date:20201018

Re: [PATCH 2/2] arm64: allow hotpluggable sections to be offlined

2020-10-18 Thread Anshuman Khandual




On 10/17/2020 01:04 PM, David Hildenbrand wrote:
> 
>> Am 17.10.2020 um 04:03 schrieb Sudarshan Rajagopalan 
>> :
>>
>> On receiving the MEM_GOING_OFFLINE notification, we disallow offlining of
>> any boot memory by checking if section_early or not. With the introduction
>> of SECTION_MARK_HOTPLUGGABLE, allow boot mem sections that are marked as
>> hotpluggable with this bit set to be offlined and removed. This now allows
>> certain boot mem sections to be offlined.
>>
> 
> The check (notifier) is in arm64 code. I don‘t see why you cannot make such 
> decisions completely in arm64 code? Why would you have to mark sections?
> 
> Also, I think I am missing from *where* the code that marks sections 
> removable is even called? Who makes such decisions?

>From the previous patch.

+EXPORT_SYMBOL_GPL(mark_memory_hotpluggable);

> 
> This feels wrong. 
> 
>> Signed-off-by: Sudarshan Rajagopalan 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> Cc: Anshuman Khandual 
>> Cc: Mark Rutland 
>> Cc: Gavin Shan 
>> Cc: Logan Gunthorpe 
>> Cc: David Hildenbrand 
>> Cc: Andrew Morton 
>> Cc: Steven Price 
>> Cc: Suren Baghdasaryan 
>> ---
>> arch/arm64/mm/mmu.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 75df62fea1b6..fb8878698672 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1487,7 +1487,7 @@ static int prevent_bootmem_remove_notifier(struct 
>> notifier_block *nb,
>>
>>for (; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
>>ms = __pfn_to_section(pfn);
>> -if (early_section(ms))
>> +if (early_section(ms) && !removable_section(ms))

Till challenges related to boot memory removal on arm64 platform get
resolved, no portion of boot memory can be offlined. Let alone via a
driver making such decisions.

>>return NOTIFY_BAD;
>>}
>>return NOTIFY_OK;
>> -- 
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> a Linux Foundation Collaborative Project
>>
> 
>

Re: [PATCH v1] hv_balloon: disable warning when floor reached

2020-10-18 Thread Olaf Hering

Am Mon, 19 Oct 2020 02:58:08 +
schrieb Michael Kelley :

> I think we should take the patch.

Thanks. I just briefly looked at the code, did not understand much of it. But 
it feels like the math uses the wrong input. I think its is not the 'pr_warn' 
that needs changing, the 'Fixes' tag would also be incorrect because a 
4.12+backports kernel does not show the warning.

Olaf


pgpmTOYwUauIW.pgp
Description: Digitale Signatur von OpenPGP

[PATCH] gfs2: use helper macro abs()

2020-10-18 Thread Xianting Tian

Use helper macro abs() to simplify the "x >= y || x <= -y" cmp.

Signed-off-by: Xianting Tian 
---
 fs/gfs2/super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 9f4d9e7be..05eb709de 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -304,7 +304,7 @@ void gfs2_statfs_change(struct gfs2_sbd *sdp, s64 total, 
s64 free,
if (sdp->sd_args.ar_statfs_percent) {
x = 100 * l_sc->sc_free;
y = m_sc->sc_free * sdp->sd_args.ar_statfs_percent;
-   if (x >= y || x <= -y)
+   if (abs(x) >= y)
need_sync = 1;
}
spin_unlock(&sdp->sd_statfs_spin);
-- 
2.17.1

Re: [RFC] of/platform: Create device link between simple-mfd and its children

2020-10-18 Thread Uwe Kleine-König

On Fri, Oct 16, 2020 at 05:26:56PM +0200, Nicolas Saenz Julienne wrote:
> On Fri, 2020-10-16 at 09:38 -0500, Rob Herring wrote:
> > On Thu, Oct 15, 2020 at 6:43 AM Nicolas Saenz Julienne
> >  wrote:
> > > 'simple-mfd' usage implies there might be some kind of resource sharing
> > > between the parent device and its children.
> > 
> > It does? No! The reason behind simple-mfd was specifically because
> > there was no parent driver or dependency on the parent. No doubt
> > simple-mfd has been abused.
> 
> Fair enough, so we're doing things wrong. Just for the record, I'm looking at
> RPi´s firmware interface:
> 
>   firmware: firmware {
>   compatible = "raspberrypi,bcm2835-firmware", "simple-mfd";
>   #address-cells = <1>;
>   #size-cells = <1>;
>   mboxes = <&mailbox>;
> 
>   firmware_clocks: clocks {
>   compatible = "raspberrypi,firmware-clocks";
>   #clock-cells = <1>;
>   };
> 
>   reset: reset {
>   compatible = "raspberrypi,firmware-reset";
>   #reset-cells = <1>;
>   };
>   [...]
>   };
> 
> Note that "raspberrypi,bcm2835-firmware" has a driver, it's not just a
> placeholder. Consumer drivers get a handle to RPi's firmware interface through
> the supplier's API, rpi_firmware_get(). The handle to firmware becomes
> meaningless if it is unbinded, which I want to protect myself against.
> 
> A simpler solution would be to manually create a device link between both
> devices ("raspberrypi,bcm2835-firmware" and "raspberrypi,firmware-clocks" for
> example) upon calling rpi_firmware_get(). But I wanted to try addressing the
> problem in a generic way first.

IMHO rpi_firmware_get() should get a reference on the firmware device
(and call try_module_get()) which prevents unbinding it.

Best regards
Uwe


-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature

Re: [PATCH 0/2] mm/memory_hotplug, arm64: allow certain bootmem sections to be offlinable

2020-10-18 Thread Anshuman Khandual

Hello Sudarshan,

On 10/17/2020 07:32 AM, Sudarshan Rajagopalan wrote:
> In the patch that enables memory hot-remove (commit bbd6ec605c0f ("arm64/mm: 
> Enable memory hot remove")) for arm64, there’s a notifier put in place that 
> prevents boot memory from being offlined and removed. The commit text 
> mentions that boot memory on arm64 cannot be removed. But x86 and other archs 
> doesn’t seem to do this prevention.
> 
> The current logic is that only “new” memory blocks which are hot-added can 
> later be offlined and removed. The memory that system booted up with cannot 
> be offlined and removed. But there could be many usercases such as inter-VM 
> memory sharing where a primary VM could offline and hot-remove a 
> block/section of memory and lend it to secondary VM where it could hot-add 
> it. And after usecase is done, the reverse happens where secondary VM 
> hot-removes and gives it back to primary which can hot-add it back. In such 
> cases, the present logic for arm64 doesn’t allow this hot-remove in primary 
> to happen.
> 
> Also, on systems with movable zone that sort of guarantees pages to be 
> migrated and isolated so that blocks can be offlined, this logic also defeats 
> the purpose of having a movable zone which system can rely on memory 
> hot-plugging, which say virt-io mem also relies on for fully plugged memory 
> blocks.
> 
> This patch tries to solve by introducing a new section mem map sit 
> 'SECTION_MARK_HOTPLUGGABLE' which allows the concerned module drivers be able
> to mark requried sections as "hotpluggable" by setting this bit. Also this 
> marking is only allowed for sections which are in movable zone and have 
> unmovable pages. The arm64 mmu code on receiving the MEM_GOING_OFFLINE 
> notification, we disallow offlining of any boot memory by checking if 
> section_early or not. With the introduction of SECTION_MARK_HOTPLUGGABLE, we 
> allow boot mem sections that are marked as hotpluggable with this bit set to 
> be offlined and removed. Thereby allowing required bootmem sections to be 
> offlinable.

This series was posted right after another thread you initiated in this regard
but without even waiting for it to conclude in any manner.

https://lore.kernel.org/linux-arm-kernel/de8388df2fbc5a6a33aab95831ba7...@codeaurora.org/
 

Inter-VM memory migration could be solved in other methods as David has 
mentioned.
Boot memory cannot be removed and hence offlined on arm64 due to multiple 
reasons
including making kexec non-functional afterwards. Besides these intrusive core 
MM
changes are not really required.

- Anshuman

[PATCH] arm64: dts: allwinner: Pine H64: Enable both RGMII RX/TX delay

2020-10-18 Thread Corentin Labbe

Since commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx delay 
config"),
the network is unusable on PineH64 model A.

This is due to phy-mode incorrectly set to rgmii instead of rgmii-id.

Fixes: 729e1ffcf47e ("arm64: allwinner: h6: add support for the Ethernet on 
Pine H64")
Signed-off-by: Corentin Labbe 
---
 arch/arm64/boot/dts/allwinner/sun50i-h6-pine-h64.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6-pine-h64.dts 
b/arch/arm64/boot/dts/allwinner/sun50i-h6-pine-h64.dts
index af85b2074867..961732c52aa0 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-h6-pine-h64.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-h6-pine-h64.dts
@@ -100,7 +100,7 @@ &ehci3 {
 &emac {
pinctrl-names = "default";
pinctrl-0 = <&ext_rgmii_pins>;
-   phy-mode = "rgmii";
+   phy-mode = "rgmii-id";
phy-handle = <&ext_rgmii_phy>;
phy-supply = <®_gmac_3v3>;
allwinner,rx-delay-ps = <200>;
-- 
2.26.2

Re: arm64: dropping prevent_bootmem_remove_notifier

2020-10-18 Thread Anshuman Khandual




On 10/17/2020 03:05 PM, David Hildenbrand wrote:
> On 17.10.20 01:11, Sudarshan Rajagopalan wrote:
>>
>> Hello Anshuman,
>>
> David here,
> 
> in general, if your driver offlines+removes random memory, it is doing
> something *very* wrong and dangerous. You shouldn't ever be
> offlining+removing memory unless
> a) you own that boot memory after boot. E.g., the ACPI driver owns DIMMs
> after a reboot.
> b) you added that memory via add_memory() and friends.

Right.

> 
> Even trusting that offline memory can be used by your driver is wrong.

Right.

> 
> Just imagine you racing with actual memory hot(un)plug, you'll be in
> *big* trouble. For example,
> 
> 1. You offlined memory and assume you can use it. A DIMM can simply get
> unplugged. you're doomed.
> 2. You offlined+removed memory and assume you can use it. A DIMM can
> simply get unplugged and the whole machine would crash.
> 
> Or imagine your driver running on a system that has virtio-mem, which
> will try to remove/offline+remove memory that was added by virtio-mem/
> is under its control.
> 
> Long story short: don't do it.
> 
> There is *one* instance in Linux where we currently allow it for legacy
> reasons. It is powernv/memtrace code that offlines+removes boot memory.
> But here we are guaranteed to run in an environment (HW) without any
> actual memory hot(un)plug.
> 
> I guess you're going to say "but in our environment we don't have ..." -
> this is not a valid argument to change such generic things upstream /
> introducing such hacks.

Agreed.

> 
>> In the patch that enables memory hot-remove (commit bbd6ec605c0f 
>> ("arm64/mm: Enable memory hot remove")) for arm64, there’s a notifier 
>> put in place that prevents boot memory from being offlined and removed. 
>> Also commit text mentions that boot memory on arm64 cannot be removed. 
>> We wanted to understand more about the reasoning for this. X86 and other 
>> archs doesn’t seem to do this prevention. There’s also comment in the 
>> code that this notifier could be dropped in future if and when boot 
>> memory can be removed.
> 
> The issue is that with *actual* memory hotunplug (for what the whole
> machinery should be used for), that memory/DIMM will be gone. And as you
> cannot fixup the initial memmap, if you were to reboot that machine, you
> would simply crash immediately.

Right.

> 
> On x86, you can have that easily: hotplug DIMMs on bare metal and
> reboot. The DIMMs will be exposed via e820 during boot, so they are
> "early", although if done right (movable_node, movable_core and
> similar), they can get hotunplugged later. Important in environments
> where you want to hotunplug whole nodes. But has HW on x86 will properly
> adjust the initial memmap / e820, there is no such issue as on arm64.

That is the primary problem.

> 
>>
>> The current logic is that only “new” memory blocks which are hot-added 
>> can later be offlined and removed. The memory that system booted up with 
>> cannot be offlined and removed. But there could be many usercases such 
>> as inter-VM memory sharing where a primary VM could offline and 
>> hot-remove a block/section of memory and lend it to secondary VM where 
>> it could hot-add it. And after usecase is done, the reverse happens 
> 
> That use case is using the wrong mechanisms. It shouldn't be
> offlining+removing memory. Read below.
> 
>> where secondary VM hot-removes and gives it back to primary which can 
>> hot-add it back. In such cases, the present logic for arm64 doesn’t 
>> allow this hot-remove in primary to happen.
>>
>> Also, on systems with movable zone that sort of guarantees pages to be 
>> migrated and isolated so that blocks can be offlined, this logic also 
>> defeats the purpose of having a movable zone which system can rely on 
>> memory hot-plugging, which say virt-io mem also relies on for fully 
>> plugged memory blocks.
> 
> The MOVABLE_ZONE is *not* just for better guarantees when trying to
> hotunplug memory. It also increases the number of THP/huge pages. And
> that part works just fine.

Right.

> 
>>
>> So we’re trying to understand the reasoning for such a prevention put in 
>> place for arm64 arch alone.
>>
>> One possible way to solve this is by marking the required sections as 
>> “non-early” by removing the SECTION_IS_EARLY bit in its section_mem_map. 
>> This puts these sections in the context of “memory hotpluggable” which 
>> can be offlined-removed and added-onlined which are part of boot RAM 
>> itself and doesn’t need any extra blocks to be hot added. This way of 
>> marking certain sections as “non-early” could be exported so that module 
>> drivers can set the required number of sections as “memory 
> 
> Oh please no. No driver should be doing that. That's just hacking around
> the root issue: you're not supposed to do that.
> 
>> hotpluggable”. This could have certain checks put in place to see which 
>> sections are allowed, example only movable zone sections can be marked 
>> as “non-early”.

Re: [PATCH v1 4/6] spi: cadence-quadspi: Add QSPI support for Intel LGM SoC

2020-10-18 Thread Ramuthevar, Vadivel MuruganX


Hi Mark,

On 17/10/2020 12:33 am, Mark Brown wrote:

On Fri, Oct 16, 2020 at 05:31:36PM +0800, Ramuthevar,Vadivel MuruganX wrote:


+   depends on OF && (ARM || ARM64 || X86 || COMPILE_TEST)



+   {
+   .compatible = "intel,lgm-qspi",
+   },


This is an x86 SoC (or SoC series) - is it really going to use DT for
the firmware interfaces? 

Thank you for the review comments...
Intel LGM SoC does uses DT based firmware blob.
 It's not specifically a problem, just

surprising to see something other than ACPI.  Or is the intention to use
PRP0001? 

Yes, You're right most of them uses ACPI based, but LGM SoC doesn't.

Regards
Vadivel
 There's a new comaptible here which wasn't really the use case

for PRP0001.  Like I say not really a problem, just curious.

Re: [PATCH v1 2/6] dt-bindings: spi: Convert cadence-quadspi.txt to cadence-quadspi.yaml

2020-10-18 Thread Ramuthevar, Vadivel MuruganX


Hi Mark,

On 17/10/2020 12:18 am, Mark Brown wrote:

On Fri, Oct 16, 2020 at 05:31:34PM +0800, Ramuthevar,Vadivel MuruganX wrote:

From: Ramuthevar Vadivel Murugan 

Convert the cadence-quadspi.txt documentation to cadence-quadspi.yaml
remove the cadence-quadspi.txt from Documentation/devicetree/bindings/spi/


Please make YAML conversions the last thing in any patch series -
there's sometimes a backlog on reviews as the DT maintainers are very
busy so this means that delays with them don't hold the rest of the
series up.

Thank you for the comment and suggestions...
Sure, will do that accordingly.

Regards
Vadivel

[PATCH v5 16/17] virt: acrn: Introduce irqfd

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

irqfd is a mechanism to inject a specific interrupt to a User VM using a
decoupled eventfd mechanism.

Vhost is a kernel-level virtio server which uses eventfd for interrupt
injection. To support vhost on ACRN, irqfd is introduced in HSM.

HSM provides ioctls to associate a virtual Message Signaled Interrupt
(MSI) with an eventfd. The corresponding virtual MSI will be injected
into a User VM once the eventfd got signal.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/Makefile   |   2 +-
 drivers/virt/acrn/acrn_drv.h |  10 ++
 drivers/virt/acrn/hsm.c  |   7 ++
 drivers/virt/acrn/irqfd.c| 235 +++
 drivers/virt/acrn/vm.c   |   3 +
 include/uapi/linux/acrn.h|  15 +++
 6 files changed, 271 insertions(+), 1 deletion(-)
 create mode 100644 drivers/virt/acrn/irqfd.c

diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
index 755b583b32ca..08ce641dcfa1 100644
--- a/drivers/virt/acrn/Makefile
+++ b/drivers/virt/acrn/Makefile
@@ -1,3 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_ACRN_HSM) := acrn.o
-acrn-y := hsm.o vm.o mm.o ioreq.o ioeventfd.o
+acrn-y := hsm.o vm.o mm.o ioreq.o ioeventfd.o irqfd.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index c66c620b9f10..8354d0d5881c 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -161,6 +161,9 @@ extern rwlock_t acrn_vm_list_lock;
  * @ioeventfds_lock:   Lock to protect ioeventfds list
  * @ioeventfds:List to link all hsm_ioeventfd
  * @ioeventfd_client:  I/O client for ioeventfds of the VM
+ * @irqfds_lock:   Lock to protect irqfds list
+ * @irqfds:List to link all hsm_irqfd
+ * @irqfd_wq:  Workqueue for irqfd async shutdown
  */
 struct acrn_vm {
struct list_headlist;
@@ -180,6 +183,9 @@ struct acrn_vm {
struct mutexioeventfds_lock;
struct list_headioeventfds;
struct acrn_ioreq_client*ioeventfd_client;
+   struct mutexirqfds_lock;
+   struct list_headirqfds;
+   struct workqueue_struct *irqfd_wq;
 };
 
 struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
@@ -216,4 +222,8 @@ int acrn_ioeventfd_init(struct acrn_vm *vm);
 int acrn_ioeventfd_config(struct acrn_vm *vm, struct acrn_ioeventfd *args);
 void acrn_ioeventfd_deinit(struct acrn_vm *vm);
 
+int acrn_irqfd_init(struct acrn_vm *vm);
+int acrn_irqfd_config(struct acrn_vm *vm, struct acrn_irqfd *args);
+void acrn_irqfd_deinit(struct acrn_vm *vm);
+
 #endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index 16f0280148f3..c7290e177b1e 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -115,6 +115,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
struct acrn_vm_memmap memmap;
struct acrn_msi_entry *msi;
struct acrn_pcidev *pcidev;
+   struct acrn_irqfd irqfd;
struct page *page;
u64 cstate_cmd;
int ret = 0;
@@ -317,6 +318,12 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
 
ret = acrn_ioeventfd_config(vm, &ioeventfd);
break;
+   case ACRN_IOCTL_IRQFD:
+   if (copy_from_user(&irqfd, (void __user *)ioctl_param,
+  sizeof(irqfd)))
+   return -EFAULT;
+   ret = acrn_irqfd_config(vm, &irqfd);
+   break;
default:
dev_dbg(acrn_dev.this_device, "Unknown IOCTL 0x%x!\n", cmd);
ret = -ENOTTY;
diff --git a/drivers/virt/acrn/irqfd.c b/drivers/virt/acrn/irqfd.c
new file mode 100644
index ..a8766d528e29
--- /dev/null
+++ b/drivers/virt/acrn/irqfd.c
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN HSM irqfd: use eventfd objects to inject virtual interrupts
+ *
+ * Copyright (C) 2020 Intel Corporation. All rights reserved.
+ *
+ * Authors:
+ * Shuo Liu 
+ * Yakui Zhao 
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "acrn_drv.h"
+
+static LIST_HEAD(acrn_irqfd_clients);
+static DEFINE_MUTEX(acrn_irqfds_mutex);
+
+/**
+ * struct hsm_irqfd - Properties of HSM irqfd
+ * @vm:Associated VM pointer
+ * @wait:  Entry of wait-queue
+ * @shutdown:  Async shutdown work
+ * @eventfd:   Associated eventfd
+ * @list:  Entry within &acrn_vm.irqfds of irqfds of a VM
+ * @pt:Structure for select/poll on the associated eventfd
+ * @msi:   MSI data
+ */
+struct hsm_irqfd {
+   struct acrn_vm  *vm;
+   wait_queue_entry_t  wait;
+   struct work_struct  shutdown;
+   struct eventfd_ctx  *eventfd;
+

[PATCH v5 15/17] virt: acrn: Introduce ioeventfd

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

ioeventfd is a mechanism to register PIO/MMIO regions to trigger an
eventfd signal when written to by a User VM. ACRN userspace can register
any arbitrary I/O address with a corresponding eventfd and then pass the
eventfd to a specific end-point of interest for handling.

Vhost is a kernel-level virtio server which uses eventfd for signalling.
To support vhost on ACRN, ioeventfd is introduced in HSM.

A new I/O client dedicated to ioeventfd is associated with a User VM
during VM creation. HSM provides ioctls to associate an I/O region with
a eventfd. The I/O client signals a eventfd once its corresponding I/O
region is matched with an I/O request.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/Kconfig |   1 +
 drivers/virt/acrn/Makefile|   2 +-
 drivers/virt/acrn/acrn_drv.h  |  10 ++
 drivers/virt/acrn/hsm.c   |   8 +
 drivers/virt/acrn/ioeventfd.c | 273 ++
 drivers/virt/acrn/vm.c|   2 +
 include/uapi/linux/acrn.h |  29 
 7 files changed, 324 insertions(+), 1 deletion(-)
 create mode 100644 drivers/virt/acrn/ioeventfd.c

diff --git a/drivers/virt/acrn/Kconfig b/drivers/virt/acrn/Kconfig
index 36c80378c30c..3e1a61c9d8d8 100644
--- a/drivers/virt/acrn/Kconfig
+++ b/drivers/virt/acrn/Kconfig
@@ -2,6 +2,7 @@
 config ACRN_HSM
tristate "ACRN Hypervisor Service Module"
depends on ACRN_GUEST
+   select EVENTFD
help
  ACRN Hypervisor Service Module (HSM) is a kernel module which
  communicates with ACRN userspace through ioctls and talks to
diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
index 21721cbf6a80..755b583b32ca 100644
--- a/drivers/virt/acrn/Makefile
+++ b/drivers/virt/acrn/Makefile
@@ -1,3 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_ACRN_HSM) := acrn.o
-acrn-y := hsm.o vm.o mm.o ioreq.o
+acrn-y := hsm.o vm.o mm.o ioreq.o ioeventfd.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index 5b824fa1ee57..c66c620b9f10 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -158,6 +158,9 @@ extern rwlock_t acrn_vm_list_lock;
  * @ioreq_page:The page of the I/O request shared 
buffer
  * @pci_conf_addr: Address of a PCI configuration access emulation
  * @monitor_page:  Page of interrupt statistics of User VM
+ * @ioeventfds_lock:   Lock to protect ioeventfds list
+ * @ioeventfds:List to link all hsm_ioeventfd
+ * @ioeventfd_client:  I/O client for ioeventfds of the VM
  */
 struct acrn_vm {
struct list_headlist;
@@ -174,6 +177,9 @@ struct acrn_vm {
struct page *ioreq_page;
u32 pci_conf_addr;
struct page *monitor_page;
+   struct mutexioeventfds_lock;
+   struct list_headioeventfds;
+   struct acrn_ioreq_client*ioeventfd_client;
 };
 
 struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
@@ -206,4 +212,8 @@ void acrn_ioreq_range_del(struct acrn_ioreq_client *client,
 
 int acrn_msi_inject(struct acrn_vm *vm, u64 msi_addr, u64 msi_data);
 
+int acrn_ioeventfd_init(struct acrn_vm *vm);
+int acrn_ioeventfd_config(struct acrn_vm *vm, struct acrn_ioeventfd *args);
+void acrn_ioeventfd_deinit(struct acrn_vm *vm);
+
 #endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index 9565b4d64ab7..16f0280148f3 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -111,6 +111,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
struct acrn_vcpu_regs *cpu_regs;
struct acrn_ioreq_notify notify;
struct acrn_ptdev_irq *irq_info;
+   struct acrn_ioeventfd ioeventfd;
struct acrn_vm_memmap memmap;
struct acrn_msi_entry *msi;
struct acrn_pcidev *pcidev;
@@ -309,6 +310,13 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
 
ret = pmcmd_ioctl(cstate_cmd, (void __user *)ioctl_param);
break;
+   case ACRN_IOCTL_IOEVENTFD:
+   if (copy_from_user(&ioeventfd, (void __user *)ioctl_param,
+  sizeof(ioeventfd)))
+   return -EFAULT;
+
+   ret = acrn_ioeventfd_config(vm, &ioeventfd);
+   break;
default:
dev_dbg(acrn_dev.this_device, "Unknown IOCTL 0x%x!\n", cmd);
ret = -ENOTTY;
diff --git a/drivers/virt/acrn/ioeventfd.c b/drivers/virt/acrn/ioeventfd.c
new file mode 100644
index ..ac4037e9f947
--- /dev/null
+++ b/drivers/virt/acrn/ioeventfd.c
@@ -0,0 +1,273 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN HSM eventfd - use eventfd objects

[PATCH v5 17/17] virt: acrn: Introduce an interface for Service VM to control vCPU

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

ACRN supports partition mode to achieve real-time requirements. In
partition mode, a CPU core can be dedicated to a vCPU of User VM. The
local APIC of the dedicated CPU core can be passthrough to the User VM.
The Service VM controls the assignment of the CPU cores.

Introduce an interface for the Service VM to remove the control of CPU
core from hypervisor perspective so that the CPU core can be a dedicated
CPU core of User VM.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/hsm.c   | 48 +++
 drivers/virt/acrn/hypercall.h | 14 ++
 2 files changed, 62 insertions(+)

diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index c7290e177b1e..8c3799cc0313 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -9,6 +9,7 @@
  * Yakui Zhao 
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -341,6 +342,52 @@ static int acrn_dev_release(struct inode *inode, struct 
file *filp)
return 0;
 }
 
+static ssize_t remove_cpu_store(struct device *dev,
+   struct device_attribute *attr,
+   const char *buf, size_t count)
+{
+   u64 cpu, lapicid;
+   int ret;
+
+   if (kstrtoull(buf, 0, &cpu) < 0)
+   return -EINVAL;
+
+   if (cpu >= num_possible_cpus() || cpu == 0 || !cpu_is_hotpluggable(cpu))
+   return -EINVAL;
+
+   if (cpu_online(cpu))
+   remove_cpu(cpu);
+
+   lapicid = cpu_data(cpu).apicid;
+   dev_dbg(dev, "Try to remove cpu %lld with lapicid %lld\n", cpu, 
lapicid);
+   ret = hcall_sos_remove_cpu(lapicid);
+   if (ret < 0) {
+   dev_err(dev, "Failed to remove cpu %lld!\n", cpu);
+   goto fail_remove;
+   }
+
+   return count;
+
+fail_remove:
+   add_cpu(cpu);
+   return ret;
+}
+static DEVICE_ATTR_WO(remove_cpu);
+
+static struct attribute *acrn_attrs[] = {
+   &dev_attr_remove_cpu.attr,
+   NULL
+};
+
+static struct attribute_group acrn_attr_group = {
+   .attrs = acrn_attrs,
+};
+
+static const struct attribute_group *acrn_attr_groups[] = {
+   &acrn_attr_group,
+   NULL
+};
+
 static const struct file_operations acrn_fops = {
.owner  = THIS_MODULE,
.open   = acrn_dev_open,
@@ -352,6 +399,7 @@ struct miscdevice acrn_dev = {
.minor  = MISC_DYNAMIC_MINOR,
.name   = "acrn_hsm",
.fops   = &acrn_fops,
+   .groups = acrn_attr_groups,
 };
 
 static int __init hsm_init(void)
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index e640632366f0..0cfad05bd1a9 100644
--- a/drivers/virt/acrn/hypercall.h
+++ b/drivers/virt/acrn/hypercall.h
@@ -13,6 +13,9 @@
 
 #define HC_ID 0x80UL
 
+#define HC_ID_GEN_BASE 0x0UL
+#define HC_SOS_REMOVE_CPU  _HC_ID(HC_ID, HC_ID_GEN_BASE + 0x01)
+
 #define HC_ID_VM_BASE  0x10UL
 #define HC_CREATE_VM   _HC_ID(HC_ID, HC_ID_VM_BASE + 0x00)
 #define HC_DESTROY_VM  _HC_ID(HC_ID, HC_ID_VM_BASE + 0x01)
@@ -42,6 +45,17 @@
 #define HC_ID_PM_BASE  0x80UL
 #define HC_PM_GET_CPU_STATE_HC_ID(HC_ID, HC_ID_PM_BASE + 0x00)
 
+/**
+ * hcall_sos_remove_cpu() - Remove a vCPU of Service VM
+ * @cpu: The vCPU to be removed
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_sos_remove_cpu(u64 cpu)
+{
+   return acrn_hypercall1(HC_SOS_REMOVE_CPU, cpu);
+}
+
 /**
  * hcall_create_vm() - Create a User VM
  * @vminfo:Service VM GPA of info of User VM creation
-- 
2.28.0

[PATCH v5 09/17] virt: acrn: Introduce I/O request management

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

An I/O request of a User VM, which is constructed by the hypervisor, is
distributed by the ACRN Hypervisor Service Module to an I/O client
corresponding to the address range of the I/O request.

For each User VM, there is a shared 4-KByte memory region used for I/O
requests communication between the hypervisor and Service VM. An I/O
request is a 256-byte structure buffer, which is 'struct
acrn_io_request', that is filled by an I/O handler of the hypervisor
when a trapped I/O access happens in a User VM. ACRN userspace in the
Service VM first allocates a 4-KByte page and passes the GPA (Guest
Physical Address) of the buffer to the hypervisor. The buffer is used as
an array of 16 I/O request slots with each I/O request slot being 256
bytes. This array is indexed by vCPU ID.

An I/O client, which is 'struct acrn_ioreq_client', is responsible for
handling User VM I/O requests whose accessed GPA falls in a certain
range. Multiple I/O clients can be associated with each User VM. There
is a special client associated with each User VM, called the default
client, that handles all I/O requests that do not fit into the range of
any other I/O clients. The ACRN userspace acts as the default client for
each User VM.

The state transitions of a ACRN I/O request are as follows.

   FREE -> PENDING -> PROCESSING -> COMPLETE -> FREE -> ...

FREE: this I/O request slot is empty
PENDING: a valid I/O request is pending in this slot
PROCESSING: the I/O request is being processed
COMPLETE: the I/O request has been processed

An I/O request in COMPLETE or FREE state is owned by the hypervisor. HSM
and ACRN userspace are in charge of processing the others.

The processing flow of I/O requests are listed as following:

a) The I/O handler of the hypervisor will fill an I/O request with
   PENDING state when a trapped I/O access happens in a User VM.
b) The hypervisor makes an upcall, which is a notification interrupt, to
   the Service VM.
c) The upcall handler schedules a tasklet to dispatch I/O requests.
d) The tasklet looks for the PENDING I/O requests, assigns them to
   different registered clients based on the address of the I/O accesses,
   updates their state to PROCESSING, and notifies the corresponding
   client to handle.
e) The notified client handles the assigned I/O requests.
f) The HSM updates I/O requests states to COMPLETE and notifies the
   hypervisor of the completion via hypercalls.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/Makefile|   2 +-
 drivers/virt/acrn/acrn_drv.h  |  82 ++
 drivers/virt/acrn/hsm.c   |  32 ++-
 drivers/virt/acrn/hypercall.h |  28 ++
 drivers/virt/acrn/ioreq.c | 509 ++
 drivers/virt/acrn/vm.c|  27 +-
 include/uapi/linux/acrn.h | 134 +
 7 files changed, 804 insertions(+), 10 deletions(-)
 create mode 100644 drivers/virt/acrn/ioreq.c

diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
index 38bc44b6edcd..21721cbf6a80 100644
--- a/drivers/virt/acrn/Makefile
+++ b/drivers/virt/acrn/Makefile
@@ -1,3 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_ACRN_HSM) := acrn.o
-acrn-y := hsm.o vm.o mm.o
+acrn-y := hsm.o vm.o mm.o ioreq.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index 4e86b97371ba..cf9143cf760d 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -12,10 +12,15 @@
 
 extern struct miscdevice acrn_dev;
 
+#define ACRN_NAME_LEN  16
 #define ACRN_MEM_MAPPING_MAX   256
 
 #define ACRN_MEM_REGION_ADD0
 #define ACRN_MEM_REGION_DEL2
+
+struct acrn_vm;
+struct acrn_ioreq_client;
+
 /**
  * struct vm_memory_region_op - Hypervisor memory operation
  * @type:  Operation type (ACRN_MEM_REGION_*)
@@ -77,9 +82,63 @@ struct vm_memory_mapping {
size_t  size;
 };
 
+/**
+ * struct acrn_ioreq_buffer - Data for setting the ioreq buffer of User VM
+ * @ioreq_buf: The GPA of the IO request shared buffer of a VM
+ *
+ * The parameter for the HC_SET_IOREQ_BUFFER hypercall used to set up
+ * the shared I/O request buffer between Service VM and ACRN hypervisor.
+ */
+struct acrn_ioreq_buffer {
+   u64 ioreq_buf;
+};
+
+struct acrn_ioreq_range {
+   struct list_headlist;
+   u32 type;
+   u64 start;
+   u64 end;
+};
+
+#define ACRN_IOREQ_CLIENT_DESTROYING   0U
+typedefint (*ioreq_handler_t)(struct acrn_ioreq_client *client,
+  struct acrn_io_request *req);
+/**
+ * struct acrn_ioreq_client - Structure of I/O client.
+ * @name:  Client name
+ * @vm:The VM that the client belongs to
+ * @list:  List node for this acrn_ioreq_client
+ * @is_default:If this client is the default one
+ * @flags: Flags (A

[PATCH v5 14/17] virt: acrn: Introduce I/O ranges operation interfaces

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

An I/O request of a User VM, which is constructed by hypervisor, is
distributed by the ACRN Hypervisor Service Module to an I/O client
corresponding to the address range of the I/O request.

I/O client maintains a list of address ranges. Introduce
acrn_ioreq_range_{add,del}() to manage these address ranges.

Signed-off-by: Shuo Liu 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/acrn_drv.h |  4 +++
 drivers/virt/acrn/ioreq.c| 60 
 2 files changed, 64 insertions(+)

diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index 701c83319115..5b824fa1ee57 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -199,6 +199,10 @@ struct acrn_ioreq_client *acrn_ioreq_client_create(struct 
acrn_vm *vm,
   void *data, bool is_default,
   const char *name);
 void acrn_ioreq_client_destroy(struct acrn_ioreq_client *client);
+int acrn_ioreq_range_add(struct acrn_ioreq_client *client,
+u32 type, u64 start, u64 end);
+void acrn_ioreq_range_del(struct acrn_ioreq_client *client,
+ u32 type, u64 start, u64 end);
 
 int acrn_msi_inject(struct acrn_vm *vm, u64 msi_addr, u64 msi_data);
 
diff --git a/drivers/virt/acrn/ioreq.c b/drivers/virt/acrn/ioreq.c
index b2dbac205078..aa21c8ccc63a 100644
--- a/drivers/virt/acrn/ioreq.c
+++ b/drivers/virt/acrn/ioreq.c
@@ -101,6 +101,66 @@ int acrn_ioreq_request_default_complete(struct acrn_vm 
*vm, u16 vcpu)
return ret;
 }
 
+/**
+ * acrn_ioreq_range_add() - Add an iorange monitored by an ioreq client
+ * @client:The ioreq client
+ * @type:  Type (ACRN_IOREQ_TYPE_MMIO or ACRN_IOREQ_TYPE_PORTIO)
+ * @start: Start address of iorange
+ * @end:   End address of iorange
+ *
+ * Return: 0 on success, <0 on error
+ */
+int acrn_ioreq_range_add(struct acrn_ioreq_client *client,
+u32 type, u64 start, u64 end)
+{
+   struct acrn_ioreq_range *range;
+
+   if (end < start) {
+   dev_err(acrn_dev.this_device,
+   "Invalid IO range [0x%llx,0x%llx]\n", start, end);
+   return -EINVAL;
+   }
+
+   range = kzalloc(sizeof(*range), GFP_KERNEL);
+   if (!range)
+   return -ENOMEM;
+
+   range->type = type;
+   range->start = start;
+   range->end = end;
+
+   write_lock_bh(&client->range_lock);
+   list_add(&range->list, &client->range_list);
+   write_unlock_bh(&client->range_lock);
+
+   return 0;
+}
+
+/**
+ * acrn_ioreq_range_del() - Del an iorange monitored by an ioreq client
+ * @client:The ioreq client
+ * @type:  Type (ACRN_IOREQ_TYPE_MMIO or ACRN_IOREQ_TYPE_PORTIO)
+ * @start: Start address of iorange
+ * @end:   End address of iorange
+ */
+void acrn_ioreq_range_del(struct acrn_ioreq_client *client,
+ u32 type, u64 start, u64 end)
+{
+   struct acrn_ioreq_range *range;
+
+   write_lock_bh(&client->range_lock);
+   list_for_each_entry(range, &client->range_list, list) {
+   if (type == range->type &&
+   start == range->start &&
+   end == range->end) {
+   list_del(&range->list);
+   kfree(range);
+   break;
+   }
+   }
+   write_unlock_bh(&client->range_lock);
+}
+
 /*
  * ioreq_task() is the execution entity of handler thread of an I/O client.
  * The handler callback of the I/O client is called within the handler thread.
-- 
2.28.0

[PATCH v5 12/17] virt: acrn: Introduce interrupt injection interfaces

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

ACRN userspace need to inject virtual interrupts into a User VM in
devices emulation.

HSM needs provide interfaces to do so.

Introduce following interrupt injection interfaces:

ioctl ACRN_IOCTL_SET_IRQLINE:
  Pass data from userspace to the hypervisor, and inform the hypervisor
  to inject a virtual IOAPIC GSI interrupt to a User VM.

ioctl ACRN_IOCTL_INJECT_MSI:
  Pass data struct acrn_msi_entry from userspace to the hypervisor, and
  inform the hypervisor to inject a virtual MSI to a User VM.

ioctl ACRN_IOCTL_VM_INTR_MONITOR:
  Set a 4-Kbyte aligned shared page for statistics information of
  interrupts of a User VM.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/acrn_drv.h  |  4 
 drivers/virt/acrn/hsm.c   | 39 +
 drivers/virt/acrn/hypercall.h | 41 +++
 drivers/virt/acrn/vm.c| 36 ++
 include/uapi/linux/acrn.h | 17 +++
 5 files changed, 137 insertions(+)

diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index 97d2aab8b70a..701c83319115 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -157,6 +157,7 @@ extern rwlock_t acrn_vm_list_lock;
  * @ioreq_buf: I/O request shared buffer
  * @ioreq_page:The page of the I/O request shared 
buffer
  * @pci_conf_addr: Address of a PCI configuration access emulation
+ * @monitor_page:  Page of interrupt statistics of User VM
  */
 struct acrn_vm {
struct list_headlist;
@@ -172,6 +173,7 @@ struct acrn_vm {
struct acrn_io_request_buffer   *ioreq_buf;
struct page *ioreq_page;
u32 pci_conf_addr;
+   struct page *monitor_page;
 };
 
 struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
@@ -198,4 +200,6 @@ struct acrn_ioreq_client *acrn_ioreq_client_create(struct 
acrn_vm *vm,
   const char *name);
 void acrn_ioreq_client_destroy(struct acrn_ioreq_client *client);
 
+int acrn_msi_inject(struct acrn_vm *vm, u64 msi_addr, u64 msi_data);
+
 #endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index e72837adba21..5c38aa841b63 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -51,7 +51,9 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
struct acrn_ioreq_notify notify;
struct acrn_ptdev_irq *irq_info;
struct acrn_vm_memmap memmap;
+   struct acrn_msi_entry *msi;
struct acrn_pcidev *pcidev;
+   struct page *page;
int ret = 0;
 
if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
@@ -178,6 +180,43 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
"Failed to reset intr for ptdev!\n");
kfree(irq_info);
break;
+   case ACRN_IOCTL_SET_IRQLINE:
+   ret = hcall_set_irqline(vm->vmid, ioctl_param);
+   if (ret < 0)
+   dev_dbg(acrn_dev.this_device,
+   "Failed to set interrupt line!\n");
+   break;
+   case ACRN_IOCTL_INJECT_MSI:
+   msi = memdup_user((void __user *)ioctl_param,
+ sizeof(struct acrn_msi_entry));
+   if (IS_ERR(msi))
+   return PTR_ERR(msi);
+
+   ret = hcall_inject_msi(vm->vmid, virt_to_phys(msi));
+   if (ret < 0)
+   dev_dbg(acrn_dev.this_device,
+   "Failed to inject MSI!\n");
+   kfree(msi);
+   break;
+   case ACRN_IOCTL_VM_INTR_MONITOR:
+   ret = get_user_pages_fast(ioctl_param, 1, FOLL_WRITE, &page);
+   if (unlikely(ret != 1)) {
+   dev_dbg(acrn_dev.this_device,
+   "Failed to pin intr hdr buffer!\n");
+   return -EFAULT;
+   }
+
+   ret = hcall_vm_intr_monitor(vm->vmid, page_to_phys(page));
+   if (ret < 0) {
+   put_page(page);
+   dev_dbg(acrn_dev.this_device,
+   "Failed to monitor intr data!\n");
+   return ret;
+   }
+   if (vm->monitor_page)
+   put_page(vm->monitor_page);
+   vm->monitor_page = page;
+   break;
case ACRN_IOCTL_CREATE_IOREQ_CLIENT:
if (vm->default_client)
return -EEXIST;
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index f448301832cf.

[PATCH v5 08/17] virt: acrn: Introduce EPT mapping management

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

The HSM provides hypervisor services to the ACRN userspace. While
launching a User VM, ACRN userspace needs to allocate memory and request
the ACRN Hypervisor to set up the EPT mapping for the VM.

A mapping cache is introduced for accelerating the translation between
the Service VM kernel virtual address and User VM physical address.

>From the perspective of the hypervisor, the types of GPA of User VM can be
listed as following:
   1) RAM region, which is used by User VM as system ram.
   2) MMIO region, which is recognized by User VM as MMIO. MMIO region is
  used to be utilized for devices emulation.

Generally, User VM RAM regions mapping is set up before VM started and
is released in the User VM destruction. MMIO regions mapping may be set
and unset dynamically during User VM running.

To achieve this, ioctls ACRN_IOCTL_SET_MEMSEG and ACRN_IOCTL_UNSET_MEMSEG
are introduced in HSM.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/Makefile|   2 +-
 drivers/virt/acrn/acrn_drv.h  |  98 ++-
 drivers/virt/acrn/hsm.c   |  15 ++
 drivers/virt/acrn/hypercall.h |  14 ++
 drivers/virt/acrn/mm.c| 305 ++
 drivers/virt/acrn/vm.c|   4 +
 include/uapi/linux/acrn.h |  51 ++
 7 files changed, 479 insertions(+), 10 deletions(-)
 create mode 100644 drivers/virt/acrn/mm.c

diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
index cf8b4ed5e74e..38bc44b6edcd 100644
--- a/drivers/virt/acrn/Makefile
+++ b/drivers/virt/acrn/Makefile
@@ -1,3 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_ACRN_HSM) := acrn.o
-acrn-y := hsm.o vm.o
+acrn-y := hsm.o vm.o mm.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index e5aba86cad8c..4e86b97371ba 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -12,26 +12,106 @@
 
 extern struct miscdevice acrn_dev;
 
+#define ACRN_MEM_MAPPING_MAX   256
+
+#define ACRN_MEM_REGION_ADD0
+#define ACRN_MEM_REGION_DEL2
+/**
+ * struct vm_memory_region_op - Hypervisor memory operation
+ * @type:  Operation type (ACRN_MEM_REGION_*)
+ * @attr:  Memory attribute (ACRN_MEM_TYPE_* | ACRN_MEM_ACCESS_*)
+ * @user_vm_pa:Physical address of User VM to be mapped.
+ * @service_vm_pa: Physical address of Service VM to be mapped.
+ * @size:  Size of this region.
+ *
+ * Structure containing needed information that is provided to ACRN Hypervisor
+ * to manage the EPT mappings of a single memory region of the User VM. Several
+ * &struct vm_memory_region_op can be batched to ACRN Hypervisor, see &struct
+ * vm_memory_region_batch.
+ */
+struct vm_memory_region_op {
+   u32 type;
+   u32 attr;
+   u64 user_vm_pa;
+   u64 service_vm_pa;
+   u64 size;
+};
+
+/**
+ * struct vm_memory_region_batch - A batch of vm_memory_region_op.
+ * @vmid:  A User VM ID.
+ * @reserved:  Reserved.
+ * @regions_num:   The number of vm_memory_region_op.
+ * @reserved1: Reserved.
+ * @regions_gpa:   Physical address of a vm_memory_region_op array.
+ *
+ * HC_VM_SET_MEMORY_REGIONS uses this structure to manage EPT mappings of
+ * multiple memory regions of a User VM. A &struct vm_memory_region_batch
+ * contains multiple &struct vm_memory_region_op for batch processing in the
+ * ACRN Hypervisor.
+ */
+struct vm_memory_region_batch {
+   u16 vmid;
+   u16 reserved[3];
+   u32 regions_num;
+   u32 reserved1;
+   u64 regions_gpa;
+};
+
+/**
+ * struct vm_memory_mapping - Memory map between a User VM and the Service VM
+ * @pages: Pages in Service VM kernel.
+ * @npages:Number of pages.
+ * @service_vm_va: Virtual address in Service VM kernel.
+ * @user_vm_pa:Physical address in User VM.
+ * @size:  Size of this memory region.
+ *
+ * HSM maintains memory mappings between a User VM GPA and the Service VM
+ * kernel VA for accelerating the User VM GPA translation.
+ */
+struct vm_memory_mapping {
+   struct page **pages;
+   int npages;
+   void*service_vm_va;
+   u64 user_vm_pa;
+   size_t  size;
+};
+
 #define ACRN_INVALID_VMID (0xU)
 
 #define ACRN_VM_FLAG_DESTROYED 0U
 /**
  * struct acrn_vm - Properties of ACRN User VM.
- * @list:  Entry within global list of all VMs
- * @vmid:  User VM ID
- * @vcpu_num:  Number of virtual CPUs in the VM
- * @flags: Flags (ACRN_VM_FLAG_*) of the VM. This is VM flag management
- * in HSM which is different from the &acrn_vm_creation.vm_flag.
+ * @list:  Entry within global list of all VMs.
+ * @vmid:  User VM ID.
+ * @vcpu_num:

[PATCH v5 13/17] virt: acrn: Introduce interfaces to query C-states and P-states allowed by hypervisor

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

The C-states and P-states data are used to support CPU power management.
The hypervisor controls C-states and P-states for a User VM.

ACRN userspace need to query the data from the hypervisor to build ACPI
tables for a User VM.

HSM provides ioctls for ACRN userspace to query C-states and P-states
data obtained from the hypervisor.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/hsm.c   | 69 +++
 drivers/virt/acrn/hypercall.h | 12 ++
 include/uapi/linux/acrn.h | 35 ++
 3 files changed, 116 insertions(+)

diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index 5c38aa841b63..9565b4d64ab7 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -38,6 +38,67 @@ static int acrn_dev_open(struct inode *inode, struct file 
*filp)
return 0;
 }
 
+static int pmcmd_ioctl(u64 cmd, void __user *uptr)
+{
+   struct acrn_pstate_data *px_data;
+   struct acrn_cstate_data *cx_data;
+   u64 *pm_info;
+   int ret = 0;
+
+   switch (cmd & PMCMD_TYPE_MASK) {
+   case ACRN_PMCMD_GET_PX_CNT:
+   case ACRN_PMCMD_GET_CX_CNT:
+   pm_info = kmalloc(sizeof(u64), GFP_KERNEL);
+   if (!pm_info)
+   return -ENOMEM;
+
+   ret = hcall_get_cpu_state(cmd, virt_to_phys(pm_info));
+   if (ret < 0) {
+   kfree(pm_info);
+   break;
+   }
+
+   if (copy_to_user(uptr, pm_info, sizeof(u64)))
+   ret = -EFAULT;
+   kfree(pm_info);
+   break;
+   case ACRN_PMCMD_GET_PX_DATA:
+   px_data = kmalloc(sizeof(*px_data), GFP_KERNEL);
+   if (!px_data)
+   return -ENOMEM;
+
+   ret = hcall_get_cpu_state(cmd, virt_to_phys(px_data));
+   if (ret < 0) {
+   kfree(px_data);
+   break;
+   }
+
+   if (copy_to_user(uptr, px_data, sizeof(*px_data)))
+   ret = -EFAULT;
+   kfree(px_data);
+   break;
+   case ACRN_PMCMD_GET_CX_DATA:
+   cx_data = kmalloc(sizeof(*cx_data), GFP_KERNEL);
+   if (!cx_data)
+   return -ENOMEM;
+
+   ret = hcall_get_cpu_state(cmd, virt_to_phys(cx_data));
+   if (ret < 0) {
+   kfree(cx_data);
+   break;
+   }
+
+   if (copy_to_user(uptr, cx_data, sizeof(*cx_data)))
+   ret = -EFAULT;
+   kfree(cx_data);
+   break;
+   default:
+   break;
+   }
+
+   return ret;
+}
+
 /*
  * HSM relies on hypercall layer of the ACRN hypervisor to do the
  * sanity check against the input parameters.
@@ -54,6 +115,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
struct acrn_msi_entry *msi;
struct acrn_pcidev *pcidev;
struct page *page;
+   u64 cstate_cmd;
int ret = 0;
 
if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
@@ -240,6 +302,13 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
case ACRN_IOCTL_CLEAR_VM_IOREQ:
acrn_ioreq_request_clear(vm);
break;
+   case ACRN_IOCTL_PM_GET_CPU_STATE:
+   if (copy_from_user(&cstate_cmd, (void *)ioctl_param,
+  sizeof(cstate_cmd)))
+   return -EFAULT;
+
+   ret = pmcmd_ioctl(cstate_cmd, (void __user *)ioctl_param);
+   break;
default:
dev_dbg(acrn_dev.this_device, "Unknown IOCTL 0x%x!\n", cmd);
ret = -ENOTTY;
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index a8813397a3fe..e640632366f0 100644
--- a/drivers/virt/acrn/hypercall.h
+++ b/drivers/virt/acrn/hypercall.h
@@ -39,6 +39,9 @@
 #define HC_ASSIGN_PCIDEV   _HC_ID(HC_ID, HC_ID_PCI_BASE + 0x05)
 #define HC_DEASSIGN_PCIDEV _HC_ID(HC_ID, HC_ID_PCI_BASE + 0x06)
 
+#define HC_ID_PM_BASE  0x80UL
+#define HC_PM_GET_CPU_STATE_HC_ID(HC_ID, HC_ID_PM_BASE + 0x00)
+
 /**
  * hcall_create_vm() - Create a User VM
  * @vminfo:Service VM GPA of info of User VM creation
@@ -225,4 +228,13 @@ static inline long hcall_reset_ptdev_intr(u64 vmid, u64 
irq)
return acrn_hypercall2(HC_RESET_PTDEV_INTR, vmid, irq);
 }
 
+/*
+ * hcall_get_cpu_state() - Get P-states and C-states info from the hypervisor
+ * @state: Service VM GPA of buffer of P-states and C-states
+ */
+static inline long hcall_get_cpu_state(u64 cmd, u64 state)
+{
+   return acrn_hypercall2(HC_PM_GET_CPU_STATE, cmd, state);
+}
+
 #end

[PATCH v5 10/17] virt: acrn: Introduce PCI configuration space PIO accesses combiner

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

A User VM can access its virtual PCI configuration spaces via port IO
approach, which has two following steps:
 1) writes address into port 0xCF8
 2) put/get data in/from port 0xCFC

To distribute a complete PCI configuration space access one time, HSM
need to combine such two accesses together.

Combine two paired PIO I/O requests into one PCI I/O request and
continue the I/O request distribution.

Signed-off-by: Shuo Liu 
Reviewed-by: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/acrn_drv.h |  2 +
 drivers/virt/acrn/ioreq.c| 76 
 include/uapi/linux/acrn.h| 15 +++
 3 files changed, 93 insertions(+)

diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index cf9143cf760d..97d2aab8b70a 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -156,6 +156,7 @@ extern rwlock_t acrn_vm_list_lock;
  * @default_client:The default I/O request client
  * @ioreq_buf: I/O request shared buffer
  * @ioreq_page:The page of the I/O request shared 
buffer
+ * @pci_conf_addr: Address of a PCI configuration access emulation
  */
 struct acrn_vm {
struct list_headlist;
@@ -170,6 +171,7 @@ struct acrn_vm {
struct acrn_ioreq_client*default_client;
struct acrn_io_request_buffer   *ioreq_buf;
struct page *ioreq_page;
+   u32 pci_conf_addr;
 };
 
 struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
diff --git a/drivers/virt/acrn/ioreq.c b/drivers/virt/acrn/ioreq.c
index 95df44049cca..b2dbac205078 100644
--- a/drivers/virt/acrn/ioreq.c
+++ b/drivers/virt/acrn/ioreq.c
@@ -221,6 +221,80 @@ int acrn_ioreq_client_wait(struct acrn_ioreq_client 
*client)
return 0;
 }
 
+static bool is_cfg_addr(struct acrn_io_request *req)
+{
+   return ((req->type == ACRN_IOREQ_TYPE_PORTIO) &&
+   (req->reqs.pio_request.address == 0xcf8));
+}
+
+static bool is_cfg_data(struct acrn_io_request *req)
+{
+   return ((req->type == ACRN_IOREQ_TYPE_PORTIO) &&
+   ((req->reqs.pio_request.address >= 0xcfc) &&
+(req->reqs.pio_request.address < (0xcfc + 4;
+}
+
+/* The low 8-bit of supported pci_reg addr.*/
+#define PCI_LOWREG_MASK  0xFC
+/* The high 4-bit of supported pci_reg addr */
+#define PCI_HIGHREG_MASK 0xF00
+/* Max number of supported functions */
+#define PCI_FUNCMAX7
+/* Max number of supported slots */
+#define PCI_SLOTMAX31
+/* Max number of supported buses */
+#define PCI_BUSMAX 255
+#define CONF1_ENABLE   0x8000UL
+/*
+ * A PCI configuration space access via PIO 0xCF8 and 0xCFC normally has two
+ * following steps:
+ *   1) writes address into 0xCF8 port
+ *   2) accesses data in/from 0xCFC
+ * This function combines such paired PCI configuration space I/O requests into
+ * one ACRN_IOREQ_TYPE_PCICFG type I/O request and continues the processing.
+ */
+static bool handle_cf8cfc(struct acrn_vm *vm,
+ struct acrn_io_request *req, u16 vcpu)
+{
+   int offset, pci_cfg_addr, pci_reg;
+   bool is_handled = false;
+
+   if (is_cfg_addr(req)) {
+   WARN_ON(req->reqs.pio_request.size != 4);
+   if (req->reqs.pio_request.direction == ACRN_IOREQ_DIR_WRITE)
+   vm->pci_conf_addr = req->reqs.pio_request.value;
+   else
+   req->reqs.pio_request.value = vm->pci_conf_addr;
+   is_handled = true;
+   } else if (is_cfg_data(req)) {
+   if (!(vm->pci_conf_addr & CONF1_ENABLE)) {
+   if (req->reqs.pio_request.direction ==
+   ACRN_IOREQ_DIR_READ)
+   req->reqs.pio_request.value = 0x;
+   is_handled = true;
+   } else {
+   offset = req->reqs.pio_request.address - 0xcfc;
+
+   req->type = ACRN_IOREQ_TYPE_PCICFG;
+   pci_cfg_addr = vm->pci_conf_addr;
+   req->reqs.pci_request.bus =
+   (pci_cfg_addr >> 16) & PCI_BUSMAX;
+   req->reqs.pci_request.dev =
+   (pci_cfg_addr >> 11) & PCI_SLOTMAX;
+   req->reqs.pci_request.func =
+   (pci_cfg_addr >> 8) & PCI_FUNCMAX;
+   pci_reg = (pci_cfg_addr & PCI_LOWREG_MASK) +
+  ((pci_cfg_addr >> 16) & PCI_HIGHREG_MASK);
+   req->reqs.pci_request.reg = pci_reg + offset;
+   }
+   }
+
+   if (is_handled)
+   ioreq_complete_request(vm, vcpu, req);
+
+   return is_handled;
+}
+
 static bool in_range(struct acrn_ioreq_range *range,
 struct acrn_io_request *req)
 {
@

[PATCH v5 11/17] virt: acrn: Introduce interfaces for PCI device passthrough

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

PCI device passthrough enables an OS in a virtual machine to directly
access a PCI device in the host. It promises almost the native
performance, which is required in performance-critical scenarios of
ACRN.

HSM provides the following ioctls:
 - Assign - ACRN_IOCTL_ASSIGN_PCIDEV
   Pass data struct acrn_pcidev from userspace to the hypervisor, and
   inform the hypervisor to assign a PCI device to a User VM.

 - De-assign - ACRN_IOCTL_DEASSIGN_PCIDEV
   Pass data struct acrn_pcidev from userspace to the hypervisor, and
   inform the hypervisor to de-assign a PCI device from a User VM.

 - Set a interrupt of a passthrough device - ACRN_IOCTL_SET_PTDEV_INTR
   Pass data struct acrn_ptdev_irq from userspace to the hypervisor,
   and inform the hypervisor to map a INTx interrupt of passthrough
   device of User VM.

 - Reset passthrough device interrupt - ACRN_IOCTL_RESET_PTDEV_INTR
   Pass data struct acrn_ptdev_irq from userspace to the hypervisor,
   and inform the hypervisor to unmap a INTx interrupt of passthrough
   device of User VM.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/hsm.c   | 50 +++
 drivers/virt/acrn/hypercall.h | 54 ++
 include/uapi/linux/acrn.h | 63 +++
 3 files changed, 167 insertions(+)

diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index 15cfb4fb5f1b..e72837adba21 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -49,7 +49,9 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
struct acrn_vm_creation *vm_param;
struct acrn_vcpu_regs *cpu_regs;
struct acrn_ioreq_notify notify;
+   struct acrn_ptdev_irq *irq_info;
struct acrn_vm_memmap memmap;
+   struct acrn_pcidev *pcidev;
int ret = 0;
 
if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
@@ -128,6 +130,54 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
 
ret = acrn_vm_memseg_unmap(vm, &memmap);
break;
+   case ACRN_IOCTL_ASSIGN_PCIDEV:
+   pcidev = memdup_user((void __user *)ioctl_param,
+sizeof(struct acrn_pcidev));
+   if (IS_ERR(pcidev))
+   return PTR_ERR(pcidev);
+
+   ret = hcall_assign_pcidev(vm->vmid, virt_to_phys(pcidev));
+   if (ret < 0)
+   dev_dbg(acrn_dev.this_device,
+   "Failed to assign pci device!\n");
+   kfree(pcidev);
+   break;
+   case ACRN_IOCTL_DEASSIGN_PCIDEV:
+   pcidev = memdup_user((void __user *)ioctl_param,
+sizeof(struct acrn_pcidev));
+   if (IS_ERR(pcidev))
+   return PTR_ERR(pcidev);
+
+   ret = hcall_deassign_pcidev(vm->vmid, virt_to_phys(pcidev));
+   if (ret < 0)
+   dev_dbg(acrn_dev.this_device,
+   "Failed to deassign pci device!\n");
+   kfree(pcidev);
+   break;
+   case ACRN_IOCTL_SET_PTDEV_INTR:
+   irq_info = memdup_user((void __user *)ioctl_param,
+  sizeof(struct acrn_ptdev_irq));
+   if (IS_ERR(irq_info))
+   return PTR_ERR(irq_info);
+
+   ret = hcall_set_ptdev_intr(vm->vmid, virt_to_phys(irq_info));
+   if (ret < 0)
+   dev_dbg(acrn_dev.this_device,
+   "Failed to configure intr for ptdev!\n");
+   kfree(irq_info);
+   break;
+   case ACRN_IOCTL_RESET_PTDEV_INTR:
+   irq_info = memdup_user((void __user *)ioctl_param,
+  sizeof(struct acrn_ptdev_irq));
+   if (IS_ERR(irq_info))
+   return PTR_ERR(irq_info);
+
+   ret = hcall_reset_ptdev_intr(vm->vmid, virt_to_phys(irq_info));
+   if (ret < 0)
+   dev_dbg(acrn_dev.this_device,
+   "Failed to reset intr for ptdev!\n");
+   kfree(irq_info);
+   break;
case ACRN_IOCTL_CREATE_IOREQ_CLIENT:
if (vm->default_client)
return -EEXIST;
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index 5eba29e3ed38..f448301832cf 100644
--- a/drivers/virt/acrn/hypercall.h
+++ b/drivers/virt/acrn/hypercall.h
@@ -28,6 +28,12 @@
 #define HC_ID_MEM_BASE 0x40UL
 #define HC_VM_SET_MEMORY_REGIONS   _HC_ID(HC_ID, HC_ID_MEM_BASE + 0x02)
 
+#define HC_ID_PCI_BASE 0x50UL
+#define HC_SET_PTDEV_INTR  _HC_ID(HC_ID, HC_ID_PCI_BA

[PATCH v5 07/17] virt: acrn: Introduce an ioctl to set vCPU registers state

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

A virtual CPU of User VM has different context due to the different
registers state. ACRN userspace needs to set the virtual CPU
registers state (e.g. giving a initial registers state to a virtual
BSP of a User VM).

HSM provides an ioctl ACRN_IOCTL_SET_VCPU_REGS to do the virtual CPU
registers state setting. The ioctl passes the registers state from ACRN
userspace to the hypervisor directly.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 drivers/virt/acrn/hsm.c   | 15 
 drivers/virt/acrn/hypercall.h | 13 +++
 include/uapi/linux/acrn.h | 71 +++
 3 files changed, 99 insertions(+)

diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index cbda67d4eb89..58ceb02e82db 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -9,6 +9,7 @@
  * Yakui Zhao 
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -46,6 +47,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
 {
struct acrn_vm *vm = filp->private_data;
struct acrn_vm_creation *vm_param;
+   struct acrn_vcpu_regs *cpu_regs;
int ret = 0;
 
if (vm->vmid == ACRN_INVALID_VMID && cmd != ACRN_IOCTL_CREATE_VM) {
@@ -97,6 +99,19 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int 
cmd,
case ACRN_IOCTL_DESTROY_VM:
ret = acrn_vm_destroy(vm);
break;
+   case ACRN_IOCTL_SET_VCPU_REGS:
+   cpu_regs = memdup_user((void __user *)ioctl_param,
+  sizeof(struct acrn_vcpu_regs));
+   if (IS_ERR(cpu_regs))
+   return PTR_ERR(cpu_regs);
+
+   ret = hcall_set_vcpu_regs(vm->vmid, virt_to_phys(cpu_regs));
+   if (ret < 0)
+   dev_dbg(acrn_dev.this_device,
+   "Failed to set regs state of VM%u!\n",
+   vm->vmid);
+   kfree(cpu_regs);
+   break;
default:
dev_dbg(acrn_dev.this_device, "Unknown IOCTL 0x%x!\n", cmd);
ret = -ENOTTY;
diff --git a/drivers/virt/acrn/hypercall.h b/drivers/virt/acrn/hypercall.h
index 426b66cadb1f..f29cfae08862 100644
--- a/drivers/virt/acrn/hypercall.h
+++ b/drivers/virt/acrn/hypercall.h
@@ -19,6 +19,7 @@
 #define HC_START_VM_HC_ID(HC_ID, HC_ID_VM_BASE + 0x02)
 #define HC_PAUSE_VM_HC_ID(HC_ID, HC_ID_VM_BASE + 0x03)
 #define HC_RESET_VM_HC_ID(HC_ID, HC_ID_VM_BASE + 0x05)
+#define HC_SET_VCPU_REGS   _HC_ID(HC_ID, HC_ID_VM_BASE + 0x06)
 
 /**
  * hcall_create_vm() - Create a User VM
@@ -75,4 +76,16 @@ static inline long hcall_reset_vm(u64 vmid)
return acrn_hypercall1(HC_RESET_VM, vmid);
 }
 
+/**
+ * hcall_set_vcpu_regs() - Set up registers of virtual CPU of a User VM
+ * @vmid:  User VM ID
+ * @regs_state:Service VM GPA of registers state
+ *
+ * Return: 0 on success, <0 on failure
+ */
+static inline long hcall_set_vcpu_regs(u64 vmid, u64 regs_state)
+{
+   return acrn_hypercall2(HC_SET_VCPU_REGS, vmid, regs_state);
+}
+
 #endif /* __ACRN_HSM_HYPERCALL_H */
diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h
index 364b1a783074..1d5b82e154fb 100644
--- a/include/uapi/linux/acrn.h
+++ b/include/uapi/linux/acrn.h
@@ -36,6 +36,75 @@ struct acrn_vm_creation {
__u8reserved2[8];
 } __attribute__((aligned(8)));
 
+struct acrn_gp_regs {
+   __u64   rax;
+   __u64   rcx;
+   __u64   rdx;
+   __u64   rbx;
+   __u64   rsp;
+   __u64   rbp;
+   __u64   rsi;
+   __u64   rdi;
+   __u64   r8;
+   __u64   r9;
+   __u64   r10;
+   __u64   r11;
+   __u64   r12;
+   __u64   r13;
+   __u64   r14;
+   __u64   r15;
+};
+
+struct acrn_descriptor_ptr {
+   __u16   limit;
+   __u64   base;
+   __u16   reserved[3];
+} __attribute__ ((__packed__));
+
+struct acrn_regs {
+   struct acrn_gp_regs gprs;
+   struct acrn_descriptor_ptr  gdt;
+   struct acrn_descriptor_ptr  idt;
+
+   __u64   rip;
+   __u64   cs_base;
+   __u64   cr0;
+   __u64   cr4;
+   __u64   cr3;
+   __u64   ia32_efer;
+   __u64   rflags;
+   __u64   reserved_64[4];
+
+   __u32   cs_ar;
+   __u32   cs_limit;
+   __u32   reserved_32[3];
+
+   __u16   cs_sel;
+   __u16   ss_sel;
+   __u16   ds_sel;
+   __u16   es_sel;
+

[PATCH v5 02/17] x86/acrn: Introduce acrn_{setup, remove}_intr_handler()

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

The ACRN Hypervisor builds an I/O request when a trapped I/O access
happens in User VM. Then, ACRN Hypervisor issues an upcall by sending
a notification interrupt to the Service VM. HSM in the Service VM needs
to hook the notification interrupt to handle I/O requests.

Notification interrupts from ACRN Hypervisor are already supported and
a, currently uninitialized, callback called.

Export two APIs for HSM to setup/remove its callback.

Originally-by: Yakui Zhao 
Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Dave Hansen 
Cc: Sean Christopherson 
Cc: Dan Williams 
Cc: Fengwei Yin 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 arch/x86/include/asm/acrn.h |  8 
 arch/x86/kernel/cpu/acrn.c  | 14 ++
 2 files changed, 22 insertions(+)
 create mode 100644 arch/x86/include/asm/acrn.h

diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
new file mode 100644
index ..ff259b69cde7
--- /dev/null
+++ b/arch/x86/include/asm/acrn.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_ACRN_H
+#define _ASM_X86_ACRN_H
+
+void acrn_setup_intr_handler(void (*handler)(void));
+void acrn_remove_intr_handler(void);
+
+#endif /* _ASM_X86_ACRN_H */
diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c
index 0b2c03943ac6..e0c181781905 100644
--- a/arch/x86/kernel/cpu/acrn.c
+++ b/arch/x86/kernel/cpu/acrn.c
@@ -10,6 +10,8 @@
  */
 
 #include 
+
+#include 
 #include 
 #include 
 #include 
@@ -55,6 +57,18 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_acrn_hv_callback)
set_irq_regs(old_regs);
 }
 
+void acrn_setup_intr_handler(void (*handler)(void))
+{
+   acrn_intr_handler = handler;
+}
+EXPORT_SYMBOL_GPL(acrn_setup_intr_handler);
+
+void acrn_remove_intr_handler(void)
+{
+   acrn_intr_handler = NULL;
+}
+EXPORT_SYMBOL_GPL(acrn_remove_intr_handler);
+
 const __initconst struct hypervisor_x86 x86_hyper_acrn = {
.name   = "ACRN",
.detect = acrn_detect,
-- 
2.28.0

[PATCH v5 04/17] x86/acrn: Introduce hypercall interfaces

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

The Service VM communicates with the hypervisor via conventional
hypercalls. VMCALL instruction is used to make the hypercalls.

ACRN hypercall ABI:
  * Hypercall number is in R8 register.
  * Up to 2 parameters are in RDI and RSI registers.
  * Return value is in RAX register.

Introduce the ACRN hypercall interfaces. Because GCC doesn't support R8
register as direct register constraints, use supported constraint as
input with a explicit MOV to R8 in beginning of asm.

Originally-by: Yakui Zhao 
Signed-off-by: Shuo Liu 
Reviewed-by: Reinette Chatre 
Cc: Dave Hansen 
Cc: Sean Christopherson 
Cc: Dan Williams 
Cc: Fengwei Yin 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
Cc: Borislav Petkov 
Cc: Arvind Sankar 
Cc: Peter Zijlstra 
Cc: Nick Desaulniers 
Cc: Segher Boessenkool 
---
 arch/x86/include/asm/acrn.h | 54 +
 1 file changed, 54 insertions(+)

diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
index a2d4aea3a80d..03e420245505 100644
--- a/arch/x86/include/asm/acrn.h
+++ b/arch/x86/include/asm/acrn.h
@@ -14,4 +14,58 @@ void acrn_setup_intr_handler(void (*handler)(void));
 void acrn_remove_intr_handler(void);
 bool acrn_is_privileged_vm(void);
 
+/*
+ * Hypercalls for ACRN
+ *
+ * - VMCALL instruction is used to implement ACRN hypercalls.
+ * - ACRN hypercall ABI:
+ *   - Hypercall number is passed in R8 register.
+ *   - Up to 2 arguments are passed in RDI, RSI.
+ *   - Return value will be placed in RAX.
+ *
+ * Because GCC doesn't support R8 register as direct register constraints, use
+ * supported constraint as input with a explicit MOV to R8 in beginning of asm.
+ */
+static inline long acrn_hypercall0(unsigned long hcall_id)
+{
+   long result;
+
+   asm volatile("movl %1, %%r8d\n\t"
+"vmcall\n\t"
+: "=a" (result)
+: "ir" (hcall_id)
+: "r8", "memory");
+
+   return result;
+}
+
+static inline long acrn_hypercall1(unsigned long hcall_id,
+  unsigned long param1)
+{
+   long result;
+
+   asm volatile("movl %1, %%r8d\n\t"
+"vmcall\n\t"
+: "=a" (result)
+: "ir" (hcall_id), "D" (param1)
+: "r8", "memory");
+
+   return result;
+}
+
+static inline long acrn_hypercall2(unsigned long hcall_id,
+  unsigned long param1,
+  unsigned long param2)
+{
+   long result;
+
+   asm volatile("movl %1, %%r8d\n\t"
+"vmcall\n\t"
+: "=a" (result)
+: "ir" (hcall_id), "D" (param1), "S" (param2)
+: "r8", "memory");
+
+   return result;
+}
+
 #endif /* _ASM_X86_ACRN_H */
-- 
2.28.0

[PATCH v5 05/17] virt: acrn: Introduce ACRN HSM basic driver

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

ACRN Hypervisor Service Module (HSM) is a kernel module in Service VM
which communicates with ACRN userspace through ioctls and talks to ACRN
Hypervisor through hypercalls.

Add a basic HSM driver which allows Service VM userspace to communicate
with ACRN. The following patches will add more ioctls, guest VM memory
mapping caching, I/O request processing, ioeventfd and irqfd into this
module. HSM exports a char device interface (/dev/acrn_hsm) to userspace.

Signed-off-by: Shuo Liu 
Reviewed-by: Reinette Chatre 
Cc: Dave Hansen 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 MAINTAINERS  |  1 +
 drivers/virt/Kconfig |  2 +
 drivers/virt/Makefile|  1 +
 drivers/virt/acrn/Kconfig| 14 ++
 drivers/virt/acrn/Makefile   |  3 ++
 drivers/virt/acrn/acrn_drv.h | 18 
 drivers/virt/acrn/hsm.c  | 87 
 7 files changed, 126 insertions(+)
 create mode 100644 drivers/virt/acrn/Kconfig
 create mode 100644 drivers/virt/acrn/Makefile
 create mode 100644 drivers/virt/acrn/acrn_drv.h
 create mode 100644 drivers/virt/acrn/hsm.c

diff --git a/MAINTAINERS b/MAINTAINERS
index e0fea5e464b4..3030d0e93d02 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -442,6 +442,7 @@ L:  acrn-...@lists.projectacrn.org
 S: Supported
 W: https://projectacrn.org
 F: Documentation/virt/acrn/
+F: drivers/virt/acrn/
 
 AD1889 ALSA SOUND DRIVER
 L: linux-par...@vger.kernel.org
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index cbc1f25c79ab..d9484a2e9b46 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -32,4 +32,6 @@ config FSL_HV_MANAGER
 partition shuts down.
 
 source "drivers/virt/vboxguest/Kconfig"
+
+source "drivers/virt/acrn/Kconfig"
 endif
diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
index fd331247c27a..f0491bbf0d4d 100644
--- a/drivers/virt/Makefile
+++ b/drivers/virt/Makefile
@@ -5,3 +5,4 @@
 
 obj-$(CONFIG_FSL_HV_MANAGER)   += fsl_hypervisor.o
 obj-y  += vboxguest/
+obj-$(CONFIG_ACRN_HSM) += acrn/
diff --git a/drivers/virt/acrn/Kconfig b/drivers/virt/acrn/Kconfig
new file mode 100644
index ..36c80378c30c
--- /dev/null
+++ b/drivers/virt/acrn/Kconfig
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0
+config ACRN_HSM
+   tristate "ACRN Hypervisor Service Module"
+   depends on ACRN_GUEST
+   help
+ ACRN Hypervisor Service Module (HSM) is a kernel module which
+ communicates with ACRN userspace through ioctls and talks to
+ the ACRN Hypervisor through hypercalls. HSM will only run in
+ a privileged management VM, called Service VM, to manage User
+ VMs and do I/O emulation. Not required for simply running
+ under ACRN as a User VM.
+
+ To compile as a module, choose M, the module will be called
+ acrn. If unsure, say N.
diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
new file mode 100644
index ..6920ed798aaf
--- /dev/null
+++ b/drivers/virt/acrn/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ACRN_HSM) := acrn.o
+acrn-y := hsm.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
new file mode 100644
index ..29eedd696327
--- /dev/null
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ACRN_HSM_DRV_H
+#define __ACRN_HSM_DRV_H
+
+#include 
+
+#define ACRN_INVALID_VMID (0xU)
+
+/**
+ * struct acrn_vm - Properties of ACRN User VM.
+ * @vmid:  User VM ID
+ */
+struct acrn_vm {
+   u16 vmid;
+};
+
+#endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
new file mode 100644
index ..28a3052ffa55
--- /dev/null
+++ b/drivers/virt/acrn/hsm.c
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ACRN Hypervisor Service Module (HSM)
+ *
+ * Copyright (C) 2020 Intel Corporation. All rights reserved.
+ *
+ * Authors:
+ * Fengwei Yin 
+ * Yakui Zhao 
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include "acrn_drv.h"
+
+/*
+ * When /dev/acrn_hsm is opened, a 'struct acrn_vm' object is created to
+ * represent a VM instance and continues to be associated with the opened file
+ * descriptor. All ioctl operations on this file descriptor will be targeted to
+ * the VM instance. Release of this file descriptor will destroy the object.
+ */
+static int acrn_dev_open(struct inode *inode, struct file *filp)
+{
+   struct acrn_vm *vm;
+
+   vm = kzalloc(sizeof(*vm), GFP_KERNEL);
+   if (!vm)
+   return -ENOMEM;
+
+   vm->vmid = ACRN_INVALID_VMID;
+   filp->private_data = vm;
+   return 0;
+}
+
+static int acrn_dev_release(struct inode *inode, struct file *filp)
+{
+   struct acrn_vm *vm = filp->private_data;
+
+   kfree(vm);
+   r

Re: WARNING in md_ioctl

2020-10-18 Thread Song Liu

On Sat, Oct 17, 2020 at 4:13 AM Dae R. Jeong  wrote:
>
> Hi,
>
> I looked into the warning "WARNING in md_ioctl" found by Syzkaller.
> (https://syzkaller.appspot.com/bug?id=fbf9eaea2e65bfcabb4e2750c3ab0892867edea1)
> I suspect that it is caused by a race between two concurrenct ioctl()s as 
> belows.
>
> CPU1 (md_ioctl())  CPU2 (md_ioctl())
> -- --
> set_bit(MD_CLOSING, &mddev->flags);
> did_set_md_closing = true;
>WARN_ON_ONCE(test_bit(MD_CLOSING, 
> &mddev->flags));
>
> if(did_set_md_closing)
> clear_bit(MD_CLOSING, &mddev->flags);
>
> If the above is correct, this warning is introduced
> in the commit 065e519e("md: MD_CLOSING needs to be cleared after called 
> md_set_readonly or do_md_stop").
> Could you please take a look into this?

This is an interesting case. We try to protect against concurrent
ioctl via mddev->openers:

if (mddev->pers && atomic_read(&mddev->openers) > 1) {
mutex_unlock(&mddev->open_mutex);
err = -EBUSY;
goto out;
}

But in this case, we are sending multiple ioctl from the same fd, so
openers == 1.

We can probably do something like:

diff --git i/drivers/md/md.c w/drivers/md/md.c
index 6072782070230..49442a3f4605b 100644
--- i/drivers/md/md.c
+++ w/drivers/md/md.c
@@ -7591,8 +7591,10 @@ static int md_ioctl(struct block_device *bdev,
fmode_t mode,
err = -EBUSY;
goto out;
}
-   WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
-   set_bit(MD_CLOSING, &mddev->flags);
+   if (test_and_set_bit(MD_CLOSING, &mddev->flags)) {
+   err = -EBUSY;
+   goto out;
+   }
did_set_md_closing = true;
mutex_unlock(&mddev->open_mutex);
sync_blockdev(bdev);

Could you please test whether this fixes the issue?

Thanks,
Song

[PATCH v5 06/17] virt: acrn: Introduce VM management interfaces

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

The VM management interfaces expose several VM operations to ACRN
userspace via ioctls. For example, creating VM, starting VM, destroying
VM and so on.

The ACRN Hypervisor needs to exchange data with the ACRN userspace
during the VM operations. HSM provides VM operation ioctls to the ACRN
userspace and communicates with the ACRN Hypervisor for VM operations
via hypercalls.

HSM maintains a list of User VM. Each User VM will be bound to an
existing file descriptor of /dev/acrn_hsm. The User VM will be
destroyed when the file descriptor is closed.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 .../userspace-api/ioctl/ioctl-number.rst  |  1 +
 MAINTAINERS   |  1 +
 drivers/virt/acrn/Makefile|  2 +-
 drivers/virt/acrn/acrn_drv.h  | 21 -
 drivers/virt/acrn/hsm.c   | 73 -
 drivers/virt/acrn/hypercall.h | 78 +++
 drivers/virt/acrn/vm.c| 68 
 include/uapi/linux/acrn.h | 56 +
 8 files changed, 296 insertions(+), 4 deletions(-)
 create mode 100644 drivers/virt/acrn/hypercall.h
 create mode 100644 drivers/virt/acrn/vm.c
 create mode 100644 include/uapi/linux/acrn.h

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst 
b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 2a198838fca9..ac60efedb104 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -319,6 +319,7 @@ Code  Seq#Include File  
 Comments
 0xA0  alllinux/sdp/sdp.h 
Industrial Device Project
  

 0xA1  0  linux/vtpm_proxy.h  TPM 
Emulator Proxy Driver
+0xA2  alluapi/linux/acrn.h   ACRN 
hypervisor
 0xA3  80-8F  Port ACL  
in development:
  

 0xA3  90-9F  linux/dtlk.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 3030d0e93d02..d4c1ef303c2d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -443,6 +443,7 @@ S:  Supported
 W: https://projectacrn.org
 F: Documentation/virt/acrn/
 F: drivers/virt/acrn/
+F: include/uapi/linux/acrn.h
 
 AD1889 ALSA SOUND DRIVER
 L: linux-par...@vger.kernel.org
diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile
index 6920ed798aaf..cf8b4ed5e74e 100644
--- a/drivers/virt/acrn/Makefile
+++ b/drivers/virt/acrn/Makefile
@@ -1,3 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_ACRN_HSM) := acrn.o
-acrn-y := hsm.o
+acrn-y := hsm.o vm.o
diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h
index 29eedd696327..e5aba86cad8c 100644
--- a/drivers/virt/acrn/acrn_drv.h
+++ b/drivers/virt/acrn/acrn_drv.h
@@ -3,16 +3,35 @@
 #ifndef __ACRN_HSM_DRV_H
 #define __ACRN_HSM_DRV_H
 
+#include 
+#include 
+#include 
 #include 
 
+#include "hypercall.h"
+
+extern struct miscdevice acrn_dev;
+
 #define ACRN_INVALID_VMID (0xU)
 
+#define ACRN_VM_FLAG_DESTROYED 0U
 /**
  * struct acrn_vm - Properties of ACRN User VM.
+ * @list:  Entry within global list of all VMs
  * @vmid:  User VM ID
+ * @vcpu_num:  Number of virtual CPUs in the VM
+ * @flags: Flags (ACRN_VM_FLAG_*) of the VM. This is VM flag management
+ * in HSM which is different from the &acrn_vm_creation.vm_flag.
  */
 struct acrn_vm {
-   u16 vmid;
+   struct list_headlist;
+   u16 vmid;
+   int vcpu_num;
+   unsigned long   flags;
 };
 
+struct acrn_vm *acrn_vm_create(struct acrn_vm *vm,
+  struct acrn_vm_creation *vm_param);
+int acrn_vm_destroy(struct acrn_vm *vm);
+
 #endif /* __ACRN_HSM_DRV_H */
diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c
index 28a3052ffa55..cbda67d4eb89 100644
--- a/drivers/virt/acrn/hsm.c
+++ b/drivers/virt/acrn/hsm.c
@@ -9,7 +9,6 @@
  * Yakui Zhao 
  */
 
-#include 
 #include 
 #include 
 #include 
@@ -38,10 +37,79 @@ static int acrn_dev_open(struct inode *inode, struct file 
*filp)
return 0;
 }
 
+/*
+ * HSM relies on hypercall layer of the ACRN hypervisor to do the
+ * sanity check against the input parameters.
+ */
+static long acrn_dev_ioctl(struct file *filp, unsigned int cmd,
+  unsigned long ioctl_param)
+{
+   struct acrn_vm *vm = filp->private_data;
+   struct acrn_vm_creation *vm_param;
+   int ret = 0;
+
+   if (vm->vmid == ACRN_INVALID_VMID && c

[PATCH v5 01/17] docs: acrn: Introduce ACRN

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

Add documentation on the following aspects of ACRN:

  1) A brief introduction on the architecture of ACRN.
  2) I/O request handling in ACRN.

To learn more about ACRN, please go to ACRN project website
https://projectacrn.org, or the documentation page
https://projectacrn.github.io/.

Signed-off-by: Shuo Liu 
Reviewed-by: Zhi Wang 
Reviewed-by: Reinette Chatre 
Cc: Dave Hansen 
Cc: Sen Christopherson 
Cc: Dan Williams 
Cc: Fengwei Yin 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
Cc: Randy Dunlap 
---
 Documentation/virt/acrn/index.rst| 11 +++
 Documentation/virt/acrn/introduction.rst | 40 ++
 Documentation/virt/acrn/io-request.rst   | 97 
 Documentation/virt/index.rst |  1 +
 MAINTAINERS  |  7 ++
 5 files changed, 156 insertions(+)
 create mode 100644 Documentation/virt/acrn/index.rst
 create mode 100644 Documentation/virt/acrn/introduction.rst
 create mode 100644 Documentation/virt/acrn/io-request.rst

diff --git a/Documentation/virt/acrn/index.rst 
b/Documentation/virt/acrn/index.rst
new file mode 100644
index ..e3cf99033bdb
--- /dev/null
+++ b/Documentation/virt/acrn/index.rst
@@ -0,0 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+ACRN Hypervisor
+===
+
+.. toctree::
+   :maxdepth: 1
+
+   introduction
+   io-request
diff --git a/Documentation/virt/acrn/introduction.rst 
b/Documentation/virt/acrn/introduction.rst
new file mode 100644
index ..6b44924d5c0e
--- /dev/null
+++ b/Documentation/virt/acrn/introduction.rst
@@ -0,0 +1,40 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+ACRN Hypervisor Introduction
+
+
+The ACRN Hypervisor is a Type 1 hypervisor, running directly on the bare-metal
+hardware. It has a privileged management VM, called Service VM, to manage User
+VMs and do I/O emulation.
+
+ACRN userspace is an application running in the Service VM that emulates
+devices for a User VM based on command line configurations. ACRN Hypervisor
+Service Module (HSM) is a kernel module in the Service VM which provides
+hypervisor services to the ACRN userspace.
+
+Below figure shows the architecture.
+
+::
+
+Service VMUser VM
+  ++  |  +--+
+  |+--+|  |  |  |
+  ||ACRN userspace||  |  |  |
+  |+--+|  |  |  |
+  |-ioctl--|  |  |  |   ...
+  |kernel space   +--+ |  |  |  |
+  |   |   HSM| |  |  | Drivers  |
+  |   +--+ |  |  |  |
+  +|---+  |  +--+
+  +-hypercall+
+  | ACRN Hypervisor  |
+  +--+
+  |  Hardware|
+  +--+
+
+ACRN userspace allocates memory for the User VM, configures and initializes the
+devices used by the User VM, loads the virtual bootloader, initializes the
+virtual CPU state and handles I/O request accesses from the User VM. It uses
+ioctls to communicate with the HSM. HSM implements hypervisor services by
+interacting with the ACRN Hypervisor via hypercalls. HSM exports a char device
+interface (/dev/acrn_hsm) to userspace.
diff --git a/Documentation/virt/acrn/io-request.rst 
b/Documentation/virt/acrn/io-request.rst
new file mode 100644
index ..79022a671ea7
--- /dev/null
+++ b/Documentation/virt/acrn/io-request.rst
@@ -0,0 +1,97 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+I/O request handling
+
+
+An I/O request of a User VM, which is constructed by the hypervisor, is
+distributed by the ACRN Hypervisor Service Module to an I/O client
+corresponding to the address range of the I/O request. Details of I/O request
+handling are described in the following sections.
+
+1. I/O request
+--
+
+For each User VM, there is a shared 4-KByte memory region used for I/O requests
+communication between the hypervisor and Service VM. An I/O request is a
+256-byte structure buffer, which is 'struct acrn_io_request', that is filled by
+an I/O handler of the hypervisor when a trapped I/O access happens in a User
+VM. ACRN userspace in the Service VM first allocates a 4-KByte page and passes
+the GPA (Guest Physical Address) of the buffer to the hypervisor. The buffer is
+used as an array of 16 I/O request slots with each I/O request slot being 256
+bytes. This array is indexed by vCPU ID.
+
+2. I/O clients
+--
+
+An I/O client is responsible for handling Use

[PATCH v5 00/17] HSM driver for ACRN hypervisor

2020-10-18 Thread shuo . a . liu

From: Shuo Liu 

ACRN is a Type 1 reference hypervisor stack, running directly on the bare-metal
hardware, and is suitable for a variety of IoT and embedded device solutions.

ACRN implements a hybrid VMM architecture, using a privileged Service VM. The
Service VM manages the system resources (CPU, memory, etc.) and I/O devices of
User VMs. Multiple User VMs are supported, with each of them running Linux,
Android OS or Windows. Both Service VM and User VMs are guest VM.

Below figure shows the architecture.

Service VMUser VM
  ++  |  +--+
  |+--+|  |  |  |
  ||ACRN userspace||  |  |  |
  |+--+|  |  |  |
  |-ioctl--|  |  |  |   ...
  |kernel space   +--+ |  |  |  |
  |   |   HSM| |  |  | Drivers  |
  |   +--+ |  |  |  |
  +|---+  |  +--+
  +-hypercall+
  |   ACRN Hypervisor|
  +--+
  |  Hardware|
  +--+

There is only one Service VM which could run Linux as OS.

In a typical case, the Service VM will be auto started when ACRN Hypervisor is
booted. Then the ACRN userspace (an application running in Service VM) could be
used to start/stop User VMs by communicating with ACRN Hypervisor Service
Module (HSM).

ACRN Hypervisor Service Module (HSM) is a middle layer that allows the ACRN
userspace and Service VM OS kernel to communicate with ACRN Hypervisor
and manage different User VMs. This middle layer provides the following
functionalities,
  - Issues hypercalls to the hypervisor to manage User VMs:
  * VM/vCPU management
  * Memory management
  * Device passthrough
  * Interrupts injection
  - I/O requests handling from User VMs.
  - Exports ioctl through HSM char device.
  - Exports function calls for other kernel modules

ACRN is focused on embedded system. So it doesn't support some features.
E.g.,
  - ACRN doesn't support VM migration.
  - ACRN doesn't support vCPU migration.

This patch set adds the HSM to the Linux kernel.

The basic ARCN support was merged to upstream already.
https://lore.kernel.org/lkml/1559108037-18813-3-git-send-email-yakui.z...@intel.com/

ChangeLog:
v5:
  - Corrected typo in documentation.
  - Removed unused pr_fmt().
  - Used supported constraint with a explicit MOV to R8 at beginning of ASM for 
hypercall interface.
  - Used dev_dbg() to replace dev_err() in places which might cause a DoS.
  - Introduced acrn_vm_list_lock as a mutex for friendly review.
  - Changed to use default attribute group list to add attribute files.

v4:
  - Used acrn_dev.this_device directly for dev_*() (Reinette)
  - Removed the odd usage of {get|put}_device() on &acrn_dev->this_device (Greg)
  - Removed unused log code. (Greg)
  - Corrected the return error values. (Greg)
  - Mentioned that HSM relies hypervisor for sanity check in acrn_dev_ioctl() 
comments (Greg)

v3:
  - Used {get|put}_device() helpers on &acrn_dev->this_device
  - Moved unused code from front patches to later ones.
  - Removed self-defined pr_fmt() and dev_fmt()
  - Provided comments for acrn_vm_list_lock.

v2:
  - Removed API version related code. (Dave)
  - Replaced pr_*() by dev_*(). (Greg)
  - Used -ENOTTY as the error code of unsupported ioctl. (Greg)

Shuo Liu (16):
  docs: acrn: Introduce ACRN
  x86/acrn: Introduce acrn_{setup, remove}_intr_handler()
  x86/acrn: Introduce hypercall interfaces
  virt: acrn: Introduce ACRN HSM basic driver
  virt: acrn: Introduce VM management interfaces
  virt: acrn: Introduce an ioctl to set vCPU registers state
  virt: acrn: Introduce EPT mapping management
  virt: acrn: Introduce I/O request management
  virt: acrn: Introduce PCI configuration space PIO accesses combiner
  virt: acrn: Introduce interfaces for PCI device passthrough
  virt: acrn: Introduce interrupt injection interfaces
  virt: acrn: Introduce interfaces to query C-states and P-states
allowed by hypervisor
  virt: acrn: Introduce I/O ranges operation interfaces
  virt: acrn: Introduce ioeventfd
  virt: acrn: Introduce irqfd
  virt: acrn: Introduce an interface for Service VM to control vCPU

Yin Fengwei (1):
  x86/acrn: Introduce an API to check if a VM is privileged

 .../userspace-api/ioctl/ioctl-number.rst  |   1 +
 Documentation/virt/acrn/index.rst |  11 +
 Documentation/virt/acrn/introduction.rst  |  40 ++
 Documentation/virt/acrn/io-request.rst|  97 +++
 Documentation/virt/index.rst

[PATCH v5 03/17] x86/acrn: Introduce an API to check if a VM is privileged

2020-10-18 Thread shuo . a . liu

From: Yin Fengwei 

ACRN Hypervisor reports hypervisor features via CPUID leaf 0x4001
which is similar to KVM. A VM can check if it's the privileged VM using
the feature bits. The Service VM is the only privileged VM by design.

Signed-off-by: Yin Fengwei 
Signed-off-by: Shuo Liu 
Reviewed-by: Reinette Chatre 
Cc: Dave Hansen 
Cc: Sean Christopherson 
Cc: Dan Williams 
Cc: Fengwei Yin 
Cc: Zhi Wang 
Cc: Zhenyu Wang 
Cc: Yu Wang 
Cc: Reinette Chatre 
Cc: Greg Kroah-Hartman 
---
 arch/x86/include/asm/acrn.h |  9 +
 arch/x86/kernel/cpu/acrn.c  | 19 ++-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/acrn.h b/arch/x86/include/asm/acrn.h
index ff259b69cde7..a2d4aea3a80d 100644
--- a/arch/x86/include/asm/acrn.h
+++ b/arch/x86/include/asm/acrn.h
@@ -2,7 +2,16 @@
 #ifndef _ASM_X86_ACRN_H
 #define _ASM_X86_ACRN_H
 
+/*
+ * This CPUID returns feature bitmaps in EAX.
+ * Guest VM uses this to detect the appropriate feature bit.
+ */
+#defineACRN_CPUID_FEATURES 0x4001
+/* Bit 0 indicates whether guest VM is privileged */
+#defineACRN_FEATURE_PRIVILEGED_VM  BIT(0)
+
 void acrn_setup_intr_handler(void (*handler)(void));
 void acrn_remove_intr_handler(void);
+bool acrn_is_privileged_vm(void);
 
 #endif /* _ASM_X86_ACRN_H */
diff --git a/arch/x86/kernel/cpu/acrn.c b/arch/x86/kernel/cpu/acrn.c
index e0c181781905..5528bfc913ea 100644
--- a/arch/x86/kernel/cpu/acrn.c
+++ b/arch/x86/kernel/cpu/acrn.c
@@ -19,9 +19,26 @@
 #include 
 #include 
 
+static u32 acrn_cpuid_base(void)
+{
+   static u32 acrn_cpuid_base;
+
+   if (!acrn_cpuid_base && boot_cpu_has(X86_FEATURE_HYPERVISOR))
+   acrn_cpuid_base = hypervisor_cpuid_base("ACRNACRNACRN", 0);
+
+   return acrn_cpuid_base;
+}
+
+bool acrn_is_privileged_vm(void)
+{
+   return cpuid_eax(acrn_cpuid_base() | ACRN_CPUID_FEATURES) &
+ACRN_FEATURE_PRIVILEGED_VM;
+}
+EXPORT_SYMBOL_GPL(acrn_is_privileged_vm);
+
 static u32 __init acrn_detect(void)
 {
-   return hypervisor_cpuid_base("ACRNACRNACRN", 0);
+   return acrn_cpuid_base();
 }
 
 static void __init acrn_init_platform(void)
-- 
2.28.0

[PATCH] PCI: export pci_find_host_bridge() to fix MFD build error

2020-10-18 Thread Randy Dunlap

Fix a build error in drivers/mfd/ioc.o by exporting
pci_find_host_bridge().

ERROR: modpost: "pci_find_host_bridge" [drivers/mfd/ioc3.ko] undefined!

Reported-by: kernel test robot 
Signed-off-by: Randy Dunlap 
Cc: Thomas Bogendoerfer 
Cc: Paul Burton 
Cc: linux-m...@vger.kernel.org
Cc: Bjorn Helgaas 
Cc: linux-...@vger.kernel.org
---
 drivers/pci/host-bridge.c |2 ++
 1 file changed, 2 insertions(+)

--- linux-next-20201016.orig/drivers/pci/host-bridge.c
+++ linux-next-20201016/drivers/pci/host-bridge.c
@@ -4,6 +4,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -23,6 +24,7 @@ struct pci_host_bridge *pci_find_host_br
 
return to_pci_host_bridge(root_bus->bridge);
 }
+EXPORT_SYMBOL(pci_find_host_bridge);
 
 struct device *pci_get_host_bridge_device(struct pci_dev *dev)
 {

[PATCH] mips: export spurious_interrupt() to fix MFD build error

2020-10-18 Thread Randy Dunlap

Fix a build error in drivers/mfd/ioc3.o by exporting
spurious_interrupt().

ERROR: modpost: "spurious_interrupt" [drivers/mfd/ioc3.ko] undefined!

Reported-by: kernel test robot 
Signed-off-by: Randy Dunlap 
Cc: Thomas Bogendoerfer 
Cc: Paul Burton 
Cc: linux-m...@vger.kernel.org
---
 arch/mips/kernel/irq.c |2 ++
 1 file changed, 2 insertions(+)

--- linux-next-20201016.orig/arch/mips/kernel/irq.c
+++ linux-next-20201016/arch/mips/kernel/irq.c
@@ -10,6 +10,7 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -48,6 +49,7 @@ asmlinkage void spurious_interrupt(void)
 {
atomic_inc(&irq_err_count);
 }
+EXPORT_SYMBOL(spurious_interrupt);
 
 void __init init_IRQ(void)
 {

Re: [PATCH] [v5] wireless: Initial driver submission for pureLiFi STA devices

2020-10-18 Thread Srinivasan Raju


Mostly trivial comments:

>> Ok Thanks, I will address them

[PATCH] drm/msm: Remove redundant null check

2020-10-18 Thread Tian Tao

clk_prepare_enable() and clk_disable_unprepare() will check
NULL clock parameter, so It is not necessary to add additional checks.

Signed-off-by: Tian Tao 
---
 drivers/gpu/drm/msm/msm_gpu.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 57ddc94..25bc654 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -175,15 +175,12 @@ static int disable_clk(struct msm_gpu *gpu)
 
 static int enable_axi(struct msm_gpu *gpu)
 {
-   if (gpu->ebi1_clk)
-   clk_prepare_enable(gpu->ebi1_clk);
-   return 0;
+   return clk_prepare_enable(gpu->ebi1_clk);
 }
 
 static int disable_axi(struct msm_gpu *gpu)
 {
-   if (gpu->ebi1_clk)
-   clk_disable_unprepare(gpu->ebi1_clk);
+   clk_disable_unprepare(gpu->ebi1_clk);
return 0;
 }
 
-- 
2.7.4

Re: [PATCH] PCI: dwc: Use ATU regions to map memory regions

2020-10-18 Thread Vidya Sagar


Hi Lorenzo, Rob, Gustavo,
Could you please review this change?

Thanks,
Vidya Sagar

On 10/5/2020 5:43 PM, Vidya Sagar wrote:

Use ATU region-3 and region-0 to setup mapping for prefetchable and
non-prefetchable memory regions respectively only if their respective CPU
and bus addresses are different.

Signed-off-by: Vidya Sagar 
---
  .../pci/controller/dwc/pcie-designware-host.c | 44 ---
  drivers/pci/controller/dwc/pcie-designware.c  | 12 ++---
  drivers/pci/controller/dwc/pcie-designware.h  |  4 +-
  3 files changed, 48 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
b/drivers/pci/controller/dwc/pcie-designware-host.c
index 317ff512f8df..cefde8e813e9 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -515,9 +515,40 @@ static struct pci_ops dw_pcie_ops = {
.write = pci_generic_config_write,
  };
  
+static void dw_pcie_setup_mem_atu(struct pcie_port *pp,

+ struct resource_entry *win)
+{
+   struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
+
+   if (win->res->flags & IORESOURCE_PREFETCH && pci->num_viewport >= 4 &&
+   win->offset) {
+   dw_pcie_prog_outbound_atu(pci,
+ PCIE_ATU_REGION_INDEX3,
+ PCIE_ATU_TYPE_MEM,
+ win->res->start,
+ win->res->start - win->offset,
+ resource_size(win->res));
+   } else if (win->res->flags & IORESOURCE_PREFETCH &&
+  pci->num_viewport < 4) {
+   dev_warn(pci->dev,
+"Insufficient ATU regions to map Prefetchable 
memory\n");
+   } else if (win->offset) {
+   if (upper_32_bits(resource_size(win->res)))
+   dev_warn(pci->dev,
+"Memory resource size exceeds max for 32 
bits\n");
+   dw_pcie_prog_outbound_atu(pci,
+ PCIE_ATU_REGION_INDEX0,
+ PCIE_ATU_TYPE_MEM,
+ win->res->start,
+ win->res->start - win->offset,
+ resource_size(win->res));
+   }
+}
+
  void dw_pcie_setup_rc(struct pcie_port *pp)
  {
u32 val, ctrl, num_ctrls;
+   struct resource_entry *win;
struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
  
  	/*

@@ -572,13 +603,14 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
 * ATU, so we should not program the ATU here.
 */
if (pp->bridge->child_ops == &dw_child_pcie_ops) {
-   struct resource_entry *entry =
-   resource_list_first_type(&pp->bridge->windows, 
IORESOURCE_MEM);
+   resource_list_for_each_entry(win, &pp->bridge->windows) {
+   switch (resource_type(win->res)) {
+   case IORESOURCE_MEM:
+   dw_pcie_setup_mem_atu(pp, win);
+   break;
+   }
+   }
  
-		dw_pcie_prog_outbound_atu(pci, PCIE_ATU_REGION_INDEX0,

- PCIE_ATU_TYPE_MEM, entry->res->start,
- entry->res->start - entry->offset,
- resource_size(entry->res));
if (pci->num_viewport > 2)
dw_pcie_prog_outbound_atu(pci, PCIE_ATU_REGION_INDEX2,
  PCIE_ATU_TYPE_IO, pp->io_base,
diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 3c1f17c78241..6033689abb15 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -227,7 +227,7 @@ static void dw_pcie_writel_ob_unroll(struct dw_pcie *pci, 
u32 index, u32 reg,
  static void dw_pcie_prog_outbound_atu_unroll(struct dw_pcie *pci, u8 func_no,
 int index, int type,
 u64 cpu_addr, u64 pci_addr,
-u32 size)
+u64 size)
  {
u32 retries, val;
u64 limit_addr = cpu_addr + size - 1;
@@ -244,8 +244,10 @@ static void dw_pcie_prog_outbound_atu_unroll(struct 
dw_pcie *pci, u8 func_no,
 lower_32_bits(pci_addr));
dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_UPPER_TARGET,
 upper_32_bits(pci_addr));
-   dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_REGION_CTRL1,
-type | PCIE_ATU_FUNC_NUM(func_no));
+   val = type | PCIE_ATU_FUNC_NUM

Re: [PATCH] asm-generic: Force inlining of get_order() to work around gcc10 poor decision

2020-10-18 Thread Christophe Leroy





Le 19/10/2020 à 06:55, Joel Stanley a écrit :

On Sat, 17 Oct 2020 at 15:55, Christophe Leroy
 wrote:


When building mpc885_ads_defconfig with gcc 10.1,
the function get_order() appears 50 times in vmlinux:

[linux]# ppc-linux-objdump -x vmlinux | grep get_order | wc -l
50

[linux]# size vmlinux
textdata bss dec hex filename
3842620  675624  135160 4653404  47015c vmlinux

In the old days, marking a function 'static inline' was forcing
GCC to inline, but since commit ac7c3e4ff401 ("compiler: enable
CONFIG_OPTIMIZE_INLINING forcibly") GCC may decide to not inline
a function.

It looks like GCC 10 is taking poor decisions on this.

get_order() compiles into the following tiny function,
occupying 20 bytes of text.

007c :
   7c:   38 63 ff ff addir3,r3,-1
   80:   54 63 a3 3e rlwinm  r3,r3,20,12,31
   84:   7c 63 00 34 cntlzw  r3,r3
   88:   20 63 00 20 subfic  r3,r3,32
   8c:   4e 80 00 20 blr

By forcing get_order() to be __always_inline, the size of text is
reduced by 1940 bytes, that is almost twice the space occupied by
50 times get_order()

[linux-powerpc]# size vmlinux
textdata bss dec hex filename
3840680  675588  135176 4651444  46f9b4 vmlinux


I see similar results with GCC 10.2 building for arm32. There are 143
instances of get_order with aspeed_g5_defconfig.

Before:
  9071838 2630138  186468 11888444 b5673c vmlinux
After:
  9069886 2630126  186468 11886480 b55f90 vmlinux

1952 bytes smaller with your patch applied. Did you raise this with
anyone from GCC?


Yes I did, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97445

For the time being, it's at a standstill.

Christophe



Reviewed-by: Joel Stanley 




Signed-off-by: Christophe Leroy 
---
  include/asm-generic/getorder.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/asm-generic/getorder.h b/include/asm-generic/getorder.h
index e9f20b813a69..f2979e3a96b6 100644
--- a/include/asm-generic/getorder.h
+++ b/include/asm-generic/getorder.h
@@ -26,7 +26,7 @@
   *
   * The result is undefined if the size is 0.
   */
-static inline __attribute_const__ int get_order(unsigned long size)
+static __always_inline __attribute_const__ int get_order(unsigned long size)
  {
 if (__builtin_constant_p(size)) {
 if (!size)
--
2.25.0

Re: [PATCH] PCI: dwc: Added link up check in map_bus of dw_child_pcie_ops

2020-10-18 Thread Kishon Vijay Abraham I

Hi Hou,

On 19/10/20 10:54 am, Z.q. Hou wrote:
> Hello Bjorn,
> 
> Thanks a lot for your comments!
> 
>> -Original Message-
>> From: Bjorn Helgaas 
>> Sent: 2020年10月16日 6:48
>> To: Z.q. Hou 
>> Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org;
>> r...@kernel.org; lorenzo.pieral...@arm.com; bhelg...@google.com;
>> gustavo.pimen...@synopsys.com
>> Subject: Re: [PATCH] PCI: dwc: Added link up check in map_bus of
>> dw_child_pcie_ops
>>
>> On Wed, Sep 16, 2020 at 01:41:30PM +0800, Zhiqiang Hou wrote:
>>> From: Hou Zhiqiang 
>>>
>>> On NXP Layerscape platforms, it results in SError in the enumeration
>>> of the PCIe controller, which is not connecting with an Endpoint
>>> device. And it doesn't make sense to enumerate the Endpoints when the
>>> PCIe link is down. So this patch added the link up check to avoid to
>>> fire configuration transactions on link down bus.
>>
>> Lorenzo already applied this, but a couple questions:
>>
>> You call out NXP Layerscape specifically, but doesn't this affect other
>> DWC-based platforms, too?  You later mentioned imx6, Kishon mentioned
>> dra7xx, Michael mentioned ls1028a, Naresh mentioned ls2088 (probably
>> both the same as your "NXP Layerscape").
> 
> For NXP Layerscape platforms (the ls1028a and ls2088a are also NXP Layerscape 
> platform), as the error response to AXI/AHB was enabled, it will get UR error 
> and trigger SError on AXI bus when it accesses a non-existent BDF on a link 
> down bus. I'm not clear about how it happens on dra7xxx and imx6, since they 
> doesn't enable the error response to AXI/AHB.

That's exactly the case with DRA7xx as the error response is enabled by
default in the platform integration.

Thanks
Kishon

> 
>>
>> The backtrace below contains a bunch of irrelevant info.  The timestamps
>> are pointless.  The backtrace past
>> pci_scan_single_device+0x80/0x100 or so really doesn't add anything either.
>>
>> It'd be nice to have a comment in the code because the code *looks* wrong
>> and racy.  Without a hint, everybody who sees it will have to dig through
>> the history to see why we tolerate the race.
> 
> Yes, agree, but seems the cause of the SError on dra7xx and imx6 is different 
> from Layerscape platforms, we need to make it clear first.
> 
> Thanks,
> Zhiqiang
>  
>>
>>> [0.807773] SError Interrupt on CPU2, code 0xbf02 -- SError
>>> [0.807775] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
>> 5.9.0-rc5-next-20200914-1-gf965d3ec86fa #67
>>> [0.807776] Hardware name: LS1046A RDB Board (DT)
>>> [0.80] pstate: 2085 (nzCv daIf -PAN -UAO BTYPE=--)
>>> [0.807778] pc : pci_generic_config_read+0x3c/0xe0
>>> [0.807778] lr : pci_generic_config_read+0x24/0xe0
>>> [0.807779] sp : 80001003b7b0
>>> [0.807780] x29: 80001003b7b0 x28: 80001003ba74
>>> [0.807782] x27: 000971d96800 x26: 00096e77e0a8
>>> [0.807784] x25: 80001003b874 x24: 80001003b924
>>> [0.807786] x23: 0004 x22: 
>>> [0.807788] x21:  x20: 80001003b874
>>> [0.807790] x19: 0004 x18: 
>>> [0.807791] x17: 00c0 x16: fe0025981840
>>> [0.807793] x15: b94c75b69948 x14: 62203a383634203a
>>> [0.807795] x13: 666e6f635f726568 x12: 202c31203d207265
>>> [0.807797] x11: 626d756e3e2d7375 x10: 656877202c307830
>>> [0.807799] x9 : 203d206e66766564 x8 : 0908
>>> [0.807801] x7 : 0908 x6 : 80001090
>>> [0.807802] x5 : 00096e77e080 x4 : 
>>> [0.807804] x3 : 0003 x2 : 84fa3440ff7e7000
>>> [0.807806] x1 :  x0 : 800010034000
>>> [0.807808] Kernel panic - not syncing: Asynchronous SError Interrupt
>>> [0.807809] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
>> 5.9.0-rc5-next-20200914-1-gf965d3ec86fa #67
>>> [0.807810] Hardware name: LS1046A RDB Board (DT)
>>> [0.807811] Call trace:
>>> [0.807812]  dump_backtrace+0x0/0x1c0
>>> [0.807813]  show_stack+0x18/0x28
>>> [0.807814]  dump_stack+0xd8/0x134
>>> [0.807814]  panic+0x180/0x398
>>> [0.807815]  add_taint+0x0/0xb0
>>> [0.807816]  arm64_serror_panic+0x78/0x88
>>> [0.807817]  do_serror+0x68/0x180
>>> [0.807818]  el1_error+0x84/0x100
>>> [0.807818]  pci_generic_config_read+0x3c/0xe0
>>> [0.807819]  dw_pcie_rd_other_conf+0x78/0x110
>>> [0.807820]  pci_bus_read_config_dword+0x88/0xe8
>>> [0.807821]  pci_bus_generic_read_dev_vendor_id+0x30/0x1b0
>>> [0.807822]  pci_bus_read_dev_vendor_id+0x4c/0x78
>>> [0.807823]  pci_scan_single_device+0x80/0x100
>>> [0.807824]  pci_scan_slot+0x38/0x130
>>> [0.807825]  pci_scan_child_bus_extend+0x54/0x2a0
>>> [0.807826]  pci_scan_child_bus+0x14/0x20
>>> [0.807827]  pci_scan_bridge_extend+0x230/0x570
>>> [0.807828]  pci_scan_child_bus_extend+0x134/0x2a0
>>> [0.807829]  pci_scan_root_bus_bridge+0x64/0xf0
>>> [0.807829]  p

Re: [PATCH RFC V3 6/9] x86/entry: Pass irqentry_state_t by reference

2020-10-18 Thread Ira Weiny

On Fri, Oct 16, 2020 at 02:55:21PM +0200, Thomas Gleixner wrote:
> On Fri, Oct 16 2020 at 13:45, Peter Zijlstra wrote:
> > On Fri, Oct 09, 2020 at 12:42:55PM -0700, ira.we...@intel.com wrote:
> >> @@ -238,7 +236,7 @@ noinstr void idtentry_exit_nmi(struct pt_regs *regs, 
> >> bool restore)
> >>  
> >>rcu_nmi_exit();
> >>lockdep_hardirq_exit();
> >> -  if (restore)
> >> +  if (irq_state->exit_rcu)
> >>lockdep_hardirqs_on(CALLER_ADDR0);
> >>__nmi_exit();
> >>  }
> >
> > That's not nice.. The NMI path is different from the IRQ path and has a
> > different variable. Yes, this works, but *groan*.
> >
> > Maybe union them if you want to avoid bloating the structure, but the
> > above makes it really hard to read.
> 
> Right, and also that nmi entry thing should not be in x86. Something
> like the untested below as first cleanup.

Ok, I see what Peter was talking about.  I've added this patch to the series.

> 
> Thanks,
> 
> tglx
> 
> Subject: x86/entry: Move nmi entry/exit into common code
> From: Thomas Gleixner 
> Date: Fri, 11 Sep 2020 10:09:56 +0200
> 
> Add blurb here.

How about:

To prepare for saving PKRS values across NMI's we lift the
idtentry_[enter|exit]_nmi() to the common code.  Rename them to
irqentry_nmi_[enter|exit]() to reflect the new generic nature and store the
state in the same irqentry_state_t structure as the other irqentry_*()
functions.  Finally, differentiate the state being stored between the NMI and
IRQ path by adding 'lockdep' to irqentry_state_t.

?

> 
> Signed-off-by: Thomas Gleixner 
> ---
>  arch/x86/entry/common.c |   34 --
>  arch/x86/include/asm/idtentry.h |3 ---
>  arch/x86/kernel/cpu/mce/core.c  |6 +++---
>  arch/x86/kernel/nmi.c   |6 +++---
>  arch/x86/kernel/traps.c |   13 +++--
>  include/linux/entry-common.h|   20 
>  kernel/entry/common.c   |   36 
>  7 files changed, 69 insertions(+), 49 deletions(-)
> 

[snip]

> --- a/include/linux/entry-common.h
> +++ b/include/linux/entry-common.h
> @@ -343,6 +343,7 @@ void irqentry_exit_to_user_mode(struct p
>  #ifndef irqentry_state
>  typedef struct irqentry_state {
>   boolexit_rcu;
> + boollockdep;
>  } irqentry_state_t;

Building on what Peter said do you agree this should be made into a union?

It may not be strictly necessary in this patch but I think it would reflect the
mutual exclusivity better and could be changed easy enough in the follow on
patch which adds the pkrs state.

Ira

Re: arm64: dropping prevent_bootmem_remove_notifier

2020-10-18 Thread Anshuman Khandual

Hello Sudarshan,

On 10/17/2020 04:41 AM, Sudarshan Rajagopalan wrote:
> 
> Hello Anshuman,
> 
> In the patch that enables memory hot-remove (commit bbd6ec605c0f ("arm64/mm: 
> Enable memory hot remove")) for arm64, there’s a notifier put in place that 
> prevents boot memory from being offlined and removed. Also commit text 
> mentions that boot memory on arm64 cannot be removed. We wanted to understand 
> more about the reasoning for this. X86 and other archs doesn’t seem to do 
> this prevention. There’s also comment in the code that this notifier could be 
> dropped in future if and when boot memory can be removed.

Right and till then the notifier cannot be dropped. There was a lot of 
discussions
around this topic during multiple iterations of memory hot remove series. 
Hence, I
would just request you to please go through them first. This list here is from 
one
such series (https://lwn.net/Articles/809179/) but might not be exhaustive.

-
On arm64 platform, it is essential to ensure that the boot time discovered
memory couldn't be hot-removed so that,

1. FW data structures used across kexec are idempotent
   e.g. the EFI memory map.

2. linear map or vmemmap would not have to be dynamically split, and can
   map boot memory at a large granularity

3. Avoid penalizing paths that have to walk page tables, where we can be
   certain that the memory is not hot-removable
-

The primary reason being kexec which would need substantial rework otherwise.

> 
> The current logic is that only “new” memory blocks which are hot-added can 
> later be offlined and removed. The memory that system booted up with cannot 
> be offlined and removed. But there could be many usercases such as inter-VM 
> memory sharing where a primary VM could offline and hot-remove a 
> block/section of memory and lend it to secondary VM where it could hot-add 
> it. And after usecase is done, the reverse happens where secondary VM 
> hot-removes and gives it back to primary which can hot-add it back. In such 
> cases, the present logic for arm64 doesn’t allow this hot-remove in primary 
> to happen.

That is not true. Each VM could just boot with a minimum boot memory which can
not be offlined or removed but then a possible larger portion of memory can be
hot added during the boot process itself, making them available for any future
inter VM sharing purpose. Hence this problem could easily be solved in the user
space itself.

> 
> Also, on systems with movable zone that sort of guarantees pages to be 
> migrated and isolated so that blocks can be offlined, this logic also defeats 
> the purpose of having a movable zone which system can rely on memory 
> hot-plugging, which say virt-io mem also relies on for fully plugged memory 
> blocks.
ZONE_MOVABLE does not really guarantee migration, isolation and removal. There
are reasons an offline request might just fail. I agree that those reasons are
normally not platform related but core memory gives platform an opportunity to
decline an offlining request via a notifier. Hence ZONE_MOVABLE offline can be
denied. Semantics wise we are still okay.

This might look bit inconsistent that movablecore/kernelcore/movable_node with
firmware sending in 'hot pluggable' memory (IIRC arm64 does not really support
this yet), the system might end up with ZONE_MOVABLE marked boot memory which
cannot be offlined or removed. But an offline notifier action is orthogonal.
Hence did not block those kernel command line paths that creates ZONE_MOVABLE
during boot to preserve existing behavior.

> 
> I understand that some region of boot RAM shouldn’t be allowed to be removed, 
> but such regions won’t be allowed to be offlined in first place since pages 
> cannot be migrated and isolated, example reserved pages.
> 
> So we’re trying to understand the reasoning for such a prevention put in 
> place for arm64 arch alone.

Primary reason being kexec. During kexec on arm64, next kernel's memory map is
derived from firmware and not from current running kernel. So the next kernel
will crash if it would access memory that might have been removed in running
kernel. Until kexec on arm64 changes substantially and takes into account the
real available memory on the current kernel, boot memory cannot be removed.

> 
> One possible way to solve this is by marking the required sections as 
> “non-early” by removing the SECTION_IS_EARLY bit in its section_mem_map.

That is too intrusive from core memory perspective.

 This puts these sections in the context of “memory hotpluggable” which can be 
offlined-removed and added-onlined which are part of boot RAM itself and 
doesn’t need any extra blocks to be hot added. This way of marking certain 
sections as “non-early” could be exported so that module drivers can set the 
required number of sections as “memory hotpluggable”. This could have certain 
checks put in place to see which sections are allowed, example only movable 
zone sections can be mar

Re: [PATCH 1/1] usb: serial: option: add Quectel EC200T module support

2020-10-18 Thread Greg Kroah-Hartman

On Mon, Oct 19, 2020 at 01:07:10AM +0800, septs wrote:
> Add usb product id of the Quectel EC200T module.
> 
> Signed-off-by: septs 

Also, this address doesn't match your "From:" line, which means we can't
take it anyway.

Re: [PATCH 1/1] usb: serial: option: add Quectel EC200T module support

2020-10-18 Thread Greg Kroah-Hartman

On Mon, Oct 19, 2020 at 01:07:10AM +0800, septs wrote:
> Add usb product id of the Quectel EC200T module.
> 
> Signed-off-by: septs 

As my bot said before, you need to use your "legal name" here.  Is this
how you sign documents?  If so, that's fine, but I have to ask.

thanks,

greg k-h

Re: [PATCH v3] PCI: cadence: Retrain Link to work around Gen2 training defect.

2020-10-18 Thread Kishon Vijay Abraham I

Hi Nadeem,

On 30/09/20 11:51 pm, Nadeem Athani wrote:
> Cadence controller will not initiate autonomous speed change if strapped
> as Gen2. The Retrain Link bit is set as quirk to enable this speed change.
> 
> Signed-off-by: Nadeem Athani 
> ---
> Changes in v3:
> - To set retrain link bit,checking device capability & link status.
> - 32bit read in place of 8bit.
> - Minor correction in patch comment.
> - Change in variable & macro name.
> Changes in v2:
> - 16bit read in place of 8bit.
>  drivers/pci/controller/cadence/pcie-cadence-host.c | 31 
> ++
>  drivers/pci/controller/cadence/pcie-cadence.h  |  9 ++-
>  2 files changed, 39 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/cadence/pcie-cadence-host.c 
> b/drivers/pci/controller/cadence/pcie-cadence-host.c
> index 4550e0d469ca..2b2ae4e18032 100644
> --- a/drivers/pci/controller/cadence/pcie-cadence-host.c
> +++ b/drivers/pci/controller/cadence/pcie-cadence-host.c
> @@ -77,6 +77,36 @@ static struct pci_ops cdns_pcie_host_ops = {
>   .write  = pci_generic_config_write,
>  };
>  
> +static void cdns_pcie_retrain(struct cdns_pcie *pcie)
> +{
> + u32 lnk_cap_sls, pcie_cap_off = CDNS_PCIE_RP_CAP_OFFSET;
> + u16 lnk_stat, lnk_ctl;
> +
> + if (!cdns_pcie_link_up(pcie))
> + return;
> +

Is there a IP version that can be used to check if that quirk is applicable?
> + /*
> +  * Set retrain bit if current speed is 2.5 GB/s,
> +  * but the PCIe root port support is > 2.5 GB/s.
> +  */
> +
> + lnk_cap_sls = cdns_pcie_readl(pcie, (CDNS_PCIE_RP_BASE + pcie_cap_off +
> +   PCI_EXP_LNKCAP));
> + if ((lnk_cap_sls & PCI_EXP_LNKCAP_SLS) <= PCI_EXP_LNKCAP_SLS_2_5GB)
> + return;
> +
> + lnk_stat = cdns_pcie_rp_readw(pcie, pcie_cap_off + PCI_EXP_LNKSTA);
> + if ((lnk_stat & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) {
> + lnk_ctl = cdns_pcie_rp_readw(pcie,
> +  pcie_cap_off + PCI_EXP_LNKCTL);
> + lnk_ctl |= PCI_EXP_LNKCTL_RL;
> + cdns_pcie_rp_writew(pcie, pcie_cap_off + PCI_EXP_LNKCTL,
> + lnk_ctl);
> +
> + if (!cdns_pcie_link_up(pcie))

Should this rather be a cdns_pcie_host_wait_for_link()?

Thanks
Kishon

> + return;
> + }
> +}
>  
>  static int cdns_pcie_host_init_root_port(struct cdns_pcie_rc *rc)
>  {
> @@ -115,6 +145,7 @@ static int cdns_pcie_host_init_root_port(struct 
> cdns_pcie_rc *rc)
>   cdns_pcie_rp_writeb(pcie, PCI_CLASS_PROG, 0);
>   cdns_pcie_rp_writew(pcie, PCI_CLASS_DEVICE, PCI_CLASS_BRIDGE_PCI);
>  
> + cdns_pcie_retrain(pcie);
>   return 0;
>  }
>  
> diff --git a/drivers/pci/controller/cadence/pcie-cadence.h 
> b/drivers/pci/controller/cadence/pcie-cadence.h
> index feed1e3038f4..5f1cf032ae15 100644
> --- a/drivers/pci/controller/cadence/pcie-cadence.h
> +++ b/drivers/pci/controller/cadence/pcie-cadence.h
> @@ -119,7 +119,7 @@
>   * Root Port Registers (PCI configuration space for the root port function)
>   */
>  #define CDNS_PCIE_RP_BASE0x0020
> -
> +#define CDNS_PCIE_RP_CAP_OFFSET 0xc0
>  
>  /*
>   * Address Translation Registers
> @@ -413,6 +413,13 @@ static inline void cdns_pcie_rp_writew(struct cdns_pcie 
> *pcie,
>   cdns_pcie_write_sz(addr, 0x2, value);
>  }
>  
> +static inline u16 cdns_pcie_rp_readw(struct cdns_pcie *pcie, u32 reg)
> +{
> + void __iomem *addr = pcie->reg_base + CDNS_PCIE_RP_BASE + reg;
> +
> + return cdns_pcie_read_sz(addr, 0x2);
> +}
> +
>  /* Endpoint Function register access */
>  static inline void cdns_pcie_ep_fn_writeb(struct cdns_pcie *pcie, u8 fn,
> u32 reg, u8 value)
>

RE: [PATCH] PCI: dwc: Added link up check in map_bus of dw_child_pcie_ops

2020-10-18 Thread Z.q. Hou

Hello Bjorn,

Thanks a lot for your comments!

> -Original Message-
> From: Bjorn Helgaas 
> Sent: 2020年10月16日 6:48
> To: Z.q. Hou 
> Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org;
> r...@kernel.org; lorenzo.pieral...@arm.com; bhelg...@google.com;
> gustavo.pimen...@synopsys.com
> Subject: Re: [PATCH] PCI: dwc: Added link up check in map_bus of
> dw_child_pcie_ops
> 
> On Wed, Sep 16, 2020 at 01:41:30PM +0800, Zhiqiang Hou wrote:
> > From: Hou Zhiqiang 
> >
> > On NXP Layerscape platforms, it results in SError in the enumeration
> > of the PCIe controller, which is not connecting with an Endpoint
> > device. And it doesn't make sense to enumerate the Endpoints when the
> > PCIe link is down. So this patch added the link up check to avoid to
> > fire configuration transactions on link down bus.
> 
> Lorenzo already applied this, but a couple questions:
> 
> You call out NXP Layerscape specifically, but doesn't this affect other
> DWC-based platforms, too?  You later mentioned imx6, Kishon mentioned
> dra7xx, Michael mentioned ls1028a, Naresh mentioned ls2088 (probably
> both the same as your "NXP Layerscape").

For NXP Layerscape platforms (the ls1028a and ls2088a are also NXP Layerscape 
platform), as the error response to AXI/AHB was enabled, it will get UR error 
and trigger SError on AXI bus when it accesses a non-existent BDF on a link 
down bus. I'm not clear about how it happens on dra7xxx and imx6, since they 
doesn't enable the error response to AXI/AHB.

> 
> The backtrace below contains a bunch of irrelevant info.  The timestamps
> are pointless.  The backtrace past
> pci_scan_single_device+0x80/0x100 or so really doesn't add anything either.
> 
> It'd be nice to have a comment in the code because the code *looks* wrong
> and racy.  Without a hint, everybody who sees it will have to dig through
> the history to see why we tolerate the race.

Yes, agree, but seems the cause of the SError on dra7xx and imx6 is different 
from Layerscape platforms, we need to make it clear first.

Thanks,
Zhiqiang
 
> 
> > [0.807773] SError Interrupt on CPU2, code 0xbf02 -- SError
> > [0.807775] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
> 5.9.0-rc5-next-20200914-1-gf965d3ec86fa #67
> > [0.807776] Hardware name: LS1046A RDB Board (DT)
> > [0.80] pstate: 2085 (nzCv daIf -PAN -UAO BTYPE=--)
> > [0.807778] pc : pci_generic_config_read+0x3c/0xe0
> > [0.807778] lr : pci_generic_config_read+0x24/0xe0
> > [0.807779] sp : 80001003b7b0
> > [0.807780] x29: 80001003b7b0 x28: 80001003ba74
> > [0.807782] x27: 000971d96800 x26: 00096e77e0a8
> > [0.807784] x25: 80001003b874 x24: 80001003b924
> > [0.807786] x23: 0004 x22: 
> > [0.807788] x21:  x20: 80001003b874
> > [0.807790] x19: 0004 x18: 
> > [0.807791] x17: 00c0 x16: fe0025981840
> > [0.807793] x15: b94c75b69948 x14: 62203a383634203a
> > [0.807795] x13: 666e6f635f726568 x12: 202c31203d207265
> > [0.807797] x11: 626d756e3e2d7375 x10: 656877202c307830
> > [0.807799] x9 : 203d206e66766564 x8 : 0908
> > [0.807801] x7 : 0908 x6 : 80001090
> > [0.807802] x5 : 00096e77e080 x4 : 
> > [0.807804] x3 : 0003 x2 : 84fa3440ff7e7000
> > [0.807806] x1 :  x0 : 800010034000
> > [0.807808] Kernel panic - not syncing: Asynchronous SError Interrupt
> > [0.807809] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
> 5.9.0-rc5-next-20200914-1-gf965d3ec86fa #67
> > [0.807810] Hardware name: LS1046A RDB Board (DT)
> > [0.807811] Call trace:
> > [0.807812]  dump_backtrace+0x0/0x1c0
> > [0.807813]  show_stack+0x18/0x28
> > [0.807814]  dump_stack+0xd8/0x134
> > [0.807814]  panic+0x180/0x398
> > [0.807815]  add_taint+0x0/0xb0
> > [0.807816]  arm64_serror_panic+0x78/0x88
> > [0.807817]  do_serror+0x68/0x180
> > [0.807818]  el1_error+0x84/0x100
> > [0.807818]  pci_generic_config_read+0x3c/0xe0
> > [0.807819]  dw_pcie_rd_other_conf+0x78/0x110
> > [0.807820]  pci_bus_read_config_dword+0x88/0xe8
> > [0.807821]  pci_bus_generic_read_dev_vendor_id+0x30/0x1b0
> > [0.807822]  pci_bus_read_dev_vendor_id+0x4c/0x78
> > [0.807823]  pci_scan_single_device+0x80/0x100
> > [0.807824]  pci_scan_slot+0x38/0x130
> > [0.807825]  pci_scan_child_bus_extend+0x54/0x2a0
> > [0.807826]  pci_scan_child_bus+0x14/0x20
> > [0.807827]  pci_scan_bridge_extend+0x230/0x570
> > [0.807828]  pci_scan_child_bus_extend+0x134/0x2a0
> > [0.807829]  pci_scan_root_bus_bridge+0x64/0xf0
> > [0.807829]  pci_host_probe+0x18/0xc8
> > [0.807830]  dw_pcie_host_init+0x220/0x378
> > [0.807831]  ls_pcie_probe+0x104/0x140
> > [0.807832]  platform_drv_probe+0x54/0xa8
> > [0.807833]  really_probe+0x118/0x3e0
> > [0.807834]

[git pull] drm fixes for 5.10-rc1

2020-10-18 Thread Dave Airlie

Hi Linus,

Some fixes queued up already for i915 and amdgpu, I've also included
the fix for the clang warning you've seen.

Dave.

drm-next-2020-10-19:
drm fixes for 5.10-rc1

i915:
- Set all unused color plane offsets to ~0xfff again (Ville)
- Fix TGL DKL PHY DP vswing handling (Ville)

amdgpu:
- DCN clang warning fix
- eDP fix
- BACO fix
- Kernel documentation fixes
- SMU7 mclk fix
- VCN1 hw bug workaround

amdkfd:
- kvfree vs kfree fix
The following changes since commit 640eee067d9aae0bb98d8706001976ff1affaf00:

  Merge tag 'drm-misc-next-fixes-2020-10-13' of
git://anongit.freedesktop.org/drm/drm-misc into drm-next (2020-10-14
07:31:53 +1000)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-next-2020-10-19

for you to fetch changes up to 40b99050455b9a6cb8faf15dcd41888312184720:

  Merge tag 'drm-intel-next-fixes-2020-10-15' of
git://anongit.freedesktop.org/drm/drm-intel into drm-next (2020-10-19
09:21:59 +1000)


drm fixes for 5.10-rc1

i915:
- Set all unused color plane offsets to ~0xfff again (Ville)
- Fix TGL DKL PHY DP vswing handling (Ville)

amdgpu:
- DCN clang warning fix
- eDP fix
- BACO fix
- Kernel documentation fixes
- SMU7 mclk fix
- VCN1 hw bug workaround

amdkfd:
- kvfree vs kfree fix


Alex Deucher (1):
  drm/amdgpu/swsmu: init the baco mutex in early_init

Dave Airlie (2):
  Merge tag 'amd-drm-fixes-5.10-2020-10-14' of
git://people.freedesktop.org/~agd5f/linux into drm-next
  Merge tag 'drm-intel-next-fixes-2020-10-15' of
git://anongit.freedesktop.org/drm/drm-intel into drm-next

Eryk Brol (1):
  drm/amd/display: Fix incorrect dsc force enable logic

Evan Quan (1):
  drm/amd/pm: increase mclk switch threshold to 200 us

Kent Russell (1):
  drm/amdkfd: Use kvfree in destroy_crat_image

Mauro Carvalho Chehab (2):
  drm/amd/display: kernel-doc: document force_timing_sync
  docs: amdgpu: fix a warning when building the documentation

Rodrigo Siqueira (1):
  drm/amd/display: Fix module load hangs when connected to an eDP

Veerabadhran G (1):
  drm/amdgpu: vcn and jpeg ring synchronization

Ville Syrjälä (2):
  drm/i915: Fix TGL DKL PHY DP vswing handling
  drm/i915: Set all unused color plane offsets to ~0xfff again

 Documentation/gpu/amdgpu.rst   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h|  1 +
 drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c | 24 +--
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  | 28 ++
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.h  |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c  |  2 +-
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h  |  2 ++
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c|  2 +-
 drivers/gpu/drm/amd/display/dc/core/dc.c   | 10 
 .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c|  2 +-
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c  |  7 +++---
 drivers/gpu/drm/i915/display/intel_ddi.c   |  2 +-
 drivers/gpu/drm/i915/display/intel_display.c   | 17 -
 14 files changed, 72 insertions(+), 34 deletions(-)

Re: [PATCH] sched: cpufreq_schedutil: restore cached freq when next_f is not changed

2020-10-18 Thread Viresh Kumar

On 16-10-20, 11:17, Wei Wang wrote:
> We have the raw cached freq to reduce the chance in calling cpufreq
> driver where it could be costly in some arch/SoC.
> 
> Currently, the raw cached freq will be reset when next_f is changed for
> correctness. This patch changes it to maintain the cached value instead
> of dropping it to honor the purpose of the cached value.
> 
> This is adapted from https://android-review.googlesource.com/1352810/
> 
> Signed-off-by: Wei Wang 
> ---
>  kernel/sched/cpufreq_schedutil.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index 5ae7b4e6e8d6..e254745a82cb 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -441,6 +441,7 @@ static void sugov_update_single(struct update_util_data 
> *hook, u64 time,
>   unsigned long util, max;
>   unsigned int next_f;
>   bool busy;
> + unsigned int cached_freq = sg_policy->cached_raw_freq;
>  
>   sugov_iowait_boost(sg_cpu, time, flags);
>   sg_cpu->last_update = time;
> @@ -464,8 +465,8 @@ static void sugov_update_single(struct update_util_data 
> *hook, u64 time,
>   if (busy && next_f < sg_policy->next_freq) {
>   next_f = sg_policy->next_freq;
>  
> - /* Reset cached freq as next_freq has changed */
> - sg_policy->cached_raw_freq = 0;
> + /* Restore cached freq as next_freq has changed */
> + sg_policy->cached_raw_freq = cached_freq;
>   }
>  
>   /*

Acked-by: Viresh Kumar 

-- 
viresh

Re: [PATCH V2 1/2] opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER

2020-10-18 Thread Viresh Kumar

On 16-10-20, 12:12, Sudeep Holla wrote:
> On Fri, Oct 16, 2020 at 07:00:21AM +0100, Sudeep Holla wrote:
> > On Fri, Oct 16, 2020 at 09:54:34AM +0530, Viresh Kumar wrote:
> > > On 15-10-20, 19:05, Sudeep Holla wrote:
> > > > OK, this breaks with SCMI which doesn't provide clocks but manage OPPs
> > > > directly. Before this change clk_get(dev..) was allowed to fail and
> > > > --EPROBE_DEFER was not an error.
> > >
> > > I think the change in itself is fine. We should be returning from
> > > there if we get EPROBE_DEFER. The question is rather why are you
> > > getting EPROBE_DEFER here ?
> > >
> >
> > Ah OK, I didn't spend too much time, saw -EPROBE_DEFER, just reverted
> > this patch and it worked. I need to check it in detail yet.
> >
> 
> You confused me earlier. As I said there will be no clock provider
> registered for SCMI CPU/Dev DVFS.
>   opp_table->clk = clk_get(dev, NULL);
> will always return -EPROBE_DEFER as there is no clock provider for dev.
> But this change now propagates that error to caller of dev_pm_opp_add
> which means we can't add opp to a device if there are no clock providers.
> This breaks for DVFS which don't operate separately with clocks and
> regulators.

The CPUs DT node shouldn't have a clock property in such a case and I
would expect an error instead of EPROBE_DEFER then. Isn't it ?

-- 
viresh

Re: [PATCH] asm-generic: Force inlining of get_order() to work around gcc10 poor decision

2020-10-18 Thread Joel Stanley

On Sat, 17 Oct 2020 at 15:55, Christophe Leroy
 wrote:
>
> When building mpc885_ads_defconfig with gcc 10.1,
> the function get_order() appears 50 times in vmlinux:
>
> [linux]# ppc-linux-objdump -x vmlinux | grep get_order | wc -l
> 50
>
> [linux]# size vmlinux
>textdata bss dec hex filename
> 3842620  675624  135160 4653404  47015c vmlinux
>
> In the old days, marking a function 'static inline' was forcing
> GCC to inline, but since commit ac7c3e4ff401 ("compiler: enable
> CONFIG_OPTIMIZE_INLINING forcibly") GCC may decide to not inline
> a function.
>
> It looks like GCC 10 is taking poor decisions on this.
>
> get_order() compiles into the following tiny function,
> occupying 20 bytes of text.
>
> 007c :
>   7c:   38 63 ff ff addir3,r3,-1
>   80:   54 63 a3 3e rlwinm  r3,r3,20,12,31
>   84:   7c 63 00 34 cntlzw  r3,r3
>   88:   20 63 00 20 subfic  r3,r3,32
>   8c:   4e 80 00 20 blr
>
> By forcing get_order() to be __always_inline, the size of text is
> reduced by 1940 bytes, that is almost twice the space occupied by
> 50 times get_order()
>
> [linux-powerpc]# size vmlinux
>textdata bss dec hex filename
> 3840680  675588  135176 4651444  46f9b4 vmlinux

I see similar results with GCC 10.2 building for arm32. There are 143
instances of get_order with aspeed_g5_defconfig.

Before:
 9071838 2630138  186468 11888444 b5673c vmlinux
After:
 9069886 2630126  186468 11886480 b55f90 vmlinux

1952 bytes smaller with your patch applied. Did you raise this with
anyone from GCC?

Reviewed-by: Joel Stanley 



> Signed-off-by: Christophe Leroy 
> ---
>  include/asm-generic/getorder.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/asm-generic/getorder.h b/include/asm-generic/getorder.h
> index e9f20b813a69..f2979e3a96b6 100644
> --- a/include/asm-generic/getorder.h
> +++ b/include/asm-generic/getorder.h
> @@ -26,7 +26,7 @@
>   *
>   * The result is undefined if the size is 0.
>   */
> -static inline __attribute_const__ int get_order(unsigned long size)
> +static __always_inline __attribute_const__ int get_order(unsigned long size)
>  {
> if (__builtin_constant_p(size)) {
> if (!size)
> --
> 2.25.0
>

Re: [PATCH] [v5] wireless: Initial driver submission for pureLiFi STA devices

2020-10-18 Thread Joe Perches

On Mon, 2020-10-19 at 08:47 +0530, Srinivasan Raju wrote:
> This introduces the pureLiFi LiFi driver for LiFi-X, LiFi-XC
> and LiFi-XL USB devices.

Mostly trivial comments:

> diff --git a/drivers/net/wireless/purelifi/chip.c 
> b/drivers/net/wireless/purelifi/chip.c
[]
> +int purelifi_chip_set_rate(struct purelifi_chip *chip, u8 rate)
> +{
> + int r;
> + static struct purelifi_chip *chip_p;

Isn't chip_p pointless?

> +
> + if (chip)
> + chip_p = chip;
> + if (!chip_p)
> + return -EINVAL;
> +

chip_p is otherwise unused.

> diff --git a/drivers/net/wireless/purelifi/mac.c 
> b/drivers/net/wireless/purelifi/mac.c
[]
> +int purelifi_mac_init_hw(struct ieee80211_hw *hw)
> +{
> + int r;
> + struct purelifi_mac *mac = purelifi_hw_mac(hw);
> + struct purelifi_chip *chip = &mac->chip;
> +
> + r = purelifi_chip_init_hw(chip);
> + if (r) {
> + dev_warn(purelifi_mac_dev(mac), "init hw failed (%d)\n", r);
> + goto out;
> + }
> +
> + dev_dbg(purelifi_mac_dev(mac), "irq_disabled %d\n", irqs_disabled());
> +
> + r = regulatory_hint(hw->wiphy, "CA");
> +out:
> + return r;
> +}

Simpler without the out: label and a direct return

if (r) {
dev_warn(...)
return r;
}

...

return regulator_hint(hw->wiphy, "CA");
}

> +static int download_fpga(struct usb_interface *intf)
> +{
[]
> + if ((le16_to_cpu(udev->descriptor.idVendor) ==
> + PURELIFI_X_VENDOR_ID_0) &&
> + (le16_to_cpu(udev->descriptor.idProduct) ==
> + PURELIFI_X_PRODUCT_ID_0)) {
> + fw_name = "purelifi/li_fi_x/fpga.bit";
> + dev_info(&intf->dev, "bit file for X selected.\n");
> +
> + } else if ((le16_to_cpu(udev->descriptor.idVendor)) ==
> + PURELIFI_XC_VENDOR_ID_0 &&
> +(le16_to_cpu(udev->descriptor.idProduct) ==
> + PURELIFI_XC_PRODUCT_ID_0)) {
> + fw_name = "purelifi/li_fi_x/fpga_xc.bit";
> + dev_info(&intf->dev, "bit filefor XC selected.\n");

Inconsistent dev_info spacing: "file for" vs "filefor"

> + for (fw_data_i = 0; fw_data_i < fw->size;) {
> + int tbuf_idx;
> +
> + if ((fw->size - fw_data_i) < blk_tran_len)
> + blk_tran_len = fw->size - fw_data_i;
> +
> + fw_data = kmalloc(blk_tran_len, GFP_KERNEL);
> +
> + memcpy(fw_data, &fw->data[fw_data_i], blk_tran_len);
> +
> + for (tbuf_idx = 0; tbuf_idx < blk_tran_len; tbuf_idx++) {
> + fw_data[tbuf_idx] =
> + ((fw_data[tbuf_idx] & 128) >> 7) |
> + ((fw_data[tbuf_idx] &  64) >> 5) |
> + ((fw_data[tbuf_idx] &  32) >> 3) |
> + ((fw_data[tbuf_idx] &  16) >> 1) |
> + ((fw_data[tbuf_idx] &   8) << 1) |
> + ((fw_data[tbuf_idx] &   4) << 3) |
> + ((fw_data[tbuf_idx] &   2) << 5) |
> + ((fw_data[tbuf_idx] &   1) << 7);
> + }

pity there isn't an u8_bit_reverse function/mechanism

> +static void pretty_print_mac(struct usb_interface *intf, char *serial_number,
> +  unsigned char *hw_address
> + )
> +{
> + unsigned char i;
> +
> + for (i = 0; i < ETH_ALEN; i++)
> + dev_info(&intf->dev, "hw_address[%d]=%x\n", i, hw_address[i]);

I don't think this prettier than %pM

> +}
> +
> +static int upload_mac_and_serial_number(struct usb_interface *intf,
> + unsigned char *hw_address,
> + unsigned char *serial_number)
> +{
[]
> + do {
> + unsigned char buf[8];
> +
> + msleep(200);
> +
> + send_vendor_request(udev, 0x40, buf, sizeof(buf));
> + flash.enabled = buf[0] & 0xFF;
> +
> + if (flash.enabled) {
> + flash.sectors = ((buf[6] & 255) << 24) |

buf is unsigned char[], rather pointless using & 255

> diff --git a/drivers/net/wireless/purelifi/usb.h 
> b/drivers/net/wireless/purelifi/usb.h
[]
> +struct station_t {
> +   //  7...3|2 | 1 | 0 |
> +   // Reserved  | Hearbeat | FIFO full | Connected |

heartbeat

Re: [PATCH v4 09/15] ASoC: dt-bindings: audio-graph: Convert bindings to json-schema

2020-10-18 Thread Kuninori Morimoto



Hi Sameer

> >> Convert device tree bindings of audio graph card to YAML format. Also
> >> expose some common definitions which can be used by similar graph based
> >> audio sound cards.
> >> 
> >> Signed-off-by: Sameer Pujar 
> >> Cc: Kuninori Morimoto 
> >> ---
> > I'm posting this patch to Rob & DT ML.
> > Not yet accepted, though..
> 
> Thanks for letting me know. I guess below is your patch,
> http://patchwork.ozlabs.org/project/devicetree-bindings/patch/877dtlvsxk.wl-kuninori.morimoto...@renesas.com/
> Do you have plans to resend this or send next revision?
> 
> I can drop my patch once yours is merged and refer the same for Tegra
> audio graph card.

I'm waiting response from Rob now.
It is merge window now. I will re-post it without his response
if -rc1 was released.

Thank you for your help !!

Best regards
---
Kuninori Morimoto

Re: [PATCH v4 09/15] ASoC: dt-bindings: audio-graph: Convert bindings to json-schema

2020-10-18 Thread Sameer Pujar


Hi Morimoto-san,


Convert device tree bindings of audio graph card to YAML format. Also
expose some common definitions which can be used by similar graph based
audio sound cards.

Signed-off-by: Sameer Pujar 
Cc: Kuninori Morimoto 
---

I'm posting this patch to Rob & DT ML.
Not yet accepted, though..


Thanks for letting me know. I guess below is your patch,
http://patchwork.ozlabs.org/project/devicetree-bindings/patch/877dtlvsxk.wl-kuninori.morimoto...@renesas.com/
Do you have plans to resend this or send next revision?

I can drop my patch once yours is merged and refer the same for Tegra 
audio graph card.


Thanks,
Sameer.

[PATCH 2/2] powerpc/smp: Use GFP_ATOMIC while allocating tmp mask

2020-10-18 Thread Srikar Dronamraju

Qian Cai reported a regression where CPU Hotplug fails with the latest
powerpc/next

BUG: sleeping function called from invalid context at mm/slab.h:494
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/88
no locks held by swapper/88/0.
irq event stamp: 18074448
hardirqs last  enabled at (18074447): [] 
tick_nohz_idle_enter+0x9c/0x110
hardirqs last disabled at (18074448): [] do_idle+0x138/0x3b0
do_idle at kernel/sched/idle.c:253 (discriminator 1)
softirqs last  enabled at (18074440): [] 
irq_enter_rcu+0x94/0xa0
softirqs last disabled at (18074439): [] 
irq_enter_rcu+0x70/0xa0
CPU: 88 PID: 0 Comm: swapper/88 Tainted: GW 
5.9.0-rc8-next-20201007 #1
Call Trace:
[c0002a4bfcf0] [c0649e98] dump_stack+0xec/0x144 (unreliable)
[c0002a4bfd30] [c00f6c34] ___might_sleep+0x2f4/0x310
[c0002a4bfdb0] [c0354f94] 
slab_pre_alloc_hook.constprop.82+0x124/0x190
[c0002a4bfe00] [c035e9e8] __kmalloc_node+0x88/0x3a0
slab_alloc_node at mm/slub.c:2817
(inlined by) __kmalloc_node at mm/slub.c:4013
[c0002a4bfe80] [c06494d8] alloc_cpumask_var_node+0x38/0x80
kmalloc_node at include/linux/slab.h:577
(inlined by) alloc_cpumask_var_node at lib/cpumask.c:116
[c0002a4bfef0] [c003eedc] start_secondary+0x27c/0x800
update_mask_by_l2 at arch/powerpc/kernel/smp.c:1267
(inlined by) add_cpu_to_masks at arch/powerpc/kernel/smp.c:1387
(inlined by) start_secondary at arch/powerpc/kernel/smp.c:1420
[c0002a4bff90] [c000c468] start_secondary_resume+0x10/0x14

Allocating a temporary mask while performing a CPU Hotplug operation
with CONFIG_CPUMASK_OFFSTACK enabled, leads to calling a sleepable
function from a atomic context. Fix this by allocating the temporary
mask with GFP_ATOMIC flag. Also instead of having to allocate twice,
allocate the mask in the caller so that we only have to allocate once.
If the allocation fails, assume the mask to be same as sibling mask, which
will make the scheduler to drop this domain for this CPU.

Fixes: 70a94089d7f7 ("powerpc/smp: Optimize update_coregroup_mask")
Fixes: 3ab33d6dc3e9 ("powerpc/smp: Optimize update_mask_by_l2")
Reported-by: Qian Cai 
Signed-off-by: Srikar Dronamraju 
Cc: linuxppc-dev 
Cc: LKML 
Cc: Michael Ellerman 
Cc: Nathan Lynch 
Cc: Gautham R Shenoy 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Valentin Schneider 
Cc: Qian Cai 
---
Changelog v1->v2:
https://lore.kernel.org/linuxppc-dev/20201008034240.34059-1-sri...@linux.vnet.ibm.com/t/#u
Updated 2nd patch based on comments from Michael Ellerman
- Remove the WARN_ON.
- Handle allocation failures in a more subtle fashion
- Allocate in the caller so that we allocate once.

 arch/powerpc/kernel/smp.c | 57 +--
 1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index a864b9b3228c..028479e9b66b 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1257,38 +1257,33 @@ static struct device_node *cpu_to_l2cache(int cpu)
return cache;
 }
 
-static bool update_mask_by_l2(int cpu)
+static bool update_mask_by_l2(int cpu, cpumask_var_t *mask)
 {
struct cpumask *(*submask_fn)(int) = cpu_sibling_mask;
struct device_node *l2_cache, *np;
-   cpumask_var_t mask;
int i;
 
if (has_big_cores)
submask_fn = cpu_smallcore_mask;
 
l2_cache = cpu_to_l2cache(cpu);
-   if (!l2_cache) {
-   /*
-* If no l2cache for this CPU, assume all siblings to share
-* cache with this CPU.
-*/
+   if (!l2_cache || !*mask) {
+   /* Assume only core siblings share cache with this CPU */
for_each_cpu(i, submask_fn(cpu))
set_cpus_related(cpu, i, cpu_l2_cache_mask);
 
return false;
}
 
-   alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu));
-   cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu));
+   cpumask_and(*mask, cpu_online_mask, cpu_cpu_mask(cpu));
 
/* Update l2-cache mask with all the CPUs that are part of submask */
or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask);
 
/* Skip all CPUs already part of current CPU l2-cache mask */
-   cpumask_andnot(mask, mask, cpu_l2_cache_mask(cpu));
+   cpumask_andnot(*mask, *mask, cpu_l2_cache_mask(cpu));
 
-   for_each_cpu(i, mask) {
+   for_each_cpu(i, *mask) {
/*
 * when updating the marks the current CPU has not been marked
 * online, but we need to update the cache masks
@@ -1298,15 +1293,14 @@ static bool update_mask_by_l2(int cpu)
/* Skip all CPUs already part of current CPU l2-cache */
if (np == l2_cache) {
or_cpumasks_related(cpu, i, submask_fn, 
cpu_l2_cache_mask);
-   cpumask_andnot(mask, mask, submask_f

fs/btrfs/volumes.c:888:50: sparse: sparse: incorrect type in argument 1 (different address spaces)

2020-10-18 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   7cf726a59435301046250c42131554d9ccc566b8
commit: 8d1a7aae89dc0c41ffb76fe1007dbba59d13881b btrfs: annotate device name 
rcu_string with __rcu
date:   12 days ago
config: s390-randconfig-s032-20201019 (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.3-rc1-2-g368fd9ce-dirty
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d1a7aae89dc0c41ffb76fe1007dbba59d13881b
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 8d1a7aae89dc0c41ffb76fe1007dbba59d13881b
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


"sparse warnings: (new ones prefixed by >>)"
   fs/btrfs/volumes.c:374:31: sparse: sparse: incorrect type in argument 1 
(different address spaces) @@ expected struct rcu_string *str @@ got 
struct rcu_string [noderef] __rcu *name @@
   fs/btrfs/volumes.c:374:31: sparse: expected struct rcu_string *str
   fs/btrfs/volumes.c:374:31: sparse: got struct rcu_string [noderef] __rcu 
*name
   fs/btrfs/volumes.c:631:43: sparse: sparse: incorrect type in argument 1 
(different address spaces) @@ expected char const *device_path @@ got 
char [noderef] __rcu * @@
   fs/btrfs/volumes.c:631:43: sparse: expected char const *device_path
   fs/btrfs/volumes.c:631:43: sparse: got char [noderef] __rcu *
>> fs/btrfs/volumes.c:888:50: sparse: sparse: incorrect type in argument 1 
>> (different address spaces) @@ expected char const *s1 @@ got char 
>> [noderef] __rcu * @@
>> fs/btrfs/volumes.c:888:50: sparse: expected char const *s1
   fs/btrfs/volumes.c:888:50: sparse: got char [noderef] __rcu *
   fs/btrfs/volumes.c:963:39: sparse: sparse: incorrect type in argument 1 
(different address spaces) @@ expected struct rcu_string *str @@ got 
struct rcu_string [noderef] __rcu *name @@
   fs/btrfs/volumes.c:963:39: sparse: expected struct rcu_string *str
   fs/btrfs/volumes.c:963:39: sparse: got struct rcu_string [noderef] __rcu 
*name
   fs/btrfs/volumes.c:1018:58: sparse: sparse: incorrect type in argument 1 
(different address spaces) @@ expected char const *src @@ got char 
[noderef] __rcu * @@
   fs/btrfs/volumes.c:1018:58: sparse: expected char const *src
   fs/btrfs/volumes.c:1018:58: sparse: got char [noderef] __rcu *
   fs/btrfs/volumes.c:2165:49: sparse: sparse: incorrect type in argument 3 
(different address spaces) @@ expected char const *device_path @@ got 
char [noderef] __rcu * @@
   fs/btrfs/volumes.c:2165:49: sparse: expected char const *device_path
   fs/btrfs/volumes.c:2165:49: sparse: got char [noderef] __rcu *
   fs/btrfs/volumes.c:2273:41: sparse: sparse: incorrect type in argument 3 
(different address spaces) @@ expected char const *device_path @@ got 
char [noderef] __rcu * @@
   fs/btrfs/volumes.c:2273:41: sparse: expected char const *device_path
   fs/btrfs/volumes.c:2273:41: sparse: got char [noderef] __rcu *

vim +888 fs/btrfs/volumes.c

1362089d2ad7e20 Nikolay Borisov   2020-01-10  745  
1362089d2ad7e20 Nikolay Borisov   2020-01-10  746  static struct 
btrfs_fs_devices *find_fsid_reverted_metadata(
1362089d2ad7e20 Nikolay Borisov   2020-01-10  747   
struct btrfs_super_block *disk_super)
1362089d2ad7e20 Nikolay Borisov   2020-01-10  748  {
1362089d2ad7e20 Nikolay Borisov   2020-01-10  749   struct 
btrfs_fs_devices *fs_devices;
1362089d2ad7e20 Nikolay Borisov   2020-01-10  750  
1362089d2ad7e20 Nikolay Borisov   2020-01-10  751   /*
1362089d2ad7e20 Nikolay Borisov   2020-01-10  752* Handle the 
case where the scanned device is part of an fs whose last
1362089d2ad7e20 Nikolay Borisov   2020-01-10  753* metadata 
UUID change reverted it to the original FSID. At the same
1362089d2ad7e20 Nikolay Borisov   2020-01-10  754* time * 
fs_devices was first created by another constitutent device
1362089d2ad7e20 Nikolay Borisov   2020-01-10  755* which didn't 
fully observe the operation. This results in an
1362089d2ad7e20 Nikolay Borisov   2020-01-10  756* 
btrfs_fs_devices created with metadata/fsid different AND
1362089d2ad7e20 Nikolay Borisov   2020-01-10  757* 
btrfs_fs_devices::fsid_change set AND the metadata_uuid of the
1362089

[PATCH 1/2] powerpc/smp: Remove unnecessary variable

2020-10-18 Thread Srikar Dronamraju

Commit 3ab33d6dc3e9 ("powerpc/smp: Optimize update_mask_by_l2")
introduced submask_fn in update_mask_by_l2 to track the right submask.
However commit f6606cfdfbcd ("powerpc/smp: Dont assume l2-cache to be
superset of sibling") introduced sibling_mask in update_mask_by_l2 to
track the same submask. Remove sibling_mask in favour of submask_fn.

Signed-off-by: Srikar Dronamraju 
Cc: linuxppc-dev 
Cc: LKML 
Cc: Michael Ellerman 
Cc: Nathan Lynch 
Cc: Gautham R Shenoy 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Valentin Schneider 
Cc: Qian Cai 
---
 arch/powerpc/kernel/smp.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 8d1c401f4617..a864b9b3228c 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1264,18 +1264,16 @@ static bool update_mask_by_l2(int cpu)
cpumask_var_t mask;
int i;
 
+   if (has_big_cores)
+   submask_fn = cpu_smallcore_mask;
+
l2_cache = cpu_to_l2cache(cpu);
if (!l2_cache) {
-   struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
-
/*
 * If no l2cache for this CPU, assume all siblings to share
 * cache with this CPU.
 */
-   if (has_big_cores)
-   sibling_mask = cpu_smallcore_mask;
-
-   for_each_cpu(i, sibling_mask(cpu))
+   for_each_cpu(i, submask_fn(cpu))
set_cpus_related(cpu, i, cpu_l2_cache_mask);
 
return false;
@@ -1284,9 +1282,6 @@ static bool update_mask_by_l2(int cpu)
alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu));
cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu));
 
-   if (has_big_cores)
-   submask_fn = cpu_smallcore_mask;
-
/* Update l2-cache mask with all the CPUs that are part of submask */
or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask);
 
-- 
2.18.2

[PATCH v2 0/2] Fixes for coregroup

2020-10-18 Thread Srikar Dronamraju

These patches fixes problems introduced by the coregroup patches.
The first patch we remove a redundant variable.
Second patch allows to boot with CONFIG_CPUMASK_OFFSTACK enabled.

Changelog v1->v2:
https://lore.kernel.org/linuxppc-dev/20201008034240.34059-1-sri...@linux.vnet.ibm.com/t/#u
1. 1st patch was not part of previous posting.
2. Updated 2nd patch based on comments from Michael Ellerman

Cc: linuxppc-dev 
Cc: LKML 
Cc: Michael Ellerman 
Cc: Nathan Lynch 
Cc: Gautham R Shenoy 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Valentin Schneider 
Cc: Qian Cai 

Srikar Dronamraju (2):
  powerpc/smp: Remove unnecessary variable
  powerpc/smp: Use GFP_ATOMIC while allocating tmp mask

 arch/powerpc/kernel/smp.c | 70 +++
 1 file changed, 35 insertions(+), 35 deletions(-)

-- 
2.18.2

Re: [PATCH 0/2] UIO support for dfl devices

2020-10-18 Thread Xu Yilun

On Fri, Oct 16, 2020 at 09:40:03AM -0700, Tom Rix wrote:
> 
> On 10/15/20 11:02 PM, Xu Yilun wrote:
> > This patchset supports some dfl device drivers written in userspace.
> >
> > The usage is like:
> >
> >  # echo dfl_dev.1 > /sys/bus/dfl/drivers//unbind
> >  # echo dfl-uio-pdev > /sys/bus/dfl/devices/dfl_dev.1/driver_override
> >  # echo dfl_dev.1 > /sys/bus/dfl/drivers_probe
> >
> >
> > Xu Yilun (2):
> >   fpga: dfl: add driver_override support
> >   fpga: dfl: add the userspace I/O device support for DFL devices
> >
> >  Documentation/ABI/testing/sysfs-bus-dfl | 28 +--
> >  drivers/fpga/Kconfig| 10 
> >  drivers/fpga/Makefile   |  1 +
> >  drivers/fpga/dfl-uio-pdev.c | 83 
> > +
> >  drivers/fpga/dfl.c  | 54 -
> >  include/linux/dfl.h |  2 +
> >  6 files changed, 173 insertions(+), 5 deletions(-)
> >  create mode 100644 drivers/fpga/dfl-uio-pdev.c
> 
> This is a neat new feature.
> 
> Should something be added to Documentation/fpga/dfl.rst ?

Yes, I could add the Doc.

> 
> Overall, patchset looks good.
> 
> Tom

Re: [PATCH 2/2] fpga: dfl: add the userspace I/O device support for DFL devices

2020-10-18 Thread Xu Yilun

On Fri, Oct 16, 2020 at 09:36:00AM -0700, Tom Rix wrote:
> 
> On 10/15/20 11:02 PM, Xu Yilun wrote:
> > This patch supports the DFL drivers be written in userspace. This is
> > realized by exposing the userspace I/O device interfaces. The driver
> > leverages the uio_pdrv_genirq, it adds the uio_pdrv_genirq platform
> > device with the DFL device's resources, and let the generic UIO platform
> > device driver provide support to userspace access to kernel interrupts
> > and memory locations.
> >
> > Signed-off-by: Xu Yilun 
> > ---
> >  drivers/fpga/Kconfig| 10 ++
> >  drivers/fpga/Makefile   |  1 +
> >  drivers/fpga/dfl-uio-pdev.c | 83 
> > +
> >  3 files changed, 94 insertions(+)
> >  create mode 100644 drivers/fpga/dfl-uio-pdev.c
> >
> > diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
> > index 5d7f0ae..e054722 100644
> > --- a/drivers/fpga/Kconfig
> > +++ b/drivers/fpga/Kconfig
> > @@ -202,6 +202,16 @@ config FPGA_DFL_NIOS_INTEL_PAC_N3000
> >   the card. It also instantiates the SPI master (spi-altera) for
> >   the card's BMC (Board Management Controller).
> >  
> > +config FPGA_DFL_UIO_PDEV
> > +   tristate "FPGA DFL Driver for Userspace I/O platform devices"
> > +   depends on FPGA_DFL && UIO_PDRV_GENIRQ
> > +   help
> > + Enable this to allow some DFL drivers be written in userspace. It
> > + adds the uio_pdrv_genirq platform device with the DFL device's
> > + resources, and let the generic UIO platform device driver provide
> 'and lets the'

Yes.

> > + support to userspace access to kernel interrupts and memory
> > + locations.
> > +
> >  config FPGA_DFL_PCI
> > tristate "FPGA DFL PCIe Device Driver"
> > depends on PCI && FPGA_DFL
> > diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile
> > index 18dc9885..e07b3d5 100644
> > --- a/drivers/fpga/Makefile
> > +++ b/drivers/fpga/Makefile
> > @@ -45,6 +45,7 @@ dfl-afu-objs := dfl-afu-main.o dfl-afu-region.o 
> > dfl-afu-dma-region.o
> >  dfl-afu-objs += dfl-afu-error.o
> >  
> >  obj-$(CONFIG_FPGA_DFL_NIOS_INTEL_PAC_N3000)+= dfl-n3000-nios.o
> > +obj-$(CONFIG_FPGA_DFL_UIO_PDEV)+= dfl-uio-pdev.o
> >  
> >  # Drivers for FPGAs which implement DFL
> >  obj-$(CONFIG_FPGA_DFL_PCI) += dfl-pci.o
> > diff --git a/drivers/fpga/dfl-uio-pdev.c b/drivers/fpga/dfl-uio-pdev.c
> > new file mode 100644
> > index 000..d35b846
> > --- /dev/null
> > +++ b/drivers/fpga/dfl-uio-pdev.c
> > @@ -0,0 +1,83 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * DFL driver for Userspace I/O platform devices
> > + *
> > + * Copyright (C) 2020 Intel Corporation, Inc.
> > + */
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#define DRIVER_NAME "dfl-uio-pdev"
> > +
> > +static int dfl_uio_pdev_probe(struct dfl_device *ddev)
> > +{
> > +   struct device *dev = &ddev->dev;
> > +   struct platform_device_info pdevinfo = { 0 };
> > +   struct uio_info uio_pdata = { 0 };
> > +   struct platform_device *uio_pdev;
> > +   struct resource *res;
> > +   int i, idx = 0;
> 
> idx is not needed.

I could remove the idx. But I think I could ++res during the resource
filling.

Thanks,
Yilun

> 
> > +
> > +   pdevinfo.name = "uio_pdrv_genirq";
> > +
> > +   res = kcalloc(ddev->num_irqs + 1, sizeof(*res), GFP_KERNEL);
> > +   if (!res)
> > +   return -ENOMEM;
> > +
> > +   res[idx].parent = &ddev->mmio_res;
> res[0].parent =
> > +   res[idx].flags = IORESOURCE_MEM;
> > +   res[idx].start = ddev->mmio_res.start;
> > +   res[idx].end = ddev->mmio_res.end;
> > +   ++idx;
> > +
> > +   /* then add irq resource */
> > +   for (i = 0; i < ddev->num_irqs; i++) {
> > +   res[idx].flags = IORESOURCE_IRQ;
> 
> res[i+1].flags =
> 
> Tom
> 
> > +   res[idx].start = ddev->irqs[i];
> > +   res[idx].end = ddev->irqs[i];
> > +   ++idx;
> > +   }
> > +
> > +   uio_pdata.name = DRIVER_NAME;
> > +   uio_pdata.version = "0";
> > +
> > +   pdevinfo.res = res;
> > +   pdevinfo.num_res = idx;
> > +   pdevinfo.parent = &ddev->dev;
> > +   pdevinfo.id = PLATFORM_DEVID_AUTO;
> > +   pdevinfo.data = &uio_pdata;
> > +   pdevinfo.size_data = sizeof(uio_pdata);
> > +
> > +   uio_pdev = platform_device_register_full(&pdevinfo);
> > +   if (!IS_ERR(uio_pdev))
> > +   dev_set_drvdata(dev, uio_pdev);
> > +
> > +   kfree(res);
> > +
> > +   return PTR_ERR_OR_ZERO(uio_pdev);
> > +}
> > +
> > +static void dfl_uio_pdev_remove(struct dfl_device *ddev)
> > +{
> > +   struct platform_device *uio_pdev = dev_get_drvdata(&ddev->dev);
> > +
> > +   platform_device_unregister(uio_pdev);
> > +}
> > +
> > +static struct dfl_driver dfl_uio_pdev_driver = {
> > +   .drv= {
> > +   .name   = DRIVER_NAME,
> > +   },
> > +   .probe  = dfl_uio_pdev_probe,
> > +   .remove = dfl_uio_pdev_remove,
> > +};
> > +module_dfl_driver(dfl_uio_pdev_driver);
> > +
> > +MODULE_DESCRIPTION("DFL drive

[PATCH v4 1/4] venus: core: change clk enable and disable order in resume and suspend

2020-10-18 Thread Mansur Alisha Shaik

Currently video driver is voting after clk enable and un voting
before clk disable. This is incorrect, video driver should vote
before clk enable and unvote after clk disable.

Corrected this by changing the order of clk enable and clk disable.

Fixes: 07f8f22a33a9e ("media: venus: core: remove CNOC voting while device
suspend")
Signed-off-by: Mansur Alisha Shaik 
Reviewed-by: Stephen Boyd 
---
 drivers/media/platform/qcom/venus/core.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/media/platform/qcom/venus/core.c 
b/drivers/media/platform/qcom/venus/core.c
index 6103aaf..52a3886 100644
--- a/drivers/media/platform/qcom/venus/core.c
+++ b/drivers/media/platform/qcom/venus/core.c
@@ -355,13 +355,16 @@ static __maybe_unused int venus_runtime_suspend(struct 
device *dev)
if (ret)
return ret;
 
+   if (pm_ops->core_power) {
+   ret = pm_ops->core_power(dev, POWER_OFF);
+   if (ret)
+   return ret;
+   }
+
ret = icc_set_bw(core->cpucfg_path, 0, 0);
if (ret)
return ret;
 
-   if (pm_ops->core_power)
-   ret = pm_ops->core_power(dev, POWER_OFF);
-
return ret;
 }
 
@@ -371,16 +374,16 @@ static __maybe_unused int venus_runtime_resume(struct 
device *dev)
const struct venus_pm_ops *pm_ops = core->pm_ops;
int ret;
 
+   ret = icc_set_bw(core->cpucfg_path, 0, kbps_to_icc(1000));
+   if (ret)
+   return ret;
+
if (pm_ops->core_power) {
ret = pm_ops->core_power(dev, POWER_ON);
if (ret)
return ret;
}
 
-   ret = icc_set_bw(core->cpucfg_path, 0, kbps_to_icc(1000));
-   if (ret)
-   return ret;
-
return hfi_core_resume(core, false);
 }
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation

[PATCH v4 2/4] venus: core: vote for video-mem path

2020-10-18 Thread Mansur Alisha Shaik

Currently video driver is voting for venus0-ebi path during buffer
processing with an average bandwidth of all the instances and
unvoting during session release.

While video streaming when we try to do XO-SD using the command
"echo mem > /sys/power/state command" , device is not entering
to suspend state and from interconnect summary seeing votes for venus0-ebi

Corrected this by voting for venus0-ebi path in venus_runtime_resume()
and unvote during venus_runtime_suspend().

Fixes: 07f8f22a33a9e ("media: venus: core: remove CNOC voting while device
suspend")
Signed-off-by: Mansur Alisha Shaik 
Reviewed-by: Stephen Boyd 
---
Changes in v4:
- As per Stanimir's comments, corrected fixes tag

 drivers/media/platform/qcom/venus/core.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/qcom/venus/core.c 
b/drivers/media/platform/qcom/venus/core.c
index 52a3886..fa363b8 100644
--- a/drivers/media/platform/qcom/venus/core.c
+++ b/drivers/media/platform/qcom/venus/core.c
@@ -363,7 +363,18 @@ static __maybe_unused int venus_runtime_suspend(struct 
device *dev)
 
ret = icc_set_bw(core->cpucfg_path, 0, 0);
if (ret)
-   return ret;
+   goto err_cpucfg_path;
+
+   ret = icc_set_bw(core->video_path, 0, 0);
+   if (ret)
+   goto err_video_path;
+
+   return ret;
+
+err_video_path:
+   icc_set_bw(core->cpucfg_path, kbps_to_icc(1000), 0);
+err_cpucfg_path:
+   pm_ops->core_power(dev, POWER_ON);
 
return ret;
 }
@@ -374,6 +385,10 @@ static __maybe_unused int venus_runtime_resume(struct 
device *dev)
const struct venus_pm_ops *pm_ops = core->pm_ops;
int ret;
 
+   ret = icc_set_bw(core->video_path, 0, kbps_to_icc(1000));
+   if (ret)
+   return ret;
+
ret = icc_set_bw(core->cpucfg_path, 0, kbps_to_icc(1000));
if (ret)
return ret;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation

[PATCH v4 3/4] venus: core: vote with average bandwidth and peak bandwidth as zero

2020-10-18 Thread Mansur Alisha Shaik

As per bandwidth table video driver is voting with average bandwidth
for "video-mem" and "cpu-cfg" paths as peak bandwidth is zero
in bandwidth table.

Fixes: 07f8f22a33a9e ("media: venus: core: remove CNOC voting while device
suspend")
Signed-off-by: Mansur Alisha Shaik 
Reviewed-by: Stephen Boyd 
---
Changes in v4:
- As per Stanimir's comments, corrected fixes tag

 drivers/media/platform/qcom/venus/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/qcom/venus/core.c 
b/drivers/media/platform/qcom/venus/core.c
index fa363b8..d5bfd6f 100644
--- a/drivers/media/platform/qcom/venus/core.c
+++ b/drivers/media/platform/qcom/venus/core.c
@@ -385,11 +385,11 @@ static __maybe_unused int venus_runtime_resume(struct 
device *dev)
const struct venus_pm_ops *pm_ops = core->pm_ops;
int ret;
 
-   ret = icc_set_bw(core->video_path, 0, kbps_to_icc(1000));
+   ret = icc_set_bw(core->video_path, kbps_to_icc(2), 0);
if (ret)
return ret;
 
-   ret = icc_set_bw(core->cpucfg_path, 0, kbps_to_icc(1000));
+   ret = icc_set_bw(core->cpucfg_path, kbps_to_icc(1000), 0);
if (ret)
return ret;
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation

[PATCH v4 0/4] Venus - change clk enable, disable order and change bw values

2020-10-18 Thread Mansur Alisha Shaik

The intention of this patchset is to correct clock enable and disable
order and vote for venus-ebi and cpucfg paths with average bandwidth
instad of peak bandwidth since with current implementation we are seeing
clock related warning during XO-SD and suspend device while video playback

---
Resending v4 series by correcting fixes tag for all patches in series`

Mansur Alisha Shaik (4):
  venus: core: change clk enable and disable order in resume and suspend
  venus: core: vote for video-mem path
  venus: core: vote with average bandwidth and peak bandwidth as zero
  venus: put dummy vote on video-mem path after last session release

 drivers/media/platform/qcom/venus/core.c   | 32 --
 drivers/media/platform/qcom/venus/pm_helpers.c | 10 
 2 files changed, 35 insertions(+), 7 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation

[PATCH v4 4/4] venus: put dummy vote on video-mem path after last session release

2020-10-18 Thread Mansur Alisha Shaik

As per current implementation, video driver is unvoting "videom-mem" path
for last video session during vdec_session_release().
While video playback when we try to suspend device, we see video clock
warnings since votes are already removed during vdec_session_release().

corrected this by putting dummy vote on "video-mem" after last video
session release and unvoting it during suspend.

Fixes: 07f8f22a33a9e ("media: venus: core: remove CNOC voting while device
suspend")
Signed-off-by: Mansur Alisha Shaik 
Reviewed-by: Stephen Boyd 
---
Changes in v4:
- As per Stanimir's comments, corrected fixes tag

 drivers/media/platform/qcom/venus/pm_helpers.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/media/platform/qcom/venus/pm_helpers.c 
b/drivers/media/platform/qcom/venus/pm_helpers.c
index 57877ea..0ebba8e 100644
--- a/drivers/media/platform/qcom/venus/pm_helpers.c
+++ b/drivers/media/platform/qcom/venus/pm_helpers.c
@@ -212,6 +212,16 @@ static int load_scale_bw(struct venus_core *core)
}
mutex_unlock(&core->lock);
 
+   /*
+* keep minimum bandwidth vote for "video-mem" path,
+* so that clks can be disabled during vdec_session_release().
+* Actual bandwidth drop will be done during device supend
+* so that device can power down without any warnings.
+*/
+
+   if (!total_avg && !total_peak)
+   total_avg = kbps_to_icc(1000);
+
dev_dbg(core->dev, VDBGL "total: avg_bw: %u, peak_bw: %u\n",
total_avg, total_peak);
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member 
of Code Aurora Forum, hosted by The Linux Foundation

Re: [PATCH 1/2] fpga: dfl: add driver_override support

2020-10-18 Thread Xu Yilun

On Fri, Oct 16, 2020 at 09:21:50AM -0700, Tom Rix wrote:
> 
> On 10/15/20 11:02 PM, Xu Yilun wrote:
> > Add support for overriding the default matching of a dfl device to a dfl
> > driver. It follows the same way that can be used for PCI and platform
> > devices. This patch adds the 'driver_override' sysfs file.
> >
> > Signed-off-by: Xu Yilun 
> > ---
> >  Documentation/ABI/testing/sysfs-bus-dfl | 28 ++---
> >  drivers/fpga/dfl.c  | 54 
> > -
> >  include/linux/dfl.h |  2 ++
> >  3 files changed, 79 insertions(+), 5 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-bus-dfl 
> > b/Documentation/ABI/testing/sysfs-bus-dfl
> > index 23543be..db7e8d3 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-dfl
> > +++ b/Documentation/ABI/testing/sysfs-bus-dfl
> > @@ -1,15 +1,35 @@
> >  What:  /sys/bus/dfl/devices/dfl_dev.X/type
> > -Date:  Aug 2020
> > -KernelVersion: 5.10
> > +Date:  Oct 2020
> > +KernelVersion: 5.11
> >  Contact:   Xu Yilun 
> >  Description:   Read-only. It returns type of DFL FIU of the device. 
> > Now DFL
> > supports 2 FIU types, 0 for FME, 1 for PORT.
> > Format: 0x%x
> >  
> >  What:  /sys/bus/dfl/devices/dfl_dev.X/feature_id
> > -Date:  Aug 2020
> > -KernelVersion: 5.10
> > +Date:  Oct 2020
> > +KernelVersion: 5.11
> >  Contact:   Xu Yilun 
> >  Description:   Read-only. It returns feature identifier local to its 
> > DFL FIU
> > type.
> > Format: 0x%x
> 
> These updates, do not match the comment.
> 
> Consider splitting this out.

I'm sorry it's a typo. The above code should not be changed.

> 
> > +
> > +What:   /sys/bus/dfl/devices/.../driver_override
> > +Date:   Oct 2020
> > +KernelVersion:  5.11
> > +Contact:Xu Yilun 
> I am looking at description and trying to make it consistent with 
> sysfs-bus-pci
> > +Description:This file allows the driver for a device to be specified.
> 
> 'to be specified which will override the standard dfl bus feature id to 
> driver mapping.'

Yes, it could be improved.

Actually now it is the "type" and "feature id" matching, the 2 fields
are defined for dfl_driver.id_table. In future for dfl v1, it may be
GUID matching, which will be added to id_table. So how about we make it
more generic:

'to be specified which will override the standard ID table matching.'

> 
> 
> >  When
> > +specified, only a driver with a name matching the value 
> > written
> > +to driver_override will have an opportunity to bind to the
> > +device. The override is specified by writing a string to 
> > the
> > +driver_override file (echo dfl-uio-pdev > driver_override) 
> > and
> > +may be cleared with an empty string (echo > 
> > driver_override).
> > +This returns the device to standard matching rules binding.
> > +Writing to driver_override does not automatically unbind 
> > the
> > +device from its current driver or make any attempt to
> > +automatically load the specified driver.  If no driver 
> > with a
> > +matching name is currently loaded in the kernel, the device
> > +will not bind to any driver.  This also allows devices to
> > +opt-out of driver binding using a driver_override name 
> > such as
> > +"none".  Only a single driver may be specified in the 
> > override,
> > +there is no support for parsing delimiters.
> > diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
> > index 511b20f..bc35750 100644
> > --- a/drivers/fpga/dfl.c
> > +++ b/drivers/fpga/dfl.c
> > @@ -262,6 +262,10 @@ static int dfl_bus_match(struct device *dev, struct 
> > device_driver *drv)
> > struct dfl_driver *ddrv = to_dfl_drv(drv);
> > const struct dfl_device_id *id_entry;
> >  
> > +   /* When driver_override is set, only bind to the matching driver */
> > +   if (ddev->driver_override)
> > +   return !strcmp(ddev->driver_override, drv->name);
> > +
> > id_entry = ddrv->id_table;
> > if (id_entry) {
> > while (id_entry->feature_id) {
> > @@ -303,6 +307,53 @@ static int dfl_bus_uevent(struct device *dev, struct 
> > kobj_uevent_env *env)
> >   ddev->type, ddev->feature_id);
> >  }
> >  
> 
> I am looking at other implementations of driver_override* and looking for 
> consistency.
> 
> > +static ssize_t driver_override_show(struct device *dev,
> > +   struct device_attribute *attr, char *buf)
> > +{
> > +   struct dfl_device *ddev = to_dfl_dev(dev);
> > +   ssize_t len;
> > +
> > +   device_lock(dev);
> > +   len = sprintf(buf, "%s\n", ddev->driver_override);
> len = snprintf(buf, PAGE_SIZE ...

It is good to me.

Som

Re: [PATCH] arm64: dts: allwinner: beelink-gs1: Enable both RGMII RX/TX delay

2020-10-18 Thread Chen-Yu Tsai

On Mon, Oct 19, 2020 at 1:57 AM Clément Péron  wrote:
>
> Hi,
>
> On Sun, 18 Oct 2020 at 19:24, Clément Péron  wrote:
> >
> > Before the commit:
> > net: phy: realtek: fix rtl8211e rx/tx delay config
> bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx delay config")
>
> With the hash for reference it's better :)
> Clement
>
> >
> > The software overwrite for RX/TX delays of the RTL8211e were not
> > working properly and the Beelink GS1 had both RX/TX delay of RGMII
> > interface set using pull-up on the TXDLY and RXDLY pins.
> >
> > Now that these delays are working properly they overwrite the HW
> > config and set this to 'rgmii' meaning no delay on both RX/TX.
> > This makes the ethernet of this board not working anymore.
> >
> > Set the phy-mode to 'rgmii-id' meaning RGMII with RX/TX delays
> > in the device-tree to keep the correct configuration.
> >
> > Fixes: 089bee8dd119 ("arm64: dts: allwinner: h6: Introduce Beelink GS1 
> > board")
> > Signed-off-by: Clément Péron 

Acked-by: Chen-Yu Tsai 

For reference, the driver fix for dwmac enabling the other RGMII modes

f1239d8aa84d ("net: stmmac: dwmac-sun8i: Allow all RGMII modes")

was merged in v5.5 and was backported to relevant stable kernels.

> > ---
> >  arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts 
> > b/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts
> > index a364cb4e5b3f..6ab53860e447 100644
> > --- a/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts
> > +++ b/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts
> > @@ -99,7 +99,7 @@ &ehci0 {
> >  &emac {
> > pinctrl-names = "default";
> > pinctrl-0 = <&ext_rgmii_pins>;
> > -   phy-mode = "rgmii";
> > +   phy-mode = "rgmii-id";
> > phy-handle = <&ext_rgmii_phy>;
> > phy-supply = <®_aldo2>;
> > status = "okay";
> > --
> > 2.25.1
> >

[PATCH] rtl8188ee: avoid accessing the data mapped to streaming DMA

2020-10-18 Thread Jia-Ju Bai

In rtl88ee_tx_fill_cmddesc(), skb->data is mapped to streaming DMA on
line 677:
  dma_addr_t mapping = dma_map_single(..., skb->data, ...);

On line 680, skb->data is assigned to hdr after cast:
  struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)(skb->data);

Then hdr->frame_control is accessed on line 681:
  __le16 fc = hdr->frame_control;

This DMA access may cause data inconsistency between CPU and hardwre.

To fix this bug, hdr->frame_control is accessed before the DMA mapping.

Signed-off-by: Jia-Ju Bai 
---
 drivers/net/wireless/realtek/rtlwifi/rtl8188ee/trx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/trx.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/trx.c
index b9775eec4c54..c948dafa0c80 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/trx.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/trx.c
@@ -674,12 +674,12 @@ void rtl88ee_tx_fill_cmddesc(struct ieee80211_hw *hw,
u8 fw_queue = QSLT_BEACON;
__le32 *pdesc = (__le32 *)pdesc8;
 
-   dma_addr_t mapping = dma_map_single(&rtlpci->pdev->dev, skb->data,
-   skb->len, DMA_TO_DEVICE);
-
struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)(skb->data);
__le16 fc = hdr->frame_control;
 
+   dma_addr_t mapping = dma_map_single(&rtlpci->pdev->dev, skb->data,
+   skb->len, DMA_TO_DEVICE);
+
if (dma_mapping_error(&rtlpci->pdev->dev, mapping)) {
rtl_dbg(rtlpriv, COMP_SEND, DBG_TRACE,
"DMA mapping error\n");
-- 
2.17.1

[PATCH] rtl8723ae: avoid accessing the data mapped to streaming DMA

2020-10-18 Thread Jia-Ju Bai

In rtl8723e_tx_fill_cmddesc(), skb->data is mapped to streaming DMA on
line 531:
  dma_addr_t mapping = dma_map_single(..., skb->data, ...);

On line 534, skb->data is assigned to hdr after cast:
  struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)(skb->data);

Then hdr->frame_control is accessed on line 535:
  __le16 fc = hdr->frame_control;

This DMA access may cause data inconsistency between CPU and hardwre.

To fix this bug, hdr->frame_control is accessed before the DMA mapping.

Signed-off-by: Jia-Ju Bai 
---
 drivers/net/wireless/realtek/rtlwifi/rtl8723ae/trx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/trx.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/trx.c
index e3ee91b7ea8d..340b3d68a54e 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/trx.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/trx.c
@@ -528,12 +528,12 @@ void rtl8723e_tx_fill_cmddesc(struct ieee80211_hw *hw,
u8 fw_queue = QSLT_BEACON;
__le32 *pdesc = (__le32 *)pdesc8;
 
-   dma_addr_t mapping = dma_map_single(&rtlpci->pdev->dev, skb->data,
-   skb->len, DMA_TO_DEVICE);
-
struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)(skb->data);
__le16 fc = hdr->frame_control;
 
+   dma_addr_t mapping = dma_map_single(&rtlpci->pdev->dev, skb->data,
+   skb->len, DMA_TO_DEVICE);
+
if (dma_mapping_error(&rtlpci->pdev->dev, mapping)) {
rtl_dbg(rtlpriv, COMP_SEND, DBG_TRACE,
"DMA mapping error\n");
-- 
2.17.1

Re: [PATCH v3 2/2] PM / devfreq: Add governor attribute flag for specifc sysfs nodes

2020-10-18 Thread Chanwoo Choi

On 10/19/20 9:39 AM, Dmitry Osipenko wrote:
> ...
>> @@ -1361,6 +1373,9 @@ static ssize_t governor_store(struct device *dev, 
>> struct device_attribute *attr,
>>  goto out;
>>  }
>>  
>> +remove_sysfs_files(df, df->governor);
>> +create_sysfs_files(df, governor);
>> +
>>  prev_governor = df->governor;
>>  df->governor = governor;
>>  strncpy(df->governor_name, governor->name, DEVFREQ_NAME_LEN);
>> @@ -1460,39 +1475,6 @@ static ssize_t target_freq_show(struct device *dev,
>>  }
> 
> The further code may revert df->governor to the prev_governor or set it

prev_governor is better. I'll change it.

> to NULL. The create_sysfs_files(df->governor) should be invoked at the
> very end of the governor_store() and only in a case of success.

OK. I'll add more exception handling code.


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

[PATCH v2 1/1] acpi-cpufreq: Honor _PSD table setting in CPU frequency control

2020-10-18 Thread Wei Huang

acpi-cpufreq has a old quirk that overrides the _PSD table supplied by
BIOS on AMD CPUs. However the _PSD table of new AMD CPUs (Family 19h+)
now accurately reports the P-state dependency of CPU cores. Hence this
quirk needs to be fixed in order to support new CPUs' frequency control.

Fixes: acd316248205 ("acpi-cpufreq: Add quirk to disable _PSD usage on all AMD 
CPUs")
Signed-off-by: Wei Huang 
---
 drivers/cpufreq/acpi-cpufreq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index e4ff681f..1e4fbb002a31 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -691,7 +691,8 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy 
*policy)
cpumask_copy(policy->cpus, topology_core_cpumask(cpu));
}
 
-   if (check_amd_hwpstate_cpu(cpu) && !acpi_pstate_strict) {
+   if (check_amd_hwpstate_cpu(cpu) && boot_cpu_data.x86 < 0x19 &&
+   !acpi_pstate_strict) {
cpumask_clear(policy->cpus);
cpumask_set_cpu(cpu, policy->cpus);
cpumask_copy(data->freqdomain_cpus,
-- 
2.24.1

Re: [PATCH v2] IPv6: sr: Fix End.X nexthop to use oif.

2020-10-18 Thread Reji Thomas

Hi,

Please find my replies inline below.

Regards
Reji

On Mon, Oct 19, 2020 at 4:31 AM Jakub Kicinski  wrote:
>
> On Thu, 15 Oct 2020 13:51:19 +0530 Reji Thomas wrote:
> > Currently End.X action doesn't consider the outgoing interface
> > while looking up the nexthop.This breaks packet path functionality
> > specifically while using link local address as the End.X nexthop.
> > The patch fixes this by enforcing End.X action to have both nh6 and
> > oif and using oif in lookup.It seems this is a day one issue.
> >
> > Fixes: 140f04c33bbc ("ipv6: sr: implement several seg6local actions")
> > Signed-off-by: Reji Thomas 
>
> David, Mathiey - any comments?
>
> > @@ -239,6 +250,8 @@ static int input_action_end(struct sk_buff *skb, struct 
> > seg6_local_lwt *slwt)
> >  static int input_action_end_x(struct sk_buff *skb, struct seg6_local_lwt 
> > *slwt)
> >  {
> >   struct ipv6_sr_hdr *srh;
> > + struct net_device *odev;
> > + struct net *net = dev_net(skb->dev);
>
> Order longest to shortest.
Sorry. Will fix it.

>
>
> >
> >   srh = get_and_validate_srh(skb);
> >   if (!srh)
> > @@ -246,7 +259,11 @@ static int input_action_end_x(struct sk_buff *skb, 
> > struct seg6_local_lwt *slwt)
> >
> >   advance_nextseg(srh, &ipv6_hdr(skb)->daddr);
> >
> > - seg6_lookup_nexthop(skb, &slwt->nh6, 0);
> > + odev = dev_get_by_index_rcu(net, slwt->oif);
> > + if (!odev)
> > + goto drop;
>
> Are you doing this lookup just to make sure that oif exists?
> Looks a little wasteful for fast path, but more importantly
> it won't be backward compatible, right? See below..
>
Please see reply below.

> > +
> > + seg6_strict_lookup_nexthop(skb, &slwt->nh6, odev->ifindex, 0);
> >
> >   return dst_input(skb);
> >
>
> > @@ -566,7 +583,8 @@ static struct seg6_action_desc seg6_action_table[] = {
> >   },
> >   {
> >   .action = SEG6_LOCAL_ACTION_END_X,
> > - .attrs  = (1 << SEG6_LOCAL_NH6),
> > + .attrs  = ((1 << SEG6_LOCAL_NH6) |
> > +(1 << SEG6_LOCAL_OIF)),
> >   .input  = input_action_end_x,
> >   },
> >   {
>
> If you set this parse_nla_action() will reject all
> SEG6_LOCAL_ACTION_END_X without OIF.
>
> As you say the OIF is only required for using link local addresses,
> so this change breaks perfectly legitimate configurations.
>
> Can we instead only warn about the missing OIF, and only do that when
> nh is link local?
>
End.X is defined as an adjacency-sid and is used to select a specific link to a
neighbor for both global and link-local addresses. The intention was
to drop the
packet even for global addresses if the route via the specific
interface is not found.
Alternatively(believe semantically correct for End.X definition) I
could do a neighbor lookup
for nexthop address over specific interface and send the packet out.

> Also doesn't SEG6_LOCAL_ACTION_END_DX6 need a similar treatment?

Yes. I will update the patch for End.DX6 based on the patch finalized for End.X.

Re: [PATCH v1 28/29] virtio-mem: Big Block Mode (BBM) - basic memory hotunplug

2020-10-18 Thread Wei Yang

On Mon, Oct 12, 2020 at 02:53:22PM +0200, David Hildenbrand wrote:
>Let's try to unplug completely offline big blocks first. Then, (if
>enabled via unplug_offline) try to offline and remove whole big blocks.
>
>No locking necessary - we can deal with concurrent onlining/offlining
>just fine.
>
>Note1: This is sub-optimal and might be dangerous in some environments: we
>could end up in an infinite loop when offlining (e.g., long-term pinnings),
>similar as with DIMMs. We'll introduce safe memory hotunplug via
>fake-offlining next, and use this basic mode only when explicitly enabled.
>
>Note2: Without ZONE_MOVABLE, memory unplug will be extremely unreliable
>with bigger block sizes.
>
>Cc: "Michael S. Tsirkin" 
>Cc: Jason Wang 
>Cc: Pankaj Gupta 
>Cc: Michal Hocko 
>Cc: Oscar Salvador 
>Cc: Wei Yang 
>Cc: Andrew Morton 
>Signed-off-by: David Hildenbrand 
>---
> drivers/virtio/virtio_mem.c | 156 +++-
> 1 file changed, 155 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
>index 94cf44b15cbf..6bcd0acbff32 100644
>--- a/drivers/virtio/virtio_mem.c
>+++ b/drivers/virtio/virtio_mem.c
>@@ -388,6 +388,12 @@ static int 
>virtio_mem_bbm_bb_states_prepare_next_bb(struct virtio_mem *vm)
>_bb_id++) \
>   if (virtio_mem_bbm_get_bb_state(_vm, _bb_id) == _state)
> 
>+#define virtio_mem_bbm_for_each_bb_rev(_vm, _bb_id, _state) \
>+  for (_bb_id = vm->bbm.next_bb_id - 1; \
>+   _bb_id >= vm->bbm.first_bb_id && _vm->bbm.bb_count[_state]; \
>+   _bb_id--) \
>+  if (virtio_mem_bbm_get_bb_state(_vm, _bb_id) == _state)
>+
> /*
>  * Set the state of a memory block, taking care of the state counter.
>  */
>@@ -685,6 +691,18 @@ static int virtio_mem_sbm_remove_mb(struct virtio_mem 
>*vm, unsigned long mb_id)
>   return virtio_mem_remove_memory(vm, addr, size);
> }
> 
>+/*
>+ * See virtio_mem_remove_memory(): Try to remove all Linux memory blocks 
>covered
>+ * by the big block.
>+ */
>+static int virtio_mem_bbm_remove_bb(struct virtio_mem *vm, unsigned long 
>bb_id)
>+{
>+  const uint64_t addr = virtio_mem_bb_id_to_phys(vm, bb_id);
>+  const uint64_t size = vm->bbm.bb_size;
>+
>+  return virtio_mem_remove_memory(vm, addr, size);
>+}
>+
> /*
>  * Try offlining and removing memory from Linux.
>  *
>@@ -731,6 +749,19 @@ static int virtio_mem_sbm_offline_and_remove_mb(struct 
>virtio_mem *vm,
>   return virtio_mem_offline_and_remove_memory(vm, addr, size);
> }
> 
>+/*
>+ * See virtio_mem_offline_and_remove_memory(): Try to offline and remove a
>+ * all Linux memory blocks covered by the big block.
>+ */
>+static int virtio_mem_bbm_offline_and_remove_bb(struct virtio_mem *vm,
>+  unsigned long bb_id)
>+{
>+  const uint64_t addr = virtio_mem_bb_id_to_phys(vm, bb_id);
>+  const uint64_t size = vm->bbm.bb_size;
>+
>+  return virtio_mem_offline_and_remove_memory(vm, addr, size);
>+}
>+
> /*
>  * Trigger the workqueue so the device can perform its magic.
>  */
>@@ -1928,6 +1959,129 @@ static int virtio_mem_sbm_unplug_request(struct 
>virtio_mem *vm, uint64_t diff)
>   return rc;
> }
> 
>+/*
>+ * Try to offline and remove a big block from Linux and unplug it. Will fail
>+ * with -EBUSY if some memory is busy and cannot get unplugged.
>+ *
>+ * Will modify the state of the memory block. Might temporarily drop the
>+ * hotplug_mutex.
>+ */
>+static int virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm,
>+ unsigned long bb_id)
>+{
>+  int rc;
>+
>+  if (WARN_ON_ONCE(virtio_mem_bbm_get_bb_state(vm, bb_id) !=
>+   VIRTIO_MEM_BBM_BB_ADDED))
>+  return -EINVAL;
>+
>+  rc = virtio_mem_bbm_offline_and_remove_bb(vm, bb_id);
>+  if (rc)
>+  return rc;
>+
>+  rc = virtio_mem_bbm_unplug_bb(vm, bb_id);
>+  if (rc)
>+  virtio_mem_bbm_set_bb_state(vm, bb_id,
>+  VIRTIO_MEM_BBM_BB_PLUGGED);
>+  else
>+  virtio_mem_bbm_set_bb_state(vm, bb_id,
>+  VIRTIO_MEM_BBM_BB_UNUSED);
>+  return rc;
>+}
>+
>+/*
>+ * Try to remove a big block from Linux and unplug it. Will fail with
>+ * -EBUSY if some memory is online.
>+ *
>+ * Will modify the state of the memory block.
>+ */
>+static int virtio_mem_bbm_remove_and_unplug_bb(struct virtio_mem *vm,
>+ unsigned long bb_id)
>+{
>+  int rc;
>+
>+  if (WARN_ON_ONCE(virtio_mem_bbm_get_bb_state(vm, bb_id) !=
>+   VIRTIO_MEM_BBM_BB_ADDED))
>+  return -EINVAL;
>+
>+  rc = virtio_mem_bbm_remove_bb(vm, bb_id);
>+  if (rc)
>+  return -EBUSY;
>+
>+  rc = virtio_mem_bbm_unplug_bb(vm, bb_id);
>+  if (rc)
>+  virtio_mem_bbm_set_bb_state(vm, bb_id,
>+

Re: [PATCH v2] drm/of: Consider the state in which the ep is disabled

2020-10-18 Thread Kever Yang


Hi Daniel,

On 2020/10/15 下午11:23, Daniel Vetter wrote:

On Wed, Oct 14, 2020 at 09:48:43AM +0800, Kever Yang wrote:

Hi Maintainers,

     Does this patch ready to merge?

Would maybe be good to get some acks from other drivers using this, then
Sandy can push to drm-misc-next.


Thanks for your reply, I can understand more 'acks' will be better, but 
there is no comments object to this patch


or any 'NAK' common for more then 3 months, maintainers should move to 
next step.



Thanks,

- Kever


-Daniel

On 2020/7/7 下午7:25, Sandy Huang wrote:

don't mask possible_crtcs if remote-point is disabled.

Signed-off-by: Sandy Huang 
---
   drivers/gpu/drm/drm_of.c | 3 +++
   1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_of.c b/drivers/gpu/drm/drm_of.c
index fdb05fbf72a0..565f05f5f11b 100644
--- a/drivers/gpu/drm/drm_of.c
+++ b/drivers/gpu/drm/drm_of.c
@@ -66,6 +66,9 @@ uint32_t drm_of_find_possible_crtcs(struct drm_device *dev,
uint32_t possible_crtcs = 0;
for_each_endpoint_of_node(port, ep) {
+   if (!of_device_is_available(ep))
+   continue;
+
remote_port = of_graph_get_remote_port(ep);
if (!remote_port) {
of_node_put(ep);

Looks good to me.


Reviewed-by: Kever Yang

Re: [PATCH v3 2/2] PM / devfreq: Add governor attribute flag for specifc sysfs nodes

2020-10-18 Thread Chanwoo Choi

On 10/19/20 9:38 AM, Dmitry Osipenko wrote:
> ...
>> diff --git a/Documentation/ABI/testing/sysfs-class-devfreq 
>> b/Documentation/ABI/testing/sysfs-class-devfreq
>> index deefffb3bbe4..67af3f31e17c 100644
>> --- a/Documentation/ABI/testing/sysfs-class-devfreq
>> +++ b/Documentation/ABI/testing/sysfs-class-devfreq
>> @@ -37,20 +37,6 @@ Description:
>>  The /sys/class/devfreq/.../target_freq shows the next governor
>>  predicted target frequency of the corresponding devfreq object.
>>  
>> -What:   /sys/class/devfreq/.../polling_interval
>> -Date:   September 2011
>> -Contact:MyungJoo Ham 
>> -Description:
>> -The /sys/class/devfreq/.../polling_interval shows and sets
>> -the requested polling interval of the corresponding devfreq
>> -object. The values are represented in ms. If the value is
>> -less than 1 jiffy, it is considered to be 0, which means
>> -no polling. This value is meaningless if the governor is
>> -not polling; thus. If the governor is not using
>> -devfreq-provided central polling
>> -(/sys/class/devfreq/.../central_polling is 0), this value
>> -may be useless.
>> -
>>  What:   /sys/class/devfreq/.../trans_stat
>>  Date:   October 2012
>>  Contact:MyungJoo Ham 
>> @@ -65,14 +51,6 @@ Description:
>>  as following:
>>  echo 0 > /sys/class/devfreq/.../trans_stat
>>  
>> -What:   /sys/class/devfreq/.../userspace/set_freq
>> -Date:   September 2011
>> -Contact:MyungJoo Ham 
>> -Description:
>> -The /sys/class/devfreq/.../userspace/set_freq shows and
>> -sets the requested frequency for the devfreq object if
>> -userspace governor is in effect.
>> -
>>  What:   /sys/class/devfreq/.../available_frequencies
>>  Date:   October 2012
>>  Contact:Nishanth Menon 
>> @@ -109,6 +87,35 @@ Description:
>>  The max_freq overrides min_freq because max_freq may be
>>  used to throttle devices to avoid overheating.
>>  
>> +What:   /sys/class/devfreq/.../polling_interval
>> +Date:   September 2011
>> +Contact:MyungJoo Ham 
>> +Description:
>> +The /sys/class/devfreq/.../polling_interval shows and sets
>> +the requested polling interval of the corresponding devfreq
>> +object. The values are represented in ms. If the value is
>> +less than 1 jiffy, it is considered to be 0, which means
>> +no polling. This value is meaningless if the governor is
>> +not polling; thus. If the governor is not using
>> +devfreq-provided central polling
>> +(/sys/class/devfreq/.../central_polling is 0), this value
>> +may be useless.
>> +
>> +A list of governors that support the node:
>> +- simple_ondmenad
>> +- tegra_actmon
>> +
>> +What:   /sys/class/devfreq/.../userspace/set_freq
>> +Date:   September 2011
>> +Contact:MyungJoo Ham 
>> +Description:
>> +The /sys/class/devfreq/.../userspace/set_freq shows and
>> +sets the requested frequency for the devfreq object if
>> +userspace governor is in effect.
>> +
>> +A list of governors that support the node:
>> +- userspace
>> +
>>  What:   /sys/class/devfreq/.../timer
>>  Date:   July 2020
>>  Contact:Chanwoo Choi 
>> @@ -120,3 +127,6 @@ Description:
>>  as following:
>>  echo deferrable > /sys/class/devfreq/.../timer
>>  echo delayed > /sys/class/devfreq/.../timer
>> +
>> +A list of governors that support the node:
>> +- simple_ondemand
> 
> Hello, Chanwoo!
> 
> Could you please explain the reason of changing the doc? It looks like
> you only added the lists of governors, but is it a really useful change?
> Are you going to keep these lists up-to-date?

I think that is is useful. Because user cannot know why specific sysfs node
(like 'timer') is absence according to governor. So, in order to remove
the user confusion, better to add the information to documentation.

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

Re: [PATCH v3 1/2] PM / devfreq: Add governor feature flag

2020-10-18 Thread Chanwoo Choi

On 10/19/20 9:57 AM, Dmitry Osipenko wrote:
> 07.10.2020 08:07, Chanwoo Choi пишет:
>> The devfreq governor is able to have the specific flag as follows
>> in order to implement the specific feature. For example, devfreq allows
>> user to change the governors on runtime via sysfs interface.
>> But, if devfreq device uses 'passive' governor, don't allow user to change
>> the governor. For this case, define the DEVFREQ_GOV_FLAT_IMMUTABLE
> 
> s/DEVFREQ_GOV_FLAT/DEVFREQ_GOV_FLAG/
> 
> ...
>>  /**
>>   * struct devfreq_governor - Devfreq policy governor
>>   * @node:   list node - contains registered devfreq governors
>>   * @name:   Governor's name
>> - * @immutable:  Immutable flag for governor. If the value is 1,
>> - *  this governor is never changeable to other governor.
>> - * @interrupt_driven:   Devfreq core won't schedule polling work for 
>> this
>> - *  governor if value is set to 1.
>> + * @flag:   Governor's feature flag
>>   * @get_target_freq:Returns desired operating frequency for the 
>> device.
>>   *  Basically, get_target_freq will run
>>   *  devfreq_dev_profile.get_dev_status() to get the
>> @@ -50,8 +57,7 @@ struct devfreq_governor {
>>  struct list_head node;
>>  
>>  const char name[DEVFREQ_NAME_LEN];
>> -const unsigned int immutable;
>> -const unsigned int interrupt_driven;
>> +const u64 flag;
> A plural form of flag(s) is more common, IMO.

When need to add more feature flag, I prefer to add
the definition instead of changing the structure.
I think it is better.

> 
> It's also possible to use a single bit:1 for the struct members. Thus,
> could you please explain what are the benefits of the "flag"?

I think that anyone might add the some optional
feature. So, I used 'flag' for the extensibility.



-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

Re: [GIT PULL] RCU changes for v5.10

2020-10-18 Thread Paul E. McKenney

On Sun, Oct 18, 2020 at 02:39:56PM -0700, Linus Torvalds wrote:
> On Mon, Oct 12, 2020 at 7:14 AM Ingo Molnar  wrote:
> >
> > Please pull the latest core/rcu git tree from:
> >
> >git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> > core-rcu-2020-10-12
> 
> I've pulled everything but that last merge and the PREEMPT_COUNT stuff
> that came with it.
> 
> When Paul asked whether it was ok for RCU to use preempt_count() and I
> answered in the affirmative, I didn't mean it in the sense of "RCU
> wants to force it on everybody else too".
> 
> I'm pretty convinced that the proper fix is to simply make sure that
> rcu_free() and friends aren't run under any raw spinlocks. So even if
> the cost of preempt-count isn't that noticeable, there just isn't a
> reason for RCU to say "screw everybody else, I want this" when there
> are other alternatives.

Thank you for pulling the other branches.

On CONFIG_PREEMPT_COUNT, got it.  It would be OK for RCU to use
preempt_count() for some debugging or specialty kernel, but not across
the board.  Thank you for bearing with me on this one.

There is more to it than just raw spinlocks, but regardless we will go
back to the drawing board and come up with a less intrusive fix for the
v5.11 merge window.

Thanx, Paul

Re: [PATCH v1 27/29] mm/memory_hotplug: extend offline_and_remove_memory() to handle more than one memory block

2020-10-18 Thread Wei Yang

On Mon, Oct 12, 2020 at 02:53:21PM +0200, David Hildenbrand wrote:
>virtio-mem soon wants to use offline_and_remove_memory() memory that
>exceeds a single Linux memory block (memory_block_size_bytes()). Let's
>remove that restriction.
>
>Let's remember the old state and try to restore that if anything goes
>wrong. While re-onlining can, in general, fail, it's highly unlikely to
>happen (usually only when a notifier fails to allocate memory, and these
>are rather rare).
>
>This will be used by virtio-mem to offline+remove memory ranges that are
>bigger than a single memory block - for example, with a device block
>size of 1 GiB (e.g., gigantic pages in the hypervisor) and a Linux memory
>block size of 128MB.
>
>While we could compress the state into 2 bit, using 8 bit is much
>easier.
>
>This handling is similar, but different to acpi_scan_try_to_offline():
>
>a) We don't try to offline twice. I am not sure if this CONFIG_MEMCG
>optimization is still relevant - it should only apply to ZONE_NORMAL
>(where we have no guarantees). If relevant, we can always add it.
>
>b) acpi_scan_try_to_offline() simply onlines all memory in case
>something goes wrong. It doesn't restore previous online type. Let's do
>that, so we won't overwrite what e.g., user space configured.
>
>Cc: "Michael S. Tsirkin" 
>Cc: Jason Wang 
>Cc: Pankaj Gupta 
>Cc: Michal Hocko 
>Cc: Oscar Salvador 
>Cc: Wei Yang 
>Cc: Andrew Morton 
>Signed-off-by: David Hildenbrand 

Looks good to me.

Reviewed-by: Wei Yang 

>---
> mm/memory_hotplug.c | 105 +---
> 1 file changed, 89 insertions(+), 16 deletions(-)
>
>diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>index b44d4c7ba73b..217080ca93e5 100644
>--- a/mm/memory_hotplug.c
>+++ b/mm/memory_hotplug.c
>@@ -1806,39 +1806,112 @@ int remove_memory(int nid, u64 start, u64 size)
> }
> EXPORT_SYMBOL_GPL(remove_memory);
> 
>+static int try_offline_memory_block(struct memory_block *mem, void *arg)
>+{
>+  uint8_t online_type = MMOP_ONLINE_KERNEL;
>+  uint8_t **online_types = arg;
>+  struct page *page;
>+  int rc;
>+
>+  /*
>+   * Sense the online_type via the zone of the memory block. Offlining
>+   * with multiple zones within one memory block will be rejected
>+   * by offlining code ... so we don't care about that.
>+   */
>+  page = pfn_to_online_page(section_nr_to_pfn(mem->start_section_nr));
>+  if (page && zone_idx(page_zone(page)) == ZONE_MOVABLE)
>+  online_type = MMOP_ONLINE_MOVABLE;
>+
>+  rc = device_offline(&mem->dev);
>+  /*
>+   * Default is MMOP_OFFLINE - change it only if offlining succeeded,
>+   * so try_reonline_memory_block() can do the right thing.
>+   */
>+  if (!rc)
>+  **online_types = online_type;
>+
>+  (*online_types)++;
>+  /* Ignore if already offline. */
>+  return rc < 0 ? rc : 0;
>+}
>+
>+static int try_reonline_memory_block(struct memory_block *mem, void *arg)
>+{
>+  uint8_t **online_types = arg;
>+  int rc;
>+
>+  if (**online_types != MMOP_OFFLINE) {
>+  mem->online_type = **online_types;
>+  rc = device_online(&mem->dev);
>+  if (rc < 0)
>+  pr_warn("%s: Failed to re-online memory: %d",
>+  __func__, rc);
>+  }
>+
>+  /* Continue processing all remaining memory blocks. */
>+  (*online_types)++;
>+  return 0;
>+}
>+
> /*
>- * Try to offline and remove a memory block. Might take a long time to
>- * finish in case memory is still in use. Primarily useful for memory devices
>- * that logically unplugged all memory (so it's no longer in use) and want to
>- * offline + remove the memory block.
>+ * Try to offline and remove memory. Might take a long time to finish in case
>+ * memory is still in use. Primarily useful for memory devices that logically
>+ * unplugged all memory (so it's no longer in use) and want to offline + 
>remove
>+ * that memory.
>  */
> int offline_and_remove_memory(int nid, u64 start, u64 size)
> {
>-  struct memory_block *mem;
>-  int rc = -EINVAL;
>+  const unsigned long mb_count = size / memory_block_size_bytes();
>+  uint8_t *online_types, *tmp;
>+  int rc;
> 
>   if (!IS_ALIGNED(start, memory_block_size_bytes()) ||
>-  size != memory_block_size_bytes())
>-  return rc;
>+  !IS_ALIGNED(size, memory_block_size_bytes()) || !size)
>+  return -EINVAL;
>+
>+  /*
>+   * We'll remember the old online type of each memory block, so we can
>+   * try to revert whatever we did when offlining one memory block fails
>+   * after offlining some others succeeded.
>+   */
>+  online_types = kmalloc_array(mb_count, sizeof(*online_types),
>+   GFP_KERNEL);
>+  if (!online_types)
>+  return -ENOMEM;
>+  /*
>+   * Initialize all states to MMOP_OFFLINE, so when we abort p

[PATCH] [v5] wireless: Initial driver submission for pureLiFi STA devices

2020-10-18 Thread Srinivasan Raju

This introduces the pureLiFi LiFi driver for LiFi-X, LiFi-XC
and LiFi-XL USB devices.

This driver implementation has been based on the zd1211rw driver.

Driver is based on 802.11 softMAC Architecture and uses
native 802.11 for configuration and management.

The driver is compiled and tested in ARM, x86 architectures and
compiled in powerpc architecture.

Reported-by: kernel test robot 
Signed-off-by: Srinivasan Raju 
---
 MAINTAINERS|5 +
 drivers/net/wireless/Kconfig   |1 +
 drivers/net/wireless/Makefile  |1 +
 drivers/net/wireless/purelifi/Kconfig  |   27 +
 drivers/net/wireless/purelifi/Makefile |3 +
 drivers/net/wireless/purelifi/chip.c   |   97 ++
 drivers/net/wireless/purelifi/chip.h   |   82 ++
 drivers/net/wireless/purelifi/intf.h   |   38 +
 drivers/net/wireless/purelifi/mac.c|  861 +
 drivers/net/wireless/purelifi/mac.h|  180 +++
 drivers/net/wireless/purelifi/usb.c| 1637 
 drivers/net/wireless/purelifi/usb.h|  148 +++
 12 files changed, 3080 insertions(+)
 create mode 100644 drivers/net/wireless/purelifi/Kconfig
 create mode 100644 drivers/net/wireless/purelifi/Makefile
 create mode 100644 drivers/net/wireless/purelifi/chip.c
 create mode 100644 drivers/net/wireless/purelifi/chip.h
 create mode 100644 drivers/net/wireless/purelifi/intf.h
 create mode 100644 drivers/net/wireless/purelifi/mac.c
 create mode 100644 drivers/net/wireless/purelifi/mac.h
 create mode 100644 drivers/net/wireless/purelifi/usb.c
 create mode 100644 drivers/net/wireless/purelifi/usb.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c80f87d7258c..150f592fb6e4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14108,6 +14108,11 @@ T: git git://linuxtv.org/media_tree.git
 F: Documentation/admin-guide/media/pulse8-cec.rst
 F: drivers/media/cec/usb/pulse8/
 
+PUREILIFI USB DRIVER
+M: Srinivasan Raju 
+S: Maintained
+F: drivers/net/wireless/purelifi
+
 PVRUSB2 VIDEO4LINUX DRIVER
 M: Mike Isely 
 L: pvru...@isely.net   (subscribers-only)
diff --git a/drivers/net/wireless/Kconfig b/drivers/net/wireless/Kconfig
index 170a64e67709..b87da3139f94 100644
--- a/drivers/net/wireless/Kconfig
+++ b/drivers/net/wireless/Kconfig
@@ -48,6 +48,7 @@ source "drivers/net/wireless/st/Kconfig"
 source "drivers/net/wireless/ti/Kconfig"
 source "drivers/net/wireless/zydas/Kconfig"
 source "drivers/net/wireless/quantenna/Kconfig"
+source "drivers/net/wireless/purelifi/Kconfig"
 
 config PCMCIA_RAYCS
tristate "Aviator/Raytheon 2.4GHz wireless support"
diff --git a/drivers/net/wireless/Makefile b/drivers/net/wireless/Makefile
index 80b324499786..e9fc770026f0 100644
--- a/drivers/net/wireless/Makefile
+++ b/drivers/net/wireless/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_WLAN_VENDOR_ST) += st/
 obj-$(CONFIG_WLAN_VENDOR_TI) += ti/
 obj-$(CONFIG_WLAN_VENDOR_ZYDAS) += zydas/
 obj-$(CONFIG_WLAN_VENDOR_QUANTENNA) += quantenna/
+obj-$(CONFIG_WLAN_VENDOR_PURELIFI) += purelifi/
 
 # 16-bit wireless PCMCIA client drivers
 obj-$(CONFIG_PCMCIA_RAYCS) += ray_cs.o
diff --git a/drivers/net/wireless/purelifi/Kconfig 
b/drivers/net/wireless/purelifi/Kconfig
new file mode 100644
index ..f6630791df9d
--- /dev/null
+++ b/drivers/net/wireless/purelifi/Kconfig
@@ -0,0 +1,27 @@
+# SPDX-License-Identifier: GPL-2.0
+config WLAN_VENDOR_PURELIFI
+   bool "pureLiFi devices"
+   default y
+   help
+ If you have a pureLiFi device, say Y.
+
+ Note that the answer to this question doesn't directly affect the
+ kernel: saying N will just cause the configurator to skip all the
+ questions about these cards. If you say Y, you will be asked for
+ your specific card in the following questions.
+
+if WLAN_VENDOR_PURELIFI
+
+config PURELIFI
+
+   tristate "pureLiFi device support"
+   depends on CFG80211 && MAC80211 && USB
+   help
+  This driver makes the adapter appear as a normal WLAN interface.
+
+  The pureLiFi device requires external STA firmware to be loaded.
+
+  To compile this driver as a module, choose M here: the module will
+  be called purelifi.
+
+endif # WLAN_VENDOR_PURELIFI
diff --git a/drivers/net/wireless/purelifi/Makefile 
b/drivers/net/wireless/purelifi/Makefile
new file mode 100644
index ..1f20055e741f
--- /dev/null
+++ b/drivers/net/wireless/purelifi/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_PURELIFI) := purelifi.o
+purelifi-objs  += chip.o usb.o mac.o
diff --git a/drivers/net/wireless/purelifi/chip.c 
b/drivers/net/wireless/purelifi/chip.c
new file mode 100644
index ..cca03697cb06
--- /dev/null
+++ b/drivers/net/wireless/purelifi/chip.c
@@ -0,0 +1,97 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include 
+
+#include "chip.h"
+#include "mac.h"
+#include "usb.h"
+
+void purelifi_chip_init(struct purelifi_chip *chip,
+

Re: gssapi, crypto and afs/rxrpc

2020-10-18 Thread Herbert Xu

On Fri, Oct 16, 2020 at 05:18:26PM +0100, David Howells wrote:
>
> If I do this, should I create a "kerberos" crypto API for the data wrapping
> functions?  I'm not sure that it quite matches the existing APIs because the
> size of the input data will likely not match the size of the output data and
> it's "one shot" as it needs to deal with a checksum.

Generally it makes sense to create a Crypto API for an algorithm
if there are going to be at least two implementations of it.  In
particular, if there is hardware acceleration available then it'd
make sense.

Otherwise a library helper would be more appropriate.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[PATCH] rtl8192de: avoid accessing the data mapped to streaming DMA

2020-10-18 Thread Jia-Ju Bai

In rtl92de_tx_fill_cmddesc(), skb->data is mapped to streaming DMA on
line 667:
  dma_addr_t mapping = dma_map_single(..., skb->data, ...);

On line 669, skb->data is assigned to hdr after cast:
  struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)(skb->data);

Then hdr->frame_control is accessed on line 670:
  __le16 fc = hdr->frame_control;

This DMA access may cause data inconsistency between CPU and hardwre.

To fix this bug, hdr->frame_control is accessed before the DMA mapping.

Signed-off-by: Jia-Ju Bai 
---
 drivers/net/wireless/realtek/rtlwifi/rtl8192de/trx.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/trx.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/trx.c
index 8944712274b5..c02813fba934 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/trx.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/trx.c
@@ -664,12 +664,14 @@ void rtl92de_tx_fill_cmddesc(struct ieee80211_hw *hw,
struct rtl_ps_ctl *ppsc = rtl_psc(rtlpriv);
struct rtl_hal *rtlhal = rtl_hal(rtlpriv);
u8 fw_queue = QSLT_BEACON;
-   dma_addr_t mapping = dma_map_single(&rtlpci->pdev->dev, skb->data,
-   skb->len, DMA_TO_DEVICE);
+
struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)(skb->data);
__le16 fc = hdr->frame_control;
__le32 *pdesc = (__le32 *)pdesc8;
 
+   dma_addr_t mapping = dma_map_single(&rtlpci->pdev->dev, skb->data,
+   skb->len, DMA_TO_DEVICE);
+
if (dma_mapping_error(&rtlpci->pdev->dev, mapping)) {
rtl_dbg(rtlpriv, COMP_SEND, DBG_TRACE,
"DMA mapping error\n");
-- 
2.17.1

[PATCHv5] selftests: rtnetlink: load fou module for kci_test_encap_fou() test

2020-10-18 Thread Po-Hsu Lin

The kci_test_encap_fou() test from kci_test_encap() in rtnetlink.sh
needs the fou module to work. Otherwise it will fail with:

  $ ip netns exec "$testns" ip fou add port  ipproto 47
  RTNETLINK answers: No such file or directory
  Error talking to the kernel

Add the CONFIG_NET_FOU into the config file as well. Which needs at
least to be set as a loadable module.

Signed-off-by: Po-Hsu Lin 
---
 tools/testing/selftests/net/config   | 1 +
 tools/testing/selftests/net/rtnetlink.sh | 5 +
 2 files changed, 6 insertions(+)

diff --git a/tools/testing/selftests/net/config 
b/tools/testing/selftests/net/config
index 4364924..4d5df8e 100644
--- a/tools/testing/selftests/net/config
+++ b/tools/testing/selftests/net/config
@@ -33,3 +33,4 @@ CONFIG_KALLSYMS=y
 CONFIG_TRACEPOINTS=y
 CONFIG_NET_DROP_MONITOR=m
 CONFIG_NETDEVSIM=m
+CONFIG_NET_FOU=m
diff --git a/tools/testing/selftests/net/rtnetlink.sh 
b/tools/testing/selftests/net/rtnetlink.sh
index 8a2fe6d..c9ce3df 100755
--- a/tools/testing/selftests/net/rtnetlink.sh
+++ b/tools/testing/selftests/net/rtnetlink.sh
@@ -520,6 +520,11 @@ kci_test_encap_fou()
return $ksft_skip
fi
 
+   if ! /sbin/modprobe -q -n fou; then
+   echo "SKIP: module fou is not found"
+   return $ksft_skip
+   fi
+   /sbin/modprobe -q fou
ip -netns "$testns" fou add port  ipproto 47 2>/dev/null
if [ $? -ne 0 ];then
echo "FAIL: can't add fou port , skipping test"
-- 
2.7.4

[PATCH] rtl8192ce: avoid accessing the data mapped to streaming DMA

2020-10-18 Thread Jia-Ju Bai

In rtl92ce_tx_fill_cmddesc(), skb->data is mapped to streaming DMA on
line 530:
  dma_addr_t mapping = dma_map_single(..., skb->data, ...);

On line 533, skb->data is assigned to hdr after cast:
  struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)(skb->data);

Then hdr->frame_control is accessed on line 534:
  __le16 fc = hdr->frame_control;

This DMA access may cause data inconsistency between CPU and hardwre.

To fix this bug, hdr->frame_control is accessed before the DMA mapping.

Signed-off-by: Jia-Ju Bai 
---
 drivers/net/wireless/realtek/rtlwifi/rtl8192ce/trx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/trx.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/trx.c
index c0635309a92d..4165175cf5c0 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/trx.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/trx.c
@@ -527,12 +527,12 @@ void rtl92ce_tx_fill_cmddesc(struct ieee80211_hw *hw,
u8 fw_queue = QSLT_BEACON;
__le32 *pdesc = (__le32 *)pdesc8;
 
-   dma_addr_t mapping = dma_map_single(&rtlpci->pdev->dev, skb->data,
-   skb->len, DMA_TO_DEVICE);
-
struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)(skb->data);
__le16 fc = hdr->frame_control;
 
+   dma_addr_t mapping = dma_map_single(&rtlpci->pdev->dev, skb->data,
+   skb->len, DMA_TO_DEVICE);
+
if (dma_mapping_error(&rtlpci->pdev->dev, mapping)) {
rtl_dbg(rtlpriv, COMP_SEND, DBG_TRACE,
"DMA mapping error\n");
-- 
2.17.1

Re: [PATCH RFC 0/5] ubifs: Prevent memory oob accessing while dumping node

2020-10-18 Thread Zhihao Cheng


在 2020/6/16 15:11, Zhihao Cheng 写道:

We use function ubifs_dump_node() to dump bad node caused by some
reasons (Such as bit flipping caused by hardware error, writing bypass
ubifs or unknown bugs in ubifs). The node content can not be trusted
anymore, so we should prevent memory out-of-bounds accessing while
dumping node in following situations:

1. bad node_len: Dumping data according to 'ch->len' which may exceed
the size of memory allocated for node.
2. bad node content: Some kinds of node can record additional data, eg.
index node and orphan node, make sure the size of additional data
not beyond the node length.
3. node_type changes: Read data according to type A, but expected type
B, before that, node is allocated according to type B's size. Length
of type A node is greater than type B node.

Commit acc5af3efa303d5f3 ("ubifs: Fix out-of-bounds memory access caused
by abnormal value of node_len") handles situation 1 for data node only,
it would be better if we can solve problems in above situations for all
kinds of nodes.

Patch 1 adds a new parameter 'node_len'(size of memory which is allocated
for the node) in function ubifs_dump_node(), safe dumping length of the
node should be: minimum(ch->len, c->ranges[node_type].max_len, node_len).
Besides, c->ranges[node_type].min_len can not greater than safe dumping
length, which may caused by node_type changes(situation 3).

Patch 2 reverts commit acc5af3efa303d5f ("ubifs: Fix out-of-bounds memory
access caused by abnormal value of node_len") to prepare for patch 3.

Patch 3 replaces modified function ubifs_dump_node() in all node dumping
places except for ubifs_dump_sleb().

Patch 4 removes unused function ubifs_dump_sleb(),

Patch 5 allows ubifs_dump_node() to dump all branches of the index node.

Some tests after patchset applied:
https://bugzilla.kernel.org/show_bug.cgi?id=208203

Zhihao Cheng (5):
   ubifs: Limit dumping length by size of memory which is allocated for
 the node
   Revert "ubifs: Fix out-of-bounds memory access caused by abnormal
 value of node_len"
   ubifs: Pass node length in all node dumping callers
   ubifs: ubifs_dump_sleb: Remove unused function
   ubifs: ubifs_dump_node: Dump all branches of the index node

  fs/ubifs/commit.c   |   4 +-
  fs/ubifs/debug.c| 111 ++--
  fs/ubifs/debug.h|   5 +-
  fs/ubifs/file.c |   2 +-
  fs/ubifs/io.c   |  37 +--
  fs/ubifs/journal.c  |   3 +-
  fs/ubifs/master.c   |   4 +-
  fs/ubifs/orphan.c   |   6 ++-
  fs/ubifs/recovery.c |   6 +--
  fs/ubifs/replay.c   |   4 +-
  fs/ubifs/sb.c   |   2 +-
  fs/ubifs/scan.c |   4 +-
  fs/ubifs/super.c|   2 +-
  fs/ubifs/tnc.c  |   8 ++--
  fs/ubifs/tnc_misc.c |   4 +-
  fs/ubifs/ubifs.h|   4 +-
  16 files changed, 108 insertions(+), 98 deletions(-)

ping, although it is not a serious problem for ubifs, but dumping extra 
memory by formating specified ubifs img may cause security problem.

Re: [PATCH] mm/util.c: Add error logs for commitment overflow

2020-10-18 Thread pintu

On 2020-10-05 12:50, Michal Hocko wrote:

On Fri 02-10-20 21:53:37, pi...@codeaurora.org wrote:

On 2020-10-02 17:47, Michal Hocko wrote:

> > __vm_enough_memory: commitment overflow: ppid:150, pid:164,
> > pages:62451
> > fork failed[count:0]: Cannot allocate memory
>
> While I understand that fork failing due to overrcomit heuristic is non
> intuitive and I have seen people scratching heads due to this in the
> past I am not convinced this is a right approach to tackle the problem.

Dear Michal,
First, thank you so much for your review and comments.
I totally agree with you.

> First off, referencing pids is not really going to help much if process
> is short lived.

Yes, I agree with you.
But I think this is most important mainly for short lived processes 
itself.
Because, when this situation occurs, no one knows who could be the 
culprit.

Pid will not tell you much for those processes, right?

However, user keeps dumping "ps" or "top" in background to reproduce 
once

again.

I do not think this would be an effective way to catch the problem.
Especially with _once reporting. My experience with these reports is
that a reporter notices a malfunctioning (usually more complex)
workload. In some cases ENOMEM from fork is reported into the log by 
the

userspace.

For others it is strace -f that tells us that fork is failing and a
test with OVERCOMMIT_ALWAYS usually confirms the theory that this is
the culprit. But a rule of thumb is that it is almost always overcommit
to blame. Why? An undocumented secret is that ENOMEM resulting from an
actual memory allocation in the fork/clone path is close to impossible
because kernel does all it can to satisfy them (an even invokes OOM
killer). There are exceptions (e.g. like high order allocation) but
those should be very rare in that path.

At this time, we can easily match the pid, process-name (at least in 
most

cases).

Maybe our definitions of short lived processes differ but in my book
those are pretty hard to catch in flight.

> Secondly, __vm_enough_memory is about any address space
> allocation. Why would you be interested in parent when doing mmap?
>

Yes agree, we can remove ppid from here.
I thought it might be useful at least in case of fork (or short lived
process).

I suspect you have missed my point here. Let me clarify a bit more.
__vm_enough_memory is called from much more places than fork.
Essentially any mmap, brk etc are going though this. This is where
parent pid certainly doesn't make any sense. In fork this is a 
different

case because your forked process pid on its own doesn't make much sense
as it is going to die very quickly anyway. This is when parent is 
likely

a more important information.

That being said the content really depends on the specific path and 
that

suggestes that you are trying to log at a wrong layer.

Another question is whether we really need a logging done by the 
kernel.
Ratelimiting would be tricky to get right and we do not want to allow 
an

easy way to swamp logs either.
As already mentioned ENOMEM usually means overcommit failure. Maybe we
want to be more explicit this in the man page?

Thank you so much for your feedback.
First of all I am sorry for my delayed response.

As I understand, the conclusion here is that:
a) The pr_err_once is not helpful and neither pr_err_ratelimited ?
Even with this below logs:
+ pr_err_ratelimited("vm memory overflow: pid:%d, name: %s, 
pages:%ld\n",

+ current->pid, current->comm, pages);

b) This can be invoked from many places so we are adding the logging at 
wrong layer?

If so, any other better places which can be explored?

c) Adding logging at kernel layer is not the right approach to tackle 
this problem ?

d) Another thing we can do is, update the man page with more detailed 
information about this commitment overflow ?

e) May be returning ENOMEM (Cannot allocate memory) from VM path is 
slightly misleading for user space folks even though there are enough 
memory?
=> Either we can introduce ENOVMEM (Cannot create virtual memory 
mapping)
=> Or, update the documentation with approach to further debug this 
issue?

If there are any more suggestions to easily catch this issue please let 
us know, we can explore further.

Thanks,
Pintu

RE: [PATCH v1] hv_balloon: disable warning when floor reached

2020-10-18 Thread Michael Kelley

From: Olaf Hering  Sent: Thursday, October 8, 2020 12:12 AM
> 
> It is not an error if a the host requests to balloon down, but the VM

Spurious word "a"

> refuses to do so. Without this change a warning is logged in dmesg
> every five minutes.
> 
> Fixes commit b3bb97b8a49f3

This "Fixes" line isn't formatted correctly.  Should be:

Fixes:  b3bb97b8a49f3 ("Drivers: hv: balloon: Add logging for dynamic memory 
operations")

> 
> Signed-off-by: Olaf Hering 
> ---
>  drivers/hv/hv_balloon.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> index 32e3bc0aa665..0f50295d0214 100644
> --- a/drivers/hv/hv_balloon.c
> +++ b/drivers/hv/hv_balloon.c
> @@ -1275,7 +1275,7 @@ static void balloon_up(struct work_struct *dummy)
> 
>   /* Refuse to balloon below the floor. */
>   if (avail_pages < num_pages || avail_pages - num_pages < floor) {
> - pr_warn("Balloon request will be partially fulfilled. %s\n",
> + pr_info("Balloon request will be partially fulfilled. %s\n",
>   avail_pages < num_pages ? "Not enough memory." :
>   "Balloon floor reached.");
> 

Above nits notwithstanding,

Reviewed-by: Michael Kelley

RE: [PATCH v1] hv_balloon: disable warning when floor reached

2020-10-18 Thread Michael Kelley

From: Wei Liu 
> 
> On Tue, Oct 13, 2020 at 11:19:21AM +0200, Olaf Hering wrote:
> > Am Tue, 13 Oct 2020 09:17:17 +
> > schrieb Wei Liu :
> >
> > > So ... this patch is not needed anymore?
> >
> > Why? A message is generated every 5 minutes. Unclear why this remained
> > unnoticed since at least v5.3. I have added debug to my distro kernel
> > to see what the involved variable values are. More info later today.
> 
> What I mean is you seem to have found a way to configure the system to
> get what you want it to do. It was unclear to me whether this patch is
> absolutely necessary from your last reply.
> 
> Some may consider the log informational (like you), some may think it
> warrants a warning (because not enough memory).  People also don't seem
> to be particularly bother by it since its introduction in 4.10.
> 
> I have no objection to applying this patch. I can pick it up if I don't
> hear objection in the next couple of days.
> 

I think we should take the patch.  We've had a complaint about the noisy
output from another source as well, so it was on my list of small things
to clean up.  Thanks for doing it Olaf.

I'll send a separate Reviewed-by:

Michael

[PATCH] rtl8180: avoid accessing the data mapped to streaming DMA

2020-10-18 Thread Jia-Ju Bai

In rtl8180_tx(), skb->data is mapped to streaming DMA on line 476:
  mapping = dma_map_single(..., skb->data, ...);

On line 459, skb->data is assigned to hdr after cast:
  struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data;

Then hdr->seq_ctrl is accessed on lines 540 and 541:
  hdr->seq_ctrl &= cpu_to_le16(IEEE80211_SCTL_FRAG);
  hdr->seq_ctrl |= cpu_to_le16(priv->seqno);

These DMA accesses may cause data inconsistency between CPU and hardwre.

To fix this problem, hdr->seq_ctrl is accessed before the DMA mapping.

Signed-off-by: Jia-Ju Bai 
---
 drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c 
b/drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c
index 2477e18c7cae..cc73014aa5f3 100644
--- a/drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c
+++ b/drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c
@@ -473,6 +473,13 @@ static void rtl8180_tx(struct ieee80211_hw *dev,
prio = skb_get_queue_mapping(skb);
ring = &priv->tx_ring[prio];
 
+   if (info->flags & IEEE80211_TX_CTL_ASSIGN_SEQ) {
+   if (info->flags & IEEE80211_TX_CTL_FIRST_FRAGMENT)
+   priv->seqno += 0x10;
+   hdr->seq_ctrl &= cpu_to_le16(IEEE80211_SCTL_FRAG);
+   hdr->seq_ctrl |= cpu_to_le16(priv->seqno);
+   }
+
mapping = dma_map_single(&priv->pdev->dev, skb->data, skb->len,
 DMA_TO_DEVICE);
 
@@ -534,13 +541,6 @@ static void rtl8180_tx(struct ieee80211_hw *dev,
 
spin_lock_irqsave(&priv->lock, flags);
 
-   if (info->flags & IEEE80211_TX_CTL_ASSIGN_SEQ) {
-   if (info->flags & IEEE80211_TX_CTL_FIRST_FRAGMENT)
-   priv->seqno += 0x10;
-   hdr->seq_ctrl &= cpu_to_le16(IEEE80211_SCTL_FRAG);
-   hdr->seq_ctrl |= cpu_to_le16(priv->seqno);
-   }
-
idx = (ring->idx + skb_queue_len(&ring->queue)) % ring->entries;
entry = &ring->desc[idx];
 
-- 
2.17.1

Re: [PATCH v4 09/15] ASoC: dt-bindings: audio-graph: Convert bindings to json-schema

2020-10-18 Thread Kuninori Morimoto



Hi Sameer


> Convert device tree bindings of audio graph card to YAML format. Also
> expose some common definitions which can be used by similar graph based
> audio sound cards.
> 
> Signed-off-by: Sameer Pujar 
> Cc: Kuninori Morimoto 
> ---

I'm posting this patch to Rob & DT ML.
Not yet accepted, though..

>  .../devicetree/bindings/sound/audio-graph-card.txt | 337 -
>  .../bindings/sound/audio-graph-card.yaml   | 548 
> +
>  2 files changed, 548 insertions(+), 337 deletions(-)
>  delete mode 100644 
> Documentation/devicetree/bindings/sound/audio-graph-card.txt
>  create mode 100644 
> Documentation/devicetree/bindings/sound/audio-graph-card.yaml
> 
> diff --git a/Documentation/devicetree/bindings/sound/audio-graph-card.txt 
> b/Documentation/devicetree/bindings/sound/audio-graph-card.txt
> deleted file mode 100644
> index d5f6919..000
> --- a/Documentation/devicetree/bindings/sound/audio-graph-card.txt
> +++ /dev/null
> @@ -1,337 +0,0 @@
> -Audio Graph Card:
> -
> -Audio Graph Card specifies audio DAI connections of SoC <-> codec.
> -It is based on common bindings for device graphs.
> -see ${LINUX}/Documentation/devicetree/bindings/graph.txt
> -
> -Basically, Audio Graph Card property is same as Simple Card.
> -see ${LINUX}/Documentation/devicetree/bindings/sound/simple-card.yaml
> -
> -Below are same as Simple-Card.
> -
> -- label
> -- widgets
> -- routing
> -- dai-format
> -- frame-master
> -- bitclock-master
> -- bitclock-inversion
> -- frame-inversion
> -- mclk-fs
> -- hp-det-gpio
> -- mic-det-gpio
> -- dai-tdm-slot-num
> -- dai-tdm-slot-width
> -- clocks / system-clock-frequency
> -
> -Required properties:
> -
> -- compatible : "audio-graph-card";
> -- dais   : list of CPU DAI port{s}
> -
> -Optional properties:
> -- pa-gpios: GPIO used to control external amplifier.
> -
> 
> -Example: Single DAI case
> 
> -
> - sound_card {
> - compatible = "audio-graph-card";
> -
> - dais = <&cpu_port>;
> - };
> -
> - dai-controller {
> - ...
> - cpu_port: port {
> - cpu_endpoint: endpoint {
> - remote-endpoint = <&codec_endpoint>;
> -
> - dai-format = "left_j";
> - ...
> - };
> - };
> - };
> -
> - audio-codec {
> - ...
> - port {
> - codec_endpoint: endpoint {
> - remote-endpoint = <&cpu_endpoint>;
> - };
> - };
> - };
> -
> 
> -Example: Multi DAI case
> 
> -
> - sound-card {
> - compatible = "audio-graph-card";
> -
> - label = "sound-card";
> -
> - dais = <&cpu_port0
> - &cpu_port1
> - &cpu_port2>;
> - };
> -
> - audio-codec@0 {
> - ...
> - port {
> - codec0_endpoint: endpoint {
> - remote-endpoint = <&cpu_endpoint0>;
> - };
> - };
> - };
> -
> - audio-codec@1 {
> - ...
> - port {
> - codec1_endpoint: endpoint {
> - remote-endpoint = <&cpu_endpoint1>;
> - };
> - };
> - };
> -
> - audio-codec@2 {
> - ...
> - port {
> - codec2_endpoint: endpoint {
> - remote-endpoint = <&cpu_endpoint2>;
> - };
> - };
> - };
> -
> - dai-controller {
> - ...
> - ports {
> - cpu_port0: port@0 {
> - cpu_endpoint0: endpoint {
> - remote-endpoint = <&codec0_endpoint>;
> -
> - dai-format = "left_j";
> - ...
> - };
> - };
> - cpu_port1: port@1 {
> - cpu_endpoint1: endpoint {
> - remote-endpoint = <&codec1_endpoint>;
> -
> - dai-format = "i2s";
> - ...
> - };
> - };
> - cpu_port2: port@2 {
> - cpu_endpoint2: endpoint {
> - remote-endpoint = <&codec2_endpoint>;
> -
> - dai-format = "i2s";
> - ...
> - };
> - };
> - };
> - };
> -
> -
> 
> -Example: Samplin

Re: [PATCHv4] selftests: rtnetlink: load fou module for kci_test_encap_fou() test

2020-10-18 Thread Po-Hsu Lin

On Sat, Oct 17, 2020 at 7:32 AM Jakub Kicinski  wrote:
>
> On Fri, 16 Oct 2020 12:12:11 +0800 Po-Hsu Lin wrote:
> > The kci_test_encap_fou() test from kci_test_encap() in rtnetlink.sh
> > needs the fou module to work. Otherwise it will fail with:
> >
> >   $ ip netns exec "$testns" ip fou add port  ipproto 47
> >   RTNETLINK answers: No such file or directory
> >   Error talking to the kernel
> >
> > Add the CONFIG_NET_FOU into the config file as well. Which needs at
> > least to be set as a loadable module.
> >
> > Signed-off-by: Po-Hsu Lin 
>
> Doesn't apply :( Could you rebase on top of:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/
Ah OK, I was using the next branch in kselftest, will resend another one.
Thanks.

Re: [PATCH v3 7/7] zram: Use local lock to protect per-CPU data

2020-10-18 Thread Mike Galbraith

On Sun, 2020-10-18 at 19:52 -0600, Yu Zhao wrote:
> On Wed, May 27, 2020 at 10:11:19PM +0200, Sebastian Andrzej Siewior wrote:
> > From: Mike Galbraith 
> > 
> > The zcomp driver uses per-CPU compression. The per-CPU data pointer is
> > acquired with get_cpu_ptr() which implicitly disables preemption.
> > It allocates memory inside the preempt disabled region which conflicts
> > with the PREEMPT_RT semantics.
> > 
> > Replace the implicit preemption control with an explicit local lock.
> > This allows RT kernels to substitute it with a real per CPU lock, which
> > serializes the access but keeps the code section preemptible. On non RT
> > kernels this maps to preempt_disable() as before, i.e. no functional
> > change.
> 
> Hi,
> 
> This change seems to have introduced a potential deadlock. Can you
> please take a look?

Hm, looks like I'm getting undeserved credit for uncovering a locking
bug.  In reality, Sebastian was generous with attribution of derivative
work, so he should ge credit.. and it looks like peterz fixed it.

Date: Fri, 16 Oct 2020 14:40:09 +0200
From: Peter Zijlstra 

---

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 9100ac36670a..c1e2c2e1cde8 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1216,10 +1216,11 @@ static void zram_free_page(struct zram *zram, size_t 
index)
 static int __zram_bvec_read(struct zram *zram, struct page *page, u32 index,
struct bio *bio, bool partial_io)
 {
-   int ret;
+   struct zcomp_strm *zstrm;
unsigned long handle;
unsigned int size;
void *src, *dst;
+   int ret;
 
zram_slot_lock(zram, index);
if (zram_test_flag(zram, index, ZRAM_WB)) {
@@ -1250,6 +1251,9 @@ static int __zram_bvec_read(struct zram *zram, struct 
page *page, u32 index,
 
size = zram_get_obj_size(zram, index);
 
+   if (size != PAGE_SIZE)
+   zstrm = zcomp_stream_get(zram->comp);
+
src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
if (size == PAGE_SIZE) {
dst = kmap_atomic(page);
@@ -1257,8 +1261,6 @@ static int __zram_bvec_read(struct zram *zram, struct 
page *page, u32 index,
kunmap_atomic(dst);
ret = 0;
} else {
-   struct zcomp_strm *zstrm = zcomp_stream_get(zram->comp);
-
dst = kmap_atomic(page);
ret = zcomp_decompress(zstrm, src, size, dst);
kunmap_atomic(dst);

Re: [PATCH 2/2] net: dsa: mv88e6xxx: Support serdes ports on MV88E6097

2020-10-18 Thread Chris Packham


On 19/10/20 11:31 am, Chris Packham wrote:
>
> On 19/10/20 11:08 am, Andrew Lunn wrote:
>> On Sun, Oct 18, 2020 at 09:15:52PM +, Chris Packham wrote:
>>> On 19/10/20 9:25 am, Andrew Lunn wrote:
> I assume you're talking about the PHY Control Register 0 bit 11. 
> If so
> that's for the internal PHYs on ports 0-7. Ports 8, 9 and 10 don't 
> have
> PHYs.
 Hi Chris

 I have a datasheet for the 6122/6121, from some corner of the web,
 Part 3 of 3, Gigabit PHYs and SERDES.

 http://www.image.micros.com.pl/_dane_techniczne_auto/ui88e6122b2lkj1i0.pdf 


 Section 5 of this document talks
 about the SERDES registers. Register 0 is Control, register 1 is
 Status - Fiber, register 2 and 3 are the usual ID, 4 is auto-net
 advertisement etc.

 Where these registers appear in the address space is not clear from
 this document. It is normally in document part 2 of 3, which my
 searching of the web did not find.

   Andrew
>>> I have got the 88E6122 datasheet(s) and can see the SERDES registers
>>> you're talking about (I think they're in the same register space as the
>>> built-in PHYs). It looks like the 88E6097 is different in that there 
>>> are
>>> no SERDES registers exposed (at least not in a documented way). Looking
>>> at the 88E6185 it's the same as the 88E6097.
>> Hi Chris
>>
>> I find it odd there are no SERDES registers.  Can you poke around the
>> register space and look for ID registers? See if there are any with
>> Marvells OUI, but different to the chip ID found in the port
>> registers?
> From my experience with Marvell I don't think it's that odd. 
> Particularly for a 1G SERDES there's really not much that needs 
> configuring (although power up/down would be nice). I'll poke around 
> that register space to see if anything is there.

I poked around what I thought would be the relevant register space and 
couldn't find anything responding to the reads.

>>> So how do you want to move this series forward? I can test it on the
>>> 88E6097 (and have restricted it to just that chip for now), I'm pretty
>>> sure it'll work on the 88E6185. I doubt it'll work on the 88E6122 but
>>> maybe it would with a different serdes_power function (or even the
>>> mv88e6352_serdes_power() as you suggested).
>> Make your best guess for what you cannot test.
> Will do. I'll expand out at least to cover the 88E6185 in v2. I can 
> probably guess at the 88E6122 aside from the ability to power up/down 
> the rest looks the same from glancing the datasheets.

[PATCH v2 1/3] net: dsa: mv88e6xxx: Don't force link when using in-band-status

2020-10-18 Thread Chris Packham

When a port is configured with 'managed = "in-band-status"' don't force
the link up, the switch MAC will detect the link status correctly.

Signed-off-by: Chris Packham 
Reviewed-by: Andrew Lunn 
---

Changes in v2:
- Add review from Andrew

 drivers/net/dsa/mv88e6xxx/chip.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index f0dbc05e30a4..1ef392ee52c5 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -767,8 +767,11 @@ static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, 
int port,
goto error;
}
 
-   if (ops->port_set_link)
-   err = ops->port_set_link(chip, port, LINK_FORCED_UP);
+   if (ops->port_set_link) {
+   int link = mode == MLO_AN_INBAND ? LINK_UNFORCED : 
LINK_FORCED_UP;
+
+   err = ops->port_set_link(chip, port, link);
+   }
}
 error:
mv88e6xxx_reg_unlock(chip);
-- 
2.28.0

[PATCH v2 0/3] net: dsa: mv88e6xxx: serdes link without phy

2020-10-18 Thread Chris Packham

This small series gets my hardware into a working state. The key points are to
make sure we don't force the link and that we ask the MAC for the link status.
I also have updated my dts to say `phy-mode = "1000base-x";` and `managed =
"in-band-status";`

I've included patch #3 in this series but I don't have anything to test it on.
It's just a guess based on the datasheets.

Chris Packham (3):
  net: dsa: mv88e6xxx: Don't force link when using in-band-status
  net: dsa: mv88e6xxx: Support serdes ports on MV88E6097/6095/6185
  net: dsa: mv88e6xxx: Support serdes ports on MV88E6123/6131

 drivers/net/dsa/mv88e6xxx/chip.c   |  26 +++-
 drivers/net/dsa/mv88e6xxx/serdes.c | 102 +
 drivers/net/dsa/mv88e6xxx/serdes.h |   9 +++
 3 files changed, 135 insertions(+), 2 deletions(-)

-- 
2.28.0

[PATCH v2 3/3] net: dsa: mv88e6xxx: Support serdes ports on MV88E6123/6131

2020-10-18 Thread Chris Packham

Implement serdes_power, serdes_get_lane and serdes_pcs_get_state ops for
the MV88E6123 so that the ports without a built-in PHY supported as
serdes ports and directly connected to other network interfaces or to
SFPs. Also implement serdes_get_regs_len and serdes_get_regs to aid
future debugging.

Signed-off-by: Chris Packham 
---

This is untested (apart from compilation) it assumes the SERDES "phy"
address corresponds to the port number but I'm not confident that is a
valid assumption.

Changes in v2:
- new

 drivers/net/dsa/mv88e6xxx/chip.c   | 10 +++
 drivers/net/dsa/mv88e6xxx/serdes.c | 44 ++
 drivers/net/dsa/mv88e6xxx/serdes.h |  4 +++
 3 files changed, 58 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 62d4d7b5d9ac..5344fc84b03e 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3574,6 +3574,11 @@ static const struct mv88e6xxx_ops mv88e6123_ops = {
.set_egress_port = mv88e6095_g1_set_egress_port,
.watchdog_ops = &mv88e6097_watchdog_ops,
.mgmt_rsvd2cpu = mv88e6352_g2_mgmt_rsvd2cpu,
+   .serdes_power = mv88e6123_serdes_power,
+   .serdes_get_lane = mv88e6185_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6185_serdes_pcs_get_state,
+   .serdes_get_regs_len = mv88e6123_serdes_get_regs_len,
+   .serdes_get_regs = mv88e6123_serdes_get_regs,
.pot_clear = mv88e6xxx_g2_pot_clear,
.reset = mv88e6352_g1_reset,
.atu_get_hash = mv88e6165_g1_atu_get_hash,
@@ -3613,6 +3618,11 @@ static const struct mv88e6xxx_ops mv88e6131_ops = {
.set_egress_port = mv88e6095_g1_set_egress_port,
.watchdog_ops = &mv88e6097_watchdog_ops,
.mgmt_rsvd2cpu = mv88e6185_g2_mgmt_rsvd2cpu,
+   .serdes_power = mv88e6123_serdes_power,
+   .serdes_get_lane = mv88e6185_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6185_serdes_pcs_get_state,
+   .serdes_get_regs_len = mv88e6123_serdes_get_regs_len,
+   .serdes_get_regs = mv88e6123_serdes_get_regs,
.ppu_enable = mv88e6185_g1_ppu_enable,
.set_cascade_port = mv88e6185_g1_set_cascade_port,
.ppu_disable = mv88e6185_g1_ppu_disable,
diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c 
b/drivers/net/dsa/mv88e6xxx/serdes.c
index 2d52c8ede943..1f649a661720 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.c
+++ b/drivers/net/dsa/mv88e6xxx/serdes.c
@@ -428,6 +428,50 @@ u8 mv88e6341_serdes_get_lane(struct mv88e6xxx_chip *chip, 
int port)
return lane;
 }
 
+int mv88e6123_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
+  bool up)
+{
+   u16 val, new_val;
+   int err;
+
+   err = mv88e6xxx_phy_read(chip, port, MII_BMCR, &val);
+   if (err)
+   return err;
+
+   if (up)
+   new_val = val & ~BMCR_PDOWN;
+   else
+   new_val = val | BMCR_PDOWN;
+
+   if (val != new_val)
+   err = mv88e6xxx_phy_write(chip, port, MII_BMCR, val);
+
+   return err;
+}
+
+int mv88e6123_serdes_get_regs_len(struct mv88e6xxx_chip *chip, int port)
+{
+   if (mv88e6xxx_serdes_get_lane(chip, port) == 0)
+   return 0;
+
+   return 26 * sizeof(u16);
+}
+
+void mv88e6123_serdes_get_regs(struct mv88e6xxx_chip *chip, int port, void *_p)
+{
+   u16 *p = _p;
+   u16 reg;
+   int i;
+
+   if (mv88e6xxx_serdes_get_lane(chip, port) == 0)
+   return;
+
+   for (i = 0; i < 26; i++) {
+   mv88e6xxx_phy_read(chip, port, i, ®);
+   p[i] = reg;
+   }
+}
+
 int mv88e6185_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
   bool up)
 {
diff --git a/drivers/net/dsa/mv88e6xxx/serdes.h 
b/drivers/net/dsa/mv88e6xxx/serdes.h
index c24ec4122c9e..b573139928c4 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.h
+++ b/drivers/net/dsa/mv88e6xxx/serdes.h
@@ -104,6 +104,8 @@ unsigned int mv88e6352_serdes_irq_mapping(struct 
mv88e6xxx_chip *chip,
  int port);
 unsigned int mv88e6390_serdes_irq_mapping(struct mv88e6xxx_chip *chip,
  int port);
+int mv88e6123_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
+  bool up);
 int mv88e6185_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
   bool up);
 int mv88e6352_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
@@ -129,6 +131,8 @@ int mv88e6390_serdes_get_strings(struct mv88e6xxx_chip 
*chip,
 int mv88e6390_serdes_get_stats(struct mv88e6xxx_chip *chip, int port,
   uint64_t *data);
 
+int mv88e6123_serdes_get_regs_len(struct mv88e6xxx_chip *chip, int port);
+void mv88e6123_serdes_get_regs(struct mv88e6xxx_chip *chip, int port, void 
*_p);
 int mv88e6352_serdes_get_regs_len(struct mv88e6xxx_chip *chip, int port);
 void mv88e6352_serdes_get_regs(struct mv

[PATCH v2 2/3] net: dsa: mv88e6xxx: Support serdes ports on MV88E6097/6095/6185

2020-10-18 Thread Chris Packham

Implement serdes_power, serdes_get_lane and serdes_pcs_get_state ops for
the MV88E6097/6095/6185 so that ports 8 & 9 can be supported as serdes
ports and directly connected to other network interfaces or to SFPs
without a PHY.

Signed-off-by: Chris Packham 
---

Changes in v2:
- expand support to cover 6095 and 6185
- move serdes related code to serdes.c

 drivers/net/dsa/mv88e6xxx/chip.c   |  9 +
 drivers/net/dsa/mv88e6xxx/serdes.c | 58 ++
 drivers/net/dsa/mv88e6xxx/serdes.h |  5 +++
 3 files changed, 72 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 1ef392ee52c5..62d4d7b5d9ac 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3496,6 +3496,9 @@ static const struct mv88e6xxx_ops mv88e6095_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.mgmt_rsvd2cpu = mv88e6185_g2_mgmt_rsvd2cpu,
+   .serdes_power = mv88e6185_serdes_power,
+   .serdes_get_lane = mv88e6185_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6185_serdes_pcs_get_state,
.ppu_enable = mv88e6185_g1_ppu_enable,
.ppu_disable = mv88e6185_g1_ppu_disable,
.reset = mv88e6185_g1_reset,
@@ -3534,6 +3537,9 @@ static const struct mv88e6xxx_ops mv88e6097_ops = {
.set_egress_port = mv88e6095_g1_set_egress_port,
.watchdog_ops = &mv88e6097_watchdog_ops,
.mgmt_rsvd2cpu = mv88e6352_g2_mgmt_rsvd2cpu,
+   .serdes_power = mv88e6185_serdes_power,
+   .serdes_get_lane = mv88e6185_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6185_serdes_pcs_get_state,
.pot_clear = mv88e6xxx_g2_pot_clear,
.reset = mv88e6352_g1_reset,
.rmu_disable = mv88e6085_g1_rmu_disable,
@@ -3958,6 +3964,9 @@ static const struct mv88e6xxx_ops mv88e6185_ops = {
.set_egress_port = mv88e6095_g1_set_egress_port,
.watchdog_ops = &mv88e6097_watchdog_ops,
.mgmt_rsvd2cpu = mv88e6185_g2_mgmt_rsvd2cpu,
+   .serdes_power = mv88e6185_serdes_power,
+   .serdes_get_lane = mv88e6185_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6185_serdes_pcs_get_state,
.set_cascade_port = mv88e6185_g1_set_cascade_port,
.ppu_enable = mv88e6185_g1_ppu_enable,
.ppu_disable = mv88e6185_g1_ppu_disable,
diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c 
b/drivers/net/dsa/mv88e6xxx/serdes.c
index 9c07b4f3d345..2d52c8ede943 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.c
+++ b/drivers/net/dsa/mv88e6xxx/serdes.c
@@ -428,6 +428,64 @@ u8 mv88e6341_serdes_get_lane(struct mv88e6xxx_chip *chip, 
int port)
return lane;
 }
 
+int mv88e6185_serdes_power(struct mv88e6xxx_chip *chip, int port, u8 lane,
+  bool up)
+{
+   /* The serdes power can't be controlled on this switch chip but we need
+* to supply this function to avoid returning -EOPNOTSUPP in
+* mv88e6xxx_serdes_power_up/mv88e6xxx_serdes_power_down
+*/
+   return 0;
+}
+
+u8 mv88e6185_serdes_get_lane(struct mv88e6xxx_chip *chip, int port)
+{
+   switch (chip->ports[port].cmode) {
+   case MV88E6185_PORT_STS_CMODE_SERDES:
+   case MV88E6185_PORT_STS_CMODE_1000BASE_X:
+   return 0xff; /* Unused */
+   default:
+   return 0;
+   }
+}
+
+int mv88e6185_serdes_pcs_get_state(struct mv88e6xxx_chip *chip, int port,
+  u8 lane, struct phylink_link_state *state)
+{
+   int err;
+   u16 status;
+
+   err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_STS, &status);
+   if (err)
+   return err;
+
+   state->link = !!(status & MV88E6XXX_PORT_STS_LINK);
+
+   if (state->link) {
+   state->duplex = status & MV88E6XXX_PORT_STS_DUPLEX ? 
DUPLEX_FULL : DUPLEX_HALF;
+
+   switch (status &  MV88E6XXX_PORT_STS_SPEED_MASK) {
+   case MV88E6XXX_PORT_STS_SPEED_1000:
+   state->speed = SPEED_1000;
+   break;
+   case MV88E6XXX_PORT_STS_SPEED_100:
+   state->speed = SPEED_100;
+   break;
+   case MV88E6XXX_PORT_STS_SPEED_10:
+   state->speed = SPEED_10;
+   break;
+   default:
+   dev_err(chip->dev, "invalid PHY speed\n");
+   return -EINVAL;
+   }
+   } else {
+   state->duplex = DUPLEX_UNKNOWN;
+   state->speed = SPEED_UNKNOWN;
+   }
+
+   return 0;
+}
+
 u8 mv88e6390_serdes_get_lane(struct mv88e6xxx_chip *chip, int port)
 {
u8 cmode = chip->ports[port].cmode;
diff --git a/drivers/net/dsa/mv88e6xxx/serdes.h 
b/drivers/net/dsa/mv88e6xxx/serdes.h
index 14315f26228a..c24ec4122c9e 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.h
+++ b/drivers/net/dsa/mv88e6xxx/serdes.h
@@ -73,6 +73,7 @@
 #define MV88E

Re: [PATCH v12 0/9] support reserving crashkernel above 4G on arm64 kdump

2020-10-18 Thread chenzhou

Hi Bhupesh,


On 2020/10/7 15:07, Bhupesh Sharma wrote:
> Hi Catalin,
>
> On Tue, Oct 6, 2020 at 11:30 PM Catalin Marinas  
> wrote:
>> On Mon, Oct 05, 2020 at 11:12:10PM +0530, Bhupesh Sharma wrote:
>>> I think my earlier email with the test results on this series bounced
>>> off the mailing list server (for some weird reason), but I still see
>>> several issues with this patchset. I will add specific issues in the
>>> review comments for each patch again, but overall, with a crashkernel
>>> size of say 786M, I see the following issue:
>>>
>>> # cat /proc/cmdline
>>> BOOT_IMAGE=(hd7,gpt2)/vmlinuz-5.9.0-rc7+ root=<..snip..> 
>>> rd.lvm.lv=<..snip..> crashkernel=786M
>>>
>>> I see two regions of size 786M and 256M reserved in low and high
>>> regions respectively, So we reserve a total of 1042M of memory, which
>>> is an incorrect behaviour:
>>>
>>> # dmesg | grep -i crash
>>> [0.00] Reserving 256MB of low memory at 2816MB for crashkernel 
>>> (System low RAM: 768MB)
>>> [0.00] Reserving 786MB of memory at 654158MB for crashkernel 
>>> (System RAM: 130816MB)
>>> [0.00] Kernel command line: 
>>> BOOT_IMAGE=(hd2,gpt2)/vmlinuz-5.9.0-rc7+ 
>>> root=/dev/mapper/rhel_ampere--hr330a--03-root ro 
>>> rd.lvm.lv=rhel_ampere-hr330a-03/root rd.lvm.lv=rhel_ampere-hr330a-03/swap 
>>> crashkernel=786M cma=1024M
>>>
>>> # cat /proc/iomem | grep -i crash
>>>   b000-bfff : Crash kernel (low)
>>>   bfcbe0-bffcff : Crash kernel
>> As Chen said, that's the intended behaviour and how x86 works. The
>> requested 768M goes in the high range if there's not enough low memory
>> and an additional buffer for swiotlb is allocated, hence the low 256M.
> I understand, but why 256M (as low) for arm64? x86_64 setups usually
> have more system memory available as compared to several commercially
> available arm64 setups. So is the intent, just to keep the behavior
> similar between arm64 and x86_64?
>
> Should we have a CONFIG option / bootarg to help one select the max
> 'low_size'? Currently the ' low_size' value is calculated as:
>
> /*
>  * two parts from kernel/dma/swiotlb.c:
>  * -swiotlb size: user-specified with swiotlb= or default.
>  *
>  * -swiotlb overflow buffer: now hardcoded to 32k. We round it
>  * to 8M for other buffers that may need to stay low too. Also
>  * make sure we allocate enough extra low memory so that we
>  * don't run out of DMA buffers for 32-bit devices.
>  */
> low_size = max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20);
>
> Since many arm64 boards ship with swiotlb=0 (turned off) via kernel
> bootargs, the low_size, still ends up being 256M in such cases,
> whereas this 256M can be used for some other purposes - so should we
> be limiting this to 64M and failing the crash kernel allocation
> request (gracefully) otherwise?
>
>> We could (as an additional patch), subtract the 256M from the high
>> allocation so that you'd get a low 256M and a high 512M, not sure it's
>> worth it. Note that with a "crashkernel=768M,high" option, you still get
>> the additional low 256M, otherwise the crashkernel won't be able to
>> boot as there's no memory in ZONE_DMA. In the explicit ",high" request
>> case, I'm not sure subtracted the 256M is more intuitive.
>> In 5.11, we also hope to fix the ZONE_DMA layout for non-RPi4 platforms
>> to cover the entire 32-bit address space (i.e. identical to the current
>> ZONE_DMA32).
>>
>>> IMO, we should test this feature more before including this in 5.11
>> Definitely. That's one of the reasons we haven't queued it yet. So any
>> help with testing here is appreciated.
> Sure, I am running more checks on this series. I will be soon back
> with more updates.

Sorry to bother you. I am looking forward to your review comments.


Thanks,
Chen Zhou
>
> Regards,
> Bhupesh
>
> .
>

[PATCH] random: Fix missing-prototypes in random.c

2020-10-18 Thread Tian Tao

Fix the following warnings.
drivers/char/random.c:2297:6: warning: no previous prototype for
‘add_hwgenerator_randomness’ [-Wmissing-prototypes]

Signed-off-by: Tian Tao 
---
 drivers/char/random.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index d20ba1b..1bf06d6 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -321,6 +321,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.7.4

[PATCH v2] usb: dwc3: core: fix a issue about clear connect state

2020-10-18 Thread Dejin Zheng

According to Synopsys Programming Guide chapter 2.2 Register Resets,
it cannot reset the DCTL register by setting DCTL.CSFTRST for core soft
reset, if DWC3 controller as a slave device and stay connected with a usb
host, then, while rebooting linux, it will fail to reinitialize dwc3 as a
slave device when the DWC3 controller did not power off. because the
connection status is incorrect, so we also need to clear DCTL.RUN_STOP
bit for disabling connect when doing core soft reset.

Fixes: f59dcab176293b6 ("usb: dwc3: core: improve reset sequence")
Signed-off-by: Dejin Zheng 
---
v1 -> v2:
* modify some commit messages by Sergei's suggest, Thanks
  very much for Sergei's help!

 drivers/usb/dwc3/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 2eb34c8b4065..239636c454c2 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -256,6 +256,7 @@ static int dwc3_core_soft_reset(struct dwc3 *dwc)
 
reg = dwc3_readl(dwc->regs, DWC3_DCTL);
reg |= DWC3_DCTL_CSFTRST;
+   reg &= ~DWC3_DCTL_RUN_STOP;
dwc3_writel(dwc->regs, DWC3_DCTL, reg);
 
/*
-- 
2.25.0

Re: [PATCH v3 7/7] zram: Use local lock to protect per-CPU data

2020-10-18 Thread Hugh Dickins

On Sun, Oct 18, 2020 at 6:53 PM Yu Zhao  wrote:
>
> On Wed, May 27, 2020 at 10:11:19PM +0200, Sebastian Andrzej Siewior wrote:
> > From: Mike Galbraith 
> >
> > The zcomp driver uses per-CPU compression. The per-CPU data pointer is
> > acquired with get_cpu_ptr() which implicitly disables preemption.
> > It allocates memory inside the preempt disabled region which conflicts
> > with the PREEMPT_RT semantics.
> >
> > Replace the implicit preemption control with an explicit local lock.
> > This allows RT kernels to substitute it with a real per CPU lock, which
> > serializes the access but keeps the code section preemptible. On non RT
> > kernels this maps to preempt_disable() as before, i.e. no functional
> > change.
>
> Hi,
>
> This change seems to have introduced a potential deadlock. Can you
> please take a look?

Probably needs Peter's fix
https://lore.kernel.org/lkml/20201016124009.gq2...@hirez.programming.kicks-ass.net/

>
> Thank you.
>
> [   40.030778] ==
> [   40.037706] WARNING: possible circular locking dependency detected
> [   40.044637] 5.9.0-74216-g5c9472ed6825 #1 Tainted: GW
> [   40.051759] --
> [   40.058685] swapon/586 is trying to acquire lock:
> [   40.063950] e8c0ee60 (&zstrm->lock){+.+.}-{2:2}, at: 
> local_lock_acquire+0x5/0x70 [zram]
> [   40.073739]
> [   40.073739] but task is already holding lock:
> [   40.080277] 888101a1f438 (&zspage->lock){.+.+}-{2:2}, at: 
> zs_map_object+0x73/0x28d
> [   40.089182]
> [   40.089182] which lock already depends on the new lock.
> [   40.089182]
> [   40.098344]
> [   40.098344] the existing dependency chain (in reverse order) is:
> [   40.106715]
> [   40.106715] -> #1 (&zspage->lock){.+.+}-{2:2}:
> [   40.113386]lock_acquire+0x1cd/0x2c3
> [   40.118083]_raw_read_lock+0x44/0x78
> [   40.122781]zs_map_object+0x73/0x28d
> [   40.127479]zram_bvec_rw+0x42e/0x75d [zram]
> [   40.132855]zram_submit_bio+0x1fc/0x2d7 [zram]
> [   40.138526]submit_bio_noacct+0x11b/0x372
> [   40.143709]submit_bio+0xfd/0x1b5
> [   40.148113]__block_write_full_page+0x302/0x56f
> [   40.153877]__writepage+0x1e/0x74
> [   40.158281]write_cache_pages+0x404/0x59a
> [   40.163461]generic_writepages+0x53/0x82
> [   40.168545]do_writepages+0x33/0x74
> [   40.173145]__filemap_fdatawrite_range+0x91/0xac
> [   40.179005]file_write_and_wait_range+0x39/0x87
> [   40.184769]blkdev_fsync+0x19/0x3e
> [   40.189272]do_fsync+0x39/0x5c
> [   40.193384]__x64_sys_fsync+0x13/0x17
> [   40.198178]do_syscall_64+0x37/0x45
> [   40.202776]entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   40.209022]
> [   40.209022] -> #0 (&zstrm->lock){+.+.}-{2:2}:
> [   40.215589]validate_chain+0x1966/0x21a8
> [   40.220673]__lock_acquire+0x941/0xbba
> [   40.225552]lock_acquire+0x1cd/0x2c3
> [   40.230250]local_lock_acquire+0x21/0x70 [zram]
> [   40.236015]zcomp_stream_get+0x33/0x4d [zram]
> [   40.241585]zram_bvec_rw+0x476/0x75d [zram]
> [   40.246963]zram_rw_page+0xd8/0x17c [zram]
> [   40.252240]bdev_read_page+0x7a/0x9d
> [   40.256933]do_mpage_readpage+0x6b2/0x860
> [   40.262101]mpage_readahead+0x136/0x245
> [   40.267089]read_pages+0x60/0x1f9
> [   40.271492]page_cache_ra_unbounded+0x211/0x27b
> [   40.277251]generic_file_buffered_read+0x188/0xd4d
> [   40.283296]new_sync_read+0x10c/0x143
> [   40.288088]vfs_read+0xf4/0x1a5
> [   40.292285]ksys_read+0x73/0xd3
> [   40.296483]do_syscall_64+0x37/0x45
> [   40.301072]entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   40.307319]
> [   40.307319] other info that might help us debug this:
> [   40.307319]
> [   40.316285]  Possible unsafe locking scenario:
> [   40.316285]
> [   40.322907]CPU0CPU1
> [   40.327972]
> [   40.333041]   lock(&zspage->lock);
> [   40.336874]lock(&zstrm->lock);
> [   40.343424]lock(&zspage->lock);
> [   40.350071]   lock(&zstrm->lock);
> [   40.353803]
> [   40.353803]  *** DEADLOCK ***
>

Re: [PATCH v1 25/29] virtio-mem: Big Block Mode (BBM) memory hotplug

2020-10-18 Thread Wei Yang

On Mon, Oct 12, 2020 at 02:53:19PM +0200, David Hildenbrand wrote:
>Currently, we do not support device block sizes that exceed the Linux
>memory block size. For example, having a device block size of 1 GiB (e.g.,
>gigantic pages in the hypervisor) won't work with 128 MiB Linux memory
>blocks.
>
>Let's implement Big Block Mode (BBM), whereby we add/remove at least
>one Linux memory block at a time. With a 1 GiB device block size, a Big
>Block (BB) will cover 8 Linux memory blocks.
>
>We'll keep registering the online_page_callback machinery, it will be used
>for safe memory hotunplug in BBM next.
>
>Note: BBM is properly prepared for variable-sized Linux memory
>blocks that we might see in the future. So we won't care how many Linux
>memory blocks a big block actually spans, and how the memory notifier is
>called.
>
>Cc: "Michael S. Tsirkin" 
>Cc: Jason Wang 
>Cc: Pankaj Gupta 
>Cc: Michal Hocko 
>Cc: Oscar Salvador 
>Cc: Wei Yang 
>Cc: Andrew Morton 
>Signed-off-by: David Hildenbrand 
>---
> drivers/virtio/virtio_mem.c | 484 ++--
> 1 file changed, 402 insertions(+), 82 deletions(-)
>
>diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
>index e68d0d99590c..4d396ef98a92 100644
>--- a/drivers/virtio/virtio_mem.c
>+++ b/drivers/virtio/virtio_mem.c
>@@ -30,12 +30,18 @@ MODULE_PARM_DESC(unplug_online, "Try to unplug online 
>memory");
> /*
>  * virtio-mem currently supports the following modes of operation:
>  *
>- * * Sub Block Mode (SBM): A Linux memory block spans 1..X subblocks (SB). The
>+ * * Sub Block Mode (SBM): A Linux memory block spans 2..X subblocks (SB). The
>  *   size of a Sub Block (SB) is determined based on the device block size, 
> the
>  *   pageblock size, and the maximum allocation granularity of the buddy.
>  *   Subblocks within a Linux memory block might either be plugged or 
> unplugged.
>  *   Memory is added/removed to Linux MM in Linux memory block granularity.
>  *
>+ * * Big Block Mode (BBM): A Big Block (BB) spans 1..X Linux memory blocks.
>+ *   Memory is added/removed to Linux MM in Big Block granularity.
>+ *
>+ * The mode is determined automatically based on the Linux memory block size
>+ * and the device block size.
>+ *
>  * User space / core MM (auto onlining) is responsible for onlining added
>  * Linux memory blocks - and for selecting a zone. Linux Memory Blocks are
>  * always onlined separately, and all memory within a Linux memory block is
>@@ -61,6 +67,19 @@ enum virtio_mem_sbm_mb_state {
>   VIRTIO_MEM_SBM_MB_COUNT
> };
> 
>+/*
>+ * State of a Big Block (BB) in BBM, covering 1..X Linux memory blocks.
>+ */
>+enum virtio_mem_bbm_bb_state {
>+  /* Unplugged, not added to Linux. Can be reused later. */
>+  VIRTIO_MEM_BBM_BB_UNUSED = 0,
>+  /* Plugged, not added to Linux. Error on add_memory(). */
>+  VIRTIO_MEM_BBM_BB_PLUGGED,
>+  /* Plugged and added to Linux. */
>+  VIRTIO_MEM_BBM_BB_ADDED,
>+  VIRTIO_MEM_BBM_BB_COUNT
>+};
>+
> struct virtio_mem {
>   struct virtio_device *vdev;
> 
>@@ -113,6 +132,9 @@ struct virtio_mem {
>   atomic64_t offline_size;
>   uint64_t offline_threshold;
> 
>+  /* If set, the driver is in SBM, otherwise in BBM. */
>+  bool in_sbm;
>+
>   struct {
>   /* Id of the first memory block of this device. */
>   unsigned long first_mb_id;
>@@ -151,9 +173,27 @@ struct virtio_mem {
>   unsigned long *sb_states;
>   } sbm;
> 
>+  struct {
>+  /* Id of the first big block of this device. */
>+  unsigned long first_bb_id;
>+  /* Id of the last usable big block of this device. */
>+  unsigned long last_usable_bb_id;
>+  /* Id of the next device bock to prepare when needed. */
>+  unsigned long next_bb_id;
>+
>+  /* Summary of all big block states. */
>+  unsigned long bb_count[VIRTIO_MEM_BBM_BB_COUNT];
>+
>+  /* One byte state per big block. See sbm.mb_states. */
>+  uint8_t *bb_states;
>+
>+  /* The block size used for (un)plugged, adding/removing. */
>+  uint64_t bb_size;
>+  } bbm;
>+
>   /*
>-   * Mutex that protects the sbm.mb_count, sbm.mb_states, and
>-   * sbm.sb_states.
>+   * Mutex that protects the sbm.mb_count, sbm.mb_states,
>+   * sbm.sb_states, bbm.bb_count, and bbm.bb_states
>*
>* When this lock is held the pointers can't change, ONLINE and
>* OFFLINE blocks can't change the state and no subblocks will get
>@@ -247,6 +287,24 @@ static unsigned long virtio_mem_mb_id_to_phys(unsigned 
>long mb_id)
>   return mb_id * memory_block_size_bytes();
> }
> 
>+/*
>+ * Calculate the big block id of a given address.
>+ */
>+static unsigned long virtio_mem_phys_to_bb_id(struct virtio_mem *vm,
>+uint64_t addr)
>+{
>+  return addr / vm->bbm.bb_size;
>+}
>+
>+/*
>+

Re: [PATCH 8/9] dma-mapping: move large parts of to kernel/dma

2020-10-18 Thread Alexey Kardashevskiy





On 30/09/2020 18:55, Christoph Hellwig wrote:

Most of the dma_direct symbols should only be used by direct.c and
mapping.c, so move them to kernel/dma.  In fact more of dma-direct.h
should eventually move, but that will require more coordination with
other subsystems.


Because of this change, 
http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200713062348.100552-1-...@ozlabs.ru/ 
does not work anymore.


Should I send a patch moving 
dma_direct_map_sg/dma_direct_map_page/+unmap back to include/ or there 
is a better idea? thanks,





Signed-off-by: Christoph Hellwig 
---
  include/linux/dma-direct.h | 106 -
  kernel/dma/direct.c|   2 +-
  kernel/dma/direct.h| 119 +
  kernel/dma/mapping.c   |   2 +-
  4 files changed, 121 insertions(+), 108 deletions(-)
  create mode 100644 kernel/dma/direct.h

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 38ed3b55034d50..a2d6640c42c04e 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -120,114 +120,8 @@ struct page *dma_direct_alloc_pages(struct device *dev, 
size_t size,
  void dma_direct_free_pages(struct device *dev, size_t size,
struct page *page, dma_addr_t dma_addr,
enum dma_data_direction dir);
-int dma_direct_get_sgtable(struct device *dev, struct sg_table *sgt,
-   void *cpu_addr, dma_addr_t dma_addr, size_t size,
-   unsigned long attrs);
-bool dma_direct_can_mmap(struct device *dev);
-int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
-   void *cpu_addr, dma_addr_t dma_addr, size_t size,
-   unsigned long attrs);
  int dma_direct_supported(struct device *dev, u64 mask);
-bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr);
-int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
-   enum dma_data_direction dir, unsigned long attrs);
  dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,
size_t size, enum dma_data_direction dir, unsigned long attrs);
-size_t dma_direct_max_mapping_size(struct device *dev);
  
-#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \

-defined(CONFIG_SWIOTLB)
-void dma_direct_sync_sg_for_device(struct device *dev, struct scatterlist *sgl,
-   int nents, enum dma_data_direction dir);
-#else
-static inline void dma_direct_sync_sg_for_device(struct device *dev,
-   struct scatterlist *sgl, int nents, enum dma_data_direction dir)
-{
-}
-#endif
-
-#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
-defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) || \
-defined(CONFIG_SWIOTLB)
-void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
-   int nents, enum dma_data_direction dir, unsigned long attrs);
-void dma_direct_sync_sg_for_cpu(struct device *dev,
-   struct scatterlist *sgl, int nents, enum dma_data_direction 
dir);
-#else
-static inline void dma_direct_unmap_sg(struct device *dev,
-   struct scatterlist *sgl, int nents, enum dma_data_direction dir,
-   unsigned long attrs)
-{
-}
-static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
-   struct scatterlist *sgl, int nents, enum dma_data_direction dir)
-{
-}
-#endif
-
-static inline void dma_direct_sync_single_for_device(struct device *dev,
-   dma_addr_t addr, size_t size, enum dma_data_direction dir)
-{
-   phys_addr_t paddr = dma_to_phys(dev, addr);
-
-   if (unlikely(is_swiotlb_buffer(paddr)))
-   swiotlb_tbl_sync_single(dev, paddr, size, dir, SYNC_FOR_DEVICE);
-
-   if (!dev_is_dma_coherent(dev))
-   arch_sync_dma_for_device(paddr, size, dir);
-}
-
-static inline void dma_direct_sync_single_for_cpu(struct device *dev,
-   dma_addr_t addr, size_t size, enum dma_data_direction dir)
-{
-   phys_addr_t paddr = dma_to_phys(dev, addr);
-
-   if (!dev_is_dma_coherent(dev)) {
-   arch_sync_dma_for_cpu(paddr, size, dir);
-   arch_sync_dma_for_cpu_all();
-   }
-
-   if (unlikely(is_swiotlb_buffer(paddr)))
-   swiotlb_tbl_sync_single(dev, paddr, size, dir, SYNC_FOR_CPU);
-
-   if (dir == DMA_FROM_DEVICE)
-   arch_dma_mark_clean(paddr, size);
-}
-
-static inline dma_addr_t dma_direct_map_page(struct device *dev,
-   struct page *page, unsigned long offset, size_t size,
-   enum dma_data_direction dir, unsigned long attrs)
-{
-   phys_addr_t phys = page_to_phys(page) + offset;
-   dma_addr_t dma_addr = phys_to_dma(dev, phys);
-
-   if (unlikely(swiotlb_force == SWIOTLB_FORCE))
-   return swiotlb_map(dev, phys, size, dir, attrs);
-
-   if (unlikely(!dma_capable(dev, dma_addr, size, true))) {
-   if (swiotlb_force != SWIOTLB_NO_FORCE)
-

Re: [PATCH v2 1/7] staging: qlge: replace ql_* with qlge_* to avoid namespace clashes with other qlogic drivers

2020-10-18 Thread Coiby Xu

On Sun, Oct 18, 2020 at 08:02:53PM +0900, Benjamin Poirier wrote:

On 2020-10-17 07:16 +0800, Coiby Xu wrote:

On Thu, Oct 15, 2020 at 10:01:36AM +0900, Benjamin Poirier wrote:
> On 2020-10-14 18:43 +0800, Coiby Xu wrote:
> > To avoid namespace clashes with other qlogic drivers and also for the
> > sake of naming consistency, use the "qlge_" prefix as suggested in
> > drivers/staging/qlge/TODO.
> >
> > Suggested-by: Benjamin Poirier 
> > Signed-off-by: Coiby Xu 
> > ---
> >  drivers/staging/qlge/TODO   |4 -
> >  drivers/staging/qlge/qlge.h |  190 ++--
> >  drivers/staging/qlge/qlge_dbg.c | 1073 ---
> >  drivers/staging/qlge/qlge_ethtool.c |  231 ++---
> >  drivers/staging/qlge/qlge_main.c| 1257 +--
> >  drivers/staging/qlge/qlge_mpi.c |  352 
> >  6 files changed, 1551 insertions(+), 1556 deletions(-)
> >
> > diff --git a/drivers/staging/qlge/TODO b/drivers/staging/qlge/TODO
> > index f93f7428f5d5..5ac55664c3e2 100644
> > --- a/drivers/staging/qlge/TODO
> > +++ b/drivers/staging/qlge/TODO
> > @@ -28,10 +28,6 @@
> >  * the driver has a habit of using runtime checks where compile time checks 
are
> >possible (ex. ql_free_rx_buffers(), ql_alloc_rx_buffers())
> >  * reorder struct members to avoid holes if it doesn't impact performance
> > -* in terms of namespace, the driver uses either qlge_, ql_ (used by
> > -  other qlogic drivers, with clashes, ex: ql_sem_spinlock) or nothing (with
> > -  clashes, ex: struct ob_mac_iocb_req). Rename everything to use the 
"qlge_"
> > -  prefix.
>
> You only renamed ql -> qlge. The prefix needs to be added where there is
> currently none like the second example of that text.

On second thoughts, these structs like ob_mac_iocb_req are defined in
local headers and there is no mixed usage. So even when we want to
build this diver and other qlogic drivers into the kernel instead of
as separate modules, it won't lead to real problems, is it right?

Using cscope or ctags and searching for ob_mac_iocb_req will yield
ambiguous results. It might also make things more difficult if using a
debugger.

Even if I have been using gtags, I didn't realize it. Thank you for
explaining it to me.

Looking at other drivers (ex. ice, mlx5), they use a prefix for their
private types, just like for their static functions, even though it's
not absolutely necessary. I think it's helpful when reading the code
because it quickly shows that it is something that was defined in the
driver, not in some subsystem.

Good point!

I didn't think about it earlier but it would have been better to leave
out this renaming to a subsequent patchset, since this change is
unrelated to the debugging features.

It seems it's more reasonable to do renaming first. So in a sense, the
renaming is a preparatory step for the debugging features.

--
Best regards,
Coiby

[PATCH v4 1/2] exfat: add exfat_update_inode()

2020-10-18 Thread Tetsuhiro Kohada

Integrate exfat_sync_inode() and mark_inode_dirty() as exfat_update_inode()
Also, return the result of _exfat_write_inode () when sync is specified.

Signed-off-by: Tetsuhiro Kohada 
---
Changes in v4
 - no change
Changes in v3
 - no change
Changes in v2
 - no change

 fs/exfat/exfat_fs.h |  2 +-
 fs/exfat/file.c |  5 +
 fs/exfat/inode.c|  9 +++--
 fs/exfat/namei.c| 35 +++
 4 files changed, 16 insertions(+), 35 deletions(-)

diff --git a/fs/exfat/exfat_fs.h b/fs/exfat/exfat_fs.h
index b8f0e829ecbd..ec0ee516aee2 100644
--- a/fs/exfat/exfat_fs.h
+++ b/fs/exfat/exfat_fs.h
@@ -466,7 +466,7 @@ int exfat_count_dir_entries(struct super_block *sb, struct 
exfat_chain *p_dir);
 
 /* inode.c */
 extern const struct inode_operations exfat_file_inode_operations;
-void exfat_sync_inode(struct inode *inode);
+int exfat_update_inode(struct inode *inode);
 struct inode *exfat_build_inode(struct super_block *sb,
struct exfat_dir_entry *info, loff_t i_pos);
 void exfat_hash_inode(struct inode *inode, loff_t i_pos);
diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index a92478eabfa4..e510b95dbf77 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -245,10 +245,7 @@ void exfat_truncate(struct inode *inode, loff_t size)
goto write_size;
 
inode->i_ctime = inode->i_mtime = current_time(inode);
-   if (IS_DIRSYNC(inode))
-   exfat_sync_inode(inode);
-   else
-   mark_inode_dirty(inode);
+   exfat_update_inode(inode);
 
inode->i_blocks = ((i_size_read(inode) + (sbi->cluster_size - 1)) &
~(sbi->cluster_size - 1)) >> inode->i_blkbits;
diff --git a/fs/exfat/inode.c b/fs/exfat/inode.c
index 730373e0965a..5a55303e1f65 100644
--- a/fs/exfat/inode.c
+++ b/fs/exfat/inode.c
@@ -91,10 +91,15 @@ int exfat_write_inode(struct inode *inode, struct 
writeback_control *wbc)
return ret;
 }
 
-void exfat_sync_inode(struct inode *inode)
+int exfat_update_inode(struct inode *inode)
 {
lockdep_assert_held(&EXFAT_SB(inode->i_sb)->s_lock);
-   __exfat_write_inode(inode, 1);
+
+   if (IS_DIRSYNC(inode))
+   return __exfat_write_inode(inode, 1);
+
+   mark_inode_dirty(inode);
+   return 0;
 }
 
 /*
diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c
index 2932b23a3b6c..1f5f72eb5baf 100644
--- a/fs/exfat/namei.c
+++ b/fs/exfat/namei.c
@@ -561,10 +561,7 @@ static int exfat_create(struct inode *dir, struct dentry 
*dentry, umode_t mode,
 
inode_inc_iversion(dir);
dir->i_ctime = dir->i_mtime = current_time(dir);
-   if (IS_DIRSYNC(dir))
-   exfat_sync_inode(dir);
-   else
-   mark_inode_dirty(dir);
+   exfat_update_inode(dir);
 
i_pos = exfat_make_i_pos(&info);
inode = exfat_build_inode(sb, &info, i_pos);
@@ -812,10 +809,7 @@ static int exfat_unlink(struct inode *dir, struct dentry 
*dentry)
inode_inc_iversion(dir);
dir->i_mtime = dir->i_atime = current_time(dir);
exfat_truncate_atime(&dir->i_atime);
-   if (IS_DIRSYNC(dir))
-   exfat_sync_inode(dir);
-   else
-   mark_inode_dirty(dir);
+   exfat_update_inode(dir);
 
clear_nlink(inode);
inode->i_mtime = inode->i_atime = current_time(inode);
@@ -846,10 +840,7 @@ static int exfat_mkdir(struct inode *dir, struct dentry 
*dentry, umode_t mode)
 
inode_inc_iversion(dir);
dir->i_ctime = dir->i_mtime = current_time(dir);
-   if (IS_DIRSYNC(dir))
-   exfat_sync_inode(dir);
-   else
-   mark_inode_dirty(dir);
+   exfat_update_inode(dir);
inc_nlink(dir);
 
i_pos = exfat_make_i_pos(&info);
@@ -976,10 +967,7 @@ static int exfat_rmdir(struct inode *dir, struct dentry 
*dentry)
inode_inc_iversion(dir);
dir->i_mtime = dir->i_atime = current_time(dir);
exfat_truncate_atime(&dir->i_atime);
-   if (IS_DIRSYNC(dir))
-   exfat_sync_inode(dir);
-   else
-   mark_inode_dirty(dir);
+   exfat_update_inode(dir);
drop_nlink(dir);
 
clear_nlink(inode);
@@ -1347,19 +1335,13 @@ static int exfat_rename(struct inode *old_dir, struct 
dentry *old_dentry,
new_dir->i_ctime = new_dir->i_mtime = new_dir->i_atime =
EXFAT_I(new_dir)->i_crtime = current_time(new_dir);
exfat_truncate_atime(&new_dir->i_atime);
-   if (IS_DIRSYNC(new_dir))
-   exfat_sync_inode(new_dir);
-   else
-   mark_inode_dirty(new_dir);
+   exfat_update_inode(new_dir);
 
i_pos = ((loff_t)EXFAT_I(old_inode)->dir.dir << 32) |
(EXFAT_I(old_inode)->entry & 0x);
exfat_unhash_inode(old_inode);
exfat_hash_inode(old_inode, i_pos);
-   if (IS_DIRSYNC(new_dir))
-   exfat_sync_inode(old_inode);
-   else
-   mark_inode_dirty(old_inode);
+   exfat_update_inode(

[PATCH v4 2/2] exfat: aggregate dir-entry updates into __exfat_write_inode().

2020-10-18 Thread Tetsuhiro Kohada

The following function writes the updated inode information as dir-entry
by themselves.
 - __exfat_truncate()
 - exfat_map_cluster()
 - exfat_find_empty_entry()
Aggregate these writes into __exfat_write_inode().

In exfat_map_cluster(), the value obtained from i_size_read() is set to
stream.valid_size and stream.size.
However, in the context of get_block(), inode->i_size has not been set yet,
so the same value as current will be set, which is a meaningless update.
Furthermore, if it is called with previous size=0, the newly allocated
cluster number will be set to stream.start_clu, and stream.valid_size/size
will be 0, which is illegal.
Update stream.valid_size/size and stream.start_clu when __exfat_write_inode
is called after i_size is set, to prevent meaningless/illegal updates.

Others:
 - Remove double inode-update in __exfat_truncate() and exfat_truncate().
 - In __exfat_write_inode(), rename 'on_disk_size' to 'filesize' and
   add adjustment when filesize is 0.

Reported-by: kernel test robot 
Signed-off-by: Tetsuhiro Kohada 
---
Changes in v4
 - Remove debug message
Changes in v3
 - Remove update_inode() in exfat_map_cluster()/exfat_truncate()
 - Update commit-message
Changes in v2
 - Fix endian issue

 fs/exfat/file.c  | 52 +---
 fs/exfat/inode.c | 42 +++---
 fs/exfat/namei.c | 26 +---
 3 files changed, 17 insertions(+), 103 deletions(-)

diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index e510b95dbf77..211fb947747a 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -100,7 +100,7 @@ int __exfat_truncate(struct inode *inode, loff_t new_size)
struct super_block *sb = inode->i_sb;
struct exfat_sb_info *sbi = EXFAT_SB(sb);
struct exfat_inode_info *ei = EXFAT_I(inode);
-   int evict = (ei->dir.dir == DIR_DELETED) ? 1 : 0;
+   int ret;
 
/* check if the given file ID is opened */
if (ei->type != TYPE_FILE && ei->type != TYPE_DIR)
@@ -150,49 +150,10 @@ int __exfat_truncate(struct inode *inode, loff_t new_size)
ei->attr |= ATTR_ARCHIVE;
 
/* update the directory entry */
-   if (!evict) {
-   struct timespec64 ts;
-   struct exfat_dentry *ep, *ep2;
-   struct exfat_entry_set_cache *es;
-   int err;
-
-   es = exfat_get_dentry_set(sb, &(ei->dir), ei->entry,
-   ES_ALL_ENTRIES);
-   if (!es)
-   return -EIO;
-   ep = exfat_get_dentry_cached(es, 0);
-   ep2 = exfat_get_dentry_cached(es, 1);
-
-   ts = current_time(inode);
-   exfat_set_entry_time(sbi, &ts,
-   &ep->dentry.file.modify_tz,
-   &ep->dentry.file.modify_time,
-   &ep->dentry.file.modify_date,
-   &ep->dentry.file.modify_time_cs);
-   ep->dentry.file.attr = cpu_to_le16(ei->attr);
-
-   /* File size should be zero if there is no cluster allocated */
-   if (ei->start_clu == EXFAT_EOF_CLUSTER) {
-   ep2->dentry.stream.valid_size = 0;
-   ep2->dentry.stream.size = 0;
-   } else {
-   ep2->dentry.stream.valid_size = cpu_to_le64(new_size);
-   ep2->dentry.stream.size = ep2->dentry.stream.valid_size;
-   }
-
-   if (new_size == 0) {
-   /* Any directory can not be truncated to zero */
-   WARN_ON(ei->type != TYPE_FILE);
-
-   ep2->dentry.stream.flags = ALLOC_FAT_CHAIN;
-   ep2->dentry.stream.start_clu = EXFAT_FREE_CLUSTER;
-   }
-
-   exfat_update_dir_chksum_with_entry_set(es);
-   err = exfat_free_dentry_set(es, inode_needs_sync(inode));
-   if (err)
-   return err;
-   }
+   inode->i_ctime = inode->i_mtime = current_time(inode);
+   ret = exfat_update_inode(inode);
+   if (ret)
+   return ret;
 
/* cut off from the FAT chain */
if (ei->flags == ALLOC_FAT_CHAIN && last_clu != EXFAT_FREE_CLUSTER &&
@@ -244,9 +205,6 @@ void exfat_truncate(struct inode *inode, loff_t size)
if (err)
goto write_size;
 
-   inode->i_ctime = inode->i_mtime = current_time(inode);
-   exfat_update_inode(inode);
-
inode->i_blocks = ((i_size_read(inode) + (sbi->cluster_size - 1)) &
~(sbi->cluster_size - 1)) >> inode->i_blkbits;
 write_size:
diff --git a/fs/exfat/inode.c b/fs/exfat/inode.c
index 5a55303e1f65..3870f5a1d8cd 100644
--- a/fs/exfat/inode.c
+++ b/fs/exfat/inode.c
@@ -19,7 +19,7 @@
 
 static int __exfat_write_inode(struct inode *inode, int sync)
 {
-   unsigned long long on_disk_size;
+   unsigned long long filesize;

Your NEW ATM CARD PIN

2020-10-18 Thread CHRISTOPHER JACKSON


Dear Beneficiary,

Be informed that Your ATM CARD PIN is 0985. Your daily Card withdrawal Limit is 
USD$10,000.00 per day. The Card is valid till April 2024 and the total amount in 
the ATM Master Card is Seven Hundred & Fifty Nine Thousand Dollars Only 
(USD$759,000.00).

This is your overdue Compensation Funds Covid-19  Unemployment benefits and due 
to you paying your tax regularly as a good citizen of your country.  Kindly 
reply for more details to receive your ATM card through express Courier 
delivery service.

Regards,
Christopher Jackson
Publicity Secretary
International Monetary Fund (IMF)
Stay safe Covid-19 is real

1 2 3 4 5 6 7 8 9 >

1 - 100 of 818 matches

Mail list logo