from:"Oleksandr Tyshchenko"

[PATCH] docs: fusa: Add requirements for Device Passthrough

2024-10-07 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Add common requirements for a physical device assignment to Arm64
and AMD64 PVH domains.

Signed-off-by: Oleksandr Tyshchenko 
---
Based on:
[PATCH] docs: fusa: Replace VM with domain
https://patchew.org/Xen/20241007182603.826807-1-ayan.kumar.hal...@amd.com/
---
---
 .../reqs/design-reqs/common/passthrough.rst   | 365 ++
 docs/fusa/reqs/index.rst  |   1 +
 docs/fusa/reqs/market-reqs/reqs.rst   |  33 ++
 docs/fusa/reqs/product-reqs/common/reqs.rst   |  29 ++
 4 files changed, 428 insertions(+)
 create mode 100644 docs/fusa/reqs/design-reqs/common/passthrough.rst
 create mode 100644 docs/fusa/reqs/product-reqs/common/reqs.rst

diff --git a/docs/fusa/reqs/design-reqs/common/passthrough.rst 
b/docs/fusa/reqs/design-reqs/common/passthrough.rst
new file mode 100644
index 00..a1d6676f65
--- /dev/null
+++ b/docs/fusa/reqs/design-reqs/common/passthrough.rst
@@ -0,0 +1,365 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Device Passthrough
+==
+
+The following are the requirements related to a physical device assignment
+[1], [2] to Arm64 and AMD64 PVH domains.
+
+Requirements for both Arm64 and AMD64 PVH
+=
+
+Hide IOMMU from a domain
+
+
+`XenSwdgn~passthrough_hide_iommu_from_domain~1`
+
+Description:
+Xen shall not expose the IOMMU device to the domain even if I/O virtualization
+is disabled. The IOMMU shall be under hypervisor control only.
+
+Rationale:
+
+Comments:
+
+Covers:
+ - `XenProd~device_passthrough~1`
+
+Discover PCI devices from hardware domain
+-
+
+`XenSwdgn~passthrough_discover_pci_devices_from_hwdom~1`
+
+Description:
+The hardware domain shall enumerate and discover PCI devices and inform Xen
+about their appearance and disappearance.
+
+Rationale:
+
+Comments:
+
+Covers:
+ - `XenProd~device_passthrough~1`
+
+Discover PCI devices from Xen
+-
+
+`XenSwdgn~passthrough_discover_pci_devices_from_xen~1`
+
+Description:
+Xen shall discover PCI devices (enumerated by the firmware beforehand) during
+boot if the hardware domain is not present.
+
+Rationale:
+
+Comments:
+
+Covers:
+ - `XenProd~device_passthrough~1`
+
+Assign PCI device to domain (with IOMMU)
+
+
+`XenSwdgn~passthrough_assign_pci_device_with_iommu~1`
+
+Description:
+Xen shall assign a specified PCI device (always implied as DMA-capable) to
+a domain during its creation using passthrough (partial) device tree on Arm64
+and Hyperlaunch device tree on AMD-x86. The physical device to be assigned is
+protected by the IOMMU.
+
+Rationale:
+
+Comments:
+
+Covers:
+ - `XenProd~device_passthrough~1`
+
+Deassign PCI device from domain (with IOMMU)
+
+
+`XenSwdgn~passthrough_deassign_pci_device_with_iommu~1`
+
+Description:
+Xen shall deassign a specified PCI device from a domain during its destruction.
+The physical device to be deassigned is protected by the IOMMU.
+
+Rationale:
+
+Comments:
+
+Covers:
+ - `XenProd~device_passthrough~1`
+
+Forbid the same PCI device assignment to multiple domains
+-
+
+`XenSwdgn~passthrough_forbid_same_pci_device_assignment~1`
+
+Description:
+Xen shall not assign the same PCI device to multiple domains by failing to
+create a new domain if the device to be passed through is already assigned
+to the existing domain. Also different PCI devices which share some resources
+(interrupts, IOMMU connections) can be assigned only to the same domain.
+
+Rationale:
+
+Comments:
+
+Covers:
+ - `XenProd~device_passthrough~1`
+
+Requirements for Arm64 only
+===
+
+Assign interrupt-less platform device to domain
+---
+
+`XenSwdgn~passthrough_assign_interrupt_less_platform_device~1`
+
+Description:
+Xen shall assign a specified platform device that has only a MMIO region
+(does not have any interrupts) to a domain during its creation using 
passthrough
+device tree.
+The example of interrupt-less device is PWM or clock controller.
+
+Rationale:
+
+Comments:
+
+Covers:
+ - `XenProd~device_passthrough~1`
+
+Deassign interrupt-less platform device from domain
+---
+
+`XenSwdgn~passthrough_deassign_interrupt_less_platform_device~1`
+
+Description:
+Xen shall deassign a specified platform device that has only a MMIO region
+(does not have any interrupts) from a domain during its destruction.
+
+Rationale:
+
+Comments:
+
+Covers:
+ - `XenProd~device_passthrough~1`
+
+Assign non-DMA-capable platform device to domain
+
+
+`XenSwdgn~passthrough_assign_non_dma_platform_device~1`
+
+Description:
+Xen shall assign a specified non-DMA-capable platform device to a domain during
+its creation using passthrough device tree

Re: [PATCH 2/2] xen/arm: Add i.MX UART driver

2024-04-18 Thread Oleksandr Tyshchenko





On 07.04.24 05:43, Peng Fan wrote:

Hi Oleksandr,


Hello Peng




Subject: [PATCH 2/2] xen/arm: Add i.MX UART driver

From: Oleksandr Tyshchenko 

The i.MX UART Documentation:
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
nxp.com%2Fwebapp%2FDownload%3FcolCode%3DIMX8MMRM&data=05%7
C02%7Cpeng.fan%40nxp.com%7C6ada06c4133849667f3608dc530d5471%7
C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6384765639197564
70%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz
IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=RmXgAMb7
wFZ7epZgYgHJo4LH35rzQhD05yTXSkttXbc%3D&reserved=0
Chapter 16.2 Universal Asynchronous Receiver/Transmitter (UART)

Tested on i.MX 8M Mini only, but I guess, it should be suitable for other
i.MX8M* SoCs (those UART device tree nodes contain "fsl,imx6q-uart"
compatible string).


Good to see people are interested in XEN on 8M.
I had an implementation back in 2015, you could take a look.


Thanks.


When I was googling for what was publicly available on Xen exactly for 
i.MX 8M Mini (before start writing this driver), I didn't find that 
implementation.


Interesting to compare


[snip]

Re: [PATCH 2/2] xen/arm: Add i.MX UART driver

2024-04-18 Thread Oleksandr Tyshchenko





On 04.04.24 09:54, Michal Orzel wrote:

Hi Oleksandr,


Hello Michal

sorry for the late response



On 02/04/2024 14:05, Oleksandr Tyshchenko wrote:



From: Oleksandr Tyshchenko 

The i.MX UART Documentation:
https://www.nxp.com/webapp/Download?colCode=IMX8MMRM
Chapter 16.2 Universal Asynchronous Receiver/Transmitter (UART)

Tested on i.MX 8M Mini only, but I guess, it should be

imperative mood


ok




suitable for other i.MX8M* SoCs (those UART device tree nodes
contain "fsl,imx6q-uart" compatible string).

Signed-off-by: Oleksandr Tyshchenko 
---
I used the "earlycon=ec_imx6q,0x3089" cmd arg and
selected CONFIG_SERIAL_IMX_EARLYCON in Linux for enabling vUART.
---
---
  MAINTAINERS |   1 +
  xen/drivers/char/Kconfig|   7 +
  xen/drivers/char/Makefile   |   1 +
  xen/drivers/char/imx-uart.c | 299 
  4 files changed, 308 insertions(+)
  create mode 100644 xen/drivers/char/imx-uart.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 1bd22fd75f..bd4084fd20 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -249,6 +249,7 @@ F:  xen/drivers/char/arm-uart.c
  F: xen/drivers/char/cadence-uart.c
  F: xen/drivers/char/exynos4210-uart.c
  F: xen/drivers/char/imx-lpuart.c
+F: xen/drivers/char/imx-uart.c
  F: xen/drivers/char/meson-uart.c
  F: xen/drivers/char/mvebu-uart.c
  F: xen/drivers/char/omap-uart.c
diff --git a/xen/drivers/char/Kconfig b/xen/drivers/char/Kconfig
index e18ec3788c..f51a1f596a 100644
--- a/xen/drivers/char/Kconfig
+++ b/xen/drivers/char/Kconfig
@@ -20,6 +20,13 @@ config HAS_IMX_LPUART
 help
   This selects the i.MX LPUART. If you have i.MX8QM based board, say Y.

+config HAS_IMX_UART
+   bool "i.MX UART driver"
+   default y
+   depends on ARM_64
+   help
+ This selects the i.MX UART. If you have i.MX8M* based board, say Y.
+
  config HAS_MVEBU
 bool "Marvell MVEBU UART driver"
 default y
diff --git a/xen/drivers/char/Makefile b/xen/drivers/char/Makefile
index e7e374775d..147530a1ed 100644
--- a/xen/drivers/char/Makefile
+++ b/xen/drivers/char/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_HAS_SCIF) += scif-uart.o
  obj-$(CONFIG_HAS_EHCI) += ehci-dbgp.o
  obj-$(CONFIG_XHCI) += xhci-dbc.o
  obj-$(CONFIG_HAS_IMX_LPUART) += imx-lpuart.o
+obj-$(CONFIG_HAS_IMX_UART) += imx-uart.o
  obj-$(CONFIG_ARM) += arm-uart.o
  obj-y += serial.o
  obj-$(CONFIG_XEN_GUEST) += xen_pv_console.o
diff --git a/xen/drivers/char/imx-uart.c b/xen/drivers/char/imx-uart.c
new file mode 100644
index 00..13bb189063
--- /dev/null
+++ b/xen/drivers/char/imx-uart.c
@@ -0,0 +1,299 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */

Can it be GPL-2.0-only? Some companies are worried about v3.


I guess, it can, will change





+/*
+ * xen/drivers/char/imx-uart.c
+ *
+ * Driver for i.MX UART.
+ *
+ * Based on Linux's drivers/tty/serial/imx.c
+ *
+ * Copyright (C) 2024 EPAM Systems Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define imx_uart_read(uart, off)  readl((uart)->regs + (off))
+#define imx_uart_write(uart, off, val)writel((val), (uart)->regs + (off))
+
+static struct imx_uart {
+uint32_t baud, clock_hz, data_bits, parity, stop_bits, fifo_size;

What's the use of these variables? AFAICT they are set but never used.


right, this is a copy paste from other UART driver, will drop




+uint32_t irq;

unsigned int


+char __iomem *regs;
+struct irqaction irqaction;
+struct vuart_info vuart;
+} imx_com;
+
+static void imx_uart_interrupt(int irq, void *data)
+{
+struct serial_port *port = data;
+struct imx_uart *uart = port->uart;
+uint32_t usr1, usr2;
+
+usr1 = imx_uart_read(uart, USR1);
+usr2 = imx_uart_read(uart, USR2);
+
+if ( usr1 & (USR1_RRDY | USR1_AGTIM) )
+{
+imx_uart_write(uart, USR1, USR1_AGTIM);
+serial_rx_interrupt(port);
+}
+
+if ( (usr1 & USR1_TRDY) || (usr2 & USR2_TXDC) )
+serial_tx_interrupt(port);
+}
+
+static void imx_uart_clear_rx_errors(struct serial_port *port)
+{
+struct imx_uart *uart = port->uart;
+uint32_t usr1, usr2;
+
+usr1 = imx_uart_read(uart, USR1);
+usr2 = imx_uart_read(uart, USR2);
+
+if ( usr2 & USR2_BRCD )
+imx_uart_write(uart, USR2, USR2_BRCD);
+else if ( usr1 & USR1_FRAMERR )
+imx_uart_write(uart, USR1, USR1_FRAMERR);
+else if ( usr1 & USR1_PARITYERR )
+imx_uart_write(uart, USR1, USR1_PARITYERR);
+
+if ( usr2 & USR2_ORE )
+imx_uart_write(uart, USR2, USR2_ORE);
+}
+
+static void __init imx_uart_init_preirq(struct serial_port *port)
+{
+struct imx_uart *uart = port->uart;
+uint32_t reg;
+
+/*
+ * Wait for the transmission to complete. This is needed for a smooth
+ * transition when we come from early printk.
+ */
+wh

Re: [PATCH 1/2] xen/arm: Add i.MX UART early printk support

2024-04-18 Thread Oleksandr Tyshchenko





On 03.04.24 13:11, Michal Orzel wrote:

Hi Oleksandr,


Hello Michal

sorry for the late response



On 02/04/2024 14:05, Oleksandr Tyshchenko wrote:



From: Oleksandr Tyshchenko 

Tested on i.MX 8M Mini only, but I guess, it should be
suitable for other i.MX8M* SoCs (those UART device tree nodes
contain "fsl,imx6q-uart" compatible string).

Please use imperative mood in commit msg.


ok



I would mention also that you are adding
macros that will be used by the runtime driver.



will do





Signed-off-by: Oleksandr Tyshchenko 
---
I selected the following configs for enabling early printk:

  CONFIG_EARLY_UART_CHOICE_IMX_UART=y
  CONFIG_EARLY_UART_IMX_UART=y
  CONFIG_EARLY_PRINTK=y
  CONFIG_EARLY_UART_BASE_ADDRESS=0x3089
  CONFIG_EARLY_PRINTK_INC="debug-imx-uart.inc"
---
---
  xen/arch/arm/Kconfig.debug| 14 +
  xen/arch/arm/arm64/debug-imx-uart.inc | 38 ++
  xen/arch/arm/include/asm/imx-uart.h   | 76 +++
  3 files changed, 128 insertions(+)
  create mode 100644 xen/arch/arm/arm64/debug-imx-uart.inc
  create mode 100644 xen/arch/arm/include/asm/imx-uart.h

diff --git a/xen/arch/arm/Kconfig.debug b/xen/arch/arm/Kconfig.debug
index eec860e88e..a15d08f214 100644
--- a/xen/arch/arm/Kconfig.debug
+++ b/xen/arch/arm/Kconfig.debug
@@ -68,6 +68,16 @@ choice
 provide the parameters for the i.MX LPUART rather than
 selecting one of the platform specific options below if
 you know the parameters for the port.
+   config EARLY_UART_CHOICE_IMX_UART
+   select EARLY_UART_IMX_UART
+   depends on ARM_64
+   bool "Early printk via i.MX UART"
+   help
+   Say Y here if you wish the early printk to direct their

Do not take example from surrounding code. help text should be indented by 2 
tabs and 2 spaces here.


ok




+   output to a i.MX UART. You can use this option to
+   provide the parameters for the i.MX UART rather than
+   selecting one of the platform specific options below if
+   you know the parameters for the port.
 config EARLY_UART_CHOICE_MESON
 select EARLY_UART_MESON
 depends on ARM_64
@@ -199,6 +209,9 @@ config EARLY_UART_EXYNOS4210
  config EARLY_UART_IMX_LPUART
 select EARLY_PRINTK
 bool
+config EARLY_UART_IMX_UART
+   select EARLY_PRINTK
+   bool
  config EARLY_UART_MESON
 select EARLY_PRINTK
 bool
@@ -304,6 +317,7 @@ config EARLY_PRINTK_INC
 default "debug-cadence.inc" if EARLY_UART_CADENCE
 default "debug-exynos4210.inc" if EARLY_UART_EXYNOS4210
 default "debug-imx-lpuart.inc" if EARLY_UART_IMX_LPUART
+   default "debug-imx-uart.inc" if EARLY_UART_IMX_UART
 default "debug-meson.inc" if EARLY_UART_MESON
 default "debug-mvebu.inc" if EARLY_UART_MVEBU
 default "debug-pl011.inc" if EARLY_UART_PL011
diff --git a/xen/arch/arm/arm64/debug-imx-uart.inc 
b/xen/arch/arm/arm64/debug-imx-uart.inc
new file mode 100644
index 00..27a68b1ed5
--- /dev/null
+++ b/xen/arch/arm/arm64/debug-imx-uart.inc
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * xen/arch/arm/arm64/debug-imx-uart.inc
+ *
+ * i.MX8M* specific debug code
+ *
+ * Copyright (C) 2024 EPAM Systems Inc.
+ */
+
+#include 
+
+/*
+ * Wait UART to be ready to transmit
+ * rb: register which contains the UART base address
+ * rc: scratch register
+ */
+.macro early_uart_ready xb, c
+1:
+ldr   w\c, [\xb, #IMX21_UTS] /* <- Test register */
+tst   w\c, #UTS_TXFULL   /* Check TxFIFO FULL bit */
+bne   1b /* Wait for the UART to be ready */
+.endm
+
+/*
+ * UART transmit character
+ * rb: register which contains the UART base address
+ * rt: register which contains the character to transmit
+ */
+.macro early_uart_transmit xb, wt
+str   \wt, [\xb, #URTX0] /* -> Transmitter Register */
+.endm
+
+/*
+ * Local variables:
+ * mode: ASM
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/include/asm/imx-uart.h 
b/xen/arch/arm/include/asm/imx-uart.h
new file mode 100644
index 00..413a81dd44
--- /dev/null
+++ b/xen/arch/arm/include/asm/imx-uart.h
@@ -0,0 +1,76 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * xen/arch/arm/include/asm/imx-uart.h
+ *
+ * Common constant definition between early printk and the UART driver
+ *
+ * Copyright (C) 2024 EPAM Systems Inc.
+ */
+
+#ifndef __ASM_ARM_IMX_UART_H__
+#define __ASM_ARM_IMX_UART_H__
+
+/* 32-bit register definition */
+#define URXD0(0x00) /* Receiver Register */

There is no need to surround these values


ok




+#define URTX0(0x40) /* Transmitter Register */
+#define U

[ImageBuilder 5/5] uboot-script-gen: Add ability to specify "nr_spis"

2024-04-17 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

This is needed to have a possibility of assigning a specified number
of shared peripheral interrupts (SPIs) to domain.

Signed-off-by: Oleksandr Tyshchenko 
Signed-off-by: Stefano Stabellini 
---
 README.md| 5 +
 scripts/uboot-script-gen | 4 
 2 files changed, 9 insertions(+)

diff --git a/README.md b/README.md
index 63c4708..7683492 100644
--- a/README.md
+++ b/README.md
@@ -237,6 +237,11 @@ Where:
   PL011 UART for domain. The default is 1. If explicitly set to 0, then
   "console=ttyAMA0" is not used as a default DOMU_CMD[number].
 
+- DOMU_NR_SPIS[number] is optional. It specifies a number of shared peripheral
+  interrupts (SPIs) to be assigned to domain (depending on the underlying
+  hardware platform). The minimum possible value is 0, if DOMU_VPL011[number]
+  is also explicitly set to 0. Otherwise the minimum value is 1.
+
 - DOMU_CPUPOOL[number] specifies the id of the cpupool (created using
   CPUPOOL[number] option, where number == id) that will be assigned to domU.
 
diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
index fd37e18..50b6a59 100755
--- a/scripts/uboot-script-gen
+++ b/scripts/uboot-script-gen
@@ -348,6 +348,10 @@ function xen_device_tree_editing()
 then
 dt_set "/chosen/domU$i" "vpl011" "hex" "0x1"
 fi
+if test -n "${DOMU_NR_SPIS[$i]}"
+then
+dt_set "/chosen/domU$i" "nr_spis" "int" "${DOMU_NR_SPIS[$i]}"
+fi
 if [[ "${DOMU_ENHANCED[$i]}" == 1 || ("$DOM0_KERNEL" && 
"${DOMU_ENHANCED[$i]}" != 0) ]]
 then
 dt_set "/chosen/domU$i" "xen,enhanced" "str" "enabled"
-- 
2.34.1

[ImageBuilder 1/5] uboot-script-gen: Update to deal with uImage which is not executable

2024-04-17 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

uImage is the Image that has a U-Boot wrapper, it doesn't contain
"executable" string which subsequent "file" command is looking for
when inspecting it.

Below the proof:

otyshchenko@EPUAKYIW03DD:~/work/xen_tests/input$ file -L binaries/uImage.gz
binaries/uImage.gz: u-boot legacy uImage, Linux Kernel Image, Linux/ARM 64-bit,
OS Kernel Image (gzip), 9822180 bytes, Fri Sep 29 15:39:42 2023, Load Address: 
0X4000,
Entry Point: 0X4000, Header CRC: 0XE1EF21BF, Data CRC: 0XC418025

otyshchenko@EPUAKYIW03DD:~/work/xen_tests/input$ file -L binaries/uImage
binaries/uImage: u-boot legacy uImage, Linux Kernel Image, Linux/ARM 64-bit,
OS Kernel Image (Not compressed), 23269888 bytes, Fri Sep 29 15:40:19 2023,
Load Address: 0X4000, Entry Point: 0X4000, Header CRC: 0XA0B7D051,
Data CRC: 0X42083F51

Suggested-by: Stefano Stabellini 
Signed-off-by: Oleksandr Tyshchenko 
---
 scripts/uboot-script-gen | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
index 3cc6b47..7cb8c6d 100755
--- a/scripts/uboot-script-gen
+++ b/scripts/uboot-script-gen
@@ -505,9 +505,9 @@ function check_file_type()
 
 # if file doesn't know what it is, it outputs data, so include that
 # since some executables aren't recongnized
-if [ "$type" = "executable" ]
+if [[ "$type" = "executable"* ]]
 then
-type="executable\|data\|ARM OpenFirmware"
+type="$type\|data\|ARM OpenFirmware"
 # file in older distros (ex: RHEL 7.4) just output data for device
 # tree blobs
 elif [ "$type" = "Device Tree Blob" ]
@@ -712,7 +712,7 @@ xen_file_loading()
 {
 if test "$DOM0_KERNEL"
 then
-check_compressed_file_type $DOM0_KERNEL "executable"
+check_compressed_file_type $DOM0_KERNEL "executable\|uImage"
 dom0_kernel_addr=$memaddr
 load_file $DOM0_KERNEL "dom0_linux"
 dom0_kernel_size=$filesize
@@ -747,7 +747,7 @@ xen_file_loading()
 cleanup_and_return_err
 fi
 
-check_compressed_file_type ${DOMU_KERNEL[$i]} "executable"
+check_compressed_file_type ${DOMU_KERNEL[$i]} "executable\|uImage"
 domU_kernel_addr[$i]=$memaddr
 load_file ${DOMU_KERNEL[$i]} "domU${i}_kernel"
 domU_kernel_size[$i]=$filesize
-- 
2.34.1

[ImageBuilder 3/5] uboot-script-gen: Add ability to specify grant table params

2024-04-17 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Use DOMU_GRANT_VER to set "max_grant_version" dt property.
Use DOMU_GRANT_FRAMES to set "max_grant_frames" dt property.
Use DOMU_MAPTRACK_FRAMES to set "max_maptrack_frames" dt property.

Signed-off-by: Oleksandr Tyshchenko 
---
 README.md| 10 ++
 scripts/uboot-script-gen | 13 +
 2 files changed, 23 insertions(+)

diff --git a/README.md b/README.md
index 97db7aa..b2459fd 100644
--- a/README.md
+++ b/README.md
@@ -222,6 +222,16 @@ Where:
   kernels might break. If set to 2, "no-xenstore" is specified, see Xen
   documentation about dom0less "no-xenstore" option.
 
+- DOMU_GRANT_VER[number] is optional but specifies the maximum version
+  of grant table shared structure (the maximum security supported version
+  by Xen on Arm64 is 1)
+
+- DOMU_GRANT_FRAMES[number] is optional but specifies the maximum number
+  of grant table frames (the default value used by Xen on Arm64 is 64)
+
+- DOMU_MAPTRACK_FRAMES[number] is optional but specifies the maximum number
+  of grant maptrack frames (the default value used by Xen on Arm64 is 1024)
+
 - DOMU_CPUPOOL[number] specifies the id of the cpupool (created using
   CPUPOOL[number] option, where number == id) that will be assigned to domU.
 
diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
index 98a64d6..adec6f9 100755
--- a/scripts/uboot-script-gen
+++ b/scripts/uboot-script-gen
@@ -353,6 +353,19 @@ function xen_device_tree_editing()
 dt_set "/chosen/domU$i" "xen,enhanced" "str" "no-xenstore"
 fi
 
+if test -n "${DOMU_GRANT_VER[i]}"
+then
+dt_set "/chosen/domU$i" "max_grant_version" "int" 
"${DOMU_GRANT_VER[i]}"
+fi
+if test -n "${DOMU_GRANT_FRAMES[i]}"
+then
+dt_set "/chosen/domU$i" "max_grant_frames" "int" 
"${DOMU_GRANT_FRAMES[i]}"
+fi
+if test -n "${DOMU_MAPTRACK_FRAMES[i]}"
+then
+dt_set "/chosen/domU$i" "max_maptrack_frames" "int" 
"${DOMU_MAPTRACK_FRAMES[i]}"
+fi
+
 if test -n "${DOMU_SHARED_MEM[i]}"
 then
 add_device_tree_static_shared_mem "/chosen/domU${i}" 
"${DOMU_SHARED_MEM[i]}"
-- 
2.34.1

[ImageBuilder 4/5] uboot-script-gen: Add ability to unselect "vpl011"

2024-04-17 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Introduce new option DOMU_VPL011[nr] that can be set to 0
or 1 (default).

Also align "console=ttyAMA0" Linux cmd arg setting with "vpl011" presense.

Suggested-by: Michal Orzel 
Signed-off-by: Oleksandr Tyshchenko 
---
 README.md| 7 ++-
 scripts/uboot-script-gen | 7 +--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index b2459fd..63c4708 100644
--- a/README.md
+++ b/README.md
@@ -151,7 +151,8 @@ Where:
 - DOMU_KERNEL[number] specifies the DomU kernel to use.
 
 - DOMU_CMD[number] specifies the command line arguments for DomU's Linux
-  kernel. If not set, then "console=ttyAMA0" is used.
+  kernel. If not set and DOMU_VPL011[number] is not set to 0, then
+  "console=ttyAMA0" is used.
 
 - DOMU_RAMDISK[number] specifies the DomU ramdisk to use.
 
@@ -232,6 +233,10 @@ Where:
 - DOMU_MAPTRACK_FRAMES[number] is optional but specifies the maximum number
   of grant maptrack frames (the default value used by Xen on Arm64 is 1024)
 
+- DOMU_VPL011[number] is optional but used to enable (1)/disable (0) a virtual
+  PL011 UART for domain. The default is 1. If explicitly set to 0, then
+  "console=ttyAMA0" is not used as a default DOMU_CMD[number].
+
 - DOMU_CPUPOOL[number] specifies the id of the cpupool (created using
   CPUPOOL[number] option, where number == id) that will be assigned to domU.
 
diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
index adec6f9..fd37e18 100755
--- a/scripts/uboot-script-gen
+++ b/scripts/uboot-script-gen
@@ -344,7 +344,10 @@ function xen_device_tree_editing()
 add_device_tree_static_mem "/chosen/domU$i" 
"${DOMU_STATIC_MEM[$i]}"
 dt_set "/chosen/domU$i" "direct-map" "bool" 
"${DOMU_DIRECT_MAP[$i]}"
 fi
-dt_set "/chosen/domU$i" "vpl011" "hex" "0x1"
+if test -z "${DOMU_VPL011[$i]}" || test "${DOMU_VPL011[$i]}" -eq "1"
+then
+dt_set "/chosen/domU$i" "vpl011" "hex" "0x1"
+fi
 if [[ "${DOMU_ENHANCED[$i]}" == 1 || ("$DOM0_KERNEL" && 
"${DOMU_ENHANCED[$i]}" != 0) ]]
 then
 dt_set "/chosen/domU$i" "xen,enhanced" "str" "enabled"
@@ -677,7 +680,7 @@ function xen_config()
 then
 DOMU_VCPUS[$i]=1
 fi
-if test -z "${DOMU_CMD[$i]}"
+if test -z "${DOMU_CMD[$i]}" && (test -z "${DOMU_VPL011[$i]}" || test 
"${DOMU_VPL011[$i]}" -eq "1")
 then
 DOMU_CMD[$i]="console=ttyAMA0"
 fi
-- 
2.34.1

[ImageBuilder 0/5] Misc updates for the dom0less support

2024-04-17 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Hello all,

this is a collection of patches (#2-5) for improving the dom0less support
and a patch (#1) for dealing with uImage.  

Oleksandr Tyshchenko (5):
  uboot-script-gen: Update to deal with uImage which is not executable
  uboot-script-gen: Extend DOMU_ENHANCED to specify "no-xenstore"
  uboot-script-gen: Add ability to specify grant table params
  uboot-script-gen: Add ability to unselect "vpl011"
  uboot-script-gen: Add ability to specify "nr_spis"

 README.md| 29 +
 scripts/uboot-script-gen | 35 +--
 2 files changed, 54 insertions(+), 10 deletions(-)

-- 
2.34.1

[ImageBuilder 2/5] uboot-script-gen: Extend DOMU_ENHANCED to specify "no-xenstore"

2024-04-17 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

We need some Xen services to be available within single dom0less DomU.
Just using "enabled" will lead to Xen panic because of no Dom0.

(XEN) 
(XEN) Panic on CPU 0:
(XEN) At the moment, Xenstore support requires dom0 to be present
(XEN) 

Signed-off-by: Oleksandr Tyshchenko 
---
 README.md| 7 ---
 scripts/uboot-script-gen | 3 +++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 3b4b16f..97db7aa 100644
--- a/README.md
+++ b/README.md
@@ -217,9 +217,10 @@ Where:
   If set to 1, the VM is direct mapped. The default is 1.
   This is only applicable when DOMU_STATIC_MEM is specified.
 
-- DOMU_ENHANCED[number] can be set to 1 or 0, default is 1 when Dom0 is
-  present. If set to 1, the VM can use PV drivers. Older Linux kernels
-  might break.
+- DOMU_ENHANCED[number] can be set to 0, 1, or 2. Default is 1 when Dom0
+  is present. If set to 1, the VM can use PV drivers. Older Linux
+  kernels might break. If set to 2, "no-xenstore" is specified, see Xen
+  documentation about dom0less "no-xenstore" option.
 
 - DOMU_CPUPOOL[number] specifies the id of the cpupool (created using
   CPUPOOL[number] option, where number == id) that will be assigned to domU.
diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
index 7cb8c6d..98a64d6 100755
--- a/scripts/uboot-script-gen
+++ b/scripts/uboot-script-gen
@@ -348,6 +348,9 @@ function xen_device_tree_editing()
 if [[ "${DOMU_ENHANCED[$i]}" == 1 || ("$DOM0_KERNEL" && 
"${DOMU_ENHANCED[$i]}" != 0) ]]
 then
 dt_set "/chosen/domU$i" "xen,enhanced" "str" "enabled"
+elif [ "${DOMU_ENHANCED[$i]}" == 2 ]
+then
+dt_set "/chosen/domU$i" "xen,enhanced" "str" "no-xenstore"
 fi
 
 if test -n "${DOMU_SHARED_MEM[i]}"
-- 
2.34.1

Re: [amd-xen-safety] [PATCH V4] Add requirements for Device Passthrough

2024-04-08 Thread Oleksandr Tyshchenko



On 08.04.24 20:38, Hildebrand, Stewart via groups.io wrote:
> On 4/8/24 09:19, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko 
>>
>> Please refer to chapter "Device Passthrough":
>> https://urldefense.com/v3/__https://groups.io/g/amd-xen-safety/message/1300__;!!GF_29dbcQIUBPA!3aFb7rL3ZtuIqm60pio-mtZDckCi8MY_i0a1d7ncLKqQ7hDoVkGuVye68mX81bsBw9Y1WXiVIfwRg3xhUyXw5QbLvWnYKNA8aeg4$
>>  [groups[.]io]
>>
>> Create corresponding directory and README file.
>>
>> Signed-off-by: Oleksandr Tyshchenko 

Hello,

this was sent to @xen-devel by mistake, please ignore.
sorry for the inconvenience

[PATCH V4] Add requirements for Device Passthrough

2024-04-08 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Please refer to chapter "Device Passthrough":
https://groups.io/g/amd-xen-safety/message/1300

Create corresponding directory and README file.

Signed-off-by: Oleksandr Tyshchenko 
---

  V2:
   - add R-b
   - update README
   - lower case for platform, s/simple/non-DMA-capable, other misc
 updates
   - add "Allowed for the safe direct mapped VMs only"
 to reqs for DMA-capable devices without IOMMU protection
   - add dom0less passthrough details where needed
   - add reqs for PCI devices discovering

  V3:
   - move common reqs "Assign PCI device to domain (without IOMMU)" and
 "Deassign PCI device from domain (without IOMMU)" to Arm64 only
   - clarify the DMA-capable device assignment w/o IOMMU,
 add more details
   - drop R-b

  V4:
   - add the following reqs:
 - Assign interrupt-less platform device to domain
 - Deassign interrupt-less platform device from domain
 - Map platform device MMIO region identity
 - Map platform device MMIO region non-identity
   - add more details
   - repeat the relevant info in all assign reqs
---
---
 .../physical_resources/README.rst |  16 +
 .../physical_resources/passthrough.rst| 477 ++
 2 files changed, 493 insertions(+)
 create mode 100644 domain_creation_and_runtime/physical_resources/README.rst
 create mode 100644 
domain_creation_and_runtime/physical_resources/passthrough.rst

diff --git a/domain_creation_and_runtime/physical_resources/README.rst 
b/domain_creation_and_runtime/physical_resources/README.rst
new file mode 100644
index 000..0eb4dd4
--- /dev/null
+++ b/domain_creation_and_runtime/physical_resources/README.rst
@@ -0,0 +1,16 @@
+Physical resources
+==
+
+This section lists the requirements related to physical resources directly
+accessible from the domain as well as physical resources entirely controlled
+by Xen and invisible to a domain. The later group of resources, although being
+invisible to a domain, has an impact on it.
+
+Examples of domain physical resources:
+| 1. PCI device
+| 2. Platform device
+| 3. MMU stage 1
+
+Examples of Xen physical resources:
+| 1. IOMMU stage 2
+| 2. MMU stage 2
diff --git a/domain_creation_and_runtime/physical_resources/passthrough.rst 
b/domain_creation_and_runtime/physical_resources/passthrough.rst
new file mode 100644
index 000..f619730
--- /dev/null
+++ b/domain_creation_and_runtime/physical_resources/passthrough.rst
@@ -0,0 +1,477 @@
+Device Passthrough
+==
+
+The following are the requirements related to a physical device
+assignment [1], [2] to Arm64 and AMD64 PVH domains.
+
+Requirements for both Arm64 and AMD64 PVH
+=
+
+Hide IOMMU from a domain
+
+
+`XenSSR~hide_iommu_from_domain~1`
+
+Description:
+Xen should not expose the IOMMU device to the domain even if I/O virtualization
+is disabled. The IOMMU should be under hypervisor control only.
+
+Rationale:
+
+Covers:
+ - `XenPRQ~device_passthrough~1`
+
+Needs:
+ - XenValTestCase
+
+Discover PCI devices from hardware domain
+-
+
+`XenSSR~discover_pci_devices_from_hwdom~1`
+
+Description:
+The hardware domain shall be able to enumerate and discover PCI devices and
+inform Xen about their appearance and disappearance
+
+Rationale:
+
+Covers:
+ - `XenPRQ~device_passthrough~1`
+
+Needs:
+ - XenValTestCase
+
+Discover PCI devices from Xen
+-
+
+`XenSSR~discover_pci_devices_from_xen~1`
+
+Description:
+Xen shall be able to discover PCI devices (enumerated by the firmware
+beforehand) during boot if the hardware domain is not meant to be used
+
+Rationale:
+
+Covers:
+ - `XenPRQ~device_passthrough~1`
+
+Needs:
+ - XenValTestCase
+
+Assign PCI device to domain (with IOMMU)
+
+
+`XenSSR~assign_pci_device_with_iommu~1`
+
+Description:
+Xen shall be able to assign a specified PCI device (always implied as
+DMA-capable) to a domain during its creation using passthrough (partial)
+device tree. The physical device to be assigned is protected by the IOMMU.
+
+Rationale:
+
+ - The passthrough device tree is specified using a device tree module node
+   with compatible ("multiboot,device-tree") in the host device tree
+ - The PCI device to be passed through is specified using device tree property
+   ("xen,pci-assigned") in the "passthrough" node described in the passthrough
+   device tree
+
+Covers:
+ - `XenPRQ~device_passthrough~1`
+
+Needs:
+ - XenValTestCase
+
+Deassign PCI device from domain (with IOMMU)
+
+
+`XenSSR~deassign_pci_device_with_iommu~1`
+
+Description:
+Xen shall be able to deassign a specified PCI device from a domain during its
+destruction. The physical device to be deassigned is protected by the IOMM

[PATCH 2/2] xen/arm: Add i.MX UART driver

2024-04-02 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

The i.MX UART Documentation:
https://www.nxp.com/webapp/Download?colCode=IMX8MMRM
Chapter 16.2 Universal Asynchronous Receiver/Transmitter (UART)

Tested on i.MX 8M Mini only, but I guess, it should be
suitable for other i.MX8M* SoCs (those UART device tree nodes
contain "fsl,imx6q-uart" compatible string).

Signed-off-by: Oleksandr Tyshchenko 
---
I used the "earlycon=ec_imx6q,0x3089" cmd arg and
selected CONFIG_SERIAL_IMX_EARLYCON in Linux for enabling vUART.
---
---
 MAINTAINERS |   1 +
 xen/drivers/char/Kconfig|   7 +
 xen/drivers/char/Makefile   |   1 +
 xen/drivers/char/imx-uart.c | 299 
 4 files changed, 308 insertions(+)
 create mode 100644 xen/drivers/char/imx-uart.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 1bd22fd75f..bd4084fd20 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -249,6 +249,7 @@ F:  xen/drivers/char/arm-uart.c
 F: xen/drivers/char/cadence-uart.c
 F: xen/drivers/char/exynos4210-uart.c
 F: xen/drivers/char/imx-lpuart.c
+F: xen/drivers/char/imx-uart.c
 F: xen/drivers/char/meson-uart.c
 F: xen/drivers/char/mvebu-uart.c
 F: xen/drivers/char/omap-uart.c
diff --git a/xen/drivers/char/Kconfig b/xen/drivers/char/Kconfig
index e18ec3788c..f51a1f596a 100644
--- a/xen/drivers/char/Kconfig
+++ b/xen/drivers/char/Kconfig
@@ -20,6 +20,13 @@ config HAS_IMX_LPUART
help
  This selects the i.MX LPUART. If you have i.MX8QM based board, say Y.
 
+config HAS_IMX_UART
+   bool "i.MX UART driver"
+   default y
+   depends on ARM_64
+   help
+ This selects the i.MX UART. If you have i.MX8M* based board, say Y.
+
 config HAS_MVEBU
bool "Marvell MVEBU UART driver"
default y
diff --git a/xen/drivers/char/Makefile b/xen/drivers/char/Makefile
index e7e374775d..147530a1ed 100644
--- a/xen/drivers/char/Makefile
+++ b/xen/drivers/char/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_HAS_SCIF) += scif-uart.o
 obj-$(CONFIG_HAS_EHCI) += ehci-dbgp.o
 obj-$(CONFIG_XHCI) += xhci-dbc.o
 obj-$(CONFIG_HAS_IMX_LPUART) += imx-lpuart.o
+obj-$(CONFIG_HAS_IMX_UART) += imx-uart.o
 obj-$(CONFIG_ARM) += arm-uart.o
 obj-y += serial.o
 obj-$(CONFIG_XEN_GUEST) += xen_pv_console.o
diff --git a/xen/drivers/char/imx-uart.c b/xen/drivers/char/imx-uart.c
new file mode 100644
index 00..13bb189063
--- /dev/null
+++ b/xen/drivers/char/imx-uart.c
@@ -0,0 +1,299 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * xen/drivers/char/imx-uart.c
+ *
+ * Driver for i.MX UART.
+ *
+ * Based on Linux's drivers/tty/serial/imx.c
+ *
+ * Copyright (C) 2024 EPAM Systems Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define imx_uart_read(uart, off)  readl((uart)->regs + (off))
+#define imx_uart_write(uart, off, val)writel((val), (uart)->regs + (off))
+
+static struct imx_uart {
+uint32_t baud, clock_hz, data_bits, parity, stop_bits, fifo_size;
+uint32_t irq;
+char __iomem *regs;
+struct irqaction irqaction;
+struct vuart_info vuart;
+} imx_com;
+
+static void imx_uart_interrupt(int irq, void *data)
+{
+struct serial_port *port = data;
+struct imx_uart *uart = port->uart;
+uint32_t usr1, usr2;
+
+usr1 = imx_uart_read(uart, USR1);
+usr2 = imx_uart_read(uart, USR2);
+
+if ( usr1 & (USR1_RRDY | USR1_AGTIM) )
+{
+imx_uart_write(uart, USR1, USR1_AGTIM);
+serial_rx_interrupt(port);
+}
+
+if ( (usr1 & USR1_TRDY) || (usr2 & USR2_TXDC) )
+serial_tx_interrupt(port);
+}
+
+static void imx_uart_clear_rx_errors(struct serial_port *port)
+{
+struct imx_uart *uart = port->uart;
+uint32_t usr1, usr2;
+
+usr1 = imx_uart_read(uart, USR1);
+usr2 = imx_uart_read(uart, USR2);
+
+if ( usr2 & USR2_BRCD )
+imx_uart_write(uart, USR2, USR2_BRCD);
+else if ( usr1 & USR1_FRAMERR )
+imx_uart_write(uart, USR1, USR1_FRAMERR);
+else if ( usr1 & USR1_PARITYERR )
+imx_uart_write(uart, USR1, USR1_PARITYERR);
+
+if ( usr2 & USR2_ORE )
+imx_uart_write(uart, USR2, USR2_ORE);
+}
+
+static void __init imx_uart_init_preirq(struct serial_port *port)
+{
+struct imx_uart *uart = port->uart;
+uint32_t reg;
+
+/*
+ * Wait for the transmission to complete. This is needed for a smooth
+ * transition when we come from early printk.
+ */
+while ( !(imx_uart_read(uart, USR2) & USR2_TXDC) )
+cpu_relax();
+
+/* Set receiver/transmitter trigger level */
+reg = imx_uart_read(uart, UFCR);
+reg &= (UFCR_RFDIV | UFCR_DCEDTE);
+reg |= TXTL_DEFAULT << UFCR_TXTL_SHF | RXTL_DEFAULT;
+imx_uart_write(uart, UFCR, reg);
+
+/* Enable UART and disable interrupts/DMA */
+reg = imx_uart_read(uart, UCR1);
+reg |= UCR1_UARTEN;
+reg &= ~(UCR1_TRD

[PATCH 0/2] Add UART support for i.MX 8M Mini EVKB

2024-04-02 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Hello all.

This small series contains early printk and full UART support
for i.MX 8M Mini EVKB [1].

Tested on i.MX 8M Mini only to which I had an access, but from my
understanding, this UART support should be suitable for other i.MX8M* SoCs
(those UART device tree nodes contain "fsl,imx6q-uart" compatible string).

[1] 
https://www.nxp.com/document/guide/getting-started-with-the-i-mx-8m-mini-evkb:GS-iMX-8M-Mini-EVK

Oleksandr Tyshchenko (2):
  xen/arm: Add i.MX UART early printk support
  xen/arm: Add i.MX UART driver

 MAINTAINERS   |   1 +
 xen/arch/arm/Kconfig.debug|  14 ++
 xen/arch/arm/arm64/debug-imx-uart.inc |  38 
 xen/arch/arm/include/asm/imx-uart.h   |  76 +++
 xen/drivers/char/Kconfig  |   7 +
 xen/drivers/char/Makefile |   1 +
 xen/drivers/char/imx-uart.c   | 299 ++
 7 files changed, 436 insertions(+)
 create mode 100644 xen/arch/arm/arm64/debug-imx-uart.inc
 create mode 100644 xen/arch/arm/include/asm/imx-uart.h
 create mode 100644 xen/drivers/char/imx-uart.c

-- 
2.34.1

[PATCH 1/2] xen/arm: Add i.MX UART early printk support

2024-04-02 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Tested on i.MX 8M Mini only, but I guess, it should be
suitable for other i.MX8M* SoCs (those UART device tree nodes
contain "fsl,imx6q-uart" compatible string).

Signed-off-by: Oleksandr Tyshchenko 
---
I selected the following configs for enabling early printk:

 CONFIG_EARLY_UART_CHOICE_IMX_UART=y
 CONFIG_EARLY_UART_IMX_UART=y
 CONFIG_EARLY_PRINTK=y
 CONFIG_EARLY_UART_BASE_ADDRESS=0x3089
 CONFIG_EARLY_PRINTK_INC="debug-imx-uart.inc"
---
---
 xen/arch/arm/Kconfig.debug| 14 +
 xen/arch/arm/arm64/debug-imx-uart.inc | 38 ++
 xen/arch/arm/include/asm/imx-uart.h   | 76 +++
 3 files changed, 128 insertions(+)
 create mode 100644 xen/arch/arm/arm64/debug-imx-uart.inc
 create mode 100644 xen/arch/arm/include/asm/imx-uart.h

diff --git a/xen/arch/arm/Kconfig.debug b/xen/arch/arm/Kconfig.debug
index eec860e88e..a15d08f214 100644
--- a/xen/arch/arm/Kconfig.debug
+++ b/xen/arch/arm/Kconfig.debug
@@ -68,6 +68,16 @@ choice
provide the parameters for the i.MX LPUART rather than
selecting one of the platform specific options below if
you know the parameters for the port.
+   config EARLY_UART_CHOICE_IMX_UART
+   select EARLY_UART_IMX_UART
+   depends on ARM_64
+   bool "Early printk via i.MX UART"
+   help
+   Say Y here if you wish the early printk to direct their
+   output to a i.MX UART. You can use this option to
+   provide the parameters for the i.MX UART rather than
+   selecting one of the platform specific options below if
+   you know the parameters for the port.
config EARLY_UART_CHOICE_MESON
select EARLY_UART_MESON
depends on ARM_64
@@ -199,6 +209,9 @@ config EARLY_UART_EXYNOS4210
 config EARLY_UART_IMX_LPUART
select EARLY_PRINTK
bool
+config EARLY_UART_IMX_UART
+   select EARLY_PRINTK
+   bool
 config EARLY_UART_MESON
select EARLY_PRINTK
bool
@@ -304,6 +317,7 @@ config EARLY_PRINTK_INC
default "debug-cadence.inc" if EARLY_UART_CADENCE
default "debug-exynos4210.inc" if EARLY_UART_EXYNOS4210
default "debug-imx-lpuart.inc" if EARLY_UART_IMX_LPUART
+   default "debug-imx-uart.inc" if EARLY_UART_IMX_UART
default "debug-meson.inc" if EARLY_UART_MESON
default "debug-mvebu.inc" if EARLY_UART_MVEBU
default "debug-pl011.inc" if EARLY_UART_PL011
diff --git a/xen/arch/arm/arm64/debug-imx-uart.inc 
b/xen/arch/arm/arm64/debug-imx-uart.inc
new file mode 100644
index 00..27a68b1ed5
--- /dev/null
+++ b/xen/arch/arm/arm64/debug-imx-uart.inc
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * xen/arch/arm/arm64/debug-imx-uart.inc
+ *
+ * i.MX8M* specific debug code
+ *
+ * Copyright (C) 2024 EPAM Systems Inc.
+ */
+
+#include 
+
+/*
+ * Wait UART to be ready to transmit
+ * rb: register which contains the UART base address
+ * rc: scratch register
+ */
+.macro early_uart_ready xb, c
+1:
+ldr   w\c, [\xb, #IMX21_UTS] /* <- Test register */
+tst   w\c, #UTS_TXFULL   /* Check TxFIFO FULL bit */
+bne   1b /* Wait for the UART to be ready */
+.endm
+
+/*
+ * UART transmit character
+ * rb: register which contains the UART base address
+ * rt: register which contains the character to transmit
+ */
+.macro early_uart_transmit xb, wt
+str   \wt, [\xb, #URTX0] /* -> Transmitter Register */
+.endm
+
+/*
+ * Local variables:
+ * mode: ASM
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/include/asm/imx-uart.h 
b/xen/arch/arm/include/asm/imx-uart.h
new file mode 100644
index 00..413a81dd44
--- /dev/null
+++ b/xen/arch/arm/include/asm/imx-uart.h
@@ -0,0 +1,76 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * xen/arch/arm/include/asm/imx-uart.h
+ *
+ * Common constant definition between early printk and the UART driver
+ *
+ * Copyright (C) 2024 EPAM Systems Inc.
+ */
+
+#ifndef __ASM_ARM_IMX_UART_H__
+#define __ASM_ARM_IMX_UART_H__
+
+/* 32-bit register definition */
+#define URXD0(0x00) /* Receiver Register */
+#define URTX0(0x40) /* Transmitter Register */
+#define UCR1 (0x80) /* Control Register 1 */
+#define UCR2 (0x84) /* Control Register 2 */
+#define UCR3 (0x88) /* Control Register 3 */
+#define UCR4 (0x8c) /* Control Register 4 */
+#define UFCR (0x90) /* FIFO Control Register */
+#define USR1 (0x94) /* Status Register 1 */
+#define USR2 (0x98) /* Status Register 2 */
+#define IMX21_UTS(0xb4) /* Test Register */
+
+#define URXD_ERRBIT(14, UL) /* Error detect */
+#define URXD_RX_DATAGEN

Re: [PATCH 2/2] xen/events: increment refcnt only if event channel is refcounted

2024-03-17 Thread Oleksandr Tyshchenko



On 13.03.24 09:14, Juergen Gross wrote:


Hello Juergen

> In bind_evtchn_to_irq_chip() don't increment the refcnt of the event
> channel blindly. In case the event channel is NOT refcounted, issue a
> warning instead.
> 
> Add an additional safety net by doing the refcnt increment only if the
> caller has specified IRQF_SHARED in the irqflags parameter.
> 
> Fixes: 9e90e58c11b7 ("xen: evtchn: Allow shared registration of IRQ handers")
> Signed-off-by: Juergen Gross 


Reviewed-by: Oleksandr Tyshchenko 


> ---
>   drivers/xen/events/events_base.c | 22 +-
>   1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/xen/events/events_base.c 
> b/drivers/xen/events/events_base.c
> index 2faa4bf78c7a..81effbd53dc5 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -1190,7 +1190,7 @@ int xen_pirq_from_irq(unsigned irq)
>   EXPORT_SYMBOL_GPL(xen_pirq_from_irq);
>   
>   static int bind_evtchn_to_irq_chip(evtchn_port_t evtchn, struct irq_chip 
> *chip,
> -struct xenbus_device *dev)
> +struct xenbus_device *dev, bool shared)
>   {
>   int ret = -ENOMEM;
>   struct irq_info *info;
> @@ -1224,7 +1224,8 @@ static int bind_evtchn_to_irq_chip(evtchn_port_t 
> evtchn, struct irq_chip *chip,
>*/
>   bind_evtchn_to_cpu(info, 0, false);
>   } else if (!WARN_ON(info->type != IRQT_EVTCHN)) {
> - info->refcnt++;
> + if (shared && !WARN_ON(info->refcnt < 0))
> + info->refcnt++;
>   }
>   
>   ret = info->irq;
> @@ -1237,13 +1238,13 @@ static int bind_evtchn_to_irq_chip(evtchn_port_t 
> evtchn, struct irq_chip *chip,
>   
>   int bind_evtchn_to_irq(evtchn_port_t evtchn)
>   {
> - return bind_evtchn_to_irq_chip(evtchn, &xen_dynamic_chip, NULL);
> + return bind_evtchn_to_irq_chip(evtchn, &xen_dynamic_chip, NULL, false);
>   }
>   EXPORT_SYMBOL_GPL(bind_evtchn_to_irq);
>   
>   int bind_evtchn_to_irq_lateeoi(evtchn_port_t evtchn)
>   {
> - return bind_evtchn_to_irq_chip(evtchn, &xen_lateeoi_chip, NULL);
> + return bind_evtchn_to_irq_chip(evtchn, &xen_lateeoi_chip, NULL, false);
>   }
>   EXPORT_SYMBOL_GPL(bind_evtchn_to_irq_lateeoi);
>   
> @@ -1295,7 +1296,8 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned 
> int cpu)
>   
>   static int bind_interdomain_evtchn_to_irq_chip(struct xenbus_device *dev,
>  evtchn_port_t remote_port,
> -struct irq_chip *chip)
> +struct irq_chip *chip,
> +bool shared)
>   {
>   struct evtchn_bind_interdomain bind_interdomain;
>   int err;
> @@ -1307,14 +1309,14 @@ static int bind_interdomain_evtchn_to_irq_chip(struct 
> xenbus_device *dev,
> &bind_interdomain);
>   
>   return err ? : bind_evtchn_to_irq_chip(bind_interdomain.local_port,
> -chip, dev);
> +chip, dev, shared);
>   }
>   
>   int bind_interdomain_evtchn_to_irq_lateeoi(struct xenbus_device *dev,
>  evtchn_port_t remote_port)
>   {
>   return bind_interdomain_evtchn_to_irq_chip(dev, remote_port,
> -&xen_lateeoi_chip);
> +&xen_lateeoi_chip, false);
>   }
>   EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irq_lateeoi);
>   
> @@ -1430,7 +1432,8 @@ static int bind_evtchn_to_irqhandler_chip(evtchn_port_t 
> evtchn,
>   {
>   int irq, retval;
>   
> - irq = bind_evtchn_to_irq_chip(evtchn, chip, NULL);
> + irq = bind_evtchn_to_irq_chip(evtchn, chip, NULL,
> +   irqflags & IRQF_SHARED);
>   if (irq < 0)
>   return irq;
>   retval = request_irq(irq, handler, irqflags, devname, dev_id);
> @@ -1471,7 +1474,8 @@ static int bind_interdomain_evtchn_to_irqhandler_chip(
>   {
>   int irq, retval;
>   
> - irq = bind_interdomain_evtchn_to_irq_chip(dev, remote_port, chip);
> + irq = bind_interdomain_evtchn_to_irq_chip(dev, remote_port, chip,
> +   irqflags & IRQF_SHARED);
>   if (irq < 0)
>   return irq;
>

Re: [PATCH 1/2] xen/evtchn: avoid WARN() when unbinding an event channel

2024-03-17 Thread Oleksandr Tyshchenko



On 13.03.24 09:14, Juergen Gross wrote:

Hello Juergen


> When unbinding a user event channel, the related handler might be
> called a last time in case the kernel was built with
> CONFIG_DEBUG_SHIRQ. This might cause a WARN() in the handler.
> 
> Avoid that by adding an "unbinding" flag to struct user_event which
> will short circuit the handler.
> 
> Fixes: 9e90e58c11b7 ("xen: evtchn: Allow shared registration of IRQ handers")
> Reported-by: Demi Marie Obenour 
> Tested-by: Demi Marie Obenour 
> Signed-off-by: Juergen Gross 


Reviewed-by: Oleksandr Tyshchenko 


> ---
>   drivers/xen/evtchn.c | 6 ++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c
> index 59717628ca42..f6a2216c2c87 100644
> --- a/drivers/xen/evtchn.c
> +++ b/drivers/xen/evtchn.c
> @@ -85,6 +85,7 @@ struct user_evtchn {
>   struct per_user_data *user;
>   evtchn_port_t port;
>   bool enabled;
> + bool unbinding;
>   };
>   
>   static void evtchn_free_ring(evtchn_port_t *ring)
> @@ -164,6 +165,10 @@ static irqreturn_t evtchn_interrupt(int irq, void *data)
>   struct per_user_data *u = evtchn->user;
>   unsigned int prod, cons;
>   
> + /* Handler might be called when tearing down the IRQ. */
> + if (evtchn->unbinding)
> + return IRQ_HANDLED;
> +
>   WARN(!evtchn->enabled,
>"Interrupt for port %u, but apparently not enabled; per-user %p\n",
>evtchn->port, u);
> @@ -421,6 +426,7 @@ static void evtchn_unbind_from_user(struct per_user_data 
> *u,
>   
>   BUG_ON(irq < 0);
>   
> + evtchn->unbinding = true;
>   unbind_from_irqhandler(irq, evtchn);
>   
>   del_evtchn(u, evtchn);

Re: question about virtio-vsock on xen

2024-02-26 Thread Oleksandr Tyshchenko



On 26.02.24 05:09, Peng Fan wrote:
> Hi Oleksandr,

Hello Peng


[snip]

>>
>> ... Peng, we have vhost-vsock (and vhost-net) Xen PoC. Although it is 
>> non-
>> upstreamable in its current shape (based on old Linux version, requires some
>> rework and proper integration, most likely requires involving Qemu and
>> protocol changes to pass an additional info to vhost), it works with Linux
>> v5.10 + patched Qemu v7.0, so you can refer to the Yocto meta layer which
>> contains kernel patches for the details [1].
> 
> Thanks for the pointer, I am reading the code.
> 
>>
>> In a nutshell, before accessing the guest data the host module needs to map
>> descriptors in virtio rings which contain either guest grant based DMA
>> addresses (by using Xen grant mappings) or guest pseudo-physical addresses
>> (by using Xen foreign mappings). After accessing the guest data the host
>> module needs to unmap them.
> 
> Ok, I thought  the current xen virtio code already map every ready.
> 

It does, as you said the virtio-blk-pci worked in your environment. But 
vhost(-vsock) is a special case, unlike for virtio-blk-pci where the 
whole backend resides in Qemu, here we have a split model. As I 
understand the Qemu performs only initial setup/configuration then 
offloads the I/O processing to a separate entity which is the Linux 
module in that particular case.

Re: question about virtio-vsock on xen

2024-02-24 Thread Oleksandr Tyshchenko



On 23.02.24 23:42, Stefano Stabellini wrote:
> Hi Peng,

Hello Peng, Stefano


> 
> We haven't tried to setup virtio-vsock yet.
> 
> In general, I am very supportive of using QEMU for virtio backends. We
> use QEMU to provide virtio-net, virtio-block, virtio-console and more.
> 
> However, typically virtio-vsock comes into play for VM-to-VM
> communication, which is different. Going via QEMU in Dom0 just to have 1
> VM communicate with another VM is not an ideal design: it adds latency
> and uses resources in Dom0 when actually we could do without it.
> 
> A better model for VM-to-VM communication would be to have the VM talk
> to each other directly via grant table or pre-shared memory (see the
> static shared memory feature) or via Xen hypercalls (see Argo.)
> 
> For a good Xen design, I think the virtio-vsock backend would need to be
> in Xen itself (the hypervisor).
> 
> Of course that is more work and it doesn't help you with the specific
> question you had below :-)
> 
> For that, I don't have a pointer to help you but maybe others in CC
> have.


Yes, I will try to provide some info ...


> 
> Cheers,
> 
> Stefano
> 
> 
> On Fri, 23 Feb 2024, Peng Fan wrote:
>> Hi All,
>>
>> Has anyone make virtio-vsock on xen work? My dm args as below:
>>
>> virtio = [
>> 'backend=0,type=virtio,device,transport=pci,bdf=05:00.0,backend_type=qemu,grant_usage=true'
>> ]
>> device_model_args = [
>> '-D', '/home/root/qemu_log.txt',
>> '-d', 
>> 'trace:*vsock*,trace:*vhost*,trace:*virtio*,trace:*pci_update*,trace:*pci_route*,trace:*handle_ioreq*,trace:*xen*',
>> '-device', 
>> 'vhost-vsock-pci,iommu_platform=false,id=vhost-vsock-pci0,bus=pcie.0,addr=5.0,guest-cid=3']
>>
>> During my test, it always return failure in dom0 kernel in below code:
>>
>> vhost_transport_do_send_pkt {
>> ...
>> nbytes = copy_to_iter(hdr, sizeof(*hdr), &iov_iter);
>>  if (nbytes != sizeof(*hdr)) {
>>  vq_err(vq, "Faulted on copying pkt hdr %x %x %x 
>> %px\n", nbytes, sizeof(*hdr),
>> __builtin_object_size(hdr, 0), &iov_iter);
>>  kfree_skb(skb);
>>  break;
>>  }
>> }
>>
>> I checked copy_to_iter, it is copy data to __user addr, but it never pass,
>> the copy to __user addr always return 0 bytes copied.
>>
>> The asm code "sttr x7, [x6]" will trigger data abort, the kernel will run
>> into do_page_fault, but lock_mm_and_find_vma report it is VM_FAULT_BADMAP,
>> that means the __user addr is not mapped, no vma has this addr.
>>
>> I am not sure what may cause this. Appreciate if any comments.


   ... Peng, we have vhost-vsock (and vhost-net) Xen PoC. Although it is 
non-upstreamable in its current shape (based on old Linux version, 
requires some rework and proper integration, most likely requires 
involving Qemu and protocol changes to pass an additional info to 
vhost), it works with Linux v5.10 + patched Qemu v7.0, so you can refer 
to the Yocto meta layer which contains kernel patches for the details [1].

In a nutshell, before accessing the guest data the host module needs to 
map descriptors in virtio rings which contain either guest grant based 
DMA addresses (by using Xen grant mappings) or guest pseudo-physical 
addresses (by using Xen foreign mappings). After accessing the guest 
data the host module needs to unmap them.

Also note, in that PoC the target mapping scheme is controlled via 
module param and guest domain id is retrieved from the device-model 
specific part in the Xenstore (so Qemu/protocol are unmodified). But you 
might want to look at [2] as an example of vhost-user protocol changes 
how to pass that additional info.

Hope that helps.

[1] https://github.com/xen-troops/meta-xt-vhost/commits/main/
[2] https://www.mail-archive.com/qemu-devel@nongnu.org/msg948327.html

P.S. May answer with a delay.


>>
>> BTW: I tested blk pci, it works, so the virtio pci should work on my setup.
>>
>> Thanks,
>> Peng.
>>

[PATCH V3] libxl: Add "grant_usage" parameter for virtio disk devices

2024-02-15 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Allow administrators to control whether Xen grant mappings for
the virtio disk devices should be used. By default (when new
parameter is not specified), the existing behavior is retained
(we enable grants if backend-domid != 0).

Signed-off-by: Oleksandr Tyshchenko 
---
In addition to "libxl: arm: Add grant_usage parameter for virtio devices"
https://github.com/xen-project/xen/commit/c14254065ff4826e34f714e1790eab5217368c38

 V2:
  - clarify documentation to match the implementation
  - apply a default value if "grant_usage" is missing the Xenstore
in libxl__disk_from_xenstore()

 V3:
  - include autogenerated changes to tools/libs/util/libxlu_disk_l.c(h)
  - remove debug log from libxl__disk_from_xenstore(),
correct coding style
---
 docs/man/xl-disk-configuration.5.pod.in |   24 +
 tools/golang/xenlight/helpers.gen.go|6 +
 tools/golang/xenlight/types.gen.go  |1 +
 tools/include/libxl.h   |7 +
 tools/libs/light/libxl_arm.c|4 +-
 tools/libs/light/libxl_disk.c   |   13 +
 tools/libs/light/libxl_types.idl|1 +
 tools/libs/util/libxlu_disk_l.c | 1001 ---
 tools/libs/util/libxlu_disk_l.h |9 +-
 tools/libs/util/libxlu_disk_l.l |3 +
 10 files changed, 590 insertions(+), 479 deletions(-)

diff --git a/docs/man/xl-disk-configuration.5.pod.in 
b/docs/man/xl-disk-configuration.5.pod.in
index cb442bd5b4..98ebf8 100644
--- a/docs/man/xl-disk-configuration.5.pod.in
+++ b/docs/man/xl-disk-configuration.5.pod.in
@@ -406,6 +406,30 @@ Virtio frontend driver (virtio-blk) to be used. Please 
note, the virtual
 device (vdev) is not passed to the guest in that case, but it still must be
 specified for the internal purposes.
 
+=item B
+
+=over 4
+
+=item Description
+
+Specifies the usage of Xen grants for accessing guest memory. Only applicable
+to specification "virtio".
+
+=item Supported values
+
+1, 0
+
+=item Mandatory
+
+No
+
+=item Default value
+
+If this option is missing, then the default grant setting will be used,
+i.e. "grant_usage=1" if backend-domid != 0 or "grant_usage=0" otherwise.
+
+=back
+
 =back
 
 =head1 COLO Parameters
diff --git a/tools/golang/xenlight/helpers.gen.go 
b/tools/golang/xenlight/helpers.gen.go
index 0f8e23773c..acdf1c1820 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1885,6 +1885,9 @@ x.ActiveDisk = C.GoString(xc.active_disk)
 x.HiddenDisk = C.GoString(xc.hidden_disk)
 if err := x.Trusted.fromC(&xc.trusted);err != nil {
 return fmt.Errorf("converting field Trusted: %v", err)
+}
+if err := x.GrantUsage.fromC(&xc.grant_usage);err != nil {
+return fmt.Errorf("converting field GrantUsage: %v", err)
 }
 
  return nil}
@@ -1933,6 +1936,9 @@ if x.HiddenDisk != "" {
 xc.hidden_disk = C.CString(x.HiddenDisk)}
 if err := x.Trusted.toC(&xc.trusted); err != nil {
 return fmt.Errorf("converting field Trusted: %v", err)
+}
+if err := x.GrantUsage.toC(&xc.grant_usage); err != nil {
+return fmt.Errorf("converting field GrantUsage: %v", err)
 }
 
  return nil
diff --git a/tools/golang/xenlight/types.gen.go 
b/tools/golang/xenlight/types.gen.go
index 9c8b7b81f6..76b4ed991b 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -741,6 +741,7 @@ ColoExport string
 ActiveDisk string
 HiddenDisk string
 Trusted Defbool
+GrantUsage Defbool
 }
 
 type DeviceNic struct {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 46bc774126..a370528ba1 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -578,6 +578,13 @@
  */
 #define LIBXL_HAVE_DEVICE_DISK_SPECIFICATION 1
 
+/*
+ * LIBXL_HAVE_DISK_GRANT_USAGE indicates that the libxl_device_disk
+ * has 'grant_usage' field to specify the usage of Xen grants for
+ * the specification 'virtio'.
+ */
+#define LIBXL_HAVE_DISK_GRANT_USAGE 1
+
 /*
  * LIBXL_HAVE_CONSOLE_ADD_XENSTORE indicates presence of the function
  * libxl_console_add_xenstore() in libxl.
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 1539191774..1cb89fa584 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1372,12 +1372,12 @@ next_resize:
 libxl_device_disk *disk = &d_config->disks[i];
 
 if (disk->specification == LIBXL_DISK_SPECIFICATION_VIRTIO) {
-if (disk->backend_domid != LIBXL_TOOLSTACK_DOMID)
+if (libxl_defbool_val(disk->grant_usage))
 iommu_needed = true;
 
 FDT( make_virtio_mmio_node(gc, fdt, disk->base, disk->irq,
disk->backend_domid,
-   disk->backend_domid != 
LIBXL_TOOLSTACK_DOMID) );
+

Re: [PATCH V2] libxl: Add "grant_usage" parameter for virtio disk devices

2024-02-15 Thread Oleksandr Tyshchenko



On 13.02.24 14:14, Anthony PERARD wrote:

Hello Anthony

> On Tue, Feb 06, 2024 at 02:38:14PM +0200, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko 
>>
>> Allow administrators to control whether Xen grant mappings for
>> the virtio disk devices should be used. By default (when new
>> parameter is not specified), the existing behavior is retained
>> (we enable grants if backend-domid != 0).
>>
>> Signed-off-by: Oleksandr Tyshchenko 
>> ---
>> In addition to "libxl: arm: Add grant_usage parameter for virtio devices"
>> https://urldefense.com/v3/__https://github.com/xen-project/xen/commit/c14254065ff4826e34f714e1790eab5217368c38__;!!GF_29dbcQIUBPA!172qt30uI7DpsZcK-IpZu25JoQScPeF7do_MD-KEqJlo8gaH-TV1P_H2CYfsGI0j_l0GdUPPO4BjyUD2Q86Lk2IlF5zCiFWMAw$
>>  [github[.]com]
>>
>> I wonder, whether I had to also include autogenerated changes to:
>>   - tools/libs/util/libxlu_disk_l.c
>>   - tools/libs/util/libxlu_disk_l.h
> 
> Well, that could be done on commit. The changes are going to be needed
> to be committed as they may not be regenerated to include the new feature
> in a build.

Thanks. As V3 is needed anyway, I will include them.


> 
>> ---
>> diff --git a/tools/libs/light/libxl_disk.c b/tools/libs/light/libxl_disk.c
>> index ea3623dd6f..ed02b655a3 100644
>> --- a/tools/libs/light/libxl_disk.c
>> +++ b/tools/libs/light/libxl_disk.c
>> @@ -623,6 +628,15 @@ static int libxl__disk_from_xenstore(libxl__gc *gc, 
>> const char *libxl_path,
>>   goto cleanup;
>>   }
>>   disk->irq = strtoul(tmp, NULL, 10);
>> +
>> +tmp = libxl__xs_read(gc, XBT_NULL,
>> + GCSPRINTF("%s/grant_usage", libxl_path));
>> +if (!tmp) {
>> +LOG(DEBUG, "Missing xenstore node %s/grant_usage, using default 
>> value", libxl_path);
> 
> Is this information useful for debugging?
> 
> It should be easy to find out if the grant_usage node is present or not
> by looking at xenstore, and I don't think libxl is going to make use of
> that information after this point, so I don't think that's going to be
> very useful.

It is not very useful, will drop the log.


> 
>> +libxl_defbool_set(&disk->grant_usage,
>> +  disk->backend_domid != LIBXL_TOOLSTACK_DOMID);
>> +} else
>> +libxl_defbool_set(&disk->grant_usage, strtoul(tmp, NULL, 0));
> 
> Per coding style, it's better to have both side of an if..else to have
> {}-block or none of them. So could you add a {} block in the else, or
> remove the {} from the true side if we remove the LOG()?


Will do.

> 
> Thanks,
>

[PATCH V2] libxl: Add "grant_usage" parameter for virtio disk devices

2024-02-06 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Allow administrators to control whether Xen grant mappings for
the virtio disk devices should be used. By default (when new
parameter is not specified), the existing behavior is retained
(we enable grants if backend-domid != 0).

Signed-off-by: Oleksandr Tyshchenko 
---
In addition to "libxl: arm: Add grant_usage parameter for virtio devices"
https://github.com/xen-project/xen/commit/c14254065ff4826e34f714e1790eab5217368c38

I wonder, whether I had to also include autogenerated changes to:
 - tools/libs/util/libxlu_disk_l.c
 - tools/libs/util/libxlu_disk_l.h

 V2:
  - clarify documentation to match the implementation
  - apply a default value if "grant_usage" is missing in the Xenstore
in libxl__disk_from_xenstore()
---
 docs/man/xl-disk-configuration.5.pod.in | 24 
 tools/golang/xenlight/helpers.gen.go|  6 ++
 tools/golang/xenlight/types.gen.go  |  1 +
 tools/include/libxl.h   |  7 +++
 tools/libs/light/libxl_arm.c|  4 ++--
 tools/libs/light/libxl_disk.c   | 14 ++
 tools/libs/light/libxl_types.idl|  1 +
 tools/libs/util/libxlu_disk_l.l |  3 +++
 8 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl-disk-configuration.5.pod.in 
b/docs/man/xl-disk-configuration.5.pod.in
index bc945cc517..55e78005cf 100644
--- a/docs/man/xl-disk-configuration.5.pod.in
+++ b/docs/man/xl-disk-configuration.5.pod.in
@@ -404,6 +404,30 @@ Virtio frontend driver (virtio-blk) to be used. Please 
note, the virtual
 device (vdev) is not passed to the guest in that case, but it still must be
 specified for the internal purposes.
 
+=item B
+
+=over 4
+
+=item Description
+
+Specifies the usage of Xen grants for accessing guest memory. Only applicable
+to specification "virtio".
+
+=item Supported values
+
+1, 0
+
+=item Mandatory
+
+No
+
+=item Default value
+
+If this option is missing, then the default grant setting will be used,
+i.e. "grant_usage=1" if backend-domid != 0 or "grant_usage=0" otherwise.
+
+=back
+
 =back
 
 =head1 COLO Parameters
diff --git a/tools/golang/xenlight/helpers.gen.go 
b/tools/golang/xenlight/helpers.gen.go
index 35e209ff1b..768ab0f566 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1879,6 +1879,9 @@ x.ActiveDisk = C.GoString(xc.active_disk)
 x.HiddenDisk = C.GoString(xc.hidden_disk)
 if err := x.Trusted.fromC(&xc.trusted);err != nil {
 return fmt.Errorf("converting field Trusted: %v", err)
+}
+if err := x.GrantUsage.fromC(&xc.grant_usage);err != nil {
+return fmt.Errorf("converting field GrantUsage: %v", err)
 }
 
  return nil}
@@ -1927,6 +1930,9 @@ if x.HiddenDisk != "" {
 xc.hidden_disk = C.CString(x.HiddenDisk)}
 if err := x.Trusted.toC(&xc.trusted); err != nil {
 return fmt.Errorf("converting field Trusted: %v", err)
+}
+if err := x.GrantUsage.toC(&xc.grant_usage); err != nil {
+return fmt.Errorf("converting field GrantUsage: %v", err)
 }
 
  return nil
diff --git a/tools/golang/xenlight/types.gen.go 
b/tools/golang/xenlight/types.gen.go
index 7907aa8999..0b712d2aa4 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -740,6 +740,7 @@ ColoExport string
 ActiveDisk string
 HiddenDisk string
 Trusted Defbool
+GrantUsage Defbool
 }
 
 type DeviceNic struct {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index f1652b1664..2b69e08466 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -578,6 +578,13 @@
  */
 #define LIBXL_HAVE_DEVICE_DISK_SPECIFICATION 1
 
+/*
+ * LIBXL_HAVE_DISK_GRANT_USAGE indicates that the libxl_device_disk
+ * has 'grant_usage' field to specify the usage of Xen grants for
+ * the specification 'virtio'.
+ */
+#define LIBXL_HAVE_DISK_GRANT_USAGE 1
+
 /*
  * LIBXL_HAVE_CONSOLE_ADD_XENSTORE indicates presence of the function
  * libxl_console_add_xenstore() in libxl.
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 1539191774..1cb89fa584 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1372,12 +1372,12 @@ next_resize:
 libxl_device_disk *disk = &d_config->disks[i];
 
 if (disk->specification == LIBXL_DISK_SPECIFICATION_VIRTIO) {
-if (disk->backend_domid != LIBXL_TOOLSTACK_DOMID)
+if (libxl_defbool_val(disk->grant_usage))
 iommu_needed = true;
 
 FDT( make_virtio_mmio_node(gc, fdt, disk->base, disk->irq,
disk->backend_domid,
-   disk->backend_domid != 
LIBXL_TOOLSTACK_DOMID) );
+   
libxl_defbool_val(disk->grant_usage)) );
 }
 }
 
diff --git a/tools/libs/light/li

Re: [PATCH] libxl: Add "grant_usage" parameter for virtio disk devices

2024-02-06 Thread Oleksandr Tyshchenko



On 06.02.24 12:27, Anthony PERARD wrote:


Hello Anthony

[snip]

 diff --git a/docs/man/xl-disk-configuration.5.pod.in 
 b/docs/man/xl-disk-configuration.5.pod.in
 index bc945cc517..3c035456d5 100644
 --- a/docs/man/xl-disk-configuration.5.pod.in
 +++ b/docs/man/xl-disk-configuration.5.pod.in
 @@ -404,6 +404,31 @@ Virtio frontend driver (virtio-blk) to be used. 
 Please note, the virtual
 +=item B

 +=over 4
 +
 +=item Description
 +
 +Specifies the usage of Xen grants for accessing guest memory. Only 
 applicable
 +to specification "virtio".
 +
 +=item Supported values
 +
 +If this option is B, the Xen grants are always enabled.
 +If this option is B, the Xen grants are always disabled.
>>>
>>> Unfortunately, this is wrong, the implementation in the patch only
>>> support two values: 1 / 0, nothing else, and trying to write "true" or
>>> "false" would lead to an error. (Well actually it's "grant_usage=1" or
>>> "grant_usage=0", there's nothing that cut that string at the '='.)
>>
>>
>> You are right, only 1 / 0 can be set unlike for virtio=[...] which seems
>> happy with false/true.
>>
>>
>>>
>>> Also, do we really need the extra verbal description of each value here?
>>> Is simply having the following would be enough?
>>>
>>>   =item Supported values
>>>
>>>   1, 0
>>>
>>> The description in "Description" section would hopefully be enough.
>>
>>
>> I think, this makes sense.
>>
>> So, shall I leave "grant_usage=1/grant_usage=0" or use proposed option
>> "use-grant/no-use-grant"?
> 
> Let's go with "grant_usage=*", at least this will be consistent with the
> option for "virtio".


thanks for the confirmation, will do


> 
> Cheers,
>

Re: [PATCH] libxl: Add "grant_usage" parameter for virtio disk devices

2024-02-05 Thread Oleksandr Tyshchenko



On 05.02.24 17:10, Anthony PERARD wrote:

Hello Anthony


> On Fri, Feb 02, 2024 at 12:49:03PM +0200, Oleksandr Tyshchenko wrote:
>> diff --git a/tools/libs/util/libxlu_disk_l.l 
>> b/tools/libs/util/libxlu_disk_l.l
>> index 6d53c093a3..f37dd443bd 100644
>> --- a/tools/libs/util/libxlu_disk_l.l
>> +++ b/tools/libs/util/libxlu_disk_l.l
>> @@ -220,6 +220,9 @@ hidden-disk=[^,]*,?  { STRIP(','); 
>> SAVESTRING("hidden-disk", hidden_disk, FROMEQU
>>   trusted,?  { libxl_defbool_set(&DPC->disk->trusted, true); }
>>   untrusted,?{ libxl_defbool_set(&DPC->disk->trusted, 
>> false); }
>>   
>> +grant_usage=1,? { libxl_defbool_set(&DPC->disk->grant_usage, 
>> true); }
>> +grant_usage=0,? { libxl_defbool_set(&DPC->disk->grant_usage, 
>> false); }
> 
> For other boolean type for the disk, we have "trusted/untrusted",
> "discard/no-discard", "direct-io-save/", but you are adding
> "grant_usage=1/grant_usage=0". Is that fine? But I guess having the new
> option spelled "grant_usage" might be better, so it match the other
> virtio devices and the implementation. 


Yes, I noticed that how booleans are described for the disk. I decided 
to use the same representation of this option as it was already used for 
virtio=[...]. But I would be ok with other variants ...


But maybe
> "use-grant/no-use-grant" might be ok?

   ... like that, but preferably with leaving libxl_device_disk's field 
named "grant_usage" (if no objection).


> 
> In any case, the implementation need to match the documentation, and
> vice versa. See below.


Sure.


> 
>> diff --git a/docs/man/xl-disk-configuration.5.pod.in 
>> b/docs/man/xl-disk-configuration.5.pod.in
>> index bc945cc517..3c035456d5 100644
>> --- a/docs/man/xl-disk-configuration.5.pod.in
>> +++ b/docs/man/xl-disk-configuration.5.pod.in
>> @@ -404,6 +404,31 @@ Virtio frontend driver (virtio-blk) to be used. Please 
>> note, the virtual
>> +=item B
>>
>> +=over 4
>> +
>> +=item Description
>> +
>> +Specifies the usage of Xen grants for accessing guest memory. Only 
>> applicable
>> +to specification "virtio".
>> +
>> +=item Supported values
>> +
>> +If this option is B, the Xen grants are always enabled.
>> +If this option is B, the Xen grants are always disabled.
> 
> Unfortunately, this is wrong, the implementation in the patch only
> support two values: 1 / 0, nothing else, and trying to write "true" or
> "false" would lead to an error. (Well actually it's "grant_usage=1" or
> "grant_usage=0", there's nothing that cut that string at the '='.)


You are right, only 1 / 0 can be set unlike for virtio=[...] which seems 
happy with false/true.


> 
> Also, do we really need the extra verbal description of each value here?
> Is simply having the following would be enough?
> 
>  =item Supported values
> 
>  1, 0
> 
> The description in "Description" section would hopefully be enough.


I think, this makes sense.

So, shall I leave "grant_usage=1/grant_usage=0" or use proposed option 
"use-grant/no-use-grant"?


> 
>> +=item Mandatory
>> +
>> +No
>> +
>> +=item Default value
>> +
>> +If this option is missing, then the default grant setting will be used,
>> +i.e. enable grants if backend-domid != 0.
>> +
>> +=back
>> +
>> diff --git a/tools/libs/light/libxl_disk.c b/tools/libs/light/libxl_disk.c
>> index ea3623dd6f..f39f427091 100644
>> --- a/tools/libs/light/libxl_disk.c
>> +++ b/tools/libs/light/libxl_disk.c
>> @@ -181,6 +181,9 @@ static int libxl__device_disk_setdefault(libxl__gc *gc, 
>> uint32_t domid,
>>   return ERROR_INVAL;
>>   }
>>   disk->transport = LIBXL_DISK_TRANSPORT_MMIO;
>> +
>> +libxl_defbool_setdefault(&disk->grant_usage,
>> + disk->backend_domid != 
>> LIBXL_TOOLSTACK_DOMID);
>>   }
>>   
>>   if (hotplug && disk->specification == LIBXL_DISK_SPECIFICATION_VIRTIO) 
>> {
>> @@ -429,6 +432,8 @@ static void device_disk_add(libxl__egc *egc, uint32_t 
>> domid,
>>   flexarray_append(back, 
>> libxl__device_disk_string_of_transport(disk->transport));
>>   flexarray_append_pair(back, "base", GCSPRINTF("%"PRIu64, 
>> disk->base));
>>

Re: [PATCH] libxl: Add "grant_usage" parameter for virtio disk devices

2024-02-02 Thread Oleksandr Tyshchenko



On 02.02.24 13:03, Viresh Kumar wrote:

Hello Viresh


> On 02-02-24, 12:49, Oleksandr Tyshchenko wrote:
>> diff --git a/docs/man/xl-disk-configuration.5.pod.in 
>> b/docs/man/xl-disk-configuration.5.pod.in
>> index bc945cc517..3c035456d5 100644
>> --- a/docs/man/xl-disk-configuration.5.pod.in
>> +++ b/docs/man/xl-disk-configuration.5.pod.in
>> @@ -404,6 +404,31 @@ Virtio frontend driver (virtio-blk) to be used. Please 
>> note, the virtual
>>   device (vdev) is not passed to the guest in that case, but it still must be
>>   specified for the internal purposes.
>>   
>> +=item B
>> +
>> +=over 4
>> +
>> +=item Description
>> +
>> +Specifies the usage of Xen grants for accessing guest memory. Only 
>> applicable
>> +to specification "virtio".
>> +
>> +=item Supported values
>> +
>> +If this option is B, the Xen grants are always enabled.
>> +If this option is B, the Xen grants are always disabled.
>> +
>> +=item Mandatory
>> +
>> +No
>> +
>> +=item Default value
>> +
>> +If this option is missing, then the default grant setting will be used,
>> +i.e. enable grants if backend-domid != 0.
>> +
>> +=back
>> +
>>   =back
>>   
>>   =head1 COLO Parameters
> 
> I wonder if there is a way to avoid the duplication here and use the 
> definition
> from: docs/man/xl.cfg.5.pod.in somehow ?


That's good point. I am not 100% sure, but if we could use something 
like that it would be really nice. Let's see what other reviewers will say.


=item B

=over 4

Specifies the usage of Xen grants for accessing guest memory. Only 
applicable to specification "virtio". Please see B in 
L for more information on this option.

=back

[PATCH] libxl: Add "grant_usage" parameter for virtio disk devices

2024-02-02 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Allow administrators to control whether Xen grant mappings for
the virtio disk devices should be used. By default (when new
parameter is not specified), the existing behavior is retained
(we enable grants if backend-domid != 0).

Signed-off-by: Oleksandr Tyshchenko 
---
In addition to "libxl: arm: Add grant_usage parameter for virtio devices"
https://github.com/xen-project/xen/commit/c14254065ff4826e34f714e1790eab5217368c38
---
 docs/man/xl-disk-configuration.5.pod.in | 25 +
 tools/golang/xenlight/helpers.gen.go|  6 ++
 tools/golang/xenlight/types.gen.go  |  1 +
 tools/include/libxl.h   |  7 +++
 tools/libs/light/libxl_arm.c|  4 ++--
 tools/libs/light/libxl_disk.c   | 13 +
 tools/libs/light/libxl_types.idl|  1 +
 tools/libs/util/libxlu_disk_l.l |  3 +++
 8 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl-disk-configuration.5.pod.in 
b/docs/man/xl-disk-configuration.5.pod.in
index bc945cc517..3c035456d5 100644
--- a/docs/man/xl-disk-configuration.5.pod.in
+++ b/docs/man/xl-disk-configuration.5.pod.in
@@ -404,6 +404,31 @@ Virtio frontend driver (virtio-blk) to be used. Please 
note, the virtual
 device (vdev) is not passed to the guest in that case, but it still must be
 specified for the internal purposes.
 
+=item B
+
+=over 4
+
+=item Description
+
+Specifies the usage of Xen grants for accessing guest memory. Only applicable
+to specification "virtio".
+
+=item Supported values
+
+If this option is B, the Xen grants are always enabled.
+If this option is B, the Xen grants are always disabled.
+
+=item Mandatory
+
+No
+
+=item Default value
+
+If this option is missing, then the default grant setting will be used,
+i.e. enable grants if backend-domid != 0.
+
+=back
+
 =back
 
 =head1 COLO Parameters
diff --git a/tools/golang/xenlight/helpers.gen.go 
b/tools/golang/xenlight/helpers.gen.go
index 35e209ff1b..768ab0f566 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1879,6 +1879,9 @@ x.ActiveDisk = C.GoString(xc.active_disk)
 x.HiddenDisk = C.GoString(xc.hidden_disk)
 if err := x.Trusted.fromC(&xc.trusted);err != nil {
 return fmt.Errorf("converting field Trusted: %v", err)
+}
+if err := x.GrantUsage.fromC(&xc.grant_usage);err != nil {
+return fmt.Errorf("converting field GrantUsage: %v", err)
 }
 
  return nil}
@@ -1927,6 +1930,9 @@ if x.HiddenDisk != "" {
 xc.hidden_disk = C.CString(x.HiddenDisk)}
 if err := x.Trusted.toC(&xc.trusted); err != nil {
 return fmt.Errorf("converting field Trusted: %v", err)
+}
+if err := x.GrantUsage.toC(&xc.grant_usage); err != nil {
+return fmt.Errorf("converting field GrantUsage: %v", err)
 }
 
  return nil
diff --git a/tools/golang/xenlight/types.gen.go 
b/tools/golang/xenlight/types.gen.go
index 7907aa8999..0b712d2aa4 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -740,6 +740,7 @@ ColoExport string
 ActiveDisk string
 HiddenDisk string
 Trusted Defbool
+GrantUsage Defbool
 }
 
 type DeviceNic struct {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index f1652b1664..2b69e08466 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -578,6 +578,13 @@
  */
 #define LIBXL_HAVE_DEVICE_DISK_SPECIFICATION 1
 
+/*
+ * LIBXL_HAVE_DISK_GRANT_USAGE indicates that the libxl_device_disk
+ * has 'grant_usage' field to specify the usage of Xen grants for
+ * the specification 'virtio'.
+ */
+#define LIBXL_HAVE_DISK_GRANT_USAGE 1
+
 /*
  * LIBXL_HAVE_CONSOLE_ADD_XENSTORE indicates presence of the function
  * libxl_console_add_xenstore() in libxl.
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 1539191774..1cb89fa584 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1372,12 +1372,12 @@ next_resize:
 libxl_device_disk *disk = &d_config->disks[i];
 
 if (disk->specification == LIBXL_DISK_SPECIFICATION_VIRTIO) {
-if (disk->backend_domid != LIBXL_TOOLSTACK_DOMID)
+if (libxl_defbool_val(disk->grant_usage))
 iommu_needed = true;
 
 FDT( make_virtio_mmio_node(gc, fdt, disk->base, disk->irq,
disk->backend_domid,
-   disk->backend_domid != 
LIBXL_TOOLSTACK_DOMID) );
+   
libxl_defbool_val(disk->grant_usage)) );
 }
 }
 
diff --git a/tools/libs/light/libxl_disk.c b/tools/libs/light/libxl_disk.c
index ea3623dd6f..f39f427091 100644
--- a/tools/libs/light/libxl_disk.c
+++ b/tools/libs/light/libxl_disk.c
@@ -181,6 +181,9 @@ static int libxl__device_disk_setdefault(libxl__gc *gc, 
uint32_t domid,

Re: [PATCH] xen/arm: Properly clean update to init_ttbr and smp_up_cpu

2024-01-30 Thread Oleksandr Tyshchenko



On 30.01.24 19:29, Julien Grall wrote:

Hello Julien


> From: Julien Grall 
> 
> Recent rework to the secondary boot code modified how init_ttbr and
> smp_up_cpu are accessed. Rather than directly accessing them, we
> are using a pointer to them.
> 
> The helper clean_dcache() is expected to take the variable in parameter
> and then clean its content. As we now pass a pointer to the variable,
> we will clean the area storing the address rather than the content itself.
> 
> Switch to use clean_dcache_va_range() to avoid casting the pointer.
> 
> Fixes: a5ed59e62c6f ("arm/mmu: Move init_ttbr to a new section .data.idmap")
> Fixes: 9a5114074b04 ("arm/smpboot: Move smp_up_cpu to a new section 
> .data.idmap)
> 
> Reported-by: Oleksandr Tyshchenko 
> Signed-off-by: Julien Grall 


[on Renesas R-Car Gen3 SoC with 8 cores (Arm64)]
Tested-by: Oleksandr Tyshchenko 


> ---
>   xen/arch/arm/mmu/smpboot.c | 2 +-
>   xen/arch/arm/smpboot.c | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/mmu/smpboot.c b/xen/arch/arm/mmu/smpboot.c
> index bc91fdfe3331..4ffc8254a44b 100644
> --- a/xen/arch/arm/mmu/smpboot.c
> +++ b/xen/arch/arm/mmu/smpboot.c
> @@ -88,7 +88,7 @@ static void set_init_ttbr(lpae_t *root)
>* init_ttbr will be accessed with the MMU off, so ensure the update
>* is visible by cleaning the cache.
>*/
> -clean_dcache(ptr);
> +clean_dcache_va_range(ptr, sizeof(uint64_t));
>   
>   unmap_domain_page(ptr);
>   }
> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> index 119bfa3160ad..a84e706d77da 100644
> --- a/xen/arch/arm/smpboot.c
> +++ b/xen/arch/arm/smpboot.c
> @@ -449,7 +449,7 @@ static void set_smp_up_cpu(unsigned long mpidr)
>* smp_up_cpu will be accessed with the MMU off, so ensure the update
>* is visible by cleaning the cache.
>*/
> -clean_dcache(ptr);
> +clean_dcache_va_range(ptr, sizeof(unsigned long));
>   
>   unmap_domain_page(ptr);
>

[PATCH v3] xen/gntdev: Fix the abuse of underlying struct page in DMA-buf import

2024-01-09 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

DO NOT access the underlying struct page of an sg table exported
by DMA-buf in dmabuf_imp_to_refs(), this is not allowed.
Please see drivers/dma-buf/dma-buf.c:mangle_sg_table() for details.

Fortunately, here (for special Xen device) we can avoid using
pages and calculate gfns directly from dma addresses provided by
the sg table.

Suggested-by: Daniel Vetter 
Signed-off-by: Oleksandr Tyshchenko 
Acked-by: Christian König 
Acked-by: Daniel Vetter 
Reviewed-by: Stefano Stabellini 
---
Please note, I didn't manage to test the patch against the latest master branch
on real HW (patch was only build tested there). Patch was tested on Arm64
guests using Linux v5.10.41 from vendor's BSP, this is the environment where
running this use-case is possible and to which I have an access (Xen PV display
with zero-copy and backend domain as a buffer provider - be-alloc=1, so dma-buf
import part was involved). A little bit old, but the dma-buf import code
in gntdev-dmabuf.c hasn't been changed much since that time, all context
remains allmost the same according to my code inspection.

  V2:
   - add R-b and A-b
   - fix build warning noticed by kernel test robot by initializing
 "ret" in case of error
 https://lore.kernel.org/oe-kbuild-all/202401062122.it6zvlg0-...@intel.com/

  V3:
   - add A-b
   - add in-code comment
---
---
 drivers/xen/gntdev-dmabuf.c | 50 ++---
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c
index 4440e626b797..42adc2c1e06b 100644
--- a/drivers/xen/gntdev-dmabuf.c
+++ b/drivers/xen/gntdev-dmabuf.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,7 +51,7 @@ struct gntdev_dmabuf {
 
/* Number of pages this buffer has. */
int nr_pages;
-   /* Pages of this buffer. */
+   /* Pages of this buffer (only for dma-buf export). */
struct page **pages;
 };
 
@@ -484,7 +485,7 @@ static int dmabuf_exp_from_refs(struct gntdev_priv *priv, 
int flags,
 /* DMA buffer import support. */
 
 static int
-dmabuf_imp_grant_foreign_access(struct page **pages, u32 *refs,
+dmabuf_imp_grant_foreign_access(unsigned long *gfns, u32 *refs,
int count, int domid)
 {
grant_ref_t priv_gref_head;
@@ -507,7 +508,7 @@ dmabuf_imp_grant_foreign_access(struct page **pages, u32 
*refs,
}
 
gnttab_grant_foreign_access_ref(cur_ref, domid,
-   xen_page_to_gfn(pages[i]), 0);
+   gfns[i], 0);
refs[i] = cur_ref;
}
 
@@ -529,7 +530,6 @@ static void dmabuf_imp_end_foreign_access(u32 *refs, int 
count)
 
 static void dmabuf_imp_free_storage(struct gntdev_dmabuf *gntdev_dmabuf)
 {
-   kfree(gntdev_dmabuf->pages);
kfree(gntdev_dmabuf->u.imp.refs);
kfree(gntdev_dmabuf);
 }
@@ -549,12 +549,6 @@ static struct gntdev_dmabuf *dmabuf_imp_alloc_storage(int 
count)
if (!gntdev_dmabuf->u.imp.refs)
goto fail;
 
-   gntdev_dmabuf->pages = kcalloc(count,
-  sizeof(gntdev_dmabuf->pages[0]),
-  GFP_KERNEL);
-   if (!gntdev_dmabuf->pages)
-   goto fail;
-
gntdev_dmabuf->nr_pages = count;
 
for (i = 0; i < count; i++)
@@ -576,7 +570,8 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct 
device *dev,
struct dma_buf *dma_buf;
struct dma_buf_attachment *attach;
struct sg_table *sgt;
-   struct sg_page_iter sg_iter;
+   struct sg_dma_page_iter sg_iter;
+   unsigned long *gfns;
int i;
 
dma_buf = dma_buf_get(fd);
@@ -624,26 +619,31 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, 
struct device *dev,
 
gntdev_dmabuf->u.imp.sgt = sgt;
 
-   /* Now convert sgt to array of pages and check for page validity. */
+   gfns = kcalloc(count, sizeof(*gfns), GFP_KERNEL);
+   if (!gfns) {
+   ret = ERR_PTR(-ENOMEM);
+   goto fail_unmap;
+   }
+
+   /*
+* Now convert sgt to array of gfns without accessing underlying pages.
+* It is not allowed to access the underlying struct page of an sg table
+* exported by DMA-buf, but since we deal with special Xen dma device 
here
+* (not a normal physical one) look at the dma addresses in the sg table
+* and then calculate gfns directly from them.
+*/
i = 0;
-   for_each_sgtable_page(sgt, &sg_iter, 0) {
-   struct page *page = sg_page_iter_page(&sg_iter);
-   /*
-* Check if page is valid: this can happen if we are given
-* a page from VRAM or other resources which are not backed
-

Re: [PATCH v2] xen/gntdev: Fix the abuse of underlying struct page in DMA-buf import

2024-01-09 Thread Oleksandr Tyshchenko



On 08.01.24 14:05, Daniel Vetter wrote:

Hello Daniel


> On Sun, 7 Jan 2024 at 11:35, Oleksandr Tyshchenko  wrote:
>>
>> From: Oleksandr Tyshchenko 
>>
>> DO NOT access the underlying struct page of an sg table exported
>> by DMA-buf in dmabuf_imp_to_refs(), this is not allowed.
>> Please see drivers/dma-buf/dma-buf.c:mangle_sg_table() for details.
>>
>> Fortunately, here (for special Xen device) we can avoid using
>> pages and calculate gfns directly from dma addresses provided by
>> the sg table.
>>
>> Suggested-by: Daniel Vetter 
>> Signed-off-by: Oleksandr Tyshchenko 
>> Acked-by: Christian König 
>> Reviewed-by: Stefano Stabellini 
>> ---
>> Please note, I didn't manage to test the patch against the latest master 
>> branch
>> on real HW (patch was only build tested there). Patch was tested on Arm64
>> guests using Linux v5.10.41 from vendor's BSP, this is the environment where
>> running this use-case is possible and to which I have an access (Xen PV 
>> display
>> with zero-copy and backend domain as a buffer provider - be-alloc=1, so 
>> dma-buf
>> import part was involved). A little bit old, but the dma-buf import code
>> in gntdev-dmabuf.c hasn't been changed much since that time, all context
>> remains allmost the same according to my code inspection.
>>
>>v2:
>> - add R-b and A-b
>> - fix build warning noticed by kernel test robot by initializing
>>   "ret" in case of error
>>   
>> https://urldefense.com/v3/__https://lore.kernel.org/oe-kbuild-all/202401062122.it6zvlg0-...@intel.com/__;!!GF_29dbcQIUBPA!38-mwT9HCtOeZC3m4I-m9n0hragYMHfmWcHKgDxEpGs9mg35M0bpPWWORK8aichxHtO36GZ_JnCWTLdJXdZYBmCv$
>>  [lore[.]kernel[.]org]
>> ---
>> ---
>>   drivers/xen/gntdev-dmabuf.c | 44 -
>>   1 file changed, 19 insertions(+), 25 deletions(-)
>>
>> diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c
>> index 4440e626b797..272c0ab01ef5 100644
>> --- a/drivers/xen/gntdev-dmabuf.c
>> +++ b/drivers/xen/gntdev-dmabuf.c
>> @@ -11,6 +11,7 @@
>>   #include 
>>   #include 
>>   #include 
>> +#include 
>>   #include 
>>   #include 
>>   #include 
>> @@ -50,7 +51,7 @@ struct gntdev_dmabuf {
>>
>>  /* Number of pages this buffer has. */
>>  int nr_pages;
>> -   /* Pages of this buffer. */
>> +   /* Pages of this buffer (only for dma-buf export). */
>>  struct page **pages;
>>   };
>>
>> @@ -484,7 +485,7 @@ static int dmabuf_exp_from_refs(struct gntdev_priv 
>> *priv, int flags,
>>   /* DMA buffer import support. */
>>
>>   static int
>> -dmabuf_imp_grant_foreign_access(struct page **pages, u32 *refs,
>> +dmabuf_imp_grant_foreign_access(unsigned long *gfns, u32 *refs,
>>  int count, int domid)
>>   {
>>  grant_ref_t priv_gref_head;
>> @@ -507,7 +508,7 @@ dmabuf_imp_grant_foreign_access(struct page **pages, u32 
>> *refs,
>>  }
>>
>>  gnttab_grant_foreign_access_ref(cur_ref, domid,
>> -   xen_page_to_gfn(pages[i]), 
>> 0);
>> +   gfns[i], 0);
>>  refs[i] = cur_ref;
>>  }
>>
>> @@ -529,7 +530,6 @@ static void dmabuf_imp_end_foreign_access(u32 *refs, int 
>> count)
>>
>>   static void dmabuf_imp_free_storage(struct gntdev_dmabuf *gntdev_dmabuf)
>>   {
>> -   kfree(gntdev_dmabuf->pages);
>>  kfree(gntdev_dmabuf->u.imp.refs);
>>  kfree(gntdev_dmabuf);
>>   }
>> @@ -549,12 +549,6 @@ static struct gntdev_dmabuf 
>> *dmabuf_imp_alloc_storage(int count)
>>  if (!gntdev_dmabuf->u.imp.refs)
>>  goto fail;
>>
>> -   gntdev_dmabuf->pages = kcalloc(count,
>> -  sizeof(gntdev_dmabuf->pages[0]),
>> -  GFP_KERNEL);
>> -   if (!gntdev_dmabuf->pages)
>> -   goto fail;
>> -
>>  gntdev_dmabuf->nr_pages = count;
>>
>>  for (i = 0; i < count; i++)
>> @@ -576,7 +570,8 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, 
>> struct device *dev,
>>  struct dma_buf *dma_buf;
>>  struct dma_buf_attachment *attach;
>>  struct sg_table *sgt;
>> -   struct sg_page_iter

[PATCH v2] xen/gntdev: Fix the abuse of underlying struct page in DMA-buf import

2024-01-07 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

DO NOT access the underlying struct page of an sg table exported
by DMA-buf in dmabuf_imp_to_refs(), this is not allowed.
Please see drivers/dma-buf/dma-buf.c:mangle_sg_table() for details.

Fortunately, here (for special Xen device) we can avoid using
pages and calculate gfns directly from dma addresses provided by
the sg table.

Suggested-by: Daniel Vetter 
Signed-off-by: Oleksandr Tyshchenko 
Acked-by: Christian König 
Reviewed-by: Stefano Stabellini 
---
Please note, I didn't manage to test the patch against the latest master branch
on real HW (patch was only build tested there). Patch was tested on Arm64
guests using Linux v5.10.41 from vendor's BSP, this is the environment where
running this use-case is possible and to which I have an access (Xen PV display
with zero-copy and backend domain as a buffer provider - be-alloc=1, so dma-buf
import part was involved). A little bit old, but the dma-buf import code
in gntdev-dmabuf.c hasn't been changed much since that time, all context
remains allmost the same according to my code inspection.

  v2:
   - add R-b and A-b
   - fix build warning noticed by kernel test robot by initializing
 "ret" in case of error
 https://lore.kernel.org/oe-kbuild-all/202401062122.it6zvlg0-...@intel.com/
---
---
 drivers/xen/gntdev-dmabuf.c | 44 -
 1 file changed, 19 insertions(+), 25 deletions(-)

diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c
index 4440e626b797..272c0ab01ef5 100644
--- a/drivers/xen/gntdev-dmabuf.c
+++ b/drivers/xen/gntdev-dmabuf.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,7 +51,7 @@ struct gntdev_dmabuf {
 
/* Number of pages this buffer has. */
int nr_pages;
-   /* Pages of this buffer. */
+   /* Pages of this buffer (only for dma-buf export). */
struct page **pages;
 };
 
@@ -484,7 +485,7 @@ static int dmabuf_exp_from_refs(struct gntdev_priv *priv, 
int flags,
 /* DMA buffer import support. */
 
 static int
-dmabuf_imp_grant_foreign_access(struct page **pages, u32 *refs,
+dmabuf_imp_grant_foreign_access(unsigned long *gfns, u32 *refs,
int count, int domid)
 {
grant_ref_t priv_gref_head;
@@ -507,7 +508,7 @@ dmabuf_imp_grant_foreign_access(struct page **pages, u32 
*refs,
}
 
gnttab_grant_foreign_access_ref(cur_ref, domid,
-   xen_page_to_gfn(pages[i]), 0);
+   gfns[i], 0);
refs[i] = cur_ref;
}
 
@@ -529,7 +530,6 @@ static void dmabuf_imp_end_foreign_access(u32 *refs, int 
count)
 
 static void dmabuf_imp_free_storage(struct gntdev_dmabuf *gntdev_dmabuf)
 {
-   kfree(gntdev_dmabuf->pages);
kfree(gntdev_dmabuf->u.imp.refs);
kfree(gntdev_dmabuf);
 }
@@ -549,12 +549,6 @@ static struct gntdev_dmabuf *dmabuf_imp_alloc_storage(int 
count)
if (!gntdev_dmabuf->u.imp.refs)
goto fail;
 
-   gntdev_dmabuf->pages = kcalloc(count,
-  sizeof(gntdev_dmabuf->pages[0]),
-  GFP_KERNEL);
-   if (!gntdev_dmabuf->pages)
-   goto fail;
-
gntdev_dmabuf->nr_pages = count;
 
for (i = 0; i < count; i++)
@@ -576,7 +570,8 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct 
device *dev,
struct dma_buf *dma_buf;
struct dma_buf_attachment *attach;
struct sg_table *sgt;
-   struct sg_page_iter sg_iter;
+   struct sg_dma_page_iter sg_iter;
+   unsigned long *gfns;
int i;
 
dma_buf = dma_buf_get(fd);
@@ -624,26 +619,25 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, 
struct device *dev,
 
gntdev_dmabuf->u.imp.sgt = sgt;
 
-   /* Now convert sgt to array of pages and check for page validity. */
+   gfns = kcalloc(count, sizeof(*gfns), GFP_KERNEL);
+   if (!gfns) {
+   ret = ERR_PTR(-ENOMEM);
+   goto fail_unmap;
+   }
+
+   /* Now convert sgt to array of gfns without accessing underlying pages. 
*/
i = 0;
-   for_each_sgtable_page(sgt, &sg_iter, 0) {
-   struct page *page = sg_page_iter_page(&sg_iter);
-   /*
-* Check if page is valid: this can happen if we are given
-* a page from VRAM or other resources which are not backed
-* by a struct page.
-*/
-   if (!pfn_valid(page_to_pfn(page))) {
-   ret = ERR_PTR(-EINVAL);
-   goto fail_unmap;
-   }
+   for_each_sgtable_dma_page(sgt, &sg_iter, 0) {
+   dma_addr_t addr = sg_page_iter_dma_address(&sg_iter);
+   unsigned long pfn = bfn_to_pfn(XEN_PFN_DOWN

[PATCH] xen/gntdev: Fix the abuse of underlying struct page in DMA-buf import

2024-01-04 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

DO NOT access the underlying struct page of an sg table exported
by DMA-buf in dmabuf_imp_to_refs(), this is not allowed.
Please see drivers/dma-buf/dma-buf.c:mangle_sg_table() for details.

Fortunately, here (for special Xen device) we can avoid using
pages and calculate gfns directly from dma addresses provided by
the sg table.

Suggested-by: Daniel Vetter 
Signed-off-by: Oleksandr Tyshchenko 
---
Please note, I didn't manage to test the patch against the latest master branch
on real HW (patch was only build tested there). Patch was tested on Arm64
guests using Linux v5.10.41 from vendor's BSP, this is the environment where
running this use-case is possible and to which I have an access (Xen PV display
with zero-copy and backend domain as a buffer provider - be-alloc=1, so dma-buf
import part was involved). A little bit old, but the dma-buf import code
in gntdev-dmabuf.c hasn't been changed much since that time, all context
remains allmost the same according to my code inspection.
---
---
 drivers/xen/gntdev-dmabuf.c | 42 +++--
 1 file changed, 17 insertions(+), 25 deletions(-)

diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c
index 4440e626b797..0dde49fca9a5 100644
--- a/drivers/xen/gntdev-dmabuf.c
+++ b/drivers/xen/gntdev-dmabuf.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,7 +51,7 @@ struct gntdev_dmabuf {
 
/* Number of pages this buffer has. */
int nr_pages;
-   /* Pages of this buffer. */
+   /* Pages of this buffer (only for dma-buf export). */
struct page **pages;
 };
 
@@ -484,7 +485,7 @@ static int dmabuf_exp_from_refs(struct gntdev_priv *priv, 
int flags,
 /* DMA buffer import support. */
 
 static int
-dmabuf_imp_grant_foreign_access(struct page **pages, u32 *refs,
+dmabuf_imp_grant_foreign_access(unsigned long *gfns, u32 *refs,
int count, int domid)
 {
grant_ref_t priv_gref_head;
@@ -507,7 +508,7 @@ dmabuf_imp_grant_foreign_access(struct page **pages, u32 
*refs,
}
 
gnttab_grant_foreign_access_ref(cur_ref, domid,
-   xen_page_to_gfn(pages[i]), 0);
+   gfns[i], 0);
refs[i] = cur_ref;
}
 
@@ -529,7 +530,6 @@ static void dmabuf_imp_end_foreign_access(u32 *refs, int 
count)
 
 static void dmabuf_imp_free_storage(struct gntdev_dmabuf *gntdev_dmabuf)
 {
-   kfree(gntdev_dmabuf->pages);
kfree(gntdev_dmabuf->u.imp.refs);
kfree(gntdev_dmabuf);
 }
@@ -549,12 +549,6 @@ static struct gntdev_dmabuf *dmabuf_imp_alloc_storage(int 
count)
if (!gntdev_dmabuf->u.imp.refs)
goto fail;
 
-   gntdev_dmabuf->pages = kcalloc(count,
-  sizeof(gntdev_dmabuf->pages[0]),
-  GFP_KERNEL);
-   if (!gntdev_dmabuf->pages)
-   goto fail;
-
gntdev_dmabuf->nr_pages = count;
 
for (i = 0; i < count; i++)
@@ -576,7 +570,8 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct 
device *dev,
struct dma_buf *dma_buf;
struct dma_buf_attachment *attach;
struct sg_table *sgt;
-   struct sg_page_iter sg_iter;
+   struct sg_dma_page_iter sg_iter;
+   unsigned long *gfns;
int i;
 
dma_buf = dma_buf_get(fd);
@@ -624,26 +619,23 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, 
struct device *dev,
 
gntdev_dmabuf->u.imp.sgt = sgt;
 
-   /* Now convert sgt to array of pages and check for page validity. */
+   gfns = kcalloc(count, sizeof(*gfns), GFP_KERNEL);
+   if (!gfns)
+   goto fail_unmap;
+
+   /* Now convert sgt to array of gfns without accessing underlying pages. 
*/
i = 0;
-   for_each_sgtable_page(sgt, &sg_iter, 0) {
-   struct page *page = sg_page_iter_page(&sg_iter);
-   /*
-* Check if page is valid: this can happen if we are given
-* a page from VRAM or other resources which are not backed
-* by a struct page.
-*/
-   if (!pfn_valid(page_to_pfn(page))) {
-   ret = ERR_PTR(-EINVAL);
-   goto fail_unmap;
-   }
+   for_each_sgtable_dma_page(sgt, &sg_iter, 0) {
+   dma_addr_t addr = sg_page_iter_dma_address(&sg_iter);
+   unsigned long pfn = bfn_to_pfn(XEN_PFN_DOWN(dma_to_phys(dev, 
addr)));
 
-   gntdev_dmabuf->pages[i++] = page;
+   gfns[i++] = pfn_to_gfn(pfn);
}
 
-   ret = ERR_PTR(dmabuf_imp_grant_foreign_access(gntdev_dmabuf->pages,
+   ret = ERR_PTR(dmabuf_imp_grant_foreign_access(gfns,

Re: [RFC PATCH 2/6] xen/public: arch-arm: reserve resources for virtio-pci

2023-11-17 Thread Oleksandr Tyshchenko



On 17.11.23 05:31, Stewart Hildebrand wrote:

Hello Stewart

[answering only for virtio-pci bits as for vPCI I am only familiar with 
code responsible for trapping config space accesses]

[snip]

>>
>>
>> Let me start by saying that if we can get away with it, I think that a
>> single PCI Root Complex in Xen would be best because it requires less
>> complexity. Why emulate 2/3 PCI Root Complexes if we can emulate only
>> one?
>>
>> Stewart, you are deep into vPCI, what's your thinking?
> 
> First allow me explain the moving pieces in a bit more detail (skip ahead to 
> "Back to the question: " if you don't want to be bored with the details). I 
> played around with this series, and I passed through a PCI device (with vPCI) 
> and enabled virtio-pci:
> 
> virtio = [ 
> "type=virtio,device,transport=pci,bdf=:00:00.0,backend_type=qemu" ]
> device_model_args = [ "-device", "virtio-serial-pci" ]
> pci = [ "01:00.0" ]
> 
> Indeed we get two root complexes (2 ECAM ranges, 2 sets of interrupts, etc.) 
> from the domU point of view:
> 
>  pcie@1000 {
>  compatible = "pci-host-ecam-generic";
>  device_type = "pci";
>  reg = <0x00 0x1000 0x00 0x1000>;
>  bus-range = <0x00 0xff>;
>  #address-cells = <0x03>;
>  #size-cells = <0x02>;
>  status = "okay";
>  ranges = <0x200 0x00 0x2300 0x00 0x2300 0x00 0x1000 
> 0x4200 0x01 0x00 0x01 0x00 0x01 0x00>;
>  #interrupt-cells = <0x01>;
>  interrupt-map = <0x00 0x00 0x00 0x01 0xfde8 0x00 0x74 0x04>;
>  interrupt-map-mask = <0x00 0x00 0x00 0x07>;


I am wondering how you got interrupt-map here? AFAIR upstream toolstack 
doesn't add that property for vpci dt node.

>  };
> 
>  pcie@3300 {
>  compatible = "pci-host-ecam-generic";
>  device_type = "pci";
>  reg = <0x00 0x3300 0x00 0x20>;
>  bus-range = <0x00 0x01>;
>  #address-cells = <0x03>;
>  #size-cells = <0x02>;
>  status = "okay";
>  ranges = <0x200 0x00 0x3400 0x00 0x3400 0x00 0x80 
> 0x4200 0x00 0x3a00 0x00 0x3a00 0x00 0x80>;
>  dma-coherent;
>  #interrupt-cells = <0x01>;
>  interrupt-map = <0x00 0x00 0x00 0x01 0xfde8 0x00 0x0c 0x04 0x00 0x00 
> 0x00 0x02 0xfde8 0x00 0x0d 0x04 0x00 0x00 0x00 0x03 0xfde8 0x00 0x0e 0x04 
> 0x00 0x00 0x00 0x04 0xfde8 0x00 0x0f 0x04 0x800 0x00 0x00 0x01 0xfde8 0x00 
> 0x0d 0x04 0x800 0x00 0x00 0x02 0xfde8 0x00 0x0e 0x04 0x800 0x00 0x00 0x03 
> 0xfde8 0x00 0x0f 0x04 0x800 0x00 0x00 0x04 0xfde8 0x00 0x0c 0x04 0x1000 0x00 
> 0x00 0x01 0xfde8 0x00 0x0e 0x04 0x1000 0x00 0x00 0x02 0xfde8 0x00 0x0f 0x04 
> 0x1000 0x00 0x00 0x03 0xfde8 0x00 0x0c 0x04 0x1000 0x00 0x00 0x04 0xfde8 0x00 
> 0x0d 0x04 0x1800 0x00 0x00 0x01 0xfde8 0x00 0x0f 0x04 0x1800 0x00 0x00 0x02 
> 0xfde8 0x00 0x0c 0x04 0x1800 0x00 0x00 0x03 0xfde8 0x00 0x0d 0x04 0x1800 0x00 
> 0x00 0x04 0xfde8 0x00 0x0e 0x04>;
>  interrupt-map-mask = <0x1800 0x00 0x00 0x07>;


that is correct dump.

BTW, if you added "grant_usage=1" (it is disabled by default for dom0) 
to virtio configuration you would get iommu-map property here as well 
[1]. This is another point to think about when considering combined 
approach (single PCI Host bridge node -> single virtual root complex), I 
guess usual PCI device doesn't want grant based DMA addresses, correct? 
If so, it shouldn't be specified in the property.


>  };
> 
> Xen vPCI doesn't currently expose a host bridge (i.e. a device with base 
> class 0x06). As an aside, we may eventually want to expose a virtual/emulated 
> host bridge in vPCI, because Linux's x86 PCI probe expects one [0].
> 
> Qemu exposes an emulated host bridge, along with any requested emulated 
> devices.
> 
> Running lspci -v in the domU yields the following:
> 
> :00:00.0 Network controller: Ralink corp. RT2790 Wireless 802.11n 1T/2R 
> PCIe
>  Subsystem: ASUSTeK Computer Inc. RT2790 Wireless 802.11n 1T/2R PCIe
>  Flags: bus master, fast devsel, latency 0, IRQ 13
>  Memory at 2300 (32-bit, non-prefetchable) [size=64K]
>  Capabilities: [50] MSI: Enable- Count=1/128 Maskable- 64bit+
>  Kernel driver in use: rt2800pci
> 
> 0001:00:00.0 Host bridge: Red Hat, Inc. QEMU PCIe Host bridge
>  Subsystem: Red Hat, Inc. QEMU PCIe Host bridge
>  Flags: fast devsel
> 
> 0001:00:01.0 Communication controller: Red Hat, Inc. Virtio console
>  Subsystem: Red Hat, Inc. Virtio console
>  Flags: bus master, fast devsel, latency 0, IRQ 14
>  Memory at 3a00 (64-bit, prefetchable) [size=16K]
>  Capabilities: [84] Vendor Specific Information: VirtIO: 
>  Capabilities: [70] Vendor Specific Information: VirtIO: Notify
>  Capabilities: [60] Vendor Specific Information: VirtIO: DeviceCfg
>  Capabilities: [50] Vendor Specific Information: VirtIO: ISR
>  Capab

Re: [RFC PATCH 2/6] xen/public: arch-arm: reserve resources for virtio-pci

2023-11-15 Thread Oleksandr Tyshchenko



On 15.11.23 20:33, Julien Grall wrote:
> Hi Oleksandr,

Hello Julien


> 
> On 15/11/2023 18:14, Oleksandr Tyshchenko wrote:
>> On 15.11.23 19:31, Julien Grall wrote:
>>> On 15/11/2023 16:51, Oleksandr Tyshchenko wrote:
>>>> On 15.11.23 14:33, Julien Grall wrote:
>>>>> Thanks for adding support for virtio-pci in Xen. I have some 
>>>>> questions.
>>>>>
>>>>> On 15/11/2023 11:26, Sergiy Kibrik wrote:
>>>>>> From: Oleksandr Tyshchenko 
>>>>>>
>>>>>> In order to enable more use-cases such as having multiple
>>>>>> device-models (Qemu) running in different backend domains which 
>>>>>> provide
>>>>>> virtio-pci devices for the same guest, we allocate and expose one
>>>>>> PCI host bridge for every virtio backend domain for that guest.
>>>>>
>>>>> OOI, why do you need to expose one PCI host bridge for every 
>>>>> stubdomain?
>>>>>
>>>>> In fact looking at the next patch, it seems you are handling some 
>>>>> of the
>>>>> hostbridge request in Xen. This is adds a bit more confusion.
>>>>>
>>>>> I was expecting the virtual PCI device would be in the vPCI and each
>>>>> Device emulator would advertise which BDF they are covering.
>>>>
>>>>
>>>> This patch series only covers use-cases where the device emulator
>>>> handles the *entire* PCI Host bridge and PCI (virtio-pci) devices 
>>>> behind
>>>> it (i.e. Qemu). Also this patch series doesn't touch vPCI/PCI
>>>> pass-through resources, handling, accounting, nothing.
>>>
>>> I understood you want to one Device Emulator to handle the entire PCI
>>> host bridge. But...
>>>
>>>   From the
>>>> hypervisor we only need a help to intercept the config space accesses
>>>> happen in a range [GUEST_VIRTIO_PCI_ECAM_BASE ...
>>>> GUEST_VIRTIO_PCI_ECAM_BASE + GUEST_VIRTIO_PCI_TOTAL_ECAM_SIZE] and
>>>> forward them to the linked device emulator (if any), that's all.
>>>
>>> ... I really don't see why you need to add code in Xen to trap the
>>> region. If QEMU is dealing with the hostbridge, then it should be able
>>> to register the MMIO region and then do the translation.
>>
>>
>> Hmm, sounds surprising I would say. Are you saying that unmodified Qemu
>> will work if we drop #5?
> 
> I don't know if an unmodified QEMU will work. My point is I don't view 
> the patch in Xen necessary. You should be able to tell QEMU "here is the 
> ECAM region, please emulate an hostbridge". QEMU will then register the 
> region to Xen and all the accesses will be forwarded. >
> In the future we may need a patch similar to #5 if we want to have 
> multiple DM using the same hostbridge. But this is a different 
> discussion and the patch would need some rework.


ok

> 
> The ioreq.c code was always meant to be generic and is always for every 
> emulated MMIO. So you want to limit any change in it. Checking the MMIO 
> region belongs to the hostbridge and doing the translation is IMHO not a 
> good idea to do in ioreq.c. Instead you want to do the conversion from 
> MMIO to (sbdf, offset) in virtio_pci_mmio{read, write}(). So the job of 
> ioreq.c is to simply find the correct Device Model and forward it.



Are you about virtio_pci_ioreq_server_get_addr() called from 
arch_ioreq_server_get_type_addr()? If so and if I am not mistaken the 
x86 also check what PCI device is targeted there.

But, I am not against the suggestion, I agree with it.


> 
> I also don't see why the feature is gated by has_vcpi(). They are two 
> distinct features (at least in your current model).

yes, you are correct. In #5 virtio-pci mmio handlers are still 
registered in domain_vpci_init() (which is gated by has_vcpi()), etc


> 
> Cheers,
>

Re: [RFC PATCH 2/6] xen/public: arch-arm: reserve resources for virtio-pci

2023-11-15 Thread Oleksandr Tyshchenko



On 15.11.23 19:31, Julien Grall wrote:
> Hi Oleksandr,


Hello Julien

> 
> On 15/11/2023 16:51, Oleksandr Tyshchenko wrote:
>>
>>
>> On 15.11.23 14:33, Julien Grall wrote:
>>> Hi,
>>
>>
>> Hello Julien
>>
>> Let me please try to explain some bits.
>>
>>
>>>
>>> Thanks for adding support for virtio-pci in Xen. I have some questions.
>>>
>>> On 15/11/2023 11:26, Sergiy Kibrik wrote:
>>>> From: Oleksandr Tyshchenko 
>>>>
>>>> In order to enable more use-cases such as having multiple
>>>> device-models (Qemu) running in different backend domains which provide
>>>> virtio-pci devices for the same guest, we allocate and expose one
>>>> PCI host bridge for every virtio backend domain for that guest.
>>>
>>> OOI, why do you need to expose one PCI host bridge for every stubdomain?
>>>
>>> In fact looking at the next patch, it seems you are handling some of the
>>> hostbridge request in Xen. This is adds a bit more confusion.
>>>
>>> I was expecting the virtual PCI device would be in the vPCI and each
>>> Device emulator would advertise which BDF they are covering.
>>
>>
>> This patch series only covers use-cases where the device emulator
>> handles the *entire* PCI Host bridge and PCI (virtio-pci) devices behind
>> it (i.e. Qemu). Also this patch series doesn't touch vPCI/PCI
>> pass-through resources, handling, accounting, nothing. 
> 
> I understood you want to one Device Emulator to handle the entire PCI 
> host bridge. But...
> 
>  From the
>> hypervisor we only need a help to intercept the config space accesses
>> happen in a range [GUEST_VIRTIO_PCI_ECAM_BASE ...
>> GUEST_VIRTIO_PCI_ECAM_BASE + GUEST_VIRTIO_PCI_TOTAL_ECAM_SIZE] and
>> forward them to the linked device emulator (if any), that's all.
> 
> ... I really don't see why you need to add code in Xen to trap the 
> region. If QEMU is dealing with the hostbridge, then it should be able 
> to register the MMIO region and then do the translation.


Hmm, sounds surprising I would say. Are you saying that unmodified Qemu 
will work if we drop #5? I think this wants to be re-checked (@Sergiy 
can you please investigate?). If indeed so, than #5 will be dropped of 
course from the that series (I would say, postponed until more use-cases).



> 
>>
>> It is not possible (with current series) to run device emulators what
>> emulate only separate PCI (virtio-pci) devices. For it to be possible, I
>> think, much more changes are required than current patch series does.
>> There at least should be special PCI Host bridge emulation in Xen (or
>> reuse vPCI) for the integration. Also Xen should be in charge of forming
>> resulting PCI interrupt based on each PCI device level signaling (if we
>> use legacy interrupts), some kind of x86's XEN_DMOP_set_pci_intx_level,
>> etc. Please note, I am not saying this is not possible in general,
>> likely it is possible, but initial patch series doesn't cover these
>> use-cases)
>>
>> We expose one PCI host bridge per virtio backend domain. This is a
>> separate PCI host bridge to combine all virtio-pci devices running in
>> the same backend domain (in the same device emulator currently).
>> The examples:
>> - if only one domain runs Qemu which servers virtio-blk, virtio-net,
>> virtio-console devices for DomU - only single PCI Host bridge will be
>> exposed for DomU
>> - if we add another domain to run Qemu to serve additionally virtio-gpu,
>> virtio-input and virtio-snd for the *same* DomU - we expose second PCI
>> Host bridge for DomU
>>
>> I am afraid, we cannot end up exposing only single PCI Host bridge with
>> current model (if we use device emulators running in different domains
>> that handles the *entire* PCI Host bridges), this won't work.
> 
> That makes sense and it is fine. But see above, I think only the #2 is 
> necessary for the hypervisor. Patch #5 should not be necessary at all.


Good, it should be re-checked without #5 sure.


> 
> [...]
> 
>>>> Signed-off-by: Oleksandr Tyshchenko 
>>>> Signed-off-by: Sergiy Kibrik 
>>>> ---
>>>>    xen/include/public/arch-arm.h | 21 +
>>>>    1 file changed, 21 insertions(+)
>>>>
>>>> diff --git a/xen/include/public/arch-arm.h
>>>> b/xen/include/public/arch-arm.h
>>>> index a25e87dbda..e6c9cd5335 100644
>>>> --- a/xen/include/public/arch-arm.h
>>>> +++ b/xen/include/public/arch-a

Re: [RFC PATCH 2/6] xen/public: arch-arm: reserve resources for virtio-pci

2023-11-15 Thread Oleksandr Tyshchenko

On 15.11.23 14:33, Julien Grall wrote:
> Hi,

Hello Julien

Let me please try to explain some bits.

> 
> Thanks for adding support for virtio-pci in Xen. I have some questions.
> 
> On 15/11/2023 11:26, Sergiy Kibrik wrote:
>> From: Oleksandr Tyshchenko 
>>
>> In order to enable more use-cases such as having multiple
>> device-models (Qemu) running in different backend domains which provide
>> virtio-pci devices for the same guest, we allocate and expose one
>> PCI host bridge for every virtio backend domain for that guest.
> 
> OOI, why do you need to expose one PCI host bridge for every stubdomain?
> 
> In fact looking at the next patch, it seems you are handling some of the 
> hostbridge request in Xen. This is adds a bit more confusion.
> 
> I was expecting the virtual PCI device would be in the vPCI and each 
> Device emulator would advertise which BDF they are covering.

This patch series only covers use-cases where the device emulator 
handles the *entire* PCI Host bridge and PCI (virtio-pci) devices behind 
it (i.e. Qemu). Also this patch series doesn't touch vPCI/PCI 
pass-through resources, handling, accounting, nothing. From the 
hypervisor we only need a help to intercept the config space accesses 
happen in a range [GUEST_VIRTIO_PCI_ECAM_BASE ... 
GUEST_VIRTIO_PCI_ECAM_BASE + GUEST_VIRTIO_PCI_TOTAL_ECAM_SIZE] and 
forward them to the linked device emulator (if any), that's all.

It is not possible (with current series) to run device emulators what
emulate only separate PCI (virtio-pci) devices. For it to be possible, I 
think, much more changes are required than current patch series does. 
There at least should be special PCI Host bridge emulation in Xen (or 
reuse vPCI) for the integration. Also Xen should be in charge of forming 
resulting PCI interrupt based on each PCI device level signaling (if we 
use legacy interrupts), some kind of x86's XEN_DMOP_set_pci_intx_level, 
etc. Please note, I am not saying this is not possible in general, 
likely it is possible, but initial patch series doesn't cover these 
use-cases)

We expose one PCI host bridge per virtio backend domain. This is a 
separate PCI host bridge to combine all virtio-pci devices running in 
the same backend domain (in the same device emulator currently).
The examples:
- if only one domain runs Qemu which servers virtio-blk, virtio-net, 
virtio-console devices for DomU - only single PCI Host bridge will be 
exposed for DomU
- if we add another domain to run Qemu to serve additionally virtio-gpu, 
virtio-input and virtio-snd for the *same* DomU - we expose second PCI 
Host bridge for DomU

I am afraid, we cannot end up exposing only single PCI Host bridge with 
current model (if we use device emulators running in different domains 
that handles the *entire* PCI Host bridges), this won't work.

Please note, I might miss some bits since this enabling work.

> 
>>
>> For that purpose, reserve separate virtio-pci resources (memory and 
>> SPI range
>> for Legacy PCI interrupts) up to 8 possible PCI hosts (to be aligned with
> 
> Do you mean host bridge rather than host?

yes

> 
>> MAX_NR_IOREQ_SERVERS) and allocate a host per backend domain. The PCI 
>> host
>> details including its host_id to be written to dedicated Xenstore node 
>> for
>> the device-model to retrieve.
> 
> So which with approach, who is decide which BDF will be used for a given 
> virtio PCI device?

toolstack (via configuration file)

> 
>>
>> Signed-off-by: Oleksandr Tyshchenko 
>> Signed-off-by: Sergiy Kibrik 
>> ---
>>   xen/include/public/arch-arm.h | 21 +
>>   1 file changed, 21 insertions(+)
>>
>> diff --git a/xen/include/public/arch-arm.h 
>> b/xen/include/public/arch-arm.h
>> index a25e87dbda..e6c9cd5335 100644
>> --- a/xen/include/public/arch-arm.h
>> +++ b/xen/include/public/arch-arm.h
>> @@ -466,6 +466,19 @@ typedef uint64_t xen_callback_t;
>>   #define GUEST_VPCI_MEM_ADDR xen_mk_ullong(0x2300)
>>   #define GUEST_VPCI_MEM_SIZE xen_mk_ullong(0x1000)
>> +/*
>> + * 16 MB is reserved for virtio-pci configuration space based on 
>> calculation
>> + * 8 bridges * 2 buses x 32 devices x 8 functions x 4 KB = 16 MB
> 
> Can you explain how youd ecided the "2"?

good question, we have a limited free space available in memory layout 
(we had difficulties to find a suitable holes) also we don't expect a 
lot of virtio-pci devices, so "256" used vPCI would be too much. It was 
decided to reduce significantly, but select maximum to fit into free 
space, with having "2" buses we still fit into the chosen holes.

> 
>> + */
>> +#define

Re: [PATCH 5/7] xen/events: drop xen_allocate_irqs_dynamic()

2023-11-14 Thread Oleksandr Tyshchenko



On 14.11.23 10:35, Juergen Gross wrote:


Hello Juergen


> On 14.11.23 09:20, Oleksandr Tyshchenko wrote:
>>
>>
>> On 16.10.23 09:28, Juergen Gross wrote:
>>
>>
>> Hello Juergen
>>
>>> Instead of having a common function for allocating a single IRQ or a
>>> consecutive number of IRQs, split up the functionality into the callers
>>> of xen_allocate_irqs_dynamic().
>>>
>>> This allows to handle any allocation error in xen_irq_init() gracefully
>>> instead of panicing the system. Let xen_irq_init() return the irq_info
>>> pointer or NULL in case of an allocation error.
>>>
>>> Additionally set the IRQ into irq_info already at allocation time, as
>>> otherwise the IRQ would be '0' (which is a valid IRQ number) until
>>> being set.
>>>
>>> Signed-off-by: Juergen Gross 
>>> ---
>>>    drivers/xen/events/events_base.c | 74 
>>> +++-
>>>    1 file changed, 44 insertions(+), 30 deletions(-)
>>>
>>
>> [snip]
>>
>>> @@ -1725,6 +1738,7 @@ void rebind_evtchn_irq(evtchn_port_t evtchn, 
>>> int irq)
>>>   so there should be a proper type */
>>>    BUG_ON(info->type == IRQT_UNBOUND);
>>> +    info->irq = irq;
>>
>>
>> I failed to understand why this is added here. Doesn't irq remain the
>> same, and info->irq remains valid? Could you please clarify.
> 
> The IRQ remains the same, but the event channel could change.
> 
> This setting of info->irq compensates for the related removal in
> xen_irq_info_common_setup().


Thanks for the clarification, you can add my

Reviewed-by: Oleksandr Tyshchenko 

> 
>>
>> Other changes lgtm.
> 
> 
> Juergen
>

Re: [PATCH 7/7] xen/events: remove some info_for_irq() calls in pirq handling

2023-11-14 Thread Oleksandr Tyshchenko



On 16.10.23 09:28, Juergen Gross wrote:

Hello Juergen


> Instead of the IRQ number user the struct irq_info pointer as parameter
> in the internal pirq related functions. This allows to drop some calls
> of info_for_irq().
> 
> Signed-off-by: Juergen Gross 


Looks good, so

Reviewed-by: Oleksandr Tyshchenko 


Just one NIT below ...


[snip]

>   
> -static void pirq_query_unmask(int irq)
> +static void pirq_query_unmask(struct irq_info *info)
>   {
>   struct physdev_irq_status_query irq_status;
> - struct irq_info *info = info_for_irq(irq);
>   
>   BUG_ON(info->type != IRQT_PIRQ);
>   
> - irq_status.irq = pirq_from_irq(irq);
> + irq_status.irq = info->u.pirq.pirq;


  ... what is the reason to open-code pirq_from_irq() here?
For example, __startup_pirq() continues to use helper in almost the same 
situation ...


[snip]

>   
> -static unsigned int __startup_pirq(unsigned int irq)
> +static unsigned int __startup_pirq(struct irq_info *info)
>   {
>   struct evtchn_bind_pirq bind_pirq;
> - struct irq_info *info = info_for_irq(irq);
> - evtchn_port_t evtchn = evtchn_from_irq(irq);
> + evtchn_port_t evtchn = info->evtchn;
>   int rc;
>   
>   BUG_ON(info->type != IRQT_PIRQ);
> @@ -851,20 +868,20 @@ static unsigned int __startup_pirq(unsigned int irq)
>   if (VALID_EVTCHN(evtchn))
>   goto out;
>   
> - bind_pirq.pirq = pirq_from_irq(irq);
> + bind_pirq.pirq = pirq_from_irq(info);

... here



[snip]

Re: [PATCH 6/7] xen/events: modify internal [un]bind interfaces

2023-11-14 Thread Oleksandr Tyshchenko

On 16.10.23 09:28, Juergen Gross wrote:

Hello Juergen

> Modify the internal bind- and unbind-interfaces to take a struct
> irq_info parameter. When allocating a new IRQ pass the pointer from
> the allocating function further up.
> 
> This will reduce the number of info_for_irq() calls and make the code
> more efficient.
> 
> Signed-off-by: Juergen Gross 

I didn't spot obvious issues with current patch, other than just the 
fact that patch needs rebasing (some hunks cannot be applied because of
"e64e7c74b99e xen/events: avoid using info_for_irq() in 
xen_send_IPI_one()" went in).

I was going to ask why "pirq_query_unmask()/pirq_from_irq()" wasn't
converted to take a struct irq_info parameter as well, but looking at 
the rest I noticed this was already done in subsequent commit.

With proper rebasing:
Reviewed-by: Oleksandr Tyshchenko 

[snip]

Re: [PATCH 5/7] xen/events: drop xen_allocate_irqs_dynamic()

2023-11-14 Thread Oleksandr Tyshchenko



On 16.10.23 09:28, Juergen Gross wrote:


Hello Juergen

> Instead of having a common function for allocating a single IRQ or a
> consecutive number of IRQs, split up the functionality into the callers
> of xen_allocate_irqs_dynamic().
> 
> This allows to handle any allocation error in xen_irq_init() gracefully
> instead of panicing the system. Let xen_irq_init() return the irq_info
> pointer or NULL in case of an allocation error.
> 
> Additionally set the IRQ into irq_info already at allocation time, as
> otherwise the IRQ would be '0' (which is a valid IRQ number) until
> being set.
> 
> Signed-off-by: Juergen Gross 
> ---
>   drivers/xen/events/events_base.c | 74 +++-
>   1 file changed, 44 insertions(+), 30 deletions(-)
> 

[snip]

> @@ -1725,6 +1738,7 @@ void rebind_evtchn_irq(evtchn_port_t evtchn, int irq)
>  so there should be a proper type */
>   BUG_ON(info->type == IRQT_UNBOUND);
>   
> + info->irq = irq;


I failed to understand why this is added here. Doesn't irq remain the 
same, and info->irq remains valid? Could you please clarify.

Other changes lgtm.


>   (void)xen_irq_info_evtchn_setup(irq, evtchn, NULL);
>   
>   mutex_unlock(&irq_mapping_update_lock);

Re: [PATCH 4/7] xen/events: remove some simple helpers from events_base.c

2023-11-13 Thread Oleksandr Tyshchenko



On 16.10.23 09:28, Juergen Gross wrote:


Hello Juergen.


> The helper functions type_from_irq() and cpu_from_irq() are just one
> line functions used only internally.
> 
> Open code them where needed. At the same time modify and rename
> get_evtchn_to_irq() to return a struct irq_info instead of the IRQ
> number.
> 
> Signed-off-by: Juergen Gross 



[snip]



> 
> @@ -1181,15 +1172,16 @@ static int bind_evtchn_to_irq_chip(evtchn_port_t 
> evtchn, struct irq_chip *chip,
>   {
>   int irq;
>   int ret;
> + struct irq_info *info;
>   
>   if (evtchn >= xen_evtchn_max_channels())
>   return -ENOMEM;


I assume this check is called here (*before* holding a lock) by 
intention, as evtchn_to_info() below contains the same check.

>   
>   mutex_lock(&irq_mapping_update_lock);
>   
> - irq = get_evtchn_to_irq(evtchn);
> + info = evtchn_to_info(evtchn) >
> - if (irq == -1) {
> + if (!info) {
>   irq = xen_allocate_irq_dynamic();
>   if (irq < 0)
>   goto out;
> @@ -1212,8 +1204,8 @@ static int bind_evtchn_to_irq_chip(evtchn_port_t 
> evtchn, struct irq_chip *chip,
>*/
>   bind_evtchn_to_cpu(evtchn, 0, false);
>   } else {
> - struct irq_info *info = info_for_irq(irq);
> - WARN_ON(info == NULL || info->type != IRQT_EVTCHN);
> + WARN_ON(info->type != IRQT_EVTCHN);
> + irq = info->irq;
>   }


This hunk doesn't apply clearly to the latest state, because of 
"9e90e58c11b7 xen: evtchn: Allow shared registration of IRQ handers" 
went in. Please rebase.


With that:
Reviewed-by: Oleksandr Tyshchenko 


Also checkpatch.pl warns about BUG_ON usage in several places, but again 
you didn't introduce them in current patch, just touched their args.


[snip]

Re: [PATCH 3/7] xen/events: reduce externally visible helper functions

2023-11-13 Thread Oleksandr Tyshchenko



On 16.10.23 09:28, Juergen Gross wrote:


Hello Juergen

> get_evtchn_to_irq() has only one external user while irq_from_evtchn()
> provides the same functionality and is exported for a wider user base.
> Modify the only external user of get_evtchn_to_irq() to use
> irq_from_evtchn() instead and make get_evtchn_to_irq() static.
> 
> evtchn_from_irq() and irq_from_virq() have a single external user and
> can easily be combined to a new helper irq_evtchn_from_virq() allowing
> to drop irq_from_virq() and to make evtchn_from_irq() static.
> 
> Signed-off-by: Juergen Gross 


Reviewed-by: Oleksandr Tyshchenko 

Two NITs *NOT* directly related to current patch below.


> ---
>   drivers/xen/events/events_2l.c   |  8 
>   drivers/xen/events/events_base.c | 13 +
>   drivers/xen/events/events_internal.h |  1 -
>   include/xen/events.h |  4 ++--
>   4 files changed, 15 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c
> index b8f2f971c2f0..e3585330cf98 100644
> --- a/drivers/xen/events/events_2l.c
> +++ b/drivers/xen/events/events_2l.c
> @@ -171,11 +171,11 @@ static void evtchn_2l_handle_events(unsigned cpu, 
> struct evtchn_loop_ctrl *ctrl)
>   int i;
>   struct shared_info *s = HYPERVISOR_shared_info;
>   struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu);
> + evtchn_port_t evtchn;
>   
>   /* Timer interrupt has highest priority. */
> - irq = irq_from_virq(cpu, VIRQ_TIMER);
> + irq = irq_evtchn_from_virq(cpu, VIRQ_TIMER, &evtchn);
>   if (irq != -1) {
> - evtchn_port_t evtchn = evtchn_from_irq(irq);

Most users of evtchn_from_irq() check returned evtchn via VALID_EVTCHN() 
as it might be 0. But this user doesn't.


>   word_idx = evtchn / BITS_PER_LONG;
>   bit_idx = evtchn % BITS_PER_LONG;
>   if (active_evtchns(cpu, s, word_idx) & (1ULL << bit_idx))
> @@ -328,9 +328,9 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id)
>   for (i = 0; i < EVTCHN_2L_NR_CHANNELS; i++) {
>   if (sync_test_bit(i, BM(sh->evtchn_pending))) {
>   int word_idx = i / BITS_PER_EVTCHN_WORD;
> - printk("  %d: event %d -> irq %d%s%s%s\n",
> + printk("  %d: event %d -> irq %u%s%s%s\n",

checkpatch.pl says:

WARNING: printk() should include KERN_ facility level
#37: FILE: drivers/xen/events/events_2l.c:331:
+   printk("  %d: event %d -> irq %u%s%s%s\n",


>  cpu_from_evtchn(i), i,
> -get_evtchn_to_irq(i),
> +irq_from_evtchn(i),
>  sync_test_bit(word_idx, 
> BM(&v->evtchn_pending_sel))
>  ? "" : " l2-clear",
>  !sync_test_bit(i, BM(sh->evtchn_mask))
> 

[snip]

Re: [PATCH 2/7] xen/events: remove unused functions

2023-11-13 Thread Oleksandr Tyshchenko



On 16.10.23 09:28, Juergen Gross wrote:

Hello Juergen


> There are no users of xen_irq_from_pirq() and xen_set_irq_pending().
> 
> Remove those functions.
> 
> Signed-off-by: Juergen Gross 


Reviewed-by: Oleksandr Tyshchenko 


> ---
>   drivers/xen/events/events_base.c | 30 --
>   include/xen/events.h |  4 
>   2 files changed, 34 deletions(-)
> 
> diff --git a/drivers/xen/events/events_base.c 
> b/drivers/xen/events/events_base.c
> index 0e458b1c0c8c..1d797dd85d0e 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -1165,29 +1165,6 @@ int xen_destroy_irq(int irq)
>   return rc;
>   }
>   
> -int xen_irq_from_pirq(unsigned pirq)
> -{
> - int irq;
> -
> - struct irq_info *info;
> -
> - mutex_lock(&irq_mapping_update_lock);
> -
> - list_for_each_entry(info, &xen_irq_list_head, list) {
> - if (info->type != IRQT_PIRQ)
> - continue;
> - irq = info->irq;
> - if (info->u.pirq.pirq == pirq)
> - goto out;
> - }
> - irq = -1;
> -out:
> - mutex_unlock(&irq_mapping_update_lock);
> -
> - return irq;
> -}
> -
> -
>   int xen_pirq_from_irq(unsigned irq)
>   {
>   return pirq_from_irq(irq);
> @@ -2026,13 +2003,6 @@ void xen_clear_irq_pending(int irq)
>   event_handler_exit(info);
>   }
>   EXPORT_SYMBOL(xen_clear_irq_pending);
> -void xen_set_irq_pending(int irq)
> -{
> - evtchn_port_t evtchn = evtchn_from_irq(irq);
> -
> - if (VALID_EVTCHN(evtchn))
> - set_evtchn(evtchn);
> -}
>   
>   bool xen_test_irq_pending(int irq)
>   {
> diff --git a/include/xen/events.h b/include/xen/events.h
> index 23932b0673dc..a129cafa80ed 100644
> --- a/include/xen/events.h
> +++ b/include/xen/events.h
> @@ -88,7 +88,6 @@ void xen_irq_resume(void);
>   
>   /* Clear an irq's pending state, in preparation for polling on it */
>   void xen_clear_irq_pending(int irq);
> -void xen_set_irq_pending(int irq);
>   bool xen_test_irq_pending(int irq);
>   
>   /* Poll waiting for an irq to become pending.  In the usual case, the
> @@ -122,9 +121,6 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct 
> msi_desc *msidesc,
>   /* De-allocates the above mentioned physical interrupt. */
>   int xen_destroy_irq(int irq);
>   
> -/* Return irq from pirq */
> -int xen_irq_from_pirq(unsigned pirq);
> -
>   /* Return the pirq allocated to the irq. */
>   int xen_pirq_from_irq(unsigned irq);
>

Re: [PATCH 1/7] xen/events: fix delayed eoi list handling

2023-11-13 Thread Oleksandr Tyshchenko





On 16.10.23 09:28, Juergen Gross wrote:


Hello Juergen


When delaying eoi handling of events, the related elements are queued
into the percpu lateeoi list. In case the list isn't empty, the
elements should be sorted by the time when eoi handling is to happen.

Unfortunately a new element will never be queued at the start of the
list, even if it has a handling time lower than all other list
elements.

Fix that by handling that case the same way as for an empty list.

Fixes: e99502f76271 ("xen/events: defer eoi in case of excessive number of 
events")
Reported-by: Jan Beulich 
Signed-off-by: Juergen Gross 



Reviewed-by: Oleksandr Tyshchenko 


---
  drivers/xen/events/events_base.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 1b2136fe0fa5..0e458b1c0c8c 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -601,7 +601,9 @@ static void lateeoi_list_add(struct irq_info *info)
  
  	spin_lock_irqsave(&eoi->eoi_list_lock, flags);
  
-	if (list_empty(&eoi->eoi_list)) {

+   elem = list_first_entry_or_null(&eoi->eoi_list, struct irq_info,
+   eoi_list);
+   if (!elem || info->eoi_time < elem->eoi_time) {
list_add(&info->eoi_list, &eoi->eoi_list);
mod_delayed_work_on(info->eoi_cpu, system_wq,
&eoi->delayed, delay);

Re: Issue with shared information page on Xen/ARM 4.17

2023-10-04 Thread Oleksandr Tyshchenko



On 04.10.23 15:59, Roger Pau Monné wrote:

Hello Roger



> On Wed, Oct 04, 2023 at 11:42:32AM +0000, Oleksandr Tyshchenko wrote:
>>
>>
>> On 04.10.23 13:55, Julien Grall wrote:
>>
>> Hello all.
>>
>>> Hi Roger,
>>>
>>> On 04/10/2023 09:13, Roger Pau Monné wrote:
>>>> On Tue, Oct 03, 2023 at 12:18:35PM -0700, Elliott Mitchell wrote:
>>>>> On Tue, Oct 03, 2023 at 10:26:28AM +0200, Roger Pau Monné wrote:
>>>>>> On Thu, Sep 28, 2023 at 07:49:18PM -0700, Elliott Mitchell wrote:
>>>>>>> I'm trying to get FreeBSD/ARM operational on Xen/ARM.  Current
>>>>>>> issue is
>>>>>>> the changes with the handling of the shared information page appear to
>>>>>>> have broken things for me.
>>>>>>>
>>>>>>> With a pre-4.17 build of Xen/ARM things worked fine.  Yet with a build
>>>>>>> of the 4.17 release, mapping the shared information page doesn't work.
>>>>>>
>>>>>> This is due to 71320946d5edf AFAICT.
>>>>>
>>>>> Yes.  While the -EBUSY line may be the one triggering, I'm unsure why.
>>>>> This seems a fairly reasonable change, so I had no intention of asking
>>>>> for a revert (which likely would have been rejected).  There is also a
>>>>> real possibility the -EBUSY comes from elsewhere.  Could also be
>>>>> 71320946d5edf caused a bug elsewhere to be exposed.
>>>>
>>>> A good way to know would be to attempt to revert 71320946d5edf and see
>>>> if that fixes your issue.
>>>>
>>>> Alternatively you can try (or similar):
>>>>
>>>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>>>> index 6ccffeaea57d..105ef3faecfd 100644
>>>> --- a/xen/arch/arm/mm.c
>>>> +++ b/xen/arch/arm/mm.c
>>>> @@ -1424,6 +1424,8 @@ int xenmem_add_to_physmap_one(
>>>>    page_set_xenheap_gfn(mfn_to_page(mfn), gfn);
>>>>    }
>>>>    else
>>>> +    {
>>>> +    printk("%u already mapped\n", space);
>>>>    /*
>>>>     * Mandate the caller to first unmap the page before
>>>> mapping it
>>>>     * again. This is to prevent Xen creating an unwanted
>>>> hole in
>>>> @@ -1432,6 +1434,7 @@ int xenmem_add_to_physmap_one(
>>>>     * to unmap it afterwards.
>>>>     */
>>>>    rc = -EBUSY;
>>>> +    }
>>>>    p2m_write_unlock(p2m);
>>>>    }
>>>>
>>>>>>> I'm using Tianocore as the first stage loader.  This continues to work
>>>>>>> fine.  The build is using tag "edk2-stable202211", commit fff6d81270.
>>>>>>> While Tianocore does map the shared information page, my reading of
>>>>>>> their
>>>>>>> source is that it properly unmaps the page and therefore shouldn't
>>>>>>> cause
>>>>>>> trouble.
>>>>>>>
>>>>>>> Notes on the actual call is gpfn was 0x00040072.  This is
>>>>>>> outside
>>>>>>> the recommended address range, but my understanding is this is
>>>>>>> supposed
>>>>>>> to be okay.
>>>>>>>
>>>>>>> The return code is -16, which is EBUSY.
>>>>>>>
>>>>>>> Ideas?
>>>>>>
>>>>>> I think the issue is that you are mapping the shared info page over a
>>>>>> guest RAM page, and in order to do that you would fist need to create
>>>>>> a hole and then map the shared info page.  IOW: the issue is not with
>>>>>> edk2 not having unmapped the page, but with FreeBSD trying to map the
>>>>>> shared_info over a RAM page instead of a hole in the p2m.  x86
>>>>>> behavior is different here, and does allow mapping the shared_info
>>>>>> page over a RAM gfn (by first removing the backing RAM page on the
>>>>>> gfn).
>>>>>
>>>>> An interesting thought.  I thought I'd tried this, but since I didn't
>>>>> see
>>>>> such in my experiments list.  What I had tried was removing all the
>>>&g

Re: Issue with shared information page on Xen/ARM 4.17

2023-10-04 Thread Oleksandr Tyshchenko



On 04.10.23 13:55, Julien Grall wrote:

Hello all.

> Hi Roger,
> 
> On 04/10/2023 09:13, Roger Pau Monné wrote:
>> On Tue, Oct 03, 2023 at 12:18:35PM -0700, Elliott Mitchell wrote:
>>> On Tue, Oct 03, 2023 at 10:26:28AM +0200, Roger Pau Monné wrote:
 On Thu, Sep 28, 2023 at 07:49:18PM -0700, Elliott Mitchell wrote:
> I'm trying to get FreeBSD/ARM operational on Xen/ARM.  Current 
> issue is
> the changes with the handling of the shared information page appear to
> have broken things for me.
>
> With a pre-4.17 build of Xen/ARM things worked fine.  Yet with a build
> of the 4.17 release, mapping the shared information page doesn't work.

 This is due to 71320946d5edf AFAICT.
>>>
>>> Yes.  While the -EBUSY line may be the one triggering, I'm unsure why.
>>> This seems a fairly reasonable change, so I had no intention of asking
>>> for a revert (which likely would have been rejected).  There is also a
>>> real possibility the -EBUSY comes from elsewhere.  Could also be
>>> 71320946d5edf caused a bug elsewhere to be exposed.
>>
>> A good way to know would be to attempt to revert 71320946d5edf and see
>> if that fixes your issue.
>>
>> Alternatively you can try (or similar):
>>
>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>> index 6ccffeaea57d..105ef3faecfd 100644
>> --- a/xen/arch/arm/mm.c
>> +++ b/xen/arch/arm/mm.c
>> @@ -1424,6 +1424,8 @@ int xenmem_add_to_physmap_one(
>>   page_set_xenheap_gfn(mfn_to_page(mfn), gfn);
>>   }
>>   else
>> +    {
>> +    printk("%u already mapped\n", space);
>>   /*
>>    * Mandate the caller to first unmap the page before 
>> mapping it
>>    * again. This is to prevent Xen creating an unwanted 
>> hole in
>> @@ -1432,6 +1434,7 @@ int xenmem_add_to_physmap_one(
>>    * to unmap it afterwards.
>>    */
>>   rc = -EBUSY;
>> +    }
>>   p2m_write_unlock(p2m);
>>   }
>>
> I'm using Tianocore as the first stage loader.  This continues to work
> fine.  The build is using tag "edk2-stable202211", commit fff6d81270.
> While Tianocore does map the shared information page, my reading of 
> their
> source is that it properly unmaps the page and therefore shouldn't 
> cause
> trouble.
>
> Notes on the actual call is gpfn was 0x00040072.  This is 
> outside
> the recommended address range, but my understanding is this is 
> supposed
> to be okay.
>
> The return code is -16, which is EBUSY.
>
> Ideas?

 I think the issue is that you are mapping the shared info page over a
 guest RAM page, and in order to do that you would fist need to create
 a hole and then map the shared info page.  IOW: the issue is not with
 edk2 not having unmapped the page, but with FreeBSD trying to map the
 shared_info over a RAM page instead of a hole in the p2m.  x86
 behavior is different here, and does allow mapping the shared_info
 page over a RAM gfn (by first removing the backing RAM page on the
 gfn).
>>>
>>> An interesting thought.  I thought I'd tried this, but since I didn't 
>>> see
>>> such in my experiments list.  What I had tried was removing all the 
>>> pages
>>> in the suggested mapping range.  Yet this failed.
>>
>> Yeah, I went too fast and didn't read the code correctly, it is not
>> checking that the provided gfn is already populated, but whether the
>> mfn intended to be mapped is already mapped at a different location.
>>
>>> Since this seemed reasonable, I've now tried and found it fails.  The
>>> XENMEM_remove_from_physmap call returns 0.
>>
>> XENMEM_remove_from_physmap returning 0 is fine, but it seems to me
>> like edk2 hasn't unmapped the shared_info page.  The OS has no idea
>> at which position the shared_info page is currently mapped, and hence
>> can't do anything to attempt to unmap it in order to cover up for
>> buggy firmware.
>>
>> edk2 should be the entity to issue the XENMEM_remove_from_physmap
>> against the gfn where it has the shared_info page mapped.  Likely
>> needs to be done as part of ExitBootServices() method.
>>
>> FWIW, 71320946d5edf is an ABI change, and as desirable as such
>> behavior might be, a new hypercall should have introduced that had the
>> behavior that the change intended to retrofit into
>> XENMEM_add_to_physmap.
> I can see how you think this is an ABI change but the previous behavior 
> was incorrect. Before this patch, on Arm, we would allow the shared page 
> to be mapped twice. As we don't know where the firmware had mapped it 
> this could result to random corruption.
> 
> Now, we could surely decide to remove the page as x86 did. But this 
> could leave a hole in the RAM. As the OS would not know where the hole 
> is, this could lead to page fault randomly during runtime.


+1.

In addition to what Julien has already said, I would like to say the 
same i

Re: [PATCH] xenbus: fix error exit in xenbus_init()

2023-08-27 Thread Oleksandr Tyshchenko



On 22.08.23 12:11, Juergen Gross wrote:


Hello Juergen

> In case an error occurs in xenbus_init(), xen_store_domain_type should
> be set to XS_UNKNOWN.
> 
> Fix one instance where this action is missing.
> 
> Fixes: 5b3353949e89 ("xen: add support for initializing xenstore later as HVM 
> domain")
> Reported-by: kernel test robot 
> Reported-by: Dan Carpenter 
> Link: 
> https://urldefense.com/v3/__https://lore.kernel.org/r/202304200845.w7m4kxzr-...@intel.com/__;!!GF_29dbcQIUBPA!yVqmbWu6uGrgCl2HVOApItVysZdzPQdL0WxeFK9vVHe5rPbI6B4uQvdoYcEeAQvXTJUrae9KNyQk_JBW1QVL$
>  [lore[.]kernel[.]org]
> Signed-off-by: Juergen Gross 


Reviewed-by: Oleksandr Tyshchenko 


> ---
>   drivers/xen/xenbus/xenbus_probe.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/xen/xenbus/xenbus_probe.c 
> b/drivers/xen/xenbus/xenbus_probe.c
> index 639bf628389b..3205e5d724c8 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -1025,7 +1025,7 @@ static int __init xenbus_init(void)
>   if (err < 0) {
>   pr_err("xenstore_late_init couldn't bind irq 
> err=%d\n",
>  err);
> - return err;
> + goto out_error;
>   }
>   
>   xs_init_irq = err;

Re: [PATCH] libxl: Add missing libxl__virtio_devtype to device_type_tbl array

2023-07-27 Thread Oleksandr Tyshchenko



On 27.07.23 17:33, Jan Beulich wrote:

Hello Jan

> On 26.07.2023 17:13, Oleksandr Tyshchenko wrote:
>> On 26.07.23 17:50, Jan Beulich wrote:
>>> On 26.07.2023 16:14, Oleksandr Tyshchenko wrote:
>>>> From: Oleksandr Tyshchenko 
>>>>
>>>> Without it being present it won't be possible to use some
>>>> libxl__device_type's callbacks for virtio devices as the common code
>>>> can only invoke these callbacks (by dereferencing a pointer) for valid
>>>> libxl__device_type's elements when iterating over device_type_tbl[].
>>>>
>>>> Signed-off-by: Oleksandr Tyshchenko 
>>>> ---
>>>>tools/libs/light/libxl_create.c | 1 +
>>>>1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/tools/libs/light/libxl_create.c 
>>>> b/tools/libs/light/libxl_create.c
>>>> index 393c535579..c91059d713 100644
>>>> --- a/tools/libs/light/libxl_create.c
>>>> +++ b/tools/libs/light/libxl_create.c
>>>> @@ -1887,6 +1887,7 @@ const libxl__device_type *device_type_tbl[] = {
>>>>&libxl__dtdev_devtype,
>>>>&libxl__vdispl_devtype,
>>>>&libxl__vsnd_devtype,
>>>> +&libxl__virtio_devtype,
>>>>NULL
>>>>};
>>>
>>>   From description and nature of the change this looks like a Fixes:
>>> tag would be warranted.
>>
>> Looks like, yes. Thanks.
>>
>> I guess, this should point to the commit that introduced
>> libxl__virtio_devtype
>>
>> Fixes: 43ba5202e2ee ('libxl: add support for generic virtio device')
> 
> In light of Anthony's feedback I'm now thinking that no Fixes: tag
> should be here, as is being clarified by the addition to the
> description 

I was about to send V2 with the addition + Fixes tag and noticed your reply.

Basically, I agree to not append Fixes tag, there is nothing broken 
within current code base regarding that, an addition clarifies the state 
and describes what/how may be broken.

I should have mentioned that from the very beginning.


(which I guess can be folded in while committing).

It would be really good.




> 
> Jan

Re: [PATCH] libxl: Add missing libxl__virtio_devtype to device_type_tbl array

2023-07-27 Thread Oleksandr Tyshchenko



On 27.07.23 16:45, Anthony PERARD wrote:


Hello Anthony

> On Thu, Jul 27, 2023 at 10:38:03AM +0000, Oleksandr Tyshchenko wrote:
>>
>>
>> On 27.07.23 12:50, Anthony PERARD wrote:
>>
>> Hello Anthony
>>
>>> On Wed, Jul 26, 2023 at 05:14:59PM +0300, Oleksandr Tyshchenko wrote:
>>>> From: Oleksandr Tyshchenko 
>>>>
>>>> Without it being present it won't be possible to use some
>>>> libxl__device_type's callbacks for virtio devices as the common code
>>>> can only invoke these callbacks (by dereferencing a pointer) for valid
>>>> libxl__device_type's elements when iterating over device_type_tbl[].
>>>
>>> Did you notice an issue with it been missing from device_type_tbl[] ?
>>> Because to me it looks like all the functions that are using
>>> device_type_tbl will just skip over virtio devtype.
>>>
>>> domcreate_attach_devices:
>>>   skip virtio because ".skip_attach = 1"
>>>
>>> libxl__need_xenpv_qemu:
>>>   skip virtio because "dm_needed" is NULL
>>>
>>> retrieve_domain_configuration_end:
>>>   skip because "compare" is "libxl_device_virtio_compare" which is NULL
>>>
>>> libxl__update_domain_configuration:
>>>   skip because "update_config" is NULL.
>>>
>>> So, I think the patch is fine, adding virtio to the device_type_tbl
>>> array is good for completeness, but the patch description may be
>>> misleading.
>>>
>>> Did I miss something?
>>
>> No, you didn't.
>>
>> Just to be clear, there is no issue within *current* the code base, I am
>> experimenting with using device-model bits, so I implemented
>> libxl__device_virtio_dm_needed() locally and noticed that it didn't get
>> called at all, the reason was in absence of libxl__virtio_devtype in the
>> said array.
>>
>> Do you agree with the following addition to the commit description?
>>
>> "Please note, there is no issue within current the code base as virtio
>> devices don't use callbacks that depend on libxl__virtio_devtype
>> presence in device_type_tbl[]. The issue will appear as soon as we start
>> using these callbacks (for example, dm_needed)."
> 
> Yes, that would be fine. With that addition:
> Acked-by: Anthony PERARD 


Thanks for the clarification and A-b.

> 
> Thanks,
>

Re: [PATCH] libxl: Add missing libxl__virtio_devtype to device_type_tbl array

2023-07-27 Thread Oleksandr Tyshchenko

On 27.07.23 12:50, Anthony PERARD wrote:

Hello Anthony

> On Wed, Jul 26, 2023 at 05:14:59PM +0300, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko 
>>
>> Without it being present it won't be possible to use some
>> libxl__device_type's callbacks for virtio devices as the common code
>> can only invoke these callbacks (by dereferencing a pointer) for valid
>> libxl__device_type's elements when iterating over device_type_tbl[].
> 
> Did you notice an issue with it been missing from device_type_tbl[] ?
> Because to me it looks like all the functions that are using
> device_type_tbl will just skip over virtio devtype.
> 
> domcreate_attach_devices:
>  skip virtio because ".skip_attach = 1"
> 
> libxl__need_xenpv_qemu:
>  skip virtio because "dm_needed" is NULL
> 
> retrieve_domain_configuration_end:
>  skip because "compare" is "libxl_device_virtio_compare" which is NULL
> 
> libxl__update_domain_configuration:
>  skip because "update_config" is NULL.
> 
> So, I think the patch is fine, adding virtio to the device_type_tbl
> array is good for completeness, but the patch description may be
> misleading.
> 
> Did I miss something?

No, you didn't.

Just to be clear, there is no issue within *current* the code base, I am 
experimenting with using device-model bits, so I implemented 
libxl__device_virtio_dm_needed() locally and noticed that it didn't get 
called at all, the reason was in absence of libxl__virtio_devtype in the 
said array.

Do you agree with the following addition to the commit description?

"Please note, there is no issue within current the code base as virtio 
devices don't use callbacks that depend on libxl__virtio_devtype 
presence in device_type_tbl[]. The issue will appear as soon as we start
using these callbacks (for example, dm_needed)."

> 
> Thanks,
>

Re: [PATCH] libxl: Add missing libxl__virtio_devtype to device_type_tbl array

2023-07-26 Thread Oleksandr Tyshchenko



On 26.07.23 17:50, Jan Beulich wrote:

Hello Jan


> On 26.07.2023 16:14, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko 
>>
>> Without it being present it won't be possible to use some
>> libxl__device_type's callbacks for virtio devices as the common code
>> can only invoke these callbacks (by dereferencing a pointer) for valid
>> libxl__device_type's elements when iterating over device_type_tbl[].
>>
>> Signed-off-by: Oleksandr Tyshchenko 
>> ---
>>   tools/libs/light/libxl_create.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/tools/libs/light/libxl_create.c 
>> b/tools/libs/light/libxl_create.c
>> index 393c535579..c91059d713 100644
>> --- a/tools/libs/light/libxl_create.c
>> +++ b/tools/libs/light/libxl_create.c
>> @@ -1887,6 +1887,7 @@ const libxl__device_type *device_type_tbl[] = {
>>   &libxl__dtdev_devtype,
>>   &libxl__vdispl_devtype,
>>   &libxl__vsnd_devtype,
>> +&libxl__virtio_devtype,
>>   NULL
>>   };
> 
>  From description and nature of the change this looks like a Fixes:
> tag would be warranted.

Looks like, yes. Thanks.

I guess, this should point to the commit that introduced 
libxl__virtio_devtype

Fixes: 43ba5202e2ee ('libxl: add support for generic virtio device')


> 
> Jan

[PATCH] libxl: Add missing libxl__virtio_devtype to device_type_tbl array

2023-07-26 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Without it being present it won't be possible to use some
libxl__device_type's callbacks for virtio devices as the common code
can only invoke these callbacks (by dereferencing a pointer) for valid
libxl__device_type's elements when iterating over device_type_tbl[].

Signed-off-by: Oleksandr Tyshchenko 
---
 tools/libs/light/libxl_create.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 393c535579..c91059d713 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -1887,6 +1887,7 @@ const libxl__device_type *device_type_tbl[] = {
 &libxl__dtdev_devtype,
 &libxl__vdispl_devtype,
 &libxl__vsnd_devtype,
+&libxl__virtio_devtype,
 NULL
 };
 
-- 
2.34.1

Re: [PATCH V2 1/2] xen: Update dm_op.h from Xen public header

2023-07-22 Thread Oleksandr Tyshchenko

On 20.07.23 12:30, Viresh Kumar wrote:

Hello Viresh

> Update the definitions in dm_op.h from Xen public header.

I think, it would be good to mention exact Xen version (commit) we are 
based on.

In general patch looks good to me, just a note.

I compared with Xen's public/hvm/dm_op.h and noticed differences. I 
understand, this cannot be 100% verbatim copy, because of headers 
location, emacs magics, GUEST_HANDLE vs XEN_GUEST_HANDLE. The Linux 
header doesn't contain any aliases the Xen header has for each "struct 
xen_dm_op_xxx", for example ...

[snip]

>
> +/*
> + * XEN_DMOP_create_ioreq_server: Instantiate a new IOREQ Server for a
> + *   secondary emulator.
> + *
> + * The  handed back is unique for target domain. The valur of
> + *  should be one of HVM_IOREQSRV_BUFIOREQ_* defined in
> + * hvm_op.h. If the value is HVM_IOREQSRV_BUFIOREQ_OFF then  the buffered
> + * ioreq ring will not be allocated and hence all emulation requests to
> + * this server will be synchronous.
> + */
> +#define XEN_DMOP_create_ioreq_server 1
> +
> +struct xen_dm_op_create_ioreq_server {
> +/* IN - should server handle buffered ioreqs */
> +uint8_t handle_bufioreq;
> +uint8_t pad[3];
> +/* OUT - server id */
> +ioservid_t id;
> +};

... this one:

typedef struct xen_dm_op_create_ioreq_server 
xen_dm_op_create_ioreq_server_t;

And "struct xen_dm_op" down the file uses these aliases inside a union.

I assume, we have to diverge here in order to follow a recommendation
to avoid typedef'ing structs at [1], am I сorrect? Or is there another 
reason?

I think, it would be good to mention a reason in the description.

[1] https://www.kernel.org/doc/html/v6.4/process/coding-style.html#typedefs

Re: [PATCH] xenbus: check xen_domain in xenbus_probe_initcall

2023-07-22 Thread Oleksandr Tyshchenko



On 22.07.23 02:13, Stefano Stabellini wrote:

Hello Stefano


> The same way we already do in xenbus_init.
> Fixes the following warning:
> 
> [  352.175563] Trying to free already-free IRQ 0
> [  352.177355] WARNING: CPU: 1 PID: 88 at kernel/irq/manage.c:1893 
> free_irq+0xbf/0x350
> [...]
> [  352.213951] Call Trace:
> [  352.214390]  
> [  352.214717]  ? __warn+0x81/0x170
> [  352.215436]  ? free_irq+0xbf/0x350
> [  352.215906]  ? report_bug+0x10b/0x200
> [  352.216408]  ? prb_read_valid+0x17/0x20
> [  352.216926]  ? handle_bug+0x44/0x80
> [  352.217409]  ? exc_invalid_op+0x13/0x60
> [  352.217932]  ? asm_exc_invalid_op+0x16/0x20
> [  352.218497]  ? free_irq+0xbf/0x350
> [  352.218979]  ? __pfx_xenbus_probe_thread+0x10/0x10
> [  352.219600]  xenbus_probe+0x7a/0x80
> [  352.221030]  xenbus_probe_thread+0x76/0xc0
> 
> Signed-off-by: Stefano Stabellini 
> Tested-by: Petr Mladek 


Reviewed-by: Oleksandr Tyshchenko 

I guess this wants to gain the Fixes tag:

Fixes: 5b3353949e89 ("xen: add support for initializing xenstore later 
as HVM domain")



> 
> diff --git a/drivers/xen/xenbus/xenbus_probe.c 
> b/drivers/xen/xenbus/xenbus_probe.c
> index 58b732dcbfb8..e9bd3ed70108 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -811,6 +812,9 @@ static int xenbus_probe_thread(void *unused)
>   
>   static int __init xenbus_probe_initcall(void)
>   {
> + if (!xen_domain())
> + return -ENODEV;
> +
>   /*
>* Probe XenBus here in the XS_PV case, and also XS_HVM unless we
>* need to wait for the platform PCI device to come up or

Re: [PATCH] xen: privcmd: Add support for irqfd

2023-07-21 Thread Oleksandr Tyshchenko



On 20.07.23 12:41, Viresh Kumar wrote:

Hello Viresh

> On 13-07-23, 14:40, Oleksandr Tyshchenko wrote:
>> Viresh, great work!
> 
> Thanks Oleksandr.
> 
>> Do you perhaps have corresponding users-space (virtio backend) example
>> adopted for that feature (I would like to take a look at it if possible)?
> 
> This is taken care by the xen-vhost-frontend Rust crate in our case
> (which was initially designed based on virtio-disk but has deviated a
> lot from it now). 

I see

Here is the commit of interest. The backends remain
> unmodified though.
> 
> https://urldefense.com/v3/__https://github.com/vireshk/xen-vhost-frontend/commit/d79c419f14c1f54240b3147c342894998c274364__;!!GF_29dbcQIUBPA!1yHRR11TbPB-cqHmbO9ew0W4GKPfx1y1GXWHj0Q7wIEcom3ZgU28uZcrXEYlnVPl1x47t3ooXECSYer2lClnO3QosqiOmFY$
>  [github[.]com]
> 
> And I have updated the commit with CONFIG_ARM64 thingy..

Thank you for the information!

Re: [PATCH v3] xen/evtchn: Introduce new IOCTL to bind static evtchn

2023-07-18 Thread Oleksandr Tyshchenko



On 18.07.23 14:31, Rahul Singh wrote:


Hello Rahul


> Xen 4.17 supports the creation of static evtchns. To allow user space
> application to bind static evtchns introduce new ioctl
> "IOCTL_EVTCHN_BIND_STATIC". Existing IOCTL doing more than binding
> that’s why we need to introduce the new IOCTL to only bind the static
> event channels.
> 
> Static evtchns to be available for use during the lifetime of the
> guest. When the application exits, __unbind_from_irq() ends up being
> called from release() file operations because of that static evtchns
> are getting closed. To avoid closing the static event channel, add the
> new bool variable "is_static" in "struct irq_info" to mark the event
> channel static when creating the event channel to avoid closing the
> static evtchn.
> 
> Also, take this opportunity to remove the open-coded version of the
> evtchn close in drivers/xen/evtchn.c file and use xen_evtchn_close().
> 
> Signed-off-by: Rahul Singh 
> ---
> v3:
>   * Remove the open-coded version of the evtchn close in drivers/xen/evtchn.c

Thanks!

Looks like there is one unmentioned change in change-log since v2:
* Make sure that evtchn hasn't been added yet before binding it in 
evtchn_ioctl():case IOCTL_EVTCHN_BIND_STATIC

Reviewed-by: Oleksandr Tyshchenko 

> v2:
>   * Use bool in place u8 to define is_static variable.
>   * Avoid closing the static evtchns in error path.
> ---

[snip]

Re: [PATCH] xen: privcmd: Add support for irqfd

2023-07-13 Thread Oleksandr Tyshchenko



On 13.07.23 10:44, Juergen Gross wrote:

Hello all.


> On 12.07.23 10:48, Viresh Kumar wrote:
>> Xen provides support for injecting interrupts to the guests via the
>> HYPERVISOR_dm_op() hypercall. The same is used by the Virtio based
>> device backend implementations, in an inefficient manner currently.
>>
>> Generally, the Virtio backends are implemented to work with the Eventfd
>> based mechanism. In order to make such backends work with Xen, another
>> software layer needs to poll the Eventfds and raise an interrupt to the
>> guest using the Xen based mechanism. This results in an extra context
>> switch.
>>
>> This is not a new problem in Linux though. It is present with other
>> hypervisors like KVM, etc. as well. The generic solution implemented in
>> the kernel for them is to provide an IOCTL call to pass the interrupt
>> details and eventfd, which lets the kernel take care of polling the
>> eventfd and raising of the interrupt, instead of handling this in user
>> space (which involves an extra context switch).
>>
>> This patch adds support to inject a specific interrupt to guest using
>> the eventfd mechanism, by preventing the extra context switch.
>>
>> Inspired by existing implementations for KVM, etc..


Viresh, great work!

Do you perhaps have corresponding users-space (virtio backend) example 
adopted for that feature (I would like to take a look at it if possible)?



>>
>> Signed-off-by: Viresh Kumar 
>> ---
>>   drivers/xen/privcmd.c  | 285 -
>>   include/uapi/xen/privcmd.h |  14 ++
>>   2 files changed, 297 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
>> index e2f580e30a86..e8096b09c113 100644
>> --- a/drivers/xen/privcmd.c
>> +++ b/drivers/xen/privcmd.c
>> @@ -9,11 +9,16 @@
>>   #define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt
>> +#include 
>> +#include 
>>   #include 
>>   #include 
>> +#include 
>> +#include 
>>   #include 
>>   #include 
>>   #include 
>> +#include 
>>   #include 
>>   #include 
>>   #include 
>> @@ -833,6 +838,266 @@ static long privcmd_ioctl_mmap_resource(struct 
>> file *file,
>>   return rc;
>>   }
>> +/* Irqfd support */
>> +static struct workqueue_struct *irqfd_cleanup_wq;
>> +static DEFINE_MUTEX(irqfds_lock);
>> +static LIST_HEAD(irqfds_list);
>> +
>> +struct privcmd_kernel_irqfd {
>> +    domid_t dom;
>> +    u8 level;
>> +    u32 irq;
>> +    struct eventfd_ctx *eventfd;
>> +    struct work_struct shutdown;
>> +    wait_queue_entry_t wait;
>> +    struct list_head list;
>> +    poll_table pt;
>> +};
>> +
>> +/* From xen/include/public/hvm/dm_op.h */
>> +#define XEN_DMOP_set_irq_level 19
>> +
>> +struct xen_dm_op_set_irq_level {
>> +    u32 irq;
>> +    /* IN - Level: 0 -> deasserted, 1 -> asserted */
>> +    u8 level;
>> +    u8 pad[3];
>> +};
>> +
>> +struct xen_dm_op {
>> +    u32 op;
>> +    u32 pad;
>> +    union {
>> +    /*
>> + * There are more structures here, we won't be using them, so
>> + * can skip adding them here.
>> + */
>> +    struct xen_dm_op_set_irq_level set_irq_level;
>> +    } u;
>> +};
> 
> Instead of copying definitions over from Xen into privcmd.c, please just 
> update
> the related linux header include/xen/interface/dm_op.h from the Xen public
> header.
> 
>> +
>> +static void irqfd_deactivate(struct privcmd_kernel_irqfd *kirqfd)
>> +{
>> +    lockdep_assert_held(&irqfds_lock);
>> +
>> +    list_del_init(&kirqfd->list);
>> +    queue_work(irqfd_cleanup_wq, &kirqfd->shutdown);
>> +}
>> +
>> +static void irqfd_shutdown(struct work_struct *work)
>> +{
>> +    struct privcmd_kernel_irqfd *kirqfd =
>> +    container_of(work, struct privcmd_kernel_irqfd, shutdown);
>> +    u64 cnt;
>> +
>> +    eventfd_ctx_remove_wait_queue(kirqfd->eventfd, &kirqfd->wait, &cnt);
>> +    eventfd_ctx_put(kirqfd->eventfd);
>> +    kfree(kirqfd);
>> +}
>> +
>> +static void irqfd_inject(struct privcmd_kernel_irqfd *kirqfd)
>> +{
>> +    struct xen_dm_op dm_op = {
>> +    .op = XEN_DMOP_set_irq_level,
>> +    .u.set_irq_level.irq = kirqfd->irq,
>> +    .u.set_irq_level.level = kirqfd->level,
>> +    };
>> +    struct xen_dm_op_buf xbufs = {
>> +    .size = sizeof(dm_op),
>> +    };
>> +    u64 cnt;
>> +
>> +    eventfd_ctx_do_read(kirqfd->eventfd, &cnt);
>> +    set_xen_guest_handle(xbufs.h, &dm_op);
>> +
>> +    xen_preemptible_hcall_begin();
>> +    HYPERVISOR_dm_op(kirqfd->dom, 1, &xbufs);
> 
> Please add some error handling, e.g. by issuing a message in case this 
> hypercall
> was failing. Adding a bool "error" to struct privcmd_kernel_irqfd in 
> order to
> avoid multiple error messages for the same device might be a good idea.


In addition to provided comments, I would like to mention that this 
particular dm_op has Arm implementation only in vanilla hypervisor.

So this feature cannot be immediately reused on x86 because of 
XEN_DMOP_set_irq_level at least. As I understand, the x86's variant is 
XEN_DMO

Re: [PATCH v7 10/12] vpci: add initial support for virtual PCI bus topology

2023-07-07 Thread Oleksandr Tyshchenko





On 21.06.23 15:06, Jan Beulich wrote:

Hello all



On 13.06.2023 12:32, Volodymyr Babchuk wrote:

@@ -121,6 +124,62 @@ int vpci_add_handlers(struct pci_dev *pdev)
  }
  
  #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT

+static int add_virtual_device(struct pci_dev *pdev)
+{
+struct domain *d = pdev->domain;
+pci_sbdf_t sbdf = { 0 };
+unsigned long new_dev_number;
+
+if ( is_hardware_domain(d) )
+return 0;
+
+ASSERT(pcidevs_locked());
+
+/*
+ * Each PCI bus supports 32 devices/slots at max or up to 256 when
+ * there are multi-function ones which are not yet supported.
+ */
+if ( pdev->info.is_extfn )
+{
+gdprintk(XENLOG_ERR, "%pp: only function 0 passthrough supported\n",
+ &pdev->sbdf);
+return -EOPNOTSUPP;
+}
+
+new_dev_number = find_first_zero_bit(d->vpci_dev_assigned_map,
+ VPCI_MAX_VIRT_DEV);
+if ( new_dev_number >= VPCI_MAX_VIRT_DEV )
+return -ENOSPC;
+
+__set_bit(new_dev_number, &d->vpci_dev_assigned_map);


Since the find-and-set can't easily be atomic, the lock used here (
asserted to be held above) needs to be the same as ...


+/*
+ * Both segment and bus number are 0:
+ *  - we emulate a single host bridge for the guest, e.g. segment 0
+ *  - with bus 0 the virtual devices are seen as embedded
+ *endpoints behind the root complex
+ *
+ * TODO: add support for multi-function devices.
+ */
+sbdf.devfn = PCI_DEVFN(new_dev_number, 0);
+pdev->vpci->guest_sbdf = sbdf;
+
+return 0;
+
+}
+
+static void vpci_remove_virtual_device(const struct pci_dev *pdev)
+{
+write_lock(&pdev->domain->vpci_rwlock);
+if ( pdev->vpci )
+{
+__clear_bit(pdev->vpci->guest_sbdf.dev,
+&pdev->domain->vpci_dev_assigned_map);
+pdev->vpci->guest_sbdf.sbdf = ~0;
+}
+write_unlock(&pdev->domain->vpci_rwlock);


... the one used here.



I think, it makes sense, yes.

***

There is one more thing. As far as I remember, there were some requests 
provided for the previous version (also v7) [1]. At least one of them, I 
assume, is still applicable here. I am speaking about a request to 
consider moving "cleaning up guest_sbdf / vpci_dev_assigned_map" into 
vpci_remove_device() here and aliasing of vpci_deassign_device() to 
vpci_remove_device() in commit #03/12.


The diff below (to be applied on top of current patch) is my 
understanding (not even build tested):


diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index a61282cc5b..c3e6c153bc 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -51,6 +51,15 @@ void vpci_remove_device(struct pci_dev *pdev)
 return;
 }

+#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
+if ( pdev->vpci->guest_sbdf.sbdf != ~0 )
+{
+__clear_bit(pdev->vpci->guest_sbdf.dev,
+&pdev->domain->vpci_dev_assigned_map);
+pdev->vpci->guest_sbdf.sbdf = ~0;
+}
+#endif
+
 vpci = pdev->vpci;
 pdev->vpci = NULL;
 write_unlock(&pdev->domain->vpci_rwlock);
@@ -152,10 +161,14 @@ static int add_virtual_device(struct pci_dev *pdev)
 return -EOPNOTSUPP;
 }

+write_lock(&pdev->domain->vpci_rwlock);
 new_dev_number = find_first_zero_bit(d->vpci_dev_assigned_map,
  VPCI_MAX_VIRT_DEV);
 if ( new_dev_number >= VPCI_MAX_VIRT_DEV )
+{
+write_unlock(&pdev->domain->vpci_rwlock);
 return -ENOSPC;
+}

 __set_bit(new_dev_number, &d->vpci_dev_assigned_map);

@@ -169,23 +182,12 @@ static int add_virtual_device(struct pci_dev *pdev)
  */
 sbdf.devfn = PCI_DEVFN(new_dev_number, 0);
 pdev->vpci->guest_sbdf = sbdf;
+write_unlock(&pdev->domain->vpci_rwlock);

 return 0;

 }

-static void vpci_remove_virtual_device(const struct pci_dev *pdev)
-{
-write_lock(&pdev->domain->vpci_rwlock);
-if ( pdev->vpci )
-{
-__clear_bit(pdev->vpci->guest_sbdf.dev,
-&pdev->domain->vpci_dev_assigned_map);
-pdev->vpci->guest_sbdf.sbdf = ~0;
-}
-write_unlock(&pdev->domain->vpci_rwlock);
-}
-
 /* Notify vPCI that device is assigned to guest. */
 int vpci_assign_device(struct pci_dev *pdev)
 {
@@ -215,7 +217,6 @@ void vpci_deassign_device(struct pci_dev *pdev)
 if ( !has_vpci(pdev->domain) )
 return;

-vpci_remove_virtual_device(pdev);
 vpci_remove_device(pdev);
 }
 #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
(END)



[1] 
https://lore.kernel.org/xen-devel/20220719174253.541965-10-olekst...@gmail.com/

https://lore.kernel.org/xen-devel/20220719174253.541965-3-olekst...@gmail.com/



Jan

Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0

2023-07-07 Thread Oleksandr Tyshchenko



On 07.07.23 11:11, Juergen Gross wrote:

Hello Juergen


> On 07.07.23 10:00, Oleksandr Tyshchenko wrote:
>>
>>
>> On 07.07.23 10:04, Juergen Gross wrote:
>>
>> Hello Juergen
>>
>>
>>> Re-reading the whole thread again ...
>>>
>>> On 29.06.23 03:00, Stefano Stabellini wrote:
>>>> On Wed, 21 Jun 2023, Oleksandr Tyshchenko wrote:
>>>>> On 21.06.23 16:12, Petr Pavlu wrote:
>>>>>
>>>>>
>>>>> Hello Petr
>>>>>
>>>>>
>>>>>> When attempting to run Xen on a QEMU/KVM virtual machine with virtio
>>>>>> devices (all x86_64), dom0 tries to establish a grant for itself 
>>>>>> which
>>>>>> eventually results in a hang during the boot.
>>>>>>
>>>>>> The backtrace looks as follows, the while loop in 
>>>>>> __send_control_msg()
>>>>>> makes no progress:
>>>>>>
>>>>>>  #0  virtqueue_get_buf_ctx (_vq=_vq@entry=0x8880074a8400,
>>>>>> len=len@entry=0xc9413c94, ctx=ctx@entry=0x0
>>>>>> ) at ../drivers/virtio/virtio_ring.c:2326
>>>>>>  #1  0x817086b7 in virtqueue_get_buf
>>>>>> (_vq=_vq@entry=0x8880074a8400, len=len@entry=0xc9413c94)
>>>>>> at ../drivers/virtio/virtio_ring.c:2333
>>>>>>  #2  0x8175f6b2 in __send_control_msg (portdev=>>>>> out>, port_id=0x, event=0x0, value=0x1) at
>>>>>> ../drivers/char/virtio_console.c:562
>>>>>>  #3  0x8175f6ee in __send_control_msg (portdev=>>>>> out>, port_id=, event=,
>>>>>> value=) at ../drivers/char/virtio_console.c:569
>>>>>>  #4  0x817618b1 in virtcons_probe
>>>>>> (vdev=0x88800585e800) at ../drivers/char/virtio_console.c:2098
>>>>>>  #5  0x81707117 in virtio_dev_probe
>>>>>> (_d=0x88800585e810) at ../drivers/virtio/virtio.c:305
>>>>>>  #6  0x8198e348 in call_driver_probe
>>>>>> (drv=0x82be40c0 , drv=0x82be40c0
>>>>>> , dev=0x88800585e810) at ../drivers/base/dd.c:579
>>>>>>  #7  really_probe (dev=dev@entry=0x88800585e810,
>>>>>> drv=drv@entry=0x82be40c0 ) at
>>>>>> ../drivers/base/dd.c:658
>>>>>>  #8  0x8198e58f in __driver_probe_device
>>>>>> (drv=drv@entry=0x82be40c0 ,
>>>>>> dev=dev@entry=0x88800585e810) at ../drivers/base/dd.c:800
>>>>>>  #9  0x8198e65a in driver_probe_device
>>>>>> (drv=drv@entry=0x82be40c0 ,
>>>>>> dev=dev@entry=0x88800585e810) at ../drivers/base/dd.c:830
>>>>>>  #10 0x8198e832 in __driver_attach
>>>>>> (dev=0x88800585e810, data=0x82be40c0 )
>>>>>> at ../drivers/base/dd.c:1216
>>>>>>  #11 0x8198bfb2 in bus_for_each_dev (bus=,
>>>>>> start=start@entry=0x0 ,
>>>>>> data=data@entry=0x82be40c0 ,
>>>>>>  fn=fn@entry=0x8198e7b0 <__driver_attach>) at
>>>>>> ../drivers/base/bus.c:368
>>>>>>  #12 0x8198db65 in driver_attach
>>>>>> (drv=drv@entry=0x82be40c0 ) at
>>>>>> ../drivers/base/dd.c:1233
>>>>>>  #13 0x8198d207 in bus_add_driver
>>>>>> (drv=drv@entry=0x82be40c0 ) at
>>>>>> ../drivers/base/bus.c:673
>>>>>>  #14 0x8198f550 in driver_register
>>>>>> (drv=drv@entry=0x82be40c0 ) at
>>>>>> ../drivers/base/driver.c:246
>>>>>>  #15 0x81706b47 in register_virtio_driver
>>>>>> (driver=driver@entry=0x82be40c0 ) at
>>>>>> ../drivers/virtio/virtio.c:357
>>>>>>  #16 0x832cd34b in virtio_console_init () at
>>>>>> ../drivers/char/virtio_console.c:2258
>>>>>>  #17 0x8100105c in do_one_initcall (fn=0x832cd2e0
>>>>>> ) at ../init/main.c:1246
>>>>>>  #18 0x83277293 in do_initcall_level
>>>>>> (command_line=0x888003e2f900 "root", level=0x6) at
>>>>>> ../init/main.c:13

Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0

2023-07-07 Thread Oleksandr Tyshchenko



On 07.07.23 10:04, Juergen Gross wrote:

Hello Juergen


> Re-reading the whole thread again ...
> 
> On 29.06.23 03:00, Stefano Stabellini wrote:
>> On Wed, 21 Jun 2023, Oleksandr Tyshchenko wrote:
>>> On 21.06.23 16:12, Petr Pavlu wrote:
>>>
>>>
>>> Hello Petr
>>>
>>>
>>>> When attempting to run Xen on a QEMU/KVM virtual machine with virtio
>>>> devices (all x86_64), dom0 tries to establish a grant for itself which
>>>> eventually results in a hang during the boot.
>>>>
>>>> The backtrace looks as follows, the while loop in __send_control_msg()
>>>> makes no progress:
>>>>
>>>>     #0  virtqueue_get_buf_ctx (_vq=_vq@entry=0x8880074a8400, 
>>>> len=len@entry=0xc9413c94, ctx=ctx@entry=0x0 
>>>> ) at ../drivers/virtio/virtio_ring.c:2326
>>>>     #1  0x817086b7 in virtqueue_get_buf 
>>>> (_vq=_vq@entry=0x8880074a8400, len=len@entry=0xc9413c94) 
>>>> at ../drivers/virtio/virtio_ring.c:2333
>>>>     #2  0x8175f6b2 in __send_control_msg (portdev=>>> out>, port_id=0x, event=0x0, value=0x1) at 
>>>> ../drivers/char/virtio_console.c:562
>>>>     #3  0x8175f6ee in __send_control_msg (portdev=>>> out>, port_id=, event=, 
>>>> value=) at ../drivers/char/virtio_console.c:569
>>>>     #4  0x817618b1 in virtcons_probe 
>>>> (vdev=0x88800585e800) at ../drivers/char/virtio_console.c:2098
>>>>     #5  0x81707117 in virtio_dev_probe 
>>>> (_d=0x88800585e810) at ../drivers/virtio/virtio.c:305
>>>>     #6  0x8198e348 in call_driver_probe 
>>>> (drv=0x82be40c0 , drv=0x82be40c0 
>>>> , dev=0x88800585e810) at ../drivers/base/dd.c:579
>>>>     #7  really_probe (dev=dev@entry=0x88800585e810, 
>>>> drv=drv@entry=0x82be40c0 ) at 
>>>> ../drivers/base/dd.c:658
>>>>     #8  0x8198e58f in __driver_probe_device 
>>>> (drv=drv@entry=0x82be40c0 , 
>>>> dev=dev@entry=0x88800585e810) at ../drivers/base/dd.c:800
>>>>     #9  0x8198e65a in driver_probe_device 
>>>> (drv=drv@entry=0x82be40c0 , 
>>>> dev=dev@entry=0x88800585e810) at ../drivers/base/dd.c:830
>>>>     #10 0x8198e832 in __driver_attach 
>>>> (dev=0x88800585e810, data=0x82be40c0 ) 
>>>> at ../drivers/base/dd.c:1216
>>>>     #11 0x8198bfb2 in bus_for_each_dev (bus=, 
>>>> start=start@entry=0x0 , 
>>>> data=data@entry=0x82be40c0 ,
>>>>     fn=fn@entry=0x8198e7b0 <__driver_attach>) at 
>>>> ../drivers/base/bus.c:368
>>>>     #12 0x8198db65 in driver_attach 
>>>> (drv=drv@entry=0x82be40c0 ) at 
>>>> ../drivers/base/dd.c:1233
>>>>     #13 0x8198d207 in bus_add_driver 
>>>> (drv=drv@entry=0x82be40c0 ) at 
>>>> ../drivers/base/bus.c:673
>>>>     #14 0x8198f550 in driver_register 
>>>> (drv=drv@entry=0x82be40c0 ) at 
>>>> ../drivers/base/driver.c:246
>>>>     #15 0x81706b47 in register_virtio_driver 
>>>> (driver=driver@entry=0x82be40c0 ) at 
>>>> ../drivers/virtio/virtio.c:357
>>>>     #16 0x832cd34b in virtio_console_init () at 
>>>> ../drivers/char/virtio_console.c:2258
>>>>     #17 0x8100105c in do_one_initcall (fn=0x832cd2e0 
>>>> ) at ../init/main.c:1246
>>>>     #18 0x83277293 in do_initcall_level 
>>>> (command_line=0x888003e2f900 "root", level=0x6) at 
>>>> ../init/main.c:1319
>>>>     #19 do_initcalls () at ../init/main.c:1335
>>>>     #20 do_basic_setup () at ../init/main.c:1354
>>>>     #21 kernel_init_freeable () at ../init/main.c:1571
>>>>     #22 0x81f64be1 in kernel_init (unused=) 
>>>> at ../init/main.c:1462
>>>>     #23 0x81001f49 in ret_from_fork () at 
>>>> ../arch/x86/entry/entry_64.S:308
>>>>     #24 0x in ?? ()
>>>>
>>>> Fix the problem by preventing xen_grant_init_backend_domid() from
>>>> setting dom0 as a backend when running in dom0.
>>>>
>>>> Fixes: 035e3a4321f7 ("xen/virtio: Optimize the setup of 
>>>> "xen-gran

[PATCH] iommu/ipmmu-vmsa: Add missing 'U' in IMTTLBR0_TTBR_MASK for shifted constant

2023-07-04 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

With enabling both CONFIG_UBSAN and CONFIG_IPMMU_VMSA I have got the following
splat when an IOMMU driver tried to setup page tables:

(XEN) ipmmu: /soc/iommu@e67b: d1: Set IPMMU context 1 (pgd 0x77fe9)
(XEN) 

(XEN) UBSAN: Undefined behaviour in drivers/passthrough/arm/ipmmu-vmsa.c:558:51
(XEN) left shift of 1048575 by 12 places cannot be represented in type 'int'
(XEN) Xen WARN at common/ubsan/ubsan.c:172
(XEN) ---[ Xen-4.18-unstable  arm64  debug=y ubsan=y  Tainted:  S ]
...

This points to shifted constant in IMTTLBR0_TTBR_MASK. Fix that by adding
missing 'U' to it.

This should also address MISRA Rule 7.2:

A "u" or "U" suffix shall be applied to all integer constants that
are represented in an unsigned type.

Signed-off-by: Oleksandr Tyshchenko 
---
 xen/drivers/passthrough/arm/ipmmu-vmsa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c 
b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
index 24b9e09a6b..0ccfa53255 100644
--- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -201,7 +201,7 @@ static DEFINE_SPINLOCK(ipmmu_devices_lock);
 #define IMTTBCR_TSZ0_SHIFT 0
 
 #define IMTTLBR0  0x0010
-#define IMTTLBR0_TTBR_MASK(0xf << 12)
+#define IMTTLBR0_TTBR_MASK(0xfU << 12)
 #define IMTTUBR0  0x0014
 #define IMTTUBR0_TTBR_MASK(0xff << 0)
 
-- 
2.34.1

Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0

2023-07-04 Thread Oleksandr Tyshchenko

On Tue, Jul 4, 2023 at 5:49 PM Roger Pau Monné  wrote:

Hello all.

[sorry for the possible format issues]


On Tue, Jul 04, 2023 at 01:43:46PM +0200, Marek Marczykowski-Górecki wrote:
> > Hi,
> >
> > FWIW, I have ran into this issue some time ago too. I run Xen on top of
> > KVM and then passthrough some of the virtio devices (network one
> > specifically) into a (PV) guest. So, I hit both cases, the dom0 one and
> > domU one. As a temporary workaround I needed to disable
> > CONFIG_XEN_VIRTIO completely (just disabling
> > CONFIG_XEN_VIRTIO_FORCE_GRANT was not enough to fix it).
> > With that context in place, the actual response below.
> >
> > On Tue, Jul 04, 2023 at 12:39:40PM +0200, Juergen Gross wrote:
> > > On 04.07.23 09:48, Roger Pau Monné wrote:
> > > > On Thu, Jun 29, 2023 at 03:44:04PM -0700, Stefano Stabellini wrote:
> > > > > On Thu, 29 Jun 2023, Oleksandr Tyshchenko wrote:
> > > > > > On 29.06.23 04:00, Stefano Stabellini wrote:
> > > > > > > I think we need to add a second way? It could be anything that
> can help
> > > > > > > us distinguish between a non-grants-capable virtio backend and
> a
> > > > > > > grants-capable virtio backend, such as:
> > > > > > > - a string on xenstore
> > > > > > > - a xen param
> > > > > > > - a special PCI configuration register value
> > > > > > > - something in the ACPI tables
> > > > > > > - the QEMU machine type
> > > > > >
> > > > > >
> > > > > > Yes, I remember there was a discussion regarding that. The point
> is to
> > > > > > choose a solution to be functional for both PV and HVM *and* to
> be able
> > > > > > to support a hotplug. IIRC, the xenstore could be a possible
> candidate.
> > > > >
> > > > > xenstore would be among the easiest to make work. The only
> downside is
> > > > > the dependency on xenstore which otherwise virtio+grants doesn't
> have.
> > > >
> > > > I would avoid introducing a dependency on xenstore, if nothing else
> we
> > > > know it's a performance bottleneck.
> > > >
> > > > We would also need to map the virtio device topology into xenstore,
> so
> > > > that we can pass different options for each device.
> > >
> > > This aspect (different options) is important. How do you want to pass
> virtio
> > > device configuration parameters from dom0 to the virtio backend
> domain? You
> > > probably need something like Xenstore (a virtio based alternative like
> virtiofs
> > > would work, too) for that purpose.
> > >
> > > Mapping the topology should be rather easy via the PCI-Id, e.g.:
> > >
> > > /local/domain/42/device/virtio/:00:1c.0/backend
> >
> > While I agree this would probably be the simplest to implement, I don't
> > like introducing xenstore dependency into virtio frontend either.
> > Toolstack -> backend communication is probably easier to solve, as it's
> > much more flexible (could use qemu cmdline, QMP, other similar
> > mechanisms for non-qemu backends etc).
>
> I also think features should be exposed uniformly for devices, it's at
> least weird to have certain features exposed in the PCI config space
> while other features exposed in xenstore.
>
> For virtio-mmio this might get a bit confusing, are we going to add
> xenstore entries based on the position of the device config mmio
> region?
>
> I think on Arm PCI enumeration is not (usually?) done by the firmware,
> at which point the SBDF expected by the tools/backend might be
> different than the value assigned by the guest OS.
>
> I think there are two slightly different issues, one is how to pass
> information to virtio backends, I think doing this initially based on
> xenstore is not that bad, because it's an internal detail of the
> backend implementation. However passing information to virtio
> frontends using xenstore is IMO a bad idea, there's already a way to
> negotiate features between virtio frontends and backends, and Xen
> should just expand and use that.
>
>

On Arm with device-tree we have a special bindings which purpose is to
inform us whether we need to use grants for virtio and backend domid for a
particular device.Here on x86, we don't have a device tree, so cannot
(easily?) reuse this logic.

I have just recollected one idea suggested by Stefano some time ago [1].
The context of discu

Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0

2023-06-29 Thread Oleksandr Tyshchenko



On 29.06.23 04:00, Stefano Stabellini wrote:

Hello Stefano

> On Wed, 21 Jun 2023, Oleksandr Tyshchenko wrote:
>> On 21.06.23 16:12, Petr Pavlu wrote:
>>
>>
>> Hello Petr
>>
>>
>>> When attempting to run Xen on a QEMU/KVM virtual machine with virtio
>>> devices (all x86_64), dom0 tries to establish a grant for itself which
>>> eventually results in a hang during the boot.
>>>
>>> The backtrace looks as follows, the while loop in __send_control_msg()
>>> makes no progress:
>>>
>>> #0  virtqueue_get_buf_ctx (_vq=_vq@entry=0x8880074a8400, 
>>> len=len@entry=0xc9413c94, ctx=ctx@entry=0x0 ) at 
>>> ../drivers/virtio/virtio_ring.c:2326
>>> #1  0x817086b7 in virtqueue_get_buf 
>>> (_vq=_vq@entry=0x8880074a8400, len=len@entry=0xc9413c94) at 
>>> ../drivers/virtio/virtio_ring.c:2333
>>> #2  0x8175f6b2 in __send_control_msg (portdev=, 
>>> port_id=0x, event=0x0, value=0x1) at 
>>> ../drivers/char/virtio_console.c:562
>>> #3  0x8175f6ee in __send_control_msg (portdev=, 
>>> port_id=, event=, value=) at 
>>> ../drivers/char/virtio_console.c:569
>>> #4  0x817618b1 in virtcons_probe (vdev=0x88800585e800) at 
>>> ../drivers/char/virtio_console.c:2098
>>> #5  0x81707117 in virtio_dev_probe (_d=0x88800585e810) at 
>>> ../drivers/virtio/virtio.c:305
>>> #6  0x8198e348 in call_driver_probe (drv=0x82be40c0 
>>> , drv=0x82be40c0 , 
>>> dev=0x88800585e810) at ../drivers/base/dd.c:579
>>> #7  really_probe (dev=dev@entry=0x88800585e810, 
>>> drv=drv@entry=0x82be40c0 ) at 
>>> ../drivers/base/dd.c:658
>>> #8  0x8198e58f in __driver_probe_device 
>>> (drv=drv@entry=0x82be40c0 , 
>>> dev=dev@entry=0x88800585e810) at ../drivers/base/dd.c:800
>>> #9  0x8198e65a in driver_probe_device 
>>> (drv=drv@entry=0x82be40c0 , 
>>> dev=dev@entry=0x88800585e810) at ../drivers/base/dd.c:830
>>> #10 0x8198e832 in __driver_attach (dev=0x88800585e810, 
>>> data=0x82be40c0 ) at ../drivers/base/dd.c:1216
>>> #11 0x8198bfb2 in bus_for_each_dev (bus=, 
>>> start=start@entry=0x0 , 
>>> data=data@entry=0x82be40c0 ,
>>> fn=fn@entry=0x8198e7b0 <__driver_attach>) at 
>>> ../drivers/base/bus.c:368
>>> #12 0x8198db65 in driver_attach 
>>> (drv=drv@entry=0x82be40c0 ) at 
>>> ../drivers/base/dd.c:1233
>>> #13 0x8198d207 in bus_add_driver 
>>> (drv=drv@entry=0x82be40c0 ) at 
>>> ../drivers/base/bus.c:673
>>> #14 0x8198f550 in driver_register 
>>> (drv=drv@entry=0x82be40c0 ) at 
>>> ../drivers/base/driver.c:246
>>> #15 0x81706b47 in register_virtio_driver 
>>> (driver=driver@entry=0x82be40c0 ) at 
>>> ../drivers/virtio/virtio.c:357
>>> #16 0x832cd34b in virtio_console_init () at 
>>> ../drivers/char/virtio_console.c:2258
>>> #17 0x8100105c in do_one_initcall (fn=0x832cd2e0 
>>> ) at ../init/main.c:1246
>>> #18 0x83277293 in do_initcall_level 
>>> (command_line=0x888003e2f900 "root", level=0x6) at ../init/main.c:1319
>>> #19 do_initcalls () at ../init/main.c:1335
>>> #20 do_basic_setup () at ../init/main.c:1354
>>> #21 kernel_init_freeable () at ../init/main.c:1571
>>> #22 0x81f64be1 in kernel_init (unused=) at 
>>> ../init/main.c:1462
>>> #23 0x81001f49 in ret_from_fork () at 
>>> ../arch/x86/entry/entry_64.S:308
>>> #24 0x in ?? ()
>>>
>>> Fix the problem by preventing xen_grant_init_backend_domid() from
>>> setting dom0 as a backend when running in dom0.
>>>
>>> Fixes: 035e3a4321f7 ("xen/virtio: Optimize the setup of "xen-grant-dma" 
>>> devices")
>>
>>
>> I am not 100% sure whether the Fixes tag points to precise commit. If I
>> am not mistaken, the said commit just moves the code in the context
>> without changing the logic of CONFIG_XEN_VIRTIO_FORCE_GRANT, this was
>> introduced before.
>>
>>
>>> Signed-off-by: Petr Pavlu 
>>> ---
>>>drivers/xen/grant-dma-ops.c | 4 +++-
>>>1 file changed, 3 insertions(+)

Re: [PATCH v2] xen/evtchn: Introduce new IOCTL to bind static evtchn

2023-06-29 Thread Oleksandr Tyshchenko



On 29.06.23 18:46, Rahul Singh wrote:

Hello Rahul


> Xen 4.17 supports the creation of static evtchns. To allow user space
> application to bind static evtchns introduce new ioctl
> "IOCTL_EVTCHN_BIND_STATIC". Existing IOCTL doing more than binding
> that’s why we need to introduce the new IOCTL to only bind the static
> event channels.
> 
> Also, static evtchns to be available for use during the lifetime of the
> guest. When the application exits, __unbind_from_irq() ends up being
> called from release() file operations because of that static evtchns
> are getting closed. To avoid closing the static event channel, add the
> new bool variable "is_static" in "struct irq_info" to mark the event
> channel static when creating the event channel to avoid closing the
> static evtchn.
> 
> Signed-off-by: Rahul Singh 
> ---
> v2:
>   * Use bool in place u8 to define is_static variable.
>   * Avoid closing the static evtchns in error path.


Patch looks good to me, just a nit (question) below.


> ---
>   drivers/xen/events/events_base.c |  7 +--
>   drivers/xen/evtchn.c | 30 ++
>   include/uapi/xen/evtchn.h|  9 +
>   include/xen/events.h |  2 +-
>   4 files changed, 37 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/xen/events/events_base.c 
> b/drivers/xen/events/events_base.c
> index c7715f8bd452..5d3b5c7cfe64 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -112,6 +112,7 @@ struct irq_info {
>   unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */
>   u64 eoi_time;   /* Time in jiffies when to EOI. */
>   raw_spinlock_t lock;
> + bool is_static;   /* Is event channel static */
>   
>   union {
>   unsigned short virq;
> @@ -982,7 +983,8 @@ static void __unbind_from_irq(unsigned int irq)
>   unsigned int cpu = cpu_from_irq(irq);
>   struct xenbus_device *dev;
>   
> - xen_evtchn_close(evtchn);
> + if (!info->is_static)
> + xen_evtchn_close(evtchn);
>   
>   switch (type_from_irq(irq)) {
>   case IRQT_VIRQ:
> @@ -1574,7 +1576,7 @@ int xen_set_irq_priority(unsigned irq, unsigned 
> priority)
>   }
>   EXPORT_SYMBOL_GPL(xen_set_irq_priority);
>   
> -int evtchn_make_refcounted(evtchn_port_t evtchn)
> +int evtchn_make_refcounted(evtchn_port_t evtchn, bool is_static)
>   {
>   int irq = get_evtchn_to_irq(evtchn);
>   struct irq_info *info;
> @@ -1590,6 +1592,7 @@ int evtchn_make_refcounted(evtchn_port_t evtchn)
>   WARN_ON(info->refcnt != -1);
>   
>   info->refcnt = 1;
> + info->is_static = is_static;
>   
>   return 0;
>   }
> diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c
> index c99415a70051..e6d2303478b2 100644
> --- a/drivers/xen/evtchn.c
> +++ b/drivers/xen/evtchn.c
> @@ -366,7 +366,8 @@ static int evtchn_resize_ring(struct per_user_data *u)
>   return 0;
>   }
>   
> -static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port)
> +static int evtchn_bind_to_user(struct per_user_data *u, evtchn_port_t port,
> + bool is_static)
>   {
>   struct user_evtchn *evtchn;
>   struct evtchn_close close;
> @@ -402,14 +403,16 @@ static int evtchn_bind_to_user(struct per_user_data *u, 
> evtchn_port_t port)
>   if (rc < 0)
>   goto err;
>   
> - rc = evtchn_make_refcounted(port);
> + rc = evtchn_make_refcounted(port, is_static);
>   return rc;
>   
>   err:
>   /* bind failed, should close the port now */
> - close.port = port;
> - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
> - BUG();
> + if (!is_static) {


I think now "struct evtchn_close close;" can be placed here as it is not 
used outside of this block.

Also this block looks like an open-coded version of xen_evtchn_close()
defined at events_base.c, so maybe it is worth making xen_evtchn_close() 
static inline and placing it into events.h, then calling helper here?
Please note, I will be ok either way.


> + close.port = port;
> + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
> + BUG();
> + }
>   del_evtchn(u, evtchn);
>   return rc;
>   }

[snip]

Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0

2023-06-21 Thread Oleksandr Tyshchenko



On 21.06.23 16:12, Petr Pavlu wrote:


Hello Petr


> When attempting to run Xen on a QEMU/KVM virtual machine with virtio
> devices (all x86_64), dom0 tries to establish a grant for itself which
> eventually results in a hang during the boot.
> 
> The backtrace looks as follows, the while loop in __send_control_msg()
> makes no progress:
> 
>#0  virtqueue_get_buf_ctx (_vq=_vq@entry=0x8880074a8400, 
> len=len@entry=0xc9413c94, ctx=ctx@entry=0x0 ) at 
> ../drivers/virtio/virtio_ring.c:2326
>#1  0x817086b7 in virtqueue_get_buf 
> (_vq=_vq@entry=0x8880074a8400, len=len@entry=0xc9413c94) at 
> ../drivers/virtio/virtio_ring.c:2333
>#2  0x8175f6b2 in __send_control_msg (portdev=, 
> port_id=0x, event=0x0, value=0x1) at 
> ../drivers/char/virtio_console.c:562
>#3  0x8175f6ee in __send_control_msg (portdev=, 
> port_id=, event=, value=) at 
> ../drivers/char/virtio_console.c:569
>#4  0x817618b1 in virtcons_probe (vdev=0x88800585e800) at 
> ../drivers/char/virtio_console.c:2098
>#5  0x81707117 in virtio_dev_probe (_d=0x88800585e810) at 
> ../drivers/virtio/virtio.c:305
>#6  0x8198e348 in call_driver_probe (drv=0x82be40c0 
> , drv=0x82be40c0 , 
> dev=0x88800585e810) at ../drivers/base/dd.c:579
>#7  really_probe (dev=dev@entry=0x88800585e810, 
> drv=drv@entry=0x82be40c0 ) at ../drivers/base/dd.c:658
>#8  0x8198e58f in __driver_probe_device 
> (drv=drv@entry=0x82be40c0 , 
> dev=dev@entry=0x88800585e810) at ../drivers/base/dd.c:800
>#9  0x8198e65a in driver_probe_device 
> (drv=drv@entry=0x82be40c0 , 
> dev=dev@entry=0x88800585e810) at ../drivers/base/dd.c:830
>#10 0x8198e832 in __driver_attach (dev=0x88800585e810, 
> data=0x82be40c0 ) at ../drivers/base/dd.c:1216
>#11 0x8198bfb2 in bus_for_each_dev (bus=, 
> start=start@entry=0x0 , data=data@entry=0x82be40c0 
> ,
>fn=fn@entry=0x8198e7b0 <__driver_attach>) at 
> ../drivers/base/bus.c:368
>#12 0x8198db65 in driver_attach (drv=drv@entry=0x82be40c0 
> ) at ../drivers/base/dd.c:1233
>#13 0x8198d207 in bus_add_driver (drv=drv@entry=0x82be40c0 
> ) at ../drivers/base/bus.c:673
>#14 0x8198f550 in driver_register 
> (drv=drv@entry=0x82be40c0 ) at 
> ../drivers/base/driver.c:246
>#15 0x81706b47 in register_virtio_driver 
> (driver=driver@entry=0x82be40c0 ) at 
> ../drivers/virtio/virtio.c:357
>#16 0x832cd34b in virtio_console_init () at 
> ../drivers/char/virtio_console.c:2258
>#17 0x8100105c in do_one_initcall (fn=0x832cd2e0 
> ) at ../init/main.c:1246
>#18 0x83277293 in do_initcall_level 
> (command_line=0x888003e2f900 "root", level=0x6) at ../init/main.c:1319
>#19 do_initcalls () at ../init/main.c:1335
>#20 do_basic_setup () at ../init/main.c:1354
>#21 kernel_init_freeable () at ../init/main.c:1571
>#22 0x81f64be1 in kernel_init (unused=) at 
> ../init/main.c:1462
>#23 0x81001f49 in ret_from_fork () at 
> ../arch/x86/entry/entry_64.S:308
>#24 0x in ?? ()
> 
> Fix the problem by preventing xen_grant_init_backend_domid() from
> setting dom0 as a backend when running in dom0.
> 
> Fixes: 035e3a4321f7 ("xen/virtio: Optimize the setup of "xen-grant-dma" 
> devices")


I am not 100% sure whether the Fixes tag points to precise commit. If I 
am not mistaken, the said commit just moves the code in the context 
without changing the logic of CONFIG_XEN_VIRTIO_FORCE_GRANT, this was 
introduced before.


> Signed-off-by: Petr Pavlu 
> ---
>   drivers/xen/grant-dma-ops.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c
> index 76f6f26265a3..29ed27ac450e 100644
> --- a/drivers/xen/grant-dma-ops.c
> +++ b/drivers/xen/grant-dma-ops.c
> @@ -362,7 +362,9 @@ static int xen_grant_init_backend_domid(struct device 
> *dev,
>   if (np) {
>   ret = xen_dt_grant_init_backend_domid(dev, np, backend_domid);
>   of_node_put(np);
> - } else if (IS_ENABLED(CONFIG_XEN_VIRTIO_FORCE_GRANT) || 
> xen_pv_domain()) {
> + } else if ((IS_ENABLED(CONFIG_XEN_VIRTIO_FORCE_GRANT) ||
> + xen_pv_domain()) &&
> +!xen_initial_domain()) {

The commit lgtm, just one note:


I would even bail out early in xen_virtio_restricted_mem_acc() instead,
as I assume the same issue could happen on Arm with DT (although there 
we don't guess the backend's domid, we read it from DT and quite 
unlikely we get Dom0 being in Dom0 with correct DT).

Something like:

@@ -416,6 +421,10 @@ bool xen_virtio_restricted_mem_acc(struct 
virtio_device *dev)
  {
 domid_t backend_domid;

+   /* Xen grant DMA ops are not used when running as initial

Re: [PATCH 1/2] xen/virtio: Fix NULL deref when a bridge of PCI root bus has no parent

2023-06-21 Thread Oleksandr Tyshchenko



On 21.06.23 16:12, Petr Pavlu wrote:


Hello Petr


> When attempting to run Xen on a QEMU/KVM virtual machine with virtio
> devices (all x86_64), function xen_dt_get_node() crashes on accessing
> bus->bridge->parent->of_node because a bridge of the PCI root bus has no
> parent set:
> 
> [1.694192][T1] BUG: kernel NULL pointer dereference, address: 
> 0288
> [1.695688][T1] #PF: supervisor read access in kernel mode
> [1.696297][T1] #PF: error_code(0x) - not-present page
> [1.696297][T1] PGD 0 P4D 0
> [1.696297][T1] Oops:  [#1] PREEMPT SMP NOPTI
> [1.696297][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 6.3.7-1-default #1 openSUSE Tumbleweed 
> a577eae57964bb7e83477b5a5645a1781df990f0
> [1.696297][T1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), 
> BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
> [1.696297][T1] RIP: e030:xen_virtio_restricted_mem_acc+0xd9/0x1c0
> [1.696297][T1] Code: 45 0c 83 e8 c9 a3 ea ff 31 c0 eb d7 48 8b 87 40 
> ff ff ff 48 89 c2 48 8b 40 10 48 85 c0 75 f4 48 8b 82 10 01 00 00 48 8b 40 40 
> <48> 83 b8 88 02 00 00 00 0f 84 45 ff ff ff 66 90 31 c0 eb a5 48 89
> [1.696297][T1] RSP: e02b:c90040013cc8 EFLAGS: 00010246
> [1.696297][T1] RAX:  RBX: 888006c75000 RCX: 
> 0029
> [1.696297][T1] RDX: 888005ed1000 RSI: c900400f100c RDI: 
> 888005ee30d0
> [1.696297][T1] RBP: 888006c75010 R08: 0001 R09: 
> 00033006
> [1.696297][T1] R10: 888005850028 R11: 0002 R12: 
> 830439a0
> [1.696297][T1] R13:  R14: 888005657900 R15: 
> 888006e3e1e8
> [1.696297][T1] FS:  () GS:88804a00() 
> knlGS:
> [1.696297][T1] CS:  e030 DS:  ES:  CR0: 80050033
> [1.696297][T1] CR2: 0288 CR3: 02e36000 CR4: 
> 00050660
> [1.696297][T1] Call Trace:
> [1.696297][T1]  
> [1.696297][T1]  virtio_features_ok+0x1b/0xd0
> [1.696297][T1]  virtio_dev_probe+0x19c/0x270
> [1.696297][T1]  really_probe+0x19b/0x3e0
> [1.696297][T1]  __driver_probe_device+0x78/0x160
> [1.696297][T1]  driver_probe_device+0x1f/0x90
> [1.696297][T1]  __driver_attach+0xd2/0x1c0
> [1.696297][T1]  bus_for_each_dev+0x74/0xc0
> [1.696297][T1]  bus_add_driver+0x116/0x220
> [1.696297][T1]  driver_register+0x59/0x100
> [1.696297][T1]  virtio_console_init+0x7f/0x110
> [1.696297][T1]  do_one_initcall+0x47/0x220
> [1.696297][T1]  kernel_init_freeable+0x328/0x480
> [1.696297][T1]  kernel_init+0x1a/0x1c0
> [1.696297][T1]  ret_from_fork+0x29/0x50
> [1.696297][T1]  
> [1.696297][T1] Modules linked in:
> [1.696297][T1] CR2: 0288
> [1.696297][T1] ---[ end trace  ]---
> 
> The PCI root bus is in this case created from ACPI description via
> acpi_pci_root_add() -> pci_acpi_scan_root() -> acpi_pci_root_create() ->
> pci_create_root_bus() where the last function is called with
> parent=NULL. It indicates that no parent is present and then
> bus->bridge->parent is NULL too.
> 
> Fix the problem by checking bus->bridge->parent in xen_dt_get_node() for
> NULL first >
> Fixes: ef8ae384b4c9 ("xen/virtio: Handle PCI devices which Host controller is 
> described in DT")

Oops, sorry. I have to admit I checked with DT only.


> Signed-off-by: Petr Pavlu 


Reviewed-by: Oleksandr Tyshchenko 



> ---
>   drivers/xen/grant-dma-ops.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c
> index 9784a77fa3c9..76f6f26265a3 100644
> --- a/drivers/xen/grant-dma-ops.c
> +++ b/drivers/xen/grant-dma-ops.c
> @@ -303,6 +303,8 @@ static struct device_node *xen_dt_get_node(struct device 
> *dev)
>   while (!pci_is_root_bus(bus))
>   bus = bus->parent;
>   
> + if (!bus->bridge->parent)
> + return NULL;
>   return of_node_get(bus->bridge->parent->of_node);
>   }
>

Re: [PATCH] libxl: arm: Allow grant mappings for backends running on Dom0

2023-05-05 Thread Oleksandr Tyshchenko

Hello Viresh

[sorry for the possible format issues]

On Fri, May 5, 2023 at 9:19 AM Viresh Kumar  wrote:

> On 05-04-23, 05:12, Viresh Kumar wrote:
> > On 04-04-23, 21:16, Oleksandr Tyshchenko wrote:
> > > ok, probably makes sense
> >
> > While testing both foreign and grant mappings I stumbled upon another
> > related problem. How do I control the creation of iommu node from
> > guest configuration file, irrespective of the domain backend is
> > running at ? This is what we have right now:
> >
> > - always create iommu nodes if backend-dom != 0
> > - always create iommu nodes if forced_grant == 1
> >
> > what I need to cover is
> > - don't create iommu nodes irrespective of the domain
> >
> > This is required if you want to test both foreign and grant memory
> > allocations, with different guests kernels. i.e. one guest kernel for
> > device with grant mappings and another guest for device with foreign
> > mappings. There is no way, that I know of, to disable the creation of
> > iommu nodes. Of course we would want to use the same images for kernel
> > and other stuff, so this needs to be controlled from guest
> > configuration file.
>
> Any input on this please ?
>


I was going to propose an idea, but I have just realized that you already
voiced it here [1] ))
So what you proposed there sounds reasonable to me.

I will just rephrase it according to my understanding:

We probably need to consider transforming your "forced_grant" to something
three-state, for example
"grant_usage" (or "use_grant" as you suggested) which could be "default
behaviour" or "always disabled", or "always enabled".

With "grant_usage=default" we will get exact what we have at the moment
(only create iommu nodes if backend-domid != 0)
With "grant_usage=disabled" we will force grants to be always disabled
(don't create iommu nodes irrespective of the domain)
With "grant_usage=enabled" we will force grants to be always enabled
(always create iommu nodes irrespective of the domain)


[1]
https://lore.kernel.org/xen-devel/20230505093835.jcbwo6zjk5hcjvsm@vireshk-i7/


>
> --
> viresh
>


-- 
Regards,

Oleksandr Tyshchenko

Re: [PATCH V2 2/2] libxl: fix matching of generic virtio device

2023-04-05 Thread Oleksandr Tyshchenko





On 05.04.23 03:12, Viresh Kumar wrote:


Hello Viresh


The strings won't be an exact match, as we are only looking to match the
prefix here, i.e. "virtio,device". This is already done properly in
libxl_virtio.c file, lets do the same here too.

Fixes: 43ba5202e2ee ("libxl: add support for generic virtio device")
Signed-off-by: Viresh Kumar 
---
V1->V2: Add the missing fixes tag.


Reviewed-by: Oleksandr Tyshchenko 



  tools/libs/light/libxl_arm.c | 12 
  1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index ddc7b2a15975..97c80d7ed0fa 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1033,10 +1033,14 @@ static int make_virtio_mmio_node_device(libxl__gc *gc, 
void *fdt, uint64_t base,
  } else if (!strcmp(type, VIRTIO_DEVICE_TYPE_GPIO)) {
  res = make_virtio_mmio_node_gpio(gc, fdt);
  if (res) return res;
-} else if (strcmp(type, VIRTIO_DEVICE_TYPE_GENERIC)) {
-/* Doesn't match generic virtio device */
-LOG(ERROR, "Invalid type for virtio device: %s", type);
-return -EINVAL;
+} else {
+int len = sizeof(VIRTIO_DEVICE_TYPE_GENERIC) - 1;
+
+if (strncmp(type, VIRTIO_DEVICE_TYPE_GENERIC, len)) {
+/* Doesn't match generic virtio device */
+LOG(ERROR, "Invalid type for virtio device: %s", type);
+return -EINVAL;
+}
  }
  
  return fdt_end_node(fdt);

Re: [PATCH V2 1/2] docs: Allow generic virtio device types to contain device-id

2023-04-05 Thread Oleksandr Tyshchenko





On 05.04.23 03:12, Viresh Kumar wrote:


Hello Viresh


For generic virtio devices, where we don't need to add compatible or
other special DT properties, the type field is set to "virtio,device".

But this misses the case where the user sets the type with a valid
virtio device id as well, like "virtio,device26" for file system device.



ok. For the record, a valid virtio device ids can be found at:

https://docs.oasis-open.org/virtio/virtio/v1.2/cs01/virtio-v1.2-cs01.html#x1-2160005

I don't know, maybe it is worth adding that link to commit description.


Also a NIT, is this example "like "virtio,device26" for file system 
device" precise?


According to
https://www.kernel.org/doc/Documentation/devicetree/bindings/virtio/virtio-device.yaml

the virtio device id should be in hex, so for file system device it
should be "virtio,device1a", or I really missed something?

With updating description if NIT is correct (I don't know, maybe this 
could be done on commit):

Reviewed-by: Oleksandr Tyshchenko 





Update documentation to support that as well.

Fixes: dd54ea500be8 ("docs: add documentation for generic virtio devices")
Signed-off-by: Viresh Kumar 
---
V1->V2: New patch.

  docs/man/xl.cfg.5.pod.in | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 10f37990be57..ea20eac0ba32 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -1608,8 +1608,9 @@ example, "type=virtio,device22" for the I2C device, whose 
device-tree binding is
  
  L<https://www.kernel.org/doc/Documentation/devicetree/bindings/i2c/i2c-virtio.yaml>
  
-For generic virtio devices, where we don't need to set special or compatible

-properties in the Device Tree, the type field must be set to "virtio,device".
+For other generic virtio devices, where we don't need to set special or
+compatible properties in the Device Tree, the type field must be set to
+"virtio,device" or "virtio,device", where "N" is the virtio device id.




  
  =item B

Re: [PATCH] libxl: arm: Allow grant mappings for backends running on Dom0

2023-04-04 Thread Oleksandr Tyshchenko





On 30.03.23 11:43, Viresh Kumar wrote:

Hello Viresh


Currently, we add grant mapping related device tree properties if the
backend domain is not Dom0. While Dom0 is privileged and can do foreign
mapping for the entire guest memory, it is still okay for Dom0 to access
guest's memory via grant mappings and hence map only what is required.


ok, probably makes sense



This commit adds another parameter for virtio devices, with which they
can do forced grant mappings irrespective of the backend domain id.

Signed-off-by: Viresh Kumar 



In general patch lgtm, just a few comments below



---
  docs/man/xl.cfg.5.pod.in |  4 
  tools/libs/light/libxl_arm.c | 21 -
  tools/libs/light/libxl_types.idl |  1 +
  tools/libs/light/libxl_virtio.c  | 11 +++
  tools/xl/xl_parse.c  |  2 ++
  5 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 10f37990be57..4879f136aab8 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -1616,6 +1616,10 @@ properties in the Device Tree, the type field must be set to 
"virtio,device".
  Specifies the transport mechanism for the Virtio device, only "mmio" is
  supported for now.
  
+=item B

+
+Allows Xen Grant memory mapping to be done from Dom0.



Asumming it is disabled by default, I would add the following:

The default is (0) false.


+
  =falback
  
  =item B

diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 97c80d7ed0fa..ec2f1844e9b3 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -922,7 +922,8 @@ static int make_xen_iommu_node(libxl__gc *gc, void *fdt)
  
  /* The caller is responsible to complete / close the fdt node */

  static int make_virtio_mmio_node_common(libxl__gc *gc, void *fdt, uint64_t 
base,
-uint32_t irq, uint32_t backend_domid)
+uint32_t irq, uint32_t backend_domid,
+bool forced_grant)
  {
  int res;
  gic_interrupt intr;
@@ -945,7 +946,7 @@ static int make_virtio_mmio_node_common(libxl__gc *gc, void 
*fdt, uint64_t base,
  res = fdt_property(fdt, "dma-coherent", NULL, 0);
  if (res) return res;
  
-if (backend_domid != LIBXL_TOOLSTACK_DOMID) {

+if (forced_grant || backend_domid != LIBXL_TOOLSTACK_DOMID) {
  uint32_t iommus_prop[2];
  
  iommus_prop[0] = cpu_to_fdt32(GUEST_PHANDLE_IOMMU);

@@ -959,11 +960,12 @@ static int make_virtio_mmio_node_common(libxl__gc *gc, 
void *fdt, uint64_t base,
  }
  
  static int make_virtio_mmio_node(libxl__gc *gc, void *fdt, uint64_t base,

- uint32_t irq, uint32_t backend_domid)
+ uint32_t irq, uint32_t backend_domid,
+ bool forced_grant)
  {
  int res;
  
-res = make_virtio_mmio_node_common(gc, fdt, base, irq, backend_domid);

+res = make_virtio_mmio_node_common(gc, fdt, base, irq, backend_domid, 
forced_grant);
  if (res) return res;
  
  return fdt_end_node(fdt);

@@ -1019,11 +1021,11 @@ static int make_virtio_mmio_node_gpio(libxl__gc *gc, 
void *fdt)
  
  static int make_virtio_mmio_node_device(libxl__gc *gc, void *fdt, uint64_t base,

  uint32_t irq, const char *type,
-uint32_t backend_domid)
+uint32_t backend_domid, bool 
forced_grant)
  {
  int res;
  
-res = make_virtio_mmio_node_common(gc, fdt, base, irq, backend_domid);

+res = make_virtio_mmio_node_common(gc, fdt, base, irq, backend_domid, 
forced_grant);
  if (res) return res;
  
  /* Add device specific nodes */

@@ -1363,7 +1365,7 @@ static int libxl__prepare_dtb(libxl__gc *gc, 
libxl_domain_config *d_config,
  iommu_needed = true;
  
  FDT( make_virtio_mmio_node(gc, fdt, disk->base, disk->irq,

-   disk->backend_domid) );
+   disk->backend_domid, false) );
  }
  }
  
@@ -1373,12 +1375,13 @@ static int libxl__prepare_dtb(libxl__gc *gc, libxl_domain_config *d_config,

  if (virtio->transport != LIBXL_VIRTIO_TRANSPORT_MMIO)
  continue;
  
-if (virtio->backend_domid != LIBXL_TOOLSTACK_DOMID)

+if (virtio->forced_grant || virtio->backend_domid != 
LIBXL_TOOLSTACK_DOMID)
  iommu_needed = true;
  
  FDT( make_virtio_mmio_node_device(gc, fdt, virtio->base,

virtio->irq, virtio->type,
-  virtio->backend_domid) );
+  virtio->backend_domid,
+  virtio->forced_grant) );

Re: [PATCH] libxl: fix matching of generic virtio device

2023-04-04 Thread Oleksandr Tyshchenko





On 30.03.23 10:35, Viresh Kumar wrote:


Hello Viresh



The strings won't be an exact match, and we are only looking to match
the prefix here, i.e. "virtio,device". This is already done properly in
libxl_virtio.c file, lets do the same here too.

Signed-off-by: Viresh Kumar 



It feels to me this patch wants to gain the following tag:

Fixes: 43ba5202e2ee ("libxl: add support for generic virtio device")




---
  tools/libs/light/libxl_arm.c | 12 
  1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index ddc7b2a15975..97c80d7ed0fa 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1033,10 +1033,14 @@ static int make_virtio_mmio_node_device(libxl__gc *gc, 
void *fdt, uint64_t base,
  } else if (!strcmp(type, VIRTIO_DEVICE_TYPE_GPIO)) {
  res = make_virtio_mmio_node_gpio(gc, fdt);
  if (res) return res;
-} else if (strcmp(type, VIRTIO_DEVICE_TYPE_GENERIC)) {
-/* Doesn't match generic virtio device */
-LOG(ERROR, "Invalid type for virtio device: %s", type);
-return -EINVAL;
+} else {
+int len = sizeof(VIRTIO_DEVICE_TYPE_GENERIC) - 1;
+
+if (strncmp(type, VIRTIO_DEVICE_TYPE_GENERIC, len)) {
+/* Doesn't match generic virtio device */
+LOG(ERROR, "Invalid type for virtio device: %s", type);
+return -EINVAL;
+}



I agree that now code is aligned with what we have in libxl_virtio.c 
file, but I am afraid I cannot connect the sentence from the commit 
description:

"The strings won't be an exact match, and we are only looking to match
the prefix here, i.e. "virtio,device"."

with the sentence from docs/man/xl.cfg.5.pod.in:

"For generic virtio devices, where we don't need to set special or 
compatible properties in the Device Tree, the type field must be set to 
"virtio,device"."


I might miss something, but shouldn't we clarify the documentation?




  }
  
  return fdt_end_node(fdt);

Re: [PATCH v2] xen/pvcalls: don't call bind_evtchn_to_irqhandler() under lock

2023-04-04 Thread Oleksandr Tyshchenko



On 03.04.23 12:27, Juergen Gross wrote:


Hello Juergen

> bind_evtchn_to_irqhandler() shouldn't be called under spinlock, as it
> can sleep.
> 
> This requires to move the calls of create_active() out of the locked
> regions. This is no problem, as the worst which could happen would be
> a spurious call of the interrupt handler, causing a spurious wake_up().
> 
> Reported-by: Dan Carpenter 
> Link: 
> https://urldefense.com/v3/__https://lore.kernel.org/lkml/Y*JUIl64UDmdkboh@kadam/__;Kw!!GF_29dbcQIUBPA!wTyU032PQPxqlpIfuWRwb-DYE1K8P0bRWJyJICa7IEbAwQ0_aeZwknAWwxJ_cv_tWGY42f5NPgn6JHtZsiGP$
>  [lore[.]kernel[.]org]
> Signed-off-by: Juergen Gross 
> ---
> V2:
> - remove stale spin_unlock() (Oleksandr Tyshchenko)


Reviewed-by: Oleksandr Tyshchenko 



> ---
>   drivers/xen/pvcalls-front.c | 46 +
>   1 file changed, 26 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index d5d589bda243..b72ee9379d77 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -227,22 +227,30 @@ static irqreturn_t pvcalls_front_event_handler(int irq, 
> void *dev_id)
>   
>   static void free_active_ring(struct sock_mapping *map);
>   
> -static void pvcalls_front_free_map(struct pvcalls_bedata *bedata,
> -struct sock_mapping *map)
> +static void pvcalls_front_destroy_active(struct pvcalls_bedata *bedata,
> +  struct sock_mapping *map)
>   {
>   int i;
>   
>   unbind_from_irqhandler(map->active.irq, map);
>   
> - spin_lock(&bedata->socket_lock);
> - if (!list_empty(&map->list))
> - list_del_init(&map->list);
> - spin_unlock(&bedata->socket_lock);
> + if (bedata) {
> + spin_lock(&bedata->socket_lock);
> + if (!list_empty(&map->list))
> + list_del_init(&map->list);
> + spin_unlock(&bedata->socket_lock);
> + }
>   
>   for (i = 0; i < (1 << PVCALLS_RING_ORDER); i++)
>   gnttab_end_foreign_access(map->active.ring->ref[i], NULL);
>   gnttab_end_foreign_access(map->active.ref, NULL);
>   free_active_ring(map);
> +}
> +
> +static void pvcalls_front_free_map(struct pvcalls_bedata *bedata,
> +struct sock_mapping *map)
> +{
> + pvcalls_front_destroy_active(bedata, map);
>   
>   kfree(map);
>   }
> @@ -433,19 +441,18 @@ int pvcalls_front_connect(struct socket *sock, struct 
> sockaddr *addr,
>   pvcalls_exit_sock(sock);
>   return ret;
>   }
> -
> - spin_lock(&bedata->socket_lock);
> - ret = get_request(bedata, &req_id);
> + ret = create_active(map, &evtchn);
>   if (ret < 0) {
> - spin_unlock(&bedata->socket_lock);
>   free_active_ring(map);
>   pvcalls_exit_sock(sock);
>   return ret;
>   }
> - ret = create_active(map, &evtchn);
> +
> + spin_lock(&bedata->socket_lock);
> + ret = get_request(bedata, &req_id);
>   if (ret < 0) {
>   spin_unlock(&bedata->socket_lock);
> - free_active_ring(map);
> + pvcalls_front_destroy_active(NULL, map);
>   pvcalls_exit_sock(sock);
>   return ret;
>   }
> @@ -821,28 +828,27 @@ int pvcalls_front_accept(struct socket *sock, struct 
> socket *newsock, int flags)
>   pvcalls_exit_sock(sock);
>   return ret;
>   }
> - spin_lock(&bedata->socket_lock);
> - ret = get_request(bedata, &req_id);
> + ret = create_active(map2, &evtchn);
>   if (ret < 0) {
> - clear_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
> -   (void *)&map->passive.flags);
> - spin_unlock(&bedata->socket_lock);
>   free_active_ring(map2);
>   kfree(map2);
> + clear_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
> +   (void *)&map->passive.flags);
>   pvcalls_exit_sock(sock);
>   return ret;
>   }
>   
> - ret = create_active(map2, &evtchn);
> + spin_lock(&bedata->socket_lock);
> + ret = get_request(bedata, &req_id);
>   if (ret < 0) {
> - free_active_ring(map2);
> - kfree(map2);
>   clear_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
> (void *)&map->passive.flags);
>   spin_unlock(&bedata->socket_lock);
> + pvcalls_front_free_map(bedata, map2);
>   pvcalls_exit_sock(sock);
>   return ret;
>   }
> +
>   list_add_tail(&map2->list, &bedata->socket_mappings);
>   
>   req = RING_GET_REQUEST(&bedata->ring, req_id);

Re: [PATCH] xen/scsiback: don't call scsiback_free_translation_entry() under lock

2023-03-28 Thread Oleksandr Tyshchenko



On 29.03.23 09:20, Juergen Gross wrote:

Hello Juergen


> On 28.03.23 17:47, Oleksandr Tyshchenko wrote:
>>
>>
>> On 28.03.23 11:46, Juergen Gross wrote:
>>
>> Hello Juergen
>>
>>> scsiback_free_translation_entry() shouldn't be called under spinlock,
>>> as it can sleep.
>>>
>>> This requires to split removing a translation entry from the v2p list
>>> from actually calling kref_put() for the entry.
>>>
>>> Reported-by: Dan Carpenter 
>>> Link: 
>>> https://urldefense.com/v3/__https://lore.kernel.org/lkml/Y*JUIl64UDmdkboh@kadam/__;Kw!!GF_29dbcQIUBPA!23IKdVhamoFq8ptUnprd_TubDMObj-0QAalsGiffBHCeEdOuwrq7z4ohg92Sj0olgl0nh73oXvSr-i1zqXhY$
>>>  [lore[.]kernel[.]org]
>>> Signed-off-by: Juergen Gross 
>>> ---
>>>    drivers/xen/xen-scsiback.c | 27 ++-
>>>    1 file changed, 14 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c
>>> index 954188b0b858..294f29cdc7aa 100644
>>> --- a/drivers/xen/xen-scsiback.c
>>> +++ b/drivers/xen/xen-scsiback.c
>>> @@ -1010,12 +1010,6 @@ static int 
>>> scsiback_add_translation_entry(struct vscsibk_info *info,
>>>    return err;
>>>    }
>>> -static void __scsiback_del_translation_entry(struct v2p_entry *entry)
>>> -{
>>> -    list_del(&entry->l);
>>> -    kref_put(&entry->kref, scsiback_free_translation_entry);
>>> -}
>>> -
>>>    /*
>>>  Delete the translation entry specified
>>>    */
>>> @@ -1024,18 +1018,20 @@ static int 
>>> scsiback_del_translation_entry(struct vscsibk_info *info,
>>>    {
>>>    struct v2p_entry *entry;
>>>    unsigned long flags;
>>> -    int ret = 0;
>>>    spin_lock_irqsave(&info->v2p_lock, flags);
>>>    /* Find out the translation entry specified */
>>>    entry = scsiback_chk_translation_entry(info, v);
>>>    if (entry)
>>> -    __scsiback_del_translation_entry(entry);
>>> -    else
>>> -    ret = -ENOENT;
>>> +    list_del(&entry->l);
>>>    spin_unlock_irqrestore(&info->v2p_lock, flags);
>>> -    return ret;
>>> +
>>> +    if (!entry)
>>> +    return -ENOENT;
>>> +
>>> +    kref_put(&entry->kref, scsiback_free_translation_entry);
>>> +    return 0;
>>>    }
>>>    static void scsiback_do_add_lun(struct vscsibk_info *info, const 
>>> char *state,
>>> @@ -1239,14 +1235,19 @@ static void 
>>> scsiback_release_translation_entry(struct vscsibk_info *info)
>>>    {
>>>    struct v2p_entry *entry, *tmp;
>>>    struct list_head *head = &(info->v2p_entry_lists);
>>> +    struct list_head tmp_list;
>>
>>
>> I would use LIST_HEAD(tmp_list);
> 
> There is no need to initialize it, so I think I will keep it as is.
> 
>>
>>>    unsigned long flags;
>>>    spin_lock_irqsave(&info->v2p_lock, flags);
>>> -    list_for_each_entry_safe(entry, tmp, head, l)
>>> -    __scsiback_del_translation_entry(entry);
>>> +    list_cut_before(&tmp_list, head, head);
>>
>> so we just move all entries from head to tmp_list here to be processed...
> 
> Correct.
> 
>>
>>>    spin_unlock_irqrestore(&info->v2p_lock, flags);
>>
>> ... when the lock is not held, ok
>>
>> Patch LGTM, but one (maybe stupid) question to clarify.
>>
>> Why do we need to use a lock here in the first place? The
>> scsiback_release_translation_entry() gets called when the driver
>> instance is about to be removed and *after* the disconnection from
>> otherend (so no requests are expected), so what else might cause this
>> list to be accessed concurrently?
> 
> Maybe nothing, but I think it is good practice to keep the lock in order
> to avoid future code changes to cause problems.


Thanks for the explanation, it sounds reasonable to me.

Reviewed-by: Oleksandr Tyshchenko 

> 
> 
> Juergen
>

Re: [PATCH] xen/scsiback: don't call scsiback_free_translation_entry() under lock

2023-03-28 Thread Oleksandr Tyshchenko



On 28.03.23 11:46, Juergen Gross wrote:

Hello Juergen

> scsiback_free_translation_entry() shouldn't be called under spinlock,
> as it can sleep.
> 
> This requires to split removing a translation entry from the v2p list
> from actually calling kref_put() for the entry.
> 
> Reported-by: Dan Carpenter 
> Link: 
> https://urldefense.com/v3/__https://lore.kernel.org/lkml/Y*JUIl64UDmdkboh@kadam/__;Kw!!GF_29dbcQIUBPA!23IKdVhamoFq8ptUnprd_TubDMObj-0QAalsGiffBHCeEdOuwrq7z4ohg92Sj0olgl0nh73oXvSr-i1zqXhY$
>  [lore[.]kernel[.]org]
> Signed-off-by: Juergen Gross 
> ---
>   drivers/xen/xen-scsiback.c | 27 ++-
>   1 file changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c
> index 954188b0b858..294f29cdc7aa 100644
> --- a/drivers/xen/xen-scsiback.c
> +++ b/drivers/xen/xen-scsiback.c
> @@ -1010,12 +1010,6 @@ static int scsiback_add_translation_entry(struct 
> vscsibk_info *info,
>   return err;
>   }
>   
> -static void __scsiback_del_translation_entry(struct v2p_entry *entry)
> -{
> - list_del(&entry->l);
> - kref_put(&entry->kref, scsiback_free_translation_entry);
> -}
> -
>   /*
> Delete the translation entry specified
>   */
> @@ -1024,18 +1018,20 @@ static int scsiback_del_translation_entry(struct 
> vscsibk_info *info,
>   {
>   struct v2p_entry *entry;
>   unsigned long flags;
> - int ret = 0;
>   
>   spin_lock_irqsave(&info->v2p_lock, flags);
>   /* Find out the translation entry specified */
>   entry = scsiback_chk_translation_entry(info, v);
>   if (entry)
> - __scsiback_del_translation_entry(entry);
> - else
> - ret = -ENOENT;
> + list_del(&entry->l);
>   
>   spin_unlock_irqrestore(&info->v2p_lock, flags);
> - return ret;
> +
> + if (!entry)
> + return -ENOENT;
> +
> + kref_put(&entry->kref, scsiback_free_translation_entry);
> + return 0;
>   }
>   
>   static void scsiback_do_add_lun(struct vscsibk_info *info, const char 
> *state,
> @@ -1239,14 +1235,19 @@ static void scsiback_release_translation_entry(struct 
> vscsibk_info *info)
>   {
>   struct v2p_entry *entry, *tmp;
>   struct list_head *head = &(info->v2p_entry_lists);
> + struct list_head tmp_list;


I would use LIST_HEAD(tmp_list);

>   unsigned long flags;
>   
>   spin_lock_irqsave(&info->v2p_lock, flags);
>   
> - list_for_each_entry_safe(entry, tmp, head, l)
> - __scsiback_del_translation_entry(entry);
> + list_cut_before(&tmp_list, head, head);

so we just move all entries from head to tmp_list here to be processed...

>   
>   spin_unlock_irqrestore(&info->v2p_lock, flags);

... when the lock is not held, ok

Patch LGTM, but one (maybe stupid) question to clarify.

Why do we need to use a lock here in the first place? The 
scsiback_release_translation_entry() gets called when the driver 
instance is about to be removed and *after* the disconnection from 
otherend (so no requests are expected), so what else might cause this 
list to be accessed concurrently?


> +
> + list_for_each_entry_safe(entry, tmp, &tmp_list, l) {
> + list_del(&entry->l);
> + kref_put(&entry->kref, scsiback_free_translation_entry);
> + }
>   }
>   
>   static void scsiback_remove(struct xenbus_device *dev)

Re: [PATCH] xen/pciback: don't call pcistub_device_put() under lock

2023-03-28 Thread Oleksandr Tyshchenko



On 28.03.23 11:45, Juergen Gross wrote:

Hello Juergen

> pcistub_device_put() shouldn't be called under spinlock, as it can
> sleep.
> 
> For this reason pcistub_device_get_pci_dev() needs to be modified:
> instead of always calling pcistub_device_get() just do the call of
> pcistub_device_get() only if it is really needed. This removes the
> need to call pcistub_device_put().
> 
> Reported-by: Dan Carpenter 
> Link: 
> https://urldefense.com/v3/__https://lore.kernel.org/lkml/Y*JUIl64UDmdkboh@kadam/__;Kw!!GF_29dbcQIUBPA!wO4HR1jCrDMOfB1Ih2qEZs2jnqcieZUZnc6cPwh7Ta8hiLRKwS1Gs-1tmQP-NuEYoz9LhYWI8aFazIwIa8Lh$
>  [lore[.]kernel[.]org]
> Signed-off-by: Juergen Gross 


Reviewed-by: Oleksandr Tyshchenko 

> ---
>   drivers/xen/xen-pciback/pci_stub.c | 6 ++
>   1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/xen/xen-pciback/pci_stub.c 
> b/drivers/xen/xen-pciback/pci_stub.c
> index bba527620507..e34b623e4b41 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -194,8 +194,6 @@ static struct pci_dev *pcistub_device_get_pci_dev(struct 
> xen_pcibk_device *pdev,
>   struct pci_dev *pci_dev = NULL;
>   unsigned long flags;
>   
> - pcistub_device_get(psdev);
> -
>   spin_lock_irqsave(&psdev->lock, flags);
>   if (!psdev->pdev) {
>   psdev->pdev = pdev;
> @@ -203,8 +201,8 @@ static struct pci_dev *pcistub_device_get_pci_dev(struct 
> xen_pcibk_device *pdev,
>   }
>   spin_unlock_irqrestore(&psdev->lock, flags);
>   
> - if (!pci_dev)
> - pcistub_device_put(psdev);
> + if (pci_dev)
> + pcistub_device_get(psdev);
>   
>   return pci_dev;
>   }

Re: [PATCH] xen/pvcalls: don't call bind_evtchn_to_irqhandler() under lock

2023-03-28 Thread Oleksandr Tyshchenko





On 28.03.23 12:39, Juergen Gross wrote:

Hello Juergen



bind_evtchn_to_irqhandler() shouldn't be called under spinlock, as it
can sleep.

This requires to move the calls of create_active() out of the locked
regions. This is no problem, as the worst which could happen would be
a spurious call of the interrupt handler, causing a spurious wake_up().

Reported-by: Dan Carpenter 
Link: https://lore.kernel.org/lkml/Y+JUIl64UDmdkboh@kadam/
Signed-off-by: Juergen Gross 
---
  drivers/xen/pvcalls-front.c | 46 ++---
  1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index d5d589bda243..6e5d712e3115 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -227,22 +227,31 @@ static irqreturn_t pvcalls_front_event_handler(int irq, 
void *dev_id)
  
  static void free_active_ring(struct sock_mapping *map);
  
-static void pvcalls_front_free_map(struct pvcalls_bedata *bedata,

-  struct sock_mapping *map)
+static void pvcalls_front_destroy_active(struct pvcalls_bedata *bedata,
+struct sock_mapping *map)
  {
int i;
  
  	unbind_from_irqhandler(map->active.irq, map);
  
-	spin_lock(&bedata->socket_lock);

-   if (!list_empty(&map->list))
-   list_del_init(&map->list);
-   spin_unlock(&bedata->socket_lock);
+   if (bedata) {
+   spin_lock(&bedata->socket_lock);
+   if (!list_empty(&map->list))
+   list_del_init(&map->list);
+   spin_unlock(&bedata->socket_lock);
+   }
  
  	for (i = 0; i < (1 << PVCALLS_RING_ORDER); i++)

gnttab_end_foreign_access(map->active.ring->ref[i], NULL);
gnttab_end_foreign_access(map->active.ref, NULL);
+
free_active_ring(map);
+}
+
+static void pvcalls_front_free_map(struct pvcalls_bedata *bedata,
+  struct sock_mapping *map)
+{
+   pvcalls_front_destroy_active(bedata, map);
  
  	kfree(map);

  }
@@ -433,19 +442,18 @@ int pvcalls_front_connect(struct socket *sock, struct 
sockaddr *addr,
pvcalls_exit_sock(sock);
return ret;
}
-
-   spin_lock(&bedata->socket_lock);
-   ret = get_request(bedata, &req_id);
+   ret = create_active(map, &evtchn);
if (ret < 0) {
-   spin_unlock(&bedata->socket_lock);
free_active_ring(map);
pvcalls_exit_sock(sock);
return ret;
}
-   ret = create_active(map, &evtchn);
+
+   spin_lock(&bedata->socket_lock);
+   ret = get_request(bedata, &req_id);
if (ret < 0) {
spin_unlock(&bedata->socket_lock);
-   free_active_ring(map);
+   pvcalls_front_destroy_active(NULL, map);
pvcalls_exit_sock(sock);
return ret;
}
@@ -821,28 +829,28 @@ int pvcalls_front_accept(struct socket *sock, struct 
socket *newsock, int flags)
pvcalls_exit_sock(sock);
return ret;
}
-   spin_lock(&bedata->socket_lock);
-   ret = get_request(bedata, &req_id);
+   ret = create_active(map2, &evtchn);
if (ret < 0) {
+   free_active_ring(map2);
+   kfree(map2);
clear_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
  (void *)&map->passive.flags);
spin_unlock(&bedata->socket_lock);



Looks like we also need to remove spin_unlock() above, correct?



-   free_active_ring(map2);
-   kfree(map2);
pvcalls_exit_sock(sock);
return ret;
}
  
-	ret = create_active(map2, &evtchn);

+   spin_lock(&bedata->socket_lock);
+   ret = get_request(bedata, &req_id);
if (ret < 0) {
-   free_active_ring(map2);
-   kfree(map2);
clear_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
  (void *)&map->passive.flags);
spin_unlock(&bedata->socket_lock);
+   pvcalls_front_free_map(bedata, map2);
pvcalls_exit_sock(sock);
return ret;
}
+
list_add_tail(&map2->list, &bedata->socket_mappings);
  
  	req = RING_GET_REQUEST(&bedata->ring, req_id);

[PATCH] drm/virtio: Pass correct device to dma_sync_sgtable_for_device()

2023-02-24 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

The "vdev->dev.parent" should be used instead of "vdev->dev" as a device
for which to perform the DMA operation in both
virtio_gpu_cmd_transfer_to_host_2d(3d).

Because the virtio-gpu device "vdev->dev" doesn't really have DMA OPS
assigned to it, but parent (virtio-pci or virtio-mmio) device
"vdev->dev.parent" has. The more, the sgtable in question the code is
trying to sync here was mapped for the parent device (by using its DMA OPS)
previously at:
virtio_gpu_object_shmem_init()->drm_gem_shmem_get_pages_sgt()->
dma_map_sgtable(), so should be synced here for the same parent device.

Fixes: b5c9ed70d1a9 ("drm/virtio: Improve DMA API usage for shmem BOs")
Signed-off-by: Oleksandr Tyshchenko 
---
This patch fixes the following issue when running on top of Xen with 
CONFIG_XEN_VIRTIO=y (patch was only tested in Xen environment (ARM64 guest)
w/ and w/o using Xen grants for virtio):

[0.830235] [drm] pci: virtio-gpu-pci detected at :00:03.0
[0.832078] [drm] features: +virgl +edid -resource_blob -host_visible
[0.832084] [drm] features: -context_init
[0.837320] [drm] number of scanouts: 1
[0.837460] [drm] number of cap sets: 2
[0.904372] [drm] cap set 0: id 1, max-version 1, max-size 308
[0.905399] [drm] cap set 1: id 2, max-version 2, max-size 696
[0.907202] [drm] Initialized virtio_gpu 0.1.0 0 for :00:03.0 on minor 0
[0.927241] virtio-pci :00:03.0: [drm] 
drm_plane_enable_fb_damage_clips() not called
[0.927279] Unable to handle kernel paging request at virtual address 
c0053000
[0.927284] Mem abort info:
[0.927286]   ESR = 0x96000144
[0.927289]   EC = 0x25: DABT (current EL), IL = 32 bits
[0.927293]   SET = 0, FnV = 0
[0.927295]   EA = 0, S1PTW = 0
[0.927298]   FSC = 0x04: level 0 translation fault
[0.927301] Data abort info:
[0.927303]   ISV = 0, ISS = 0x0144
[0.927305]   CM = 1, WnR = 1
[0.927308] swapper pgtable: 4k pages, 48-bit VAs, pgdp=4127f000
[0.927312] [c0053000] pgd=, p4d=
[0.927323] Internal error: Oops: 96000144 [#1] PREEMPT SMP
[0.927329] Modules linked in:
[0.927336] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW  
6.2.0-rc4-yocto-standard #1
[0.927343] Hardware name: XENVM-4.18 (DT)
[0.927346] pstate: 6005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[0.927352] pc : dcache_clean_poc+0x20/0x38
[0.927370] lr : arch_sync_dma_for_device+0x24/0x30
[0.927379] sp : 8972b3e0
[0.927381] x29: 8972b3e0 x28: 01aa8a00 x27: 
[0.927389] x26:  x25: 02815010 x24: 
[0.927396] x23: 890f9078 x22: 0001 x21: 0002
[0.927403] x20: 02b6b580 x19: 80053000 x18: 
[0.927410] x17:  x16:  x15: 8963b94e
[0.927416] x14: 0001 x13: 8963b93b x12: 64615f616d645f67
[0.927423] x11: 89513110 x10: 000a x9 : 8972b360
[0.927430] x8 : 895130c8 x7 : 8972b150 x6 : 000c
[0.927436] x5 :  x4 :  x3 : 003f
[0.927443] x2 : 0040 x1 : c0067000 x0 : c0053000
[0.927450] Call trace:
[0.927452]  dcache_clean_poc+0x20/0x38
[0.927459]  dma_direct_sync_sg_for_device+0x124/0x130
[0.927466]  dma_sync_sg_for_device+0x64/0xd0
[0.927475]  virtio_gpu_cmd_transfer_to_host_2d+0x10c/0x110
[0.927483]  virtio_gpu_primary_plane_update+0x340/0x3d0
[0.927490]  drm_atomic_helper_commit_planes+0xe8/0x20c
[0.927497]  drm_atomic_helper_commit_tail+0x54/0xa0
[0.927503]  commit_tail+0x160/0x190
[0.927507]  drm_atomic_helper_commit+0x16c/0x180
[0.927513]  drm_atomic_commit+0xa8/0xe0
[0.927521]  drm_client_modeset_commit_atomic+0x200/0x260
[0.927529]  drm_client_modeset_commit_locked+0x5c/0x1a0
[0.927536]  drm_client_modeset_commit+0x30/0x60
[0.927540]  drm_fb_helper_set_par+0xc8/0x120
[0.927548]  fbcon_init+0x3b8/0x510
[0.927557]  visual_init+0xb4/0x104
[0.927565]  do_bind_con_driver.isra.0+0x1c4/0x394
[0.927572]  do_take_over_console+0x144/0x1fc
[0.927577]  do_fbcon_takeover+0x6c/0xe4
[0.927583]  fbcon_fb_registered+0x1e4/0x1f0
[0.927588]  register_framebuffer+0x214/0x310
[0.927592]  __drm_fb_helper_initial_config_and_unlock+0x33c/0x540
[0.927599]  drm_fb_helper_initial_config+0x4c/0x60
[0.927604]  drm_fbdev_client_hotplug+0xc4/0x150
[0.927609]  drm_fbdev_generic_setup+0x90/0x154
[0.927614]  virtio_gpu_probe+0xc8/0x16c
[0.927621]  virtio_dev_probe+0x19c/0x240
[0.927629]  really_probe+0xbc/0x2dc
[0.927637]  __driver_probe_device+0x78/0xe0
[0.927641]  driver_probe_d

Re: [Discussion] Xen grants and access permissions

2023-02-19 Thread Oleksandr Tyshchenko

Hello Viresh.

[CCed Jürgen who might have some thoughts]
[Sorry for the possible format issues]

On Thu, Feb 16, 2023 at 1:36 PM Andrew Cooper 
wrote:

> On 16/02/2023 11:13 am, Viresh Kumar wrote:
> > Hi Oleksandr,
> >
> > As you already know, I am looking at how we can integrate the Xen
> > grants work in our implementation of Rust based Xen vhost frontend [1].
> >
> > The hypervisor independent vhost-user backends [2] talk to
> > xen-vhost-frontend using the standard vhost-user protocol [3]. Every
> > memory region that the backends get access to are sent to it by the
> > frontend as memory region descriptors, which contain only address and
> > size information and lack any permission flags.
> >
> > I noticed that with Xen grants, there are strict memory access
> > restrictions, where a memory region may be marked READ only and we
> > can't map it as RW anymore, trying that just fails. Because the
> > standard vhost-user protocol doesn't have any permission flags, the
> > vhost libraries (in Rust) can't do anything else but try to map
> > everything as RW.
> >
> > I am wondering how do I proceed on this as I am very much stuck here.
> >
>
> (unhelpful comment) This is what happens when people try to reinvent the
> wheel a little more square than it was before.
>
> If the guest grants the page read-only, then you can only map it read
> only.  Anything else is a violation of the security model.
>
> So either you need to adjust the guest to always grant read/write, or
> you need to teach virtio that read only is actually a real concept.
>
> ~Andrew
>

Below are my thoughts which might be wrong.

I see the problem, but cannot add anything else to what Andrew has already
said. If the frontend maps a page as RO then a backend (device) should
map it with the same attribute and perform only read access to it.
Restricted memory access using Xen grants is a kind of SW IOMMU,
no more no less, so I assume the very same problem would take place if we
would implement a virtio-iommu for Xen...

Let's assume that we cannot modify a guest to map *everything* as RW. But
although the permission flags are not communicated explicitly in classic
case,
the backend usually knows how a particular frontend page is supposed to be
mapped
(at least I didn't face any permission related issues when using Xen grants
either with standalone virtio-disk backend or Qemu based backends using
Jürgen's PoC):

1. The virtqueues are mapped as RW (because it is supposed to be written by
both ends)
2. The payload I/O buffer's (virtio ring descriptors) fortunately have a
flag field, so it is always known whether they are WO or RO
3. The indirect descriptor is mapped as RO (because it contains a list of
other descriptors, so nothing to be written there)

So I am wondering can this standard vhost-user protocol be extended to pass
some additional information for a memory region?

If and only if that standard vhost-user protocol cannot be extended to
communicate required information for a memory region *and*
there is a need to use Xen grants for virtio (so it is completely unclear
what that memory region actually represents and how it should be mapped)
one (crazy?) idea could be to try to map everything as RW and fallback to
RO if the mapping attempt fails. Or, perhaps, as an alternative,
to map as RW only those pages which are going to be modified, anything else
map as RO. Although I am not quite sure whether it would be a good idea.

-- 
Regards,

Oleksandr Tyshchenko

[PATCH] xen/grant-dma-iommu: Implement a dummy probe_device() callback

2023-02-08 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

Update stub IOMMU driver (which main purpose is to reuse generic
IOMMU device-tree bindings by Xen grant DMA-mapping layer on Arm)
according to the recent changes done in the following
commit 57365a04c921 ("iommu: Move bus setup to IOMMU device registration").

With probe_device() callback being called during IOMMU device registration,
the uninitialized callback just leads to the "kernel NULL pointer
dereference" issue during boot. Fix that by adding a dummy callback.

Looks like the release_device() callback is not mandatory to be
implemented as IOMMU framework makes sure that callback is initialized
before dereferencing.

Reported-by: Viresh Kumar 
Signed-off-by: Oleksandr Tyshchenko 
---
 drivers/xen/grant-dma-iommu.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/grant-dma-iommu.c b/drivers/xen/grant-dma-iommu.c
index 16b8bc0c0b33..6a9fe02c6bfc 100644
--- a/drivers/xen/grant-dma-iommu.c
+++ b/drivers/xen/grant-dma-iommu.c
@@ -16,8 +16,15 @@ struct grant_dma_iommu_device {
struct iommu_device iommu;
 };
 
-/* Nothing is really needed here */
-static const struct iommu_ops grant_dma_iommu_ops;
+static struct iommu_device *grant_dma_iommu_probe_device(struct device *dev)
+{
+   return ERR_PTR(-ENODEV);
+}
+
+/* Nothing is really needed here except a dummy probe_device callback */
+static const struct iommu_ops grant_dma_iommu_ops = {
+   .probe_device = grant_dma_iommu_probe_device,
+};
 
 static const struct of_device_id grant_dma_iommu_of_match[] = {
{ .compatible = "xen,grant-dma" },
-- 
2.34.1

Re: [XEN v4] xen/arm: Probe the load/entry point address of an uImage correctly

2023-01-09 Thread Oleksandr Tyshchenko





On 08.01.23 18:06, Julien Grall wrote:

Hello Julien, Ayan, all


Hi Ayan,

On 21/12/2022 18:53, Ayan Kumar Halder wrote:
Currently, kernel_uimage_probe() does not read the load/entry point 
address
set in the uImge header. Thus, info->zimage.start is 0 (default 
value). This
causes, kernel_zimage_place() to treat the binary (contained within 
uImage)

as position independent executable. Thus, it loads it at an incorrect
address.

The correct approach would be to read "uimage.load" and set
info->zimage.start. This will ensure that the binary is loaded at the
correct address. Also, read "uimage.ep" and set info->entry (ie kernel 
entry

address).

If user provides load address (ie "uimage.load") as 0x0, then the 
image is

treated as position independent executable. Xen can load such an image at
any address it considers appropriate. A position independent executable
cannot have a fixed entry point address.

This behavior is applicable for both arm32 and arm64 platforms.

Earlier for arm32 and arm64 platforms, Xen was ignoring the load and 
entry
point address set in the uImage header. With this commit, Xen will use 
them.

This makes the behavior of Xen consistent with uboot for uimage headers.


The changes look good to me (with a few of comments below). That said, 
before acking the code, I would like an existing user of uImage (maybe 
EPAM or Arm?) to confirm they are happy with the change.



I have just re-checked current patch in our typical Xen based 
environment (no dom0less, Linux in Dom0) and didn't notice issues with 
it. But we use zImage for Dom0's kernel, so kernel_uimage_probe() is not 
called.




I CCed Dmytro Firsov who is playing with Zephyr in Dom0 and *might* use 
uImage.



[snip]

Re: Virtio-disk updates for latest Xen ?

2022-12-19 Thread Oleksandr Tyshchenko





On 15.12.22 06:22, Viresh Kumar wrote:

Hello Viresh



On 14-12-22, 17:01, Oleksandr Tyshchenko wrote:

Today I had a chance to check virtio-disk on my H/W using new Xen branch
which does include Juergen's series with commit 3a96013a3e17
("tools/xenstore: reduce number of watch events").

Very interesting, but I didn't manage to reproduce an issue the similar to
what you had already faced with the rust counterparts before (caused by the
lack of Xenstore watches?). Note that I didn't debug what exactly events I
had got during guest creation/destruction, I just made sure that backend
worked as before. I checked that by running the backend in Dom0 and DomD and
performed a couple of guest power cycles (reboot, destroy/create).

If you could provide the debug patch which you seem to use to print incoming
events which you described in previous email, I think I would be able to
re-check the situation at my side more deeper.


This should be enough to see the new changes I believe.

diff --git a/xs_dev.c b/xs_dev.c
index a6c8403cfe84..8525c6512299 100755
--- a/xs_dev.c
+++ b/xs_dev.c
@@ -421,6 +421,8 @@ static int xenstore_poll_be_watch(struct xs_dev *dev)
  if (!vec)
  return -1;
  
+printf("%s: %s\n", vec[XS_WATCH_PATH], dev->path);

+
  if (!strcmp(vec[XS_WATCH_PATH], dev->path))
  rc = xenstore_get_fe_domid(dev);



Thanks. This print does not provide much information in my case 
(virtio-disk). All what I see here in both cases (with old and new Xen) 
during guest creation is the following:

"backend/virtio_disk: backend/virtio_disk"

But, if I modify the xenstore_get_fe_domid() to always return 0 and as 
the result to get stuck here, I can see subsequent events for other 
paths here.


I agree that with new Xen (after commit 3a96013a3e17 "tools/xenstore: 
reduce number of watch events"), some events are missing now, but I
still don't see an issue with virtio-disk and can't see why it is going 
to be an issue. The code doesn't wait for the last "finalizing" event 
for the root directory "backend/virtio_disk", it goes ahead to find the 
FE domid right after receiving the first event.


new Xen:
oot@generic-armv8-xt-dom0:~# xl console DomD
backend/virtio_disk/6/51713: backend/virtio_disk
backend/virtio_disk/6/51713/frontend: backend/virtio_disk
backend/virtio_disk/6/51713/params: backend/virtio_disk
backend/virtio_disk/6/51713/frontend-id: backend/virtio_disk
backend/virtio_disk/6/51713/online: backend/virtio_disk
backend/virtio_disk/6/51713/removable: backend/virtio_disk
backend/virtio_disk/6/51713/bootable: backend/virtio_disk
backend/virtio_disk/6/51713/state: backend/virtio_disk
backend/virtio_disk/6/51713/dev: backend/virtio_disk
backend/virtio_disk/6/51713/type: backend/virtio_disk
backend/virtio_disk/6/51713/mode: backend/virtio_disk
backend/virtio_disk/6/51713/device-type: backend/virtio_disk
backend/virtio_disk/6/51713/discard-enable: backend/virtio_disk
backend/virtio_disk/6/51713/specification: backend/virtio_disk
backend/virtio_disk/6/51713/transport: backend/virtio_disk
backend/virtio_disk/6/51713/base: backend/virtio_disk
backend/virtio_disk/6/51713/irq: backend/virtio_disk

old Xen:
root@generic-armv8-xt-dom0:~# xl console DomD
backend/virtio_disk/4/51713: backend/virtio_disk
backend/virtio_disk/4: backend/virtio_disk
backend/virtio_disk: backend/virtio_disk
backend/virtio_disk/4/51713/frontend: backend/virtio_disk
backend/virtio_disk/4/51713/params: backend/virtio_disk
backend/virtio_disk/4/51713/frontend-id: backend/virtio_disk
backend/virtio_disk/4/51713/online: backend/virtio_disk
backend/virtio_disk/4/51713/removable: backend/virtio_disk
backend/virtio_disk/4/51713/bootable: backend/virtio_disk
backend/virtio_disk/4/51713/state: backend/virtio_disk
backend/virtio_disk/4/51713/dev: backend/virtio_disk
backend/virtio_disk/4/51713/type: backend/virtio_disk
backend/virtio_disk/4/51713/mode: backend/virtio_disk
backend/virtio_disk/4/51713/device-type: backend/virtio_disk
backend/virtio_disk/4/51713/discard-enable: backend/virtio_disk
backend/virtio_disk/4/51713/specification: backend/virtio_disk
backend/virtio_disk/4/51713/transport: backend/virtio_disk
backend/virtio_disk/4/51713/base: backend/virtio_disk
backend/virtio_disk/4/51713/irq: backend/virtio_disk

At the same time the code to automatically determine the FE domid is not 
ideal, for the instance, it is based on the assumption
that new FE domid should be always greater than old FE domid (which as I 
understand *might* be wrong) and contains a delay, so wants to be improved.

Re: Virtio-disk updates for latest Xen ?

2022-12-14 Thread Oleksandr Tyshchenko





On 07.12.22 05:59, Viresh Kumar wrote:

Hello Viresh

First of all, sorry for the late response.
The second, thank you for the investigation.



On 07-12-22, 05:51, Viresh Kumar wrote:

I am not sure how to get this working, as there is no finalizing event
for the directory. Maybe our design is broken from the start and we
need to do it properly in some recommended way ?


For now this is what I have done to make it work:

diff --git a/xs_dev.c b/xs_dev.c
index a6c8403cfe84..4643394a52a2 100755
--- a/xs_dev.c
+++ b/xs_dev.c
@@ -413,20 +413,7 @@ static int xenstore_get_fe_domid(struct xs_dev *dev)
  
  static int xenstore_poll_be_watch(struct xs_dev *dev)

  {
-unsigned int num;
-char **vec;
-int rc = 0;
-
-vec = xs_read_watch(dev->xsh, &num);
-if (!vec)
-return -1;
-
-if (!strcmp(vec[XS_WATCH_PATH], dev->path))
-rc = xenstore_get_fe_domid(dev);
-
-free(vec);
-
-return rc;
+return xenstore_get_fe_domid(dev);
  }

This rns xenstore_get_fe_domid() for each event in the path
"backend/virtio", and in my case it passes with the second event
itself, which came for "backend/virtio/1/0" and this code doesn't run
after that.

Note that I have tested this with my rust counterpart which received a
similar change, I didn't test virtio-disk directly.



Today I had a chance to check virtio-disk on my H/W using new Xen branch 
which does include Juergen's series with commit 3a96013a3e17 
("tools/xenstore: reduce number of watch events").


Very interesting, but I didn't manage to reproduce an issue the similar 
to what you had already faced with the rust counterparts before (caused 
by the lack of Xenstore watches?). Note that I didn't debug what exactly 
events I had got during guest creation/destruction, I just made sure 
that backend worked as before. I checked that by running the backend in 
Dom0 and DomD and performed a couple of guest power cycles (reboot, 
destroy/create).


If you could provide the debug patch which you seem to use to print 
incoming events which you described in previous email, I think I would 
be able to re-check the situation at my side more deeper.

Re: [PATCH V9 1/3] libxl: Add support for generic virtio device

2022-12-13 Thread Oleksandr Tyshchenko





On 13.12.22 12:08, Viresh Kumar wrote:

Hello Viresh


This patch adds basic support for configuring and assisting generic
Virtio backends, which could run in any domain.

An example of domain configuration for mmio based Virtio I2C device is:
virtio = ["type=virtio,device22,transport=mmio"]

To make this work on Arm, allocate Virtio MMIO params (IRQ and memory
region) and pass them to the backend and update guest device-tree to
create a DT node for the Virtio devices.

Add special support for I2C and GPIO devices, which require the
"compatible" DT property to be set, among other device specific
properties. Support for generic virtio devices is also added, which just
need a MMIO node but not any special DT properties, for such devices the
user needs to pass "virtio,device" in the "type" string.

The parsing of generic virtio device configurations will be done in a
separate commit.

Reviewed-by: Anthony PERARD 
Reviewed-by: Oleksandr Tyshchenko 
Signed-off-by: Viresh Kumar 
---
  tools/libs/light/Makefile |   1 +
  tools/libs/light/libxl_arm.c  | 100 +++
  tools/libs/light/libxl_create.c   |   4 +
  tools/libs/light/libxl_internal.h |   6 +
  tools/libs/light/libxl_types.idl  |  18 +++
  tools/libs/light/libxl_types_internal.idl |   1 +
  tools/libs/light/libxl_virtio.c   | 144 ++
  7 files changed, 274 insertions(+)
  create mode 100644 tools/libs/light/libxl_virtio.c

diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
index 374be1cfab25..4fddcc6f51d7 100644
--- a/tools/libs/light/Makefile
+++ b/tools/libs/light/Makefile
@@ -106,6 +106,7 @@ OBJS-y += libxl_vdispl.o
  OBJS-y += libxl_pvcalls.o
  OBJS-y += libxl_vsnd.o
  OBJS-y += libxl_vkb.o
+OBJS-y += libxl_virtio.o
  OBJS-y += libxl_genid.o
  OBJS-y += _libxl_types.o
  OBJS-y += libxl_flask.o
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index fa3d61f1e882..ddc7b2a15975 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -113,6 +113,19 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
  }
  }
  
+for (i = 0; i < d_config->num_virtios; i++) {

+libxl_device_virtio *virtio = &d_config->virtios[i];
+
+if (virtio->transport != LIBXL_VIRTIO_TRANSPORT_MMIO)
+continue;
+
+rc = alloc_virtio_mmio_params(gc, &virtio->base, &virtio->irq,
+  &virtio_mmio_base, &virtio_mmio_irq);
+
+if (rc)
+return rc;
+}
+
  /*
   * Every virtio-mmio device uses one emulated SPI. If Virtio devices are
   * present, make sure that we allocate enough SPIs for them.
@@ -956,6 +969,79 @@ static int make_virtio_mmio_node(libxl__gc *gc, void *fdt, 
uint64_t base,
  return fdt_end_node(fdt);
  }
  
+/*

+ * The DT bindings for I2C device are present here:
+ *
+ * 
https://www.kernel.org/doc/Documentation/devicetree/bindings/i2c/i2c-virtio.yaml
+ */
+static int make_virtio_mmio_node_i2c(libxl__gc *gc, void *fdt)
+{
+int res;
+
+res = fdt_begin_node(fdt, "i2c");
+if (res) return res;
+
+res = fdt_property_compat(gc, fdt, 1, VIRTIO_DEVICE_TYPE_I2C);
+if (res) return res;
+
+return fdt_end_node(fdt);
+}
+
+/*
+ * The DT bindings for GPIO device are present here:
+ *
+ * 
https://www.kernel.org/doc/Documentation/devicetree/bindings/gpio/gpio-virtio.yaml
+ */
+static int make_virtio_mmio_node_gpio(libxl__gc *gc, void *fdt)
+{
+int res;
+
+res = fdt_begin_node(fdt, "gpio");
+if (res) return res;
+
+res = fdt_property_compat(gc, fdt, 1, VIRTIO_DEVICE_TYPE_GPIO);
+if (res) return res;
+
+res = fdt_property(fdt, "gpio-controller", NULL, 0);
+if (res) return res;
+
+res = fdt_property_cell(fdt, "#gpio-cells", 2);
+if (res) return res;
+
+res = fdt_property(fdt, "interrupt-controller", NULL, 0);
+if (res) return res;
+
+res = fdt_property_cell(fdt, "#interrupt-cells", 2);
+if (res) return res;
+
+return fdt_end_node(fdt);
+}
+
+static int make_virtio_mmio_node_device(libxl__gc *gc, void *fdt, uint64_t 
base,
+uint32_t irq, const char *type,
+uint32_t backend_domid)
+{
+int res;
+
+res = make_virtio_mmio_node_common(gc, fdt, base, irq, backend_domid);
+if (res) return res;
+
+/* Add device specific nodes */
+if (!strcmp(type, VIRTIO_DEVICE_TYPE_I2C)) {
+res = make_virtio_mmio_node_i2c(gc, fdt);
+if (res) return res;
+} else if (!strcmp(type, VIRTIO_DEVICE_TYPE_GPIO)) {
+res = make_virtio_mmio_node_gpio(gc, fdt);
+if (res) return res;
+} else if (strcmp(type, VIRTIO_DEVICE_TYPE_GENERIC)) {
+/* Doesn't match generic virtio device */
+

Re: [PATCH V9 3/3] docs: Add documentation for generic virtio devices

2022-12-13 Thread Oleksandr Tyshchenko





On 13.12.22 12:08, Viresh Kumar wrote:


Hello Viresh


This patch updates xl.cfg man page with details of generic Virtio device
related information.

Reviewed-by: Anthony PERARD 
Signed-off-by: Viresh Kumar 



Now it looks perfect, thanks

Reviewed-by: Oleksandr Tyshchenko 


---
  docs/man/xl.cfg.5.pod.in | 33 +
  1 file changed, 33 insertions(+)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index ec444fb2ba79..024bceeb61b2 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -1585,6 +1585,39 @@ Set maximum height for pointer device.
  
  =back
  
+=item B

+
+Specifies the Virtio devices to be provided to the guest.
+
+Each B is a comma-separated list of C settings
+from the following list. As a special case, a single comma is allowed in the
+VALUE of the "type" KEY, where the VALUE is set with "virtio,device".
+
+=over 4
+
+=item B
+
+Specifies the backend domain name or id, defaults to dom0.
+
+=item B
+
+Specifies the compatible string for the specific Virtio device. The same will 
be
+written in the Device Tree compatible property of the Virtio device. For
+example, "type=virtio,device22" for the I2C device, whose device-tree binding 
is
+present here:
+
+L<https://www.kernel.org/doc/Documentation/devicetree/bindings/i2c/i2c-virtio.yaml>
+
+For generic virtio devices, where we don't need to set special or compatible
+properties in the Device Tree, the type field must be set to "virtio,device".
+
+=item B
+
+Specifies the transport mechanism for the Virtio device, only "mmio" is
+supported for now.
+
+=back
+
  =item B
  
  B Set TEE type for the guest. TEE is a Trusted Execution

Re: [PATCH V8 1/3] libxl: Add support for generic virtio device

2022-12-12 Thread Oleksandr Tyshchenko

On Mon, Dec 12, 2022 at 12:10 PM Viresh Kumar 
wrote:

Hello Viresh

[sorry for the possible format issues]


This patch adds basic support for configuring and assisting generic
> Virtio backends, which could run in any domain.
>
> An example of domain configuration for mmio based Virtio I2C device is:
> virtio = ["type=virtio,device22,transport=mmio"]
>
> To make this work on Arm, allocate Virtio MMIO params (IRQ and memory
> region) and pass them to the backend and update guest device-tree to
> create a DT node for the Virtio devices.
>
> Add special support for I2C and GPIO devices, which require the
> "compatible" DT property to be set, among other device specific
> properties. Support for generic virtio devices is also added, which just
> need a MMIO node but not any special DT properties, for such devices the
> user needs to pass "virtio,device" in the "type" string.
>
> The parsing of generic virtio device configurations will be done in a
> separate commit.
>
> Signed-off-by: Viresh Kumar 
>


Reviewed-by: Oleksandr Tyshchenko 

with one NIT addressed ...


> ---
>  tools/libs/light/Makefile |   1 +
>  tools/libs/light/libxl_arm.c  | 100 +++
>  tools/libs/light/libxl_create.c   |   4 +
>  tools/libs/light/libxl_internal.h |   6 +
>  tools/libs/light/libxl_types.idl  |  18 +++
>  tools/libs/light/libxl_types_internal.idl |   1 +
>  tools/libs/light/libxl_virtio.c   | 144 ++
>  7 files changed, 274 insertions(+)
>  create mode 100644 tools/libs/light/libxl_virtio.c
>
> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
> index 374be1cfab25..4fddcc6f51d7 100644
> --- a/tools/libs/light/Makefile
> +++ b/tools/libs/light/Makefile
> @@ -106,6 +106,7 @@ OBJS-y += libxl_vdispl.o
>  OBJS-y += libxl_pvcalls.o
>  OBJS-y += libxl_vsnd.o
>  OBJS-y += libxl_vkb.o
> +OBJS-y += libxl_virtio.o
>  OBJS-y += libxl_genid.o
>  OBJS-y += _libxl_types.o
>  OBJS-y += libxl_flask.o
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index fa3d61f1e882..292b31881210 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -113,6 +113,19 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>  }
>  }
>
> +for (i = 0; i < d_config->num_virtios; i++) {
> +libxl_device_virtio *virtio = &d_config->virtios[i];
> +
> +if (virtio->transport != LIBXL_VIRTIO_TRANSPORT_MMIO)
> +continue;
> +
> +rc = alloc_virtio_mmio_params(gc, &virtio->base, &virtio->irq,
> +  &virtio_mmio_base,
> &virtio_mmio_irq);
> +
> +if (rc)
> +return rc;
> +}
> +
>  /*
>   * Every virtio-mmio device uses one emulated SPI. If Virtio devices
> are
>   * present, make sure that we allocate enough SPIs for them.
> @@ -956,6 +969,79 @@ static int make_virtio_mmio_node(libxl__gc *gc, void
> *fdt, uint64_t base,
>  return fdt_end_node(fdt);
>  }
>
> +/*
> + * The DT bindings for GPIO device are present here:
>


... here, s/GPIO/I2C

I hope this could be done when committing ...



> + *
> + *
> https://www.kernel.org/doc/Documentation/devicetree/bindings/i2c/i2c-virtio.yaml
> + */
> +static int make_virtio_mmio_node_i2c(libxl__gc *gc, void *fdt)
> +{
> +int res;
> +
> +res = fdt_begin_node(fdt, "i2c");
> +if (res) return res;
> +
> +res = fdt_property_compat(gc, fdt, 1, VIRTIO_DEVICE_TYPE_I2C);
> +if (res) return res;
> +
> +return fdt_end_node(fdt);
> +}
> +
> +/*
> + * The DT bindings for GPIO device are present here:
> + *
> + *
> https://www.kernel.org/doc/Documentation/devicetree/bindings/gpio/gpio-virtio.yaml
> + */
> +static int make_virtio_mmio_node_gpio(libxl__gc *gc, void *fdt)
> +{
> +int res;
> +
> +res = fdt_begin_node(fdt, "gpio");
> +if (res) return res;
> +
> +res = fdt_property_compat(gc, fdt, 1, VIRTIO_DEVICE_TYPE_GPIO);
> +if (res) return res;
> +
> +res = fdt_property(fdt, "gpio-controller", NULL, 0);
> +if (res) return res;
> +
> +res = fdt_property_cell(fdt, "#gpio-cells", 2);
> +if (res) return res;
> +
> +res = fdt_property(fdt, "interrupt-controller", NULL, 0);
> +if (res) return res;
> +
> +res = fdt_property_cell(fdt, "#interrupt-cells", 2);
> +if (res) return res;
> +
> +return fdt_end_node(fdt);
> +}
> +
> +static int make_virtio_mmio_node_device(libxl__gc

Re: [PATCH V6 1/3] libxl: Add support for generic virtio device

2022-12-06 Thread Oleksandr Tyshchenko





On 05.12.22 13:29, Viresh Kumar wrote:

Hello Viresh


On 05-12-22, 11:45, Viresh Kumar wrote:

+rc = libxl__backendpath_parse_domid(gc, be_path, &virtio->backend_domid);
+if (rc) goto out;
+
+rc = libxl__parse_backend_path(gc, be_path, &dev);
+if (rc) goto out;


The same question for dev variable.


Hmm, this we aren't using at all, which KBD does use it. Maybe we
should even call libxl__parse_backend_path() ?


Removing it works just fine for me.



Perfect. We will be able to add it when it is *really* needed.

Re: [PATCH V6 3/3] docs: Add documentation for generic virtio devices

2022-12-06 Thread Oleksandr Tyshchenko





On 05.12.22 11:11, Viresh Kumar wrote:


Hello Viresh


On 04-12-22, 20:52, Oleksandr Tyshchenko wrote:

So as I understand current series adds support for two virtio devices
(i2c/gpio) that require specific device-tree sub node with specific
compatible in it [1]. Those backends are standalone userspace applications
(daemons) that do not require any additional configuration parameters from
the toolstack other than just virtio-mmio irq and base (please correct me if
I am wrong).


For now, yes. But we may want to link these devices with other devices
in DT, like GPIO line consumers. I am not pushing a half informed
solution for that right now and that can be taken up later.


I got it, ok.




Well, below just some thoughts (which might be wrong) regarding the possible
extensions for future use. Please note, I do not suggest the following to be
implemented right now (I mean within the context of current series):

1. For supporting usual virtio devices that don't require specific
device-tree sub node with specific compatible in it [2] we would probably
need to either make "compatible" (or type?) string optional or to reserve
some value for it ("common" for the instance).


I agree. Maybe we can use "virtio,device" without a number for the
device in this case.



Fine with me.





2. For supporting Qemu based virtio devices we would probably need to add
"backendtype" string (with "standalone" value for daemons like yours and
"qemu" value for Qemu backends).


Hmm, I realize now that my patch did define a new type for this,
libxl_virtio_backend, which defines STANDALONE already, but it isn't
used currently. Maybe I should remove it too.

And I am not sure sure how to use these values, STANDALONE or QEMU.
Should the DT nodes be created only for STANDALONE and never for QEMU
?


If we expose virtio-mmio device to the guest via device-tree on Arm, 
then I think the DT nodes should be always created here, no matter where 
the corresponding virtio backend is located itself (either STANDALONE or 
QEMU).




Maybe we can add these fields and a config param, once someone wants
to reuse this stuff for QEMU ?



I don't know what to suggest here, sorry.

On the one hand, it is an extra work for you trying to add functionality 
you don't need at the moment. On the other hand if we add "backendtype" 
config param right now with default to STANDALONE it might simplify work 
for someone who ends up adding other type (in particular, the QEMU). 
Let's see what the maintainers will say.







3. For supporting additional configuration parameters for Qemu based virtio
devices we could probably reuse "device_model_args" (although it is not
clear to me what alternative to use for daemons).


I would leave it for the person who will make use of this eventually,
as then we will have more information on the same.


Sure, these are just thoughts for now.




+=item B


Shouldn't it be "type" instead (the parsing code is looking for type and the
example below suggests the type)?


Yes.


+Specifies the compatible string for the specific Virtio device. The same will 
be
+written in the Device Tree compatible property of the Virtio device. For
+example, "type=virtio,device22" for the I2C device > +
+=item B
+
+Specifies the transport mechanism for the Virtio device, like "mmio" or "pci".
+
+=back
+
   =item B
   B Set TEE type for the guest. TEE is a Trusted Execution


Also the commit description for #1/3 mentions that Virtio backend could run
in any domain. So looks like the "backend" string is missing here. I would
add the following:

=item B

Specify the backend domain name or id, defaults to dom0.


I haven't used the backend in any other domain for now, just Dom0, but
the idea is definitely there to run backends in separate user domains.



ok, good. My point is the following: if backend domain is configurable 
then it should be documented here.





P.S. I am wondering do i2c/gpio virtio backends support Xen grant mappings
for the virtio?


Not yet, we haven't made much progress in that area until now, but it
is very much part of what we intend to do.



Thanks for the information.




Have you tried to run the backends in non-hardware domain
with CONFIG_XEN_VIRTIO=y in Linux?


Not yet.


ok

Re: [PATCH V6 2/3] xl: Add support to parse generic virtio device

2022-12-06 Thread Oleksandr Tyshchenko





On 05.12.22 08:20, Viresh Kumar wrote:

Hello Viresh


On 02-12-22, 19:16, Oleksandr Tyshchenko wrote:

Interesting, I see you allow user to configure virtio-mmio params (irq and
base), as far as I remember for virtio-disk these are internal only
(allocated by tools/libs/light/libxl_arm.c).


It is a mistake. Will drop it.



ok, good. Please don't forget to add a note to idl file that virtio-mmio 
params are internal only.



libxl_device_virtio = Struct("device_virtio", [
...

# Note that virtio-mmio parameters (irq and base) are for internal
# use by libxl and can't be modified.
("irq", uint32),
("base", uint64)
])

Re: [PATCH V6 1/3] libxl: Add support for generic virtio device

2022-12-06 Thread Oleksandr Tyshchenko





On 05.12.22 08:15, Viresh Kumar wrote:

Hi Oleksandr,



Hello Viresh



On 02-12-22, 16:52, Oleksandr Tyshchenko wrote:

This patch adds basic support for configuring and assisting generic
Virtio backend which could run in any domain.

An example of domain configuration for mmio based Virtio I2C device is:
virtio = ["type=virtio,device22,transport=mmio"]

Also to make this work on Arm, allocate Virtio MMIO params (IRQ and
memory region) and pass them to the backend. Update guest device-tree as
well to create a DT node for the Virtio devices.



Some NITs regarding the commit description:
1. Besides making generic things current patch also adds i2c/gpio device
nodes, I would mention that in the description.
2. I assume current patch is not enough to make this work on Arm, at least
the subsequent patch is needed, I would mention that as well.
3. I understand where "virtio,device22"/"virtio,device29" came from, but I
think that links to the corresponding device-tree bindings should be
mentioned here (and/or maybe in the code).


Agree to all.
  

+static int make_virtio_mmio_node_device(libxl__gc *gc, void *fdt, uint64_t 
base,
+uint32_t irq, const char *type,
+uint32_t backend_domid)
+{
+int res, len = strlen(type);
+
+res = make_virtio_mmio_node_common(gc, fdt, base, irq, backend_domid);
+if (res) return res;
+
+/* Add device specific nodes */
+if (!strncmp(type, "virtio,device22", len)) {
+res = make_virtio_mmio_node_i2c(gc, fdt);
+if (res) return res;
+} else if (!strncmp(type, "virtio,device29", len)) {
+res = make_virtio_mmio_node_gpio(gc, fdt);
+if (res) return res;
+} else {
+LOG(ERROR, "Invalid type for virtio device: %s", type);
+return -EINVAL;
+}


I am not sure whether it is the best place to ask, but I will try anyway. So
I assume that with the whole series applied it would be possible to
configure only two specific device types ("22" and "29").


Right.


But what to do if user, for example, is interested in usual virtio device
(which doesn't require specific device-tree sub node with specific
compatible in it). For these usual virtio devices just calling
make_virtio_mmio_node_common() would be enough.


Maybe we should introduce something like type "common" which would mean we
don't need any additional device-tree sub nodes?

virtio = ["type=common,transport=mmio"]


I am fine with this. Maybe, to keep it aligned with compatibles, we
can write it as
  
virtio = ["type=virtio,device,transport=mmio"]


and document that "virtio,device" type is special and we won't add
compatible property to the DT node.



Personally I am fine with this.





diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 612eacfc7fac..15a32c75c045 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -1802,6 +1802,11 @@ static void domcreate_launch_dm(libxl__egc *egc, 
libxl__multidev *multidev,
 &d_config->vkbs[i]);
   }
+for (i = 0; i < d_config->num_virtios; i++) {
+libxl__device_add(gc, domid, &libxl__virtio_devtype,
+  &d_config->virtios[i]);
+}



I am wondering whether this is the best place to put this call. This gets
called for LIBXL_DOMAIN_TYPE_PV and LIBXL_DOMAIN_TYPE_PVH (our Arm case),
and not for LIBXL_DOMAIN_TYPE_HVM. Is it what we want?


Can you suggest where should I move this ?



I am not 100% sure, but I think if this whole enabling work is supposed 
to be indeed generic,

I would move this out of "switch (d_config->c_info.type)" at least.

  

+libxl_virtioinfo = Struct("virtioinfo", [
+("backend", string),
+("backend_id", uint32),
+("frontend", string),
+("frontend_id", uint32),
+("devid", libxl_devid),
+("state", integer),
+], dir=DIR_OUT)


I failed to find where libxl_virtioinfo is used within the series. Why do we
need it?


Looks like leftover that I missed. Will remove it.
  

+static int libxl__virtio_from_xenstore(libxl__gc *gc, const char *libxl_path,
+   libxl_devid devid,
+   libxl_device_virtio *virtio)
+{
+const char *be_path, *fe_path, *tmp;
+libxl__device dev;
+int rc;
+
+virtio->devid = devid;
+
+rc = libxl__xs_read_mandatory(gc, XBT_NULL,
+  GCSPRINTF("%s/backend", libxl_path),
+  &be_path);
+if (rc) goto out;
+
+rc = libxl__xs_read_mandatory(gc, XBT_NULL,
+  GCSPRINTF("%s/frontend&q

Re: [PATCH V6 3/3] docs: Add documentation for generic virtio devices

2022-12-04 Thread Oleksandr Tyshchenko





On 08.11.22 13:24, Viresh Kumar wrote:

Hello Viresh


[sorry for the possible format issues if any]


This patch updates xl.cfg man page with details of generic Virtio device
related information.



So as I understand current series adds support for two virtio devices 
(i2c/gpio) that require specific device-tree sub node with specific 
compatible in it [1]. Those backends are standalone userspace 
applications (daemons) that do not require any additional configuration 
parameters from the toolstack other than just virtio-mmio irq and base 
(please correct me if I am wrong).


Well, below just some thoughts (which might be wrong) regarding the 
possible extensions for future use. Please note, I do not suggest the 
following to be implemented right now (I mean within the context of 
current series):


1. For supporting usual virtio devices that don't require specific 
device-tree sub node with specific compatible in it [2] we would 
probably need to either make "compatible" (or type?) string optional or 
to reserve some value for it ("common" for the instance).
2. For supporting Qemu based virtio devices we would probably need to 
add "backendtype" string (with "standalone" value for daemons like yours 
and "qemu" value for Qemu backends).
3. For supporting additional configuration parameters for Qemu based 
virtio devices we could probably reuse "device_model_args" (although it 
is not clear to me what alternative to use for daemons).


Any other thoughts?



Signed-off-by: Viresh Kumar 
---
  docs/man/xl.cfg.5.pod.in | 21 +
  1 file changed, 21 insertions(+)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 31e58b73b0c9..1056b03df846 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -1585,6 +1585,27 @@ Set maximum height for pointer device.
  
  =back
  
+=item B

+
+Specifies the Virtio devices to be provided to the guest.
+
+Each B is a comma-separated list of C
+settings from the following list:
+
+=over 4
+
+=item B


Shouldn't it be "type" instead (the parsing code is looking for type and 
the example below suggests the type)?



+
+Specifies the compatible string for the specific Virtio device. The same will 
be
+written in the Device Tree compatible property of the Virtio device. For
+example, "type=virtio,device22" for the I2C device > +
+=item B
+
+Specifies the transport mechanism for the Virtio device, like "mmio" or "pci".
+
+=back
+
  =item B
  
  B Set TEE type for the guest. TEE is a Trusted Execution




Also the commit description for #1/3 mentions that Virtio backend could 
run in any domain. So looks like the "backend" string is missing here. I 
would add the following:


=item B

Specify the backend domain name or id, defaults to dom0.


P.S. I am wondering do i2c/gpio virtio backends support Xen grant 
mappings for the virtio? Have you tried to run the backends in 
non-hardware domain with CONFIG_XEN_VIRTIO=y in Linux?



[1]
https://www.kernel.org/doc/Documentation/devicetree/bindings/i2c/i2c-virtio.yaml
https://www.kernel.org/doc/Documentation/devicetree/bindings/gpio/gpio-virtio.yaml
[2]
https://www.kernel.org/doc/Documentation/devicetree/bindings/virtio/mmio.yaml

Re: [PATCH V6 2/3] xl: Add support to parse generic virtio device

2022-12-02 Thread Oleksandr Tyshchenko





On 08.11.22 13:23, Viresh Kumar wrote:


Hello Viresh

[sorry for the possible format issues if any]


This patch adds basic support for parsing generic Virtio backend.

An example of domain configuration for mmio based Virtio I2C device is:
virtio = ["type=virtio,device22,transport=mmio"]

Signed-off-by: Viresh Kumar 
---
  tools/ocaml/libs/xl/genwrap.py   |  1 +
  tools/ocaml/libs/xl/xenlight_stubs.c |  1 +
  tools/xl/xl_parse.c  | 84 
  3 files changed, 86 insertions(+)

diff --git a/tools/ocaml/libs/xl/genwrap.py b/tools/ocaml/libs/xl/genwrap.py
index 7bf26bdcd831..b188104299b1 100644
--- a/tools/ocaml/libs/xl/genwrap.py
+++ b/tools/ocaml/libs/xl/genwrap.py
@@ -36,6 +36,7 @@ DEVICE_LIST =  [ ("list",   ["ctx", "domid", "t 
list"]),
  functions = { # ( name , [type1,type2,] )
  "device_vfb": DEVICE_FUNCTIONS,
  "device_vkb": DEVICE_FUNCTIONS,
+"device_virtio": DEVICE_FUNCTIONS,
  "device_disk":DEVICE_FUNCTIONS + DEVICE_LIST +
[ ("insert", ["ctx", "t", "domid", "?async:'a", "unit", 
"unit"]),
  ("of_vdev",["ctx", "domid", "string", "t"]),
diff --git a/tools/ocaml/libs/xl/xenlight_stubs.c 
b/tools/ocaml/libs/xl/xenlight_stubs.c
index 45b8af61c74a..8e54f95da7c7 100644
--- a/tools/ocaml/libs/xl/xenlight_stubs.c
+++ b/tools/ocaml/libs/xl/xenlight_stubs.c
@@ -707,6 +707,7 @@ DEVICE_ADDREMOVE(disk)
  DEVICE_ADDREMOVE(nic)
  DEVICE_ADDREMOVE(vfb)
  DEVICE_ADDREMOVE(vkb)
+DEVICE_ADDREMOVE(virtio)
  DEVICE_ADDREMOVE(pci)
  _DEVICE_ADDREMOVE(disk, cdrom, insert)
  
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c

index 1b5381cef033..c6f35c069d2a 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1208,6 +1208,87 @@ static void parse_vkb_list(const XLU_Config *config,
  if (rc) exit(EXIT_FAILURE);
  }
  
+static int parse_virtio_config(libxl_device_virtio *virtio, char *token)

+{
+char *oparg;
+int rc;
+
+if (MATCH_OPTION("backend", token, oparg)) {
+virtio->backend_domname = strdup(oparg);
+} else if (MATCH_OPTION("type", token, oparg)) {
+virtio->type = strdup(oparg);
+} else if (MATCH_OPTION("transport", token, oparg)) {
+rc = libxl_virtio_transport_from_string(oparg, &virtio->transport);
+if (rc) return rc;
+} else if (MATCH_OPTION("irq", token, oparg)) {
+virtio->irq = strtoul(oparg, NULL, 0);
+} else if (MATCH_OPTION("base", token, oparg)) {
+virtio->base = strtoul(oparg, NULL, 0);



Interesting, I see you allow user to configure virtio-mmio params (irq 
and base), as far as I remember for virtio-disk these are internal only 
(allocated by tools/libs/light/libxl_arm.c).


I am not really sure why we need to configure virtio "base", could you 
please clarify? But if we really want/need to be able to configure 
virtio "irq" (for example to avoid possible clashing with physical one), 
I am afraid, this will require more changes that current patch does. 
Within current series saving virtio->irq here doesn't have any effect as 
it will be overwritten in 
libxl__arch_domain_prepare_config()->alloc_virtio_mmio_params() anyway. 
I presume the code in libxl__arch_domain_prepare_config() shouldn't try 
to allocate virtio->irq if it is already configured by user, also the 
allocator should probably take into the account of what is already 
configured by user, to avoid allocating the same irq for another device 
assigned for the same guest.


Also doc change in the subsequent patch doesn't mention about irq/base 
configuration.



So maybe we should just drop for now?
+} else if (MATCH_OPTION("irq", token, oparg)) {
+virtio->irq = strtoul(oparg, NULL, 0);
+} else if (MATCH_OPTION("base", token, oparg)) {
+virtio->base = strtoul(oparg, NULL, 0);




+} else {
+fprintf(stderr, "Unknown string \"%s\" in virtio spec\n", token);
+return -1;
+}
+
+return 0;
+}
+
+static void parse_virtio_list(const XLU_Config *config,
+  libxl_domain_config *d_config)
+{
+XLU_ConfigList *virtios;
+const char *item;
+char *buf = NULL, *oparg, *str = NULL;
+int rc;
+
+if (!xlu_cfg_get_list (config, "virtio", &virtios, 0, 0)) {
+int entry = 0;
+while ((item = xlu_cfg_get_listitem(virtios, entry)) != NULL) {
+libxl_device_virtio *virtio;
+char *p;
+
+virtio = ARRAY_EXTEND_INIT(d_config->virtios, 
d_config->num_virtios,
+   libxl_device_virtio_init);
+
+buf = strdup(item);
+
+p = strtok(buf, ",");
+while (p != NULL)
+{
+while (*p == ' ') p++;
+
+// Type may contain a comma, do special handling.
+if (MATCH_OPTION("type", p, oparg)) {
+if (!strncmp(oparg, "virtio", strlen("virtio"))) {
+

Re: [PATCH V6 1/3] libxl: Add support for generic virtio device

2022-12-02 Thread Oleksandr Tyshchenko





On 08.11.22 13:23, Viresh Kumar wrote:


Hello Viresh

[sorry for the possible format issues if any]



This patch adds basic support for configuring and assisting generic
Virtio backend which could run in any domain.

An example of domain configuration for mmio based Virtio I2C device is:
virtio = ["type=virtio,device22,transport=mmio"]

Also to make this work on Arm, allocate Virtio MMIO params (IRQ and
memory region) and pass them to the backend. Update guest device-tree as
well to create a DT node for the Virtio devices.



Some NITs regarding the commit description:
1. Besides making generic things current patch also adds i2c/gpio device 
nodes, I would mention that in the description.
2. I assume current patch is not enough to make this work on Arm, at 
least the subsequent patch is needed, I would mention that as well.
3. I understand where "virtio,device22"/"virtio,device29" came from, but 
I think that links to the corresponding device-tree bindings should be 
mentioned here (and/or maybe in the code).






Signed-off-by: Viresh Kumar 
---
  tools/libs/light/Makefile |   1 +
  tools/libs/light/libxl_arm.c  |  89 +++
  tools/libs/light/libxl_create.c   |   5 +
  tools/libs/light/libxl_internal.h |   1 +
  tools/libs/light/libxl_types.idl  |  29 +
  tools/libs/light/libxl_types_internal.idl |   1 +
  tools/libs/light/libxl_virtio.c   | 127 ++
  7 files changed, 253 insertions(+)
  create mode 100644 tools/libs/light/libxl_virtio.c

diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile
index 374be1cfab25..4fddcc6f51d7 100644
--- a/tools/libs/light/Makefile
+++ b/tools/libs/light/Makefile
@@ -106,6 +106,7 @@ OBJS-y += libxl_vdispl.o
  OBJS-y += libxl_pvcalls.o
  OBJS-y += libxl_vsnd.o
  OBJS-y += libxl_vkb.o
+OBJS-y += libxl_virtio.o
  OBJS-y += libxl_genid.o
  OBJS-y += _libxl_types.o
  OBJS-y += libxl_flask.o
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index b4928dbf673c..f33c9b273a4f 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -113,6 +113,19 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
  }
  }
  
+for (i = 0; i < d_config->num_virtios; i++) {

+libxl_device_virtio *virtio = &d_config->virtios[i];
+
+if (virtio->transport != LIBXL_VIRTIO_TRANSPORT_MMIO)
+continue;
+
+rc = alloc_virtio_mmio_params(gc, &virtio->base, &virtio->irq,
+  &virtio_mmio_base, &virtio_mmio_irq);
+
+if (rc)
+return rc;
+}
+
  /*
   * Every virtio-mmio device uses one emulated SPI. If Virtio devices are
   * present, make sure that we allocate enough SPIs for them.
@@ -968,6 +981,68 @@ static int make_virtio_mmio_node(libxl__gc *gc, void *fdt, 
uint64_t base,
  return fdt_end_node(fdt);
  }
  
+static int make_virtio_mmio_node_i2c(libxl__gc *gc, void *fdt)

+{
+int res;
+
+res = fdt_begin_node(fdt, "i2c");
+if (res) return res;
+
+res = fdt_property_compat(gc, fdt, 1, "virtio,device22");
+if (res) return res;
+
+return fdt_end_node(fdt);
+}
+
+static int make_virtio_mmio_node_gpio(libxl__gc *gc, void *fdt)
+{
+int res;
+
+res = fdt_begin_node(fdt, "gpio");
+if (res) return res;
+
+res = fdt_property_compat(gc, fdt, 1, "virtio,device29");
+if (res) return res;
+
+res = fdt_property(fdt, "gpio-controller", NULL, 0);
+if (res) return res;
+
+res = fdt_property_cell(fdt, "#gpio-cells", 2);
+if (res) return res;
+
+res = fdt_property(fdt, "interrupt-controller", NULL, 0);
+if (res) return res;
+
+res = fdt_property_cell(fdt, "#interrupt-cells", 2);
+if (res) return res;
+
+return fdt_end_node(fdt);
+}
+
+static int make_virtio_mmio_node_device(libxl__gc *gc, void *fdt, uint64_t 
base,
+uint32_t irq, const char *type,
+uint32_t backend_domid)
+{
+int res, len = strlen(type);
+
+res = make_virtio_mmio_node_common(gc, fdt, base, irq, backend_domid);
+if (res) return res;
+
+/* Add device specific nodes */
+if (!strncmp(type, "virtio,device22", len)) {
+res = make_virtio_mmio_node_i2c(gc, fdt);
+if (res) return res;
+} else if (!strncmp(type, "virtio,device29", len)) {
+res = make_virtio_mmio_node_gpio(gc, fdt);
+if (res) return res;
+} else {
+LOG(ERROR, "Invalid type for virtio device: %s", type);
+return -EINVAL;
+}


I am not sure whether it is the best place to ask, but I will try 
anyway. So I assume that with the whole series applied it would be 
possible to configure only two specific device types ("22" and "29").
But what to do if user, for example, is interested in usual virtio 
device (which doesn't require specific device-tree sub node with 
specific compatible in it). For thes

Re: [PATCH] xen: add missing free_irq() in error path

2022-11-14 Thread Oleksandr Tyshchenko


On 14.11.22 09:07, ruanjinjie wrote:

Hello


> free_irq() is missing in case of error, fix that.
>
> Signed-off-by: ruanjinjie 


Nit: neither subject nor description mentions which subsystem current 
patch targets.

I would add "xen-platform:" or "xen/platform-pci:" at least.


Reviewed-by: Oleksandr Tyshchenko 

Thanks.

> ---
>   drivers/xen/platform-pci.c | 7 +--
>   1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
> index 18f0ed8b1f93..6ebd819338ec 100644
> --- a/drivers/xen/platform-pci.c
> +++ b/drivers/xen/platform-pci.c
> @@ -144,7 +144,7 @@ static int platform_pci_probe(struct pci_dev *pdev,
>   if (ret) {
>   dev_warn(&pdev->dev, "Unable to set the evtchn callback 
> "
>"err=%d\n", ret);
> - goto out;
> + goto irq_out;
>   }
>   }
>   
> @@ -152,13 +152,16 @@ static int platform_pci_probe(struct pci_dev *pdev,
>   grant_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes);
>   ret = gnttab_setup_auto_xlat_frames(grant_frames);
>   if (ret)
> - goto out;
> + goto irq_out;
>   ret = gnttab_init();
>   if (ret)
>   goto grant_out;
>   return 0;
>   grant_out:
>   gnttab_free_auto_xlat_frames();
> +irq_out:
> + if (!xen_have_vector_callback)
> + free_irq(pdev->irq, pdev);
>   out:
>   pci_release_region(pdev, 0);
>   mem_out:

-- 
Regards,

Oleksandr Tyshchenko

Re: [PATCH 2/3] CHANGELOG: Add missing entries for work during the 4.17 release

2022-11-11 Thread Oleksandr Tyshchenko

On Fri, Nov 11, 2022 at 11:23 AM Henry Wang  wrote:

Hello Henry

[leave only xen-devel@lists.xenproject.org in CC]
[sorry for the possible format issues]

Signed-off-by: Henry Wang 
> ---
>  CHANGELOG.md | 29 +++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/CHANGELOG.md b/CHANGELOG.md
> index adbbb216fa..fa8cc476b3 100644
> --- a/CHANGELOG.md
> +++ b/CHANGELOG.md
> @@ -4,16 +4,41 @@ Notable changes to Xen will be documented in this file.
>
>  The format is based on [Keep a Changelog](
> https://keepachangelog.com/en/1.0.0/)
>
> -## [unstable UNRELEASED](
> https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;h=staging) - TBD
> +## [4.17.0](
> https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;h=staging) -
> 2022-11-??
>
>  ### Changed
>   - On x86 "vga=current" can now be used together with GrUB2's gfxpayload
> setting. Note that
> this requires use of "multiboot2" (and "module2") as the GrUB commands
> loading Xen.
> + - The "gnttab" option now has a new command line sub-option for
> disabling the
> +   GNTTABOP_transfer functionality.
> + - The x86 MCE command line option info is now updated.
>
>  ### Added / support upgraded
> + - Out-of-tree builds for the hypervisor now supported.
> + - The project has officially adopted 4 directives and 24 rules of
> MISRA-C,
> +   added MISRA-C checker build integration, and defined how to document
> +   deviations.
>   - IOMMU superpage support on x86, affecting PV guests as well as HVM/PVH
> ones
> when they don't share page tables with the CPU (HAP / EPT / NPT).
> - - Support VIRT_SSBD feature for HVM guests on AMD.
> + - Support VIRT_SSBD feature for HVM guests on AMD and MSR_SPEC_CTRL
> feature for
> +   SVM guests.
> + - Improved TSC, CPU frequency calibration and APIC on x86.
> + - Improved support for CET Indirect Branch Tracking on x86.
> + - Improved mwait-idle support for SPR and ADL on x86.
> + - Extend security support for hosts to 12 TiB of memory on x86.
> + - Add command line option to set cpuid parameters for dom0 at boot time
> on x86.
> + - Improved static configuration options on Arm.
> + - cpupools can be specified at boot using device tree on Arm.
> + - It is possible to use PV drivers with dom0less guests, allowing
> statically
> +   booted dom0less guests with PV devices.
> + - On Arm, p2m structures are now allocated out of a pool of memory set
> aside at
> +   domain creation.
> + - Improved mitigations against Spectre-BHB on Arm.
> + - Add support for VirtIO toolstack on Arm.
>

I would clarify that only virtio-mmio is supported on Arm.


> + - Allow setting the number of CPUs to activate at runtime from command
> line
> +   option on Arm.
> + - Improved toolstack build system.
> + - Add Xue - console over USB 3 Debug Capability.
>

I would probably also add the following:

- Add Renesas R-Car Gen4 IPMMU-VMSA support (Arm)
- grant-table support on Arm was improved and hardened by implementing
“simplified M2P-like
approach for the xenheap pages”



>
>  ### Removed / support downgraded
>   - dropped support for the (x86-only) "vesa-mtrr" and "vesa-remap"
> command line options
> --
> 2.25.1
>
>
>

-- 
Regards,

Oleksandr Tyshchenko

Re: [PATCH v1 09/12] accel/xen/xen-all: export xenstore_record_dm_state

2022-10-28 Thread Oleksandr Tyshchenko

On Thu, Oct 27, 2022 at 12:24 PM Alex Bennée  wrote:

Hello all

> Vikram Garhwal  writes:
>
> > xenstore_record_dm_state() will also be used in aarch64 xenpv machine.
> >
> > Signed-off-by: Vikram Garhwal 
> > Signed-off-by: Stefano Stabellini 
> > ---
> >  accel/xen/xen-all.c  | 2 +-
> >  include/hw/xen/xen.h | 2 ++
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/accel/xen/xen-all.c b/accel/xen/xen-all.c
> > index 69aa7d018b..276625b78b 100644
> > --- a/accel/xen/xen-all.c
> > +++ b/accel/xen/xen-all.c
> > @@ -100,7 +100,7 @@ void xenstore_store_pv_console_info(int i, Chardev
> *chr)
> >  }
> >
> >
> > -static void xenstore_record_dm_state(struct xs_handle *xs, const char
> *state)
> > +void xenstore_record_dm_state(struct xs_handle *xs, const char *state)
> >  {
> >  char path[50];
> >
> > diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
> > index afdf9c436a..31e9538a5c 100644
> > --- a/include/hw/xen/xen.h
> > +++ b/include/hw/xen/xen.h
> > @@ -9,6 +9,7 @@
> >   */
> >
> >  #include "exec/cpu-common.h"
> > +#include 
>
> This is breaking a bunch of the builds and generally we try and avoid
> adding system includes in headers (apart from osdep.h) for this reason.
> In fact there is a comment just above to that fact.
>
> I think you can just add struct xs_handle to typedefs.h (or maybe just
> xen.h) and directly include xenstore.h in xen-all.c following the usual
> rules:
>
>
> https://qemu.readthedocs.io/en/latest/devel/style.html#include-directives
>
> It might be worth doing an audit to see what else is including xen.h
> needlessly or should be using sysemu/xen.h.
>
> >
> >  /* xen-machine.c */
> >  enum xen_mode {
> > @@ -31,5 +32,6 @@ qemu_irq *xen_interrupt_controller_init(void);
> >  void xenstore_store_pv_console_info(int i, Chardev *chr);
> >
> >  void xen_register_framebuffer(struct MemoryRegion *mr);
> > +void xenstore_record_dm_state(struct xs_handle *xs, const char *state);
> >
> >  #endif /* QEMU_HW_XEN_H */
>
>
> --
> Alex Bennée
>
>

For considering:
I think this patch and some other changes done in "[PATCH v1 10/12] hw/arm:
introduce xenpv machine" (the opening of Xen interfaces and
calling xenstore_record_dm_state() in hw/arm/xen_arm.c:xen_init_ioreq())
could be avoided if we enable the Xen accelerator (either by passing "-M
xenpv,accel=xen" or by adding mc->default_machine_opts = "accel=xen";
to hw/arm/xen_arm.c:xen_arm_machine_class_init() or by some other method).
These actions are already done in accel/xen/xen-all.c:xen_init(). Please
note, that I am not too familiar with that code, so there might be nuances.

Besides that, Xen accelerator will be needed for the xen-mapcache to be in
use (this is needed for mapping guest memory), there are a few
xen_enabled() checks spreading around that code to perform Xen specific
actions.

-- 
Regards,

Oleksandr Tyshchenko

Re: [PATCH v1 10/12] hw/arm: introduce xenpv machine

2022-10-28 Thread Oleksandr Tyshchenko

On Fri, Oct 28, 2022 at 8:58 PM Julien Grall  wrote:

> Hi,
>

Hello all.

[sorry for the possible format issues]



>
> On 27/10/2022 09:02, Alex Bennée wrote:
> >
> > Vikram Garhwal  writes:
> >
> > 
> >> Optional: When CONFIG_TPM is enabled, it also creates a tpm-tis-device,
> adds a
> >> TPM emulator and connects to swtpm running on host machine via chardev
> socket
> >> and support TPM functionalities for a guest domain.
> >>
> >> Extra command line for aarch64 xenpv QEMU to connect to swtpm:
> >>  -chardev socket,id=chrtpm,path=/tmp/myvtpm2/swtpm-sock \
> >>  -tpmdev emulator,id=tpm0,chardev=chrtpm \
> >>
> >> swtpm implements a TPM software emulator(TPM 1.2 & TPM 2) built on
> libtpms and
> >> provides access to TPM functionality over socket, chardev and CUSE
> interface.
> >> Github repo: https://github.com/stefanberger/swtpm
> >> Example for starting swtpm on host machine:
> >>  mkdir /tmp/vtpm2
> >>  swtpm socket --tpmstate dir=/tmp/vtpm2 \
> >>  --ctrl type=unixio,path=/tmp/vtpm2/swtpm-sock &
> >
> > 
> >> +static void xen_enable_tpm(void)
> >> +{
> >> +/* qemu_find_tpm_be is only available when CONFIG_TPM is enabled. */
> >> +#ifdef CONFIG_TPM
> >> +Error *errp = NULL;
> >> +DeviceState *dev;
> >> +SysBusDevice *busdev;
> >> +
> >> +TPMBackend *be = qemu_find_tpm_be("tpm0");
> >> +if (be == NULL) {
> >> +DPRINTF("Couldn't fine the backend for tpm0\n");
> >> +return;
> >> +}
> >> +dev = qdev_new(TYPE_TPM_TIS_SYSBUS);
> >> +object_property_set_link(OBJECT(dev), "tpmdev", OBJECT(be), &errp);
> >> +object_property_set_str(OBJECT(dev), "tpmdev", be->id, &errp);
> >> +busdev = SYS_BUS_DEVICE(dev);
> >> +sysbus_realize_and_unref(busdev, &error_fatal);
> >> +sysbus_mmio_map(busdev, 0, GUEST_TPM_BASE);
> >
> > I'm not sure what has gone wrong here but I'm getting:
> >
> >../../hw/arm/xen_arm.c: In function ‘xen_enable_tpm’:
> >../../hw/arm/xen_arm.c:120:32: error: ‘GUEST_TPM_BASE’ undeclared
> (first use in this function); did you mean ‘GUEST_RAM_BASE’?
> >  120 | sysbus_mmio_map(busdev, 0, GUEST_TPM_BASE);
> >  |^~
> >  |GUEST_RAM_BASE
> >../../hw/arm/xen_arm.c:120:32: note: each undeclared identifier is
> reported only once for each function it appears in
> >
> > In my cross build:
> >
> ># Configured with: '../../configure' '--disable-docs'
> '--target-list=aarch64-softmmu' '--disable-kvm' '--enable-xen'
> '--disable-opengl' '--disable-libudev' '--enable-tpm'
> '--disable-xen-pci-passthrough' '--cross-prefix=aarch64-linux-gnu-'
> '--skip-meson'
> >
> > which makes me wonder if this is a configure failure or a confusion
> > about being able to have host swtpm implementations during emulation but
> > needing target tpm for Xen?
>
> I was also wondering where is that value come from. Note that the
> memory/IRQ layout exposed to the guest is not stable.
>
> Are we expecting the user to rebuild QEMU for every Xen versions (or
> possibly every guest if we ever allow dynamic layout in Xen)?
>


This doesn't sound ideal.

I am wondering what would be the correct way here assuming that we would
likely need to have more such information in place for supporting more
use-cases...
For instance, the PCI host bridge emulation in Qemu. Xen toolstack (another
software layer) generates device-tree for the guest, so creates PCI Host
bridge node by using reserved regions from Guest OS interface (arch-arm.h):
- GUEST_VPCI_MEM_ADDR (GUEST_VPCI_MEM_SIZE)
- GUEST_VPCI_ECAM_BASE (GUEST_VPCI_ECAM_SIZE)
- GUEST_VPCI_PREFETCH_MEM_ADDR (GUEST_VPCI_PREFETCH_MEM_SIZE)
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libs/light/libxl_arm.c;h=2a5e93c28403738779863aded31d2df3ba72f8c0;hb=HEAD#l833

Here in Qemu when creating a PCI Host bridge we would need to use exactly
the same reserved regions which toolstack writes in the corresponding
device-tree node. So how to tell Qemu about them?
1. Introduce new cmd line arguments?
2. Using Xenstore?
3. Anything else?

I am afraid this would be related to every device that we want to emulate
in Qemu and for which the toolstack needs to generate device-tree node by
using something defined with GUEST_*, unless I really missed something.



>
> Cheers,
>
> --
> Julien Grall
>
>

-- 
Regards,

Oleksandr Tyshchenko

Re: Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices

2022-10-28 Thread Oleksandr Tyshchenko

On Thu, Oct 27, 2022 at 7:49 PM Rahul Singh  wrote:

> Hi Oleksandr,
>

Hello Rahul

[sorry for the possible format issues]


>
> > On 26 Oct 2022, at 7:23 pm, Oleksandr Tyshchenko 
> wrote:
> >
> >
> >
> > On Wed, Oct 26, 2022 at 8:18 PM Michal Orzel 
> wrote:
> > Hi Rahul,
> >
> >
> > Hello all
> >
> > [sorry for the possible format issues]
> >
> >
> > On 26/10/2022 16:33, Rahul Singh wrote:
> > >
> > >
> > > Hi Julien,
> > >
> > >> On 26 Oct 2022, at 2:36 pm, Julien Grall  wrote:
> > >>
> > >>
> > >>
> > >> On 26/10/2022 14:17, Rahul Singh wrote:
> > >>> Hi All,
> > >>
> > >> Hi Rahul,
> > >>
> > >>> At Arm, we started to implement the POC to support 2 levels of page
> tables/nested translation in SMMUv3.
> > >>> To support nested translation for guest OS Xen needs to expose the
> virtual IOMMU. If we passthrough the
> > >>> device to the guest that is behind an IOMMU and virtual IOMMU is
> enabled for the guest there is a need to
> > >>> add IOMMU binding for the device in the passthrough node as per [1].
> This email is to get an agreement on
> > >>> how to add the IOMMU binding for guest OS.
> > >>> Before I will explain how to add the IOMMU binding let me give a
> brief overview of how we will add support for virtual
> > >>> IOMMU on Arm. In order to implement virtual IOMMU Xen need SMMUv3
> Nested translation support. SMMUv3 hardware
> > >>> supports two stages of translation. Each stage of translation can be
> independently enabled. An incoming address is logically
> > >>> translated from VA to IPA in stage 1, then the IPA is input to stage
> 2 which translates the IPA to the output PA. Stage 1 is
> > >>> intended to be used by a software entity( Guest OS) to provide
> isolation or translation to buffers within the entity, for example,
> > >>> DMA isolation within an OS. Stage 2 is intended to be available in
> systems supporting the Virtualization Extensions and is
> > >>> intended to virtualize device DMA to guest VM address spaces. When
> both stage 1 and stage 2 are enabled, the translation
> > >>> configuration is called nesting.
> > >>> Stage 1 translation support is required to provide isolation between
> different devices within the guest OS. XEN already supports
> > >>> Stage 2 translation but there is no support for Stage 1 translation
> for guests. We will add support for guests to configure
> > >>> the Stage 1 transition via virtual IOMMU. XEN will emulate the SMMU
> hardware and exposes the virtual SMMU to the guest.
> > >>> Guest can use the native SMMU driver to configure the stage 1
> translation. When the guest configures the SMMU for Stage 1,
> > >>> XEN will trap the access and configure the hardware accordingly.
> > >>> Now back to the question of how we can add the IOMMU binding between
> the virtual IOMMU and the master devices so that
> > >>> guests can configure the IOMMU correctly. The solution that I am
> suggesting is as below:
> > >>> For dom0, while handling the DT node(handle_node()) Xen will replace
> the phandle in the "iommus" property with the virtual
> > >>> IOMMU node phandle.
> > >> Below, you said that each IOMMUs may have a different ID space. So
> shouldn't we expose one vIOMMU per pIOMMU? If not, how do you expect the
> user to specify the mapping?
> > >
> > > Yes you are right we need to create one vIOMMU per pIOMMU for dom0.
> This also helps in the ACPI case
> > > where we don’t need to modify the tables to delete the pIOMMU entries
> and create one vIOMMU.
> > > In this case, no need to replace the phandle as Xen create the vIOMMU
> with the same pIOMMU
> > > phandle and same base address.
> > >
> > > For domU guests one vIOMMU per guest will be created.
> > >
> > >>
> > >>> For domU guests, when passthrough the device to the guest as per
> [2],  add the below property in the partial device tree
> > >>> node that is required to describe the generic device tree binding
> for IOMMUs and their master(s)
> > >>> "iommus = < &magic_phandle 0xvMasterID>
> > >>>  • magic_phandle will be the phandle ( vIOMMU phandle in xl)
> that will be documented so that the user can set that in partial DT node
> (0xfdea).
> > >&g

Re: Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices

2022-10-26 Thread Oleksandr Tyshchenko

ice
> in DT to the guest.
> >>>  iommu@4f00 {
> >>> compatible = "arm,smmu-v3";
> >>>  interrupts = <0x00 0xe4 0xf04>;
> >>> interrupt-parent = <0x01>;
> >>> #iommu-cells = <0x01>;
> >>> interrupt-names = "combined";
> >>> reg = <0x00 0x4f00 0x00 0x4>;
> >>> phandle = <0xfdeb>;
> >>> name = "iommu";
> >>> };
> >>
> >> So I guess this node will be written by Xen. How will you the case
> where there are extra property to added (e.g. dma-coherent)?
> >
> > In this example this is physical IOMMU node. vIOMMU node wil be created
> by xl during guest creation.
> >>
> >>>  test@1000 {
> >>>  compatible = "viommu-test”;
> >>>  iommus = <0xfdeb 0x10>;
> >>
> >> I am a bit confused. Here you use 0xfdeb for the phandle but below...
> >
> > Here 0xfdeb is the physical IOMMU node phandle...
> >>
> >>>  interrupts = <0x00 0xff 0x04>;
> >>>  reg = <0x00 0x1000 0x00 0x1000>;
> >>>  name = "viommu-test";
> >>> };
> >>>  The partial Device tree node will be like this:
> >>>  / {
> >>> /* #*cells are here to keep DTC happy */
> >>> #address-cells = <2>;
> >>> #size-cells = <2>;
> >>>   passthrough {
> >>> compatible = "simple-bus";
> >>> ranges;
> >>> #address-cells = <2>;
> >>> #size-cells = <2>;
> >>>  test@1000 {
> >>>  compatible = "viommu-test";
> >>>  reg = <0 0x1000 0 0x1000>;
> >>>  interrupts = <0 80 4  0 81 4  0 82 4>;
> >>>  iommus = <0xfdea 0x01>;
> >>
> >> ... you use 0xfdea. Does this mean 'xl' will rewrite the phandle?
> >
> > but here user has to set the “iommus” property with magic phanle as
> explained earlier. 0xfdea is magic phandle.
> >
> > Regards,
> > Rahul
>
> ~Michal
>
>
>

-- 
Regards,

Oleksandr Tyshchenko

Re: [PATCH V4 1/2] xen/virtio: Optimize the setup of "xen-grant-dma" devices

2022-10-25 Thread Oleksandr Tyshchenko


On 25.10.22 20:27, Xenia Ragiadakou wrote:

Hello Xenia

> On 10/25/22 19:20, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko 
>>
>> This is needed to avoid having to parse the same device-tree
>> several times for a given device.
>>
>> For this to work we need to install the xen_virtio_restricted_mem_acc
>> callback in Arm's xen_guest_init() which is same callback as x86's
>> PV and HVM modes already use and remove the manual assignment in
>> xen_setup_dma_ops(). Also we need to split the code to initialize
>> backend_domid into a separate function.
>>
>> Prior to current patch we parsed the device-tree three times:
>> 1. xen_setup_dma_ops()->...->xen_is_dt_grant_dma_device()
>> 2. xen_setup_dma_ops()->...->xen_dt_grant_init_backend_domid()
>> 3. xen_virtio_mem_acc()->...->xen_is_dt_grant_dma_device()
>>
>> With current patch we parse the device-tree only once in
>> xen_virtio_restricted_mem_acc()->...->xen_dt_grant_init_backend_domid()
>>
>> Other benefits are:
>> - Not diverge from x86 when setting up Xen grant DMA ops
>> - Drop several global functions
>>
>> Signed-off-by: Oleksandr Tyshchenko 
>
> Reviewed-by: Xenia Ragiadakou 

Thanks!


>
> I have a question unrelated to the patch.
> CONFIG_XEN_VIRTIO_FORCE_GRANT cannot be used to force backend dom0 in 
> case xen_dt_grant_init_backend_domid() fails?

Good question, as always)


Current patch doesn't change behavior in the context of 
CONFIG_XEN_VIRTIO_FORCE_GRANT usage on Arm with device-tree,
this option is not applied for device-tree based devices, as for them we 
have a way to communicate backend_domid, so no need to guess.

Below my understanding, which might be wrong.

The xen_dt_grant_init_backend_domid() failure means that we didn't 
retrieve the backend_domid from the device node
(either the bindings is wrong or it is absent at all, the later means 
that device is *not* required use grants for virtio).
I don't really know whether forcing the grant usage with domid = 0 would 
be the good idea in that case, this just might not work.
For the instance, if the backend is other than Dom0 domain or it is in 
Dom0 but doesn't support grant mappings.

 From other hand, the CONFIG_XEN_VIRTIO_FORCE_GRANT is disabled by 
default, if it gets enabled then the user is likely aware of the 
consequences.
If we want to always honor CONFIG_XEN_VIRTIO_FORCE_GRANT, we would 
likely need to have "if (IS_ENABLED(CONFIG_XEN_VIRTIO_FORCE_GRANT))"
check the first (before the check for DT device).


>
>
>> ---
>> New patch
>> ---
>>   arch/arm/xen/enlighten.c    |  2 +-
>>   drivers/xen/grant-dma-ops.c | 77 ++---
>>   include/xen/arm/xen-ops.h   |  4 +-
>>   include/xen/xen-ops.h   | 16 
>>   4 files changed, 30 insertions(+), 69 deletions(-)
>>
>> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
>> index 93c8ccbf2982..7d59765aef22 100644
>> --- a/arch/arm/xen/enlighten.c
>> +++ b/arch/arm/xen/enlighten.c
>> @@ -445,7 +445,7 @@ static int __init xen_guest_init(void)
>>   return 0;
>>     if (IS_ENABLED(CONFIG_XEN_VIRTIO))
>> -    virtio_set_mem_acc_cb(xen_virtio_mem_acc);
>> +    virtio_set_mem_acc_cb(xen_virtio_restricted_mem_acc);
>>     if (!acpi_disabled)
>>   xen_acpi_guest_init();
>> diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c
>> index daa525df7bdc..1e797a043980 100644
>> --- a/drivers/xen/grant-dma-ops.c
>> +++ b/drivers/xen/grant-dma-ops.c
>> @@ -292,50 +292,20 @@ static const struct dma_map_ops 
>> xen_grant_dma_ops = {
>>   .dma_supported = xen_grant_dma_supported,
>>   };
>>   -static bool xen_is_dt_grant_dma_device(struct device *dev)
>> -{
>> -    struct device_node *iommu_np;
>> -    bool has_iommu;
>> -
>> -    iommu_np = of_parse_phandle(dev->of_node, "iommus", 0);
>> -    has_iommu = iommu_np &&
>> -    of_device_is_compatible(iommu_np, "xen,grant-dma");
>> -    of_node_put(iommu_np);
>> -
>> -    return has_iommu;
>> -}
>> -
>> -bool xen_is_grant_dma_device(struct device *dev)
>> -{
>> -    /* XXX Handle only DT devices for now */
>> -    if (dev->of_node)
>> -    return xen_is_dt_grant_dma_device(dev);
>> -
>> -    return false;
>> -}
>> -
>> -bool xen_virtio_mem_acc(struct virtio_device *dev)
>> -{
>> -    if (IS_ENABLED(CONFIG_XEN_VIRTIO_FORCE_GRANT) || xen_pv_domain())
>> -    return true;
>

Re: [PATCH V3] xen/virtio: Handle PCI devices which Host controller is described in DT

2022-10-25 Thread Oleksandr Tyshchenko


On 22.10.22 09:44, Oleksandr wrote:

Hello Stefano.

>
> On 21.10.22 23:08, Stefano Stabellini wrote:
>
> Hello Stefano
>
>> On Fri, 21 Oct 2022, Oleksandr Tyshchenko wrote:
>>> From: Oleksandr Tyshchenko 
>>>
>>> Use the same "xen-grant-dma" device concept for the PCI devices
>>> behind device-tree based PCI Host controller, but with one 
>>> modification.
>>> Unlike for platform devices, we cannot use generic IOMMU bindings
>>> (iommus property), as we need to support more flexible configuration.
>>> The problem is that PCI devices under the single PCI Host controller
>>> may have the backends running in different Xen domains and thus have
>>> different endpoints ID (backend domains ID).
>>>
>>> Add ability to deal with generic PCI-IOMMU bindings (iommu-map/
>>> iommu-map-mask properties) which allows us to describe relationship
>>> between PCI devices and backend domains ID properly.
>>>
>>> To avoid having to look up for the PCI Host bridge twice and reduce
>>> the amount of checks pass an extra struct device_node *np to both
>>> xen_dt_grant_init_backend_domid() and xen_is_dt_grant_dma_device().
>>> While at it also pass domid_t *backend_domid instead of
>>> struct xen_grant_dma_data *data to the former.
>>>
>>> So with current patch the code expects iommus property for the platform
>>> devices and iommu-map/iommu-map-mask properties for PCI devices.
>>>
>>> The example of generated by the toolstack iommu-map property
>>> for two PCI devices :00:01.0 and :00:02.0 whose
>>> backends are running in different Xen domains with IDs 1 and 2
>>> respectively:
>>> iommu-map = <0x08 0xfde9 0x01 0x08 0x10 0xfde9 0x02 0x08>;
>>>
>>> Signed-off-by: Oleksandr Tyshchenko 
>>> ---
>>> Slightly RFC. This is needed to support Xen grant mappings for 
>>> virtio-pci devices
>>> on Arm at some point in the future. The Xen toolstack side is not 
>>> completely ready yet.
>>> Here, for PCI devices we use more flexible way to pass backend domid 
>>> to the guest
>>> than for platform devices.
>>>
>>> Changes V1 -> V2:
>>>     - update commit description
>>>     - rebase
>>>     - rework to use generic PCI-IOMMU bindings instead of generic 
>>> IOMMU bindings
>>>
>>> Changes V2 -> V3:
>>>     - update commit description, add an example
>>>     - drop xen_dt_map_id() and squash xen_dt_get_pci_host_node() with
>>>   xen_dt_get_node()
>>>     - pass struct device_node *np to xen_is_dt_grant_dma_device() and
>>>   xen_dt_grant_init_backend_domid()
>>>     - pass domid_t *backend_domid instead of struct 
>>> xen_grant_dma_data *data
>>>   to xen_dt_grant_init_backend_domid()
>>>
>>> Previous discussion is at:
>>> https://urldefense.com/v3/__https://lore.kernel.org/xen-devel/20221006174804.2003029-1-olekst...@gmail.com/__;!!GF_29dbcQIUBPA!3cZiRy0Scq8-dibrxyFGUlAIhwa7UwRmrCAG-qdvkAG5NInYPZ_mbLTMtsZ_F4Gonowkettr-dcRO3TAs_gn-1xcLk77xg$
>>>  
>>> [lore[.]kernel[.]org]
>>> https://urldefense.com/v3/__https://lore.kernel.org/xen-devel/20221015153409.918775-1-olekst...@gmail.com/__;!!GF_29dbcQIUBPA!3cZiRy0Scq8-dibrxyFGUlAIhwa7UwRmrCAG-qdvkAG5NInYPZ_mbLTMtsZ_F4Gonowkettr-dcRO3TAs_gn-1xwRjX9GQ$
>>>  
>>> [lore[.]kernel[.]org]
>>>
>>> Based on:
>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git/log/?h=for-linus-6.1__;!!GF_29dbcQIUBPA!3cZiRy0Scq8-dibrxyFGUlAIhwa7UwRmrCAG-qdvkAG5NInYPZ_mbLTMtsZ_F4Gonowkettr-dcRO3TAs_gn-1zVOy8WKg$
>>>  
>>> [git[.]kernel[.]org]
>>> ---
>>>   drivers/xen/grant-dma-ops.c | 80 
>>> ++---
>>>   1 file changed, 66 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c
>>> index daa525df7bdc..76b29d20aeee 100644
>>> --- a/drivers/xen/grant-dma-ops.c
>>> +++ b/drivers/xen/grant-dma-ops.c
>>> @@ -10,6 +10,7 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> +#include 
>>>   #include 
>>>   #include 
>>>   #include 
>>> @@ -292,12 +293,37 @@ static const struct dma_map_ops 
>>> xen_grant_dma_ops = {
>>>   .dma_supported = xen_grant_dma_supported,
>>>   };
>>>   -static bool xen_is_dt_grant_dma_device(struct device *dev)
>&

[PATCH V4 1/2] xen/virtio: Optimize the setup of "xen-grant-dma" devices

2022-10-25 Thread Oleksandr Tyshchenko

From: Oleksandr Tyshchenko 

This is needed to avoid having to parse the same device-tree
several times for a given device.

For this to work we need to install the xen_virtio_restricted_mem_acc
callback in Arm's xen_guest_init() which is same callback as x86's
PV and HVM modes already use and remove the manual assignment in
xen_setup_dma_ops(). Also we need to split the code to initialize
backend_domid into a separate function.

Prior to current patch we parsed the device-tree three times:
1. xen_setup_dma_ops()->...->xen_is_dt_grant_dma_device()
2. xen_setup_dma_ops()->...->xen_dt_grant_init_backend_domid()
3. xen_virtio_mem_acc()->...->xen_is_dt_grant_dma_device()

With current patch we parse the device-tree only once in
xen_virtio_restricted_mem_acc()->...->xen_dt_grant_init_backend_domid()

Other benefits are:
- Not diverge from x86 when setting up Xen grant DMA ops
- Drop several global functions

Signed-off-by: Oleksandr Tyshchenko 
---
New patch
---
 arch/arm/xen/enlighten.c|  2 +-
 drivers/xen/grant-dma-ops.c | 77 ++---
 include/xen/arm/xen-ops.h   |  4 +-
 include/xen/xen-ops.h   | 16 
 4 files changed, 30 insertions(+), 69 deletions(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 93c8ccbf2982..7d59765aef22 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -445,7 +445,7 @@ static int __init xen_guest_init(void)
return 0;
 
if (IS_ENABLED(CONFIG_XEN_VIRTIO))
-   virtio_set_mem_acc_cb(xen_virtio_mem_acc);
+   virtio_set_mem_acc_cb(xen_virtio_restricted_mem_acc);
 
if (!acpi_disabled)
xen_acpi_guest_init();
diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c
index daa525df7bdc..1e797a043980 100644
--- a/drivers/xen/grant-dma-ops.c
+++ b/drivers/xen/grant-dma-ops.c
@@ -292,50 +292,20 @@ static const struct dma_map_ops xen_grant_dma_ops = {
.dma_supported = xen_grant_dma_supported,
 };
 
-static bool xen_is_dt_grant_dma_device(struct device *dev)
-{
-   struct device_node *iommu_np;
-   bool has_iommu;
-
-   iommu_np = of_parse_phandle(dev->of_node, "iommus", 0);
-   has_iommu = iommu_np &&
-   of_device_is_compatible(iommu_np, "xen,grant-dma");
-   of_node_put(iommu_np);
-
-   return has_iommu;
-}
-
-bool xen_is_grant_dma_device(struct device *dev)
-{
-   /* XXX Handle only DT devices for now */
-   if (dev->of_node)
-   return xen_is_dt_grant_dma_device(dev);
-
-   return false;
-}
-
-bool xen_virtio_mem_acc(struct virtio_device *dev)
-{
-   if (IS_ENABLED(CONFIG_XEN_VIRTIO_FORCE_GRANT) || xen_pv_domain())
-   return true;
-
-   return xen_is_grant_dma_device(dev->dev.parent);
-}
-
 static int xen_dt_grant_init_backend_domid(struct device *dev,
-  struct xen_grant_dma_data *data)
+  domid_t *backend_domid)
 {
struct of_phandle_args iommu_spec;
 
if (of_parse_phandle_with_args(dev->of_node, "iommus", "#iommu-cells",
0, &iommu_spec)) {
-   dev_err(dev, "Cannot parse iommus property\n");
+   dev_dbg(dev, "Cannot parse iommus property\n");
return -ESRCH;
}
 
if (!of_device_is_compatible(iommu_spec.np, "xen,grant-dma") ||
iommu_spec.args_count != 1) {
-   dev_err(dev, "Incompatible IOMMU node\n");
+   dev_dbg(dev, "Incompatible IOMMU node\n");
of_node_put(iommu_spec.np);
return -ESRCH;
}
@@ -346,12 +316,28 @@ static int xen_dt_grant_init_backend_domid(struct device 
*dev,
 * The endpoint ID here means the ID of the domain where the
 * corresponding backend is running
 */
-   data->backend_domid = iommu_spec.args[0];
+   *backend_domid = iommu_spec.args[0];
 
return 0;
 }
 
-void xen_grant_setup_dma_ops(struct device *dev)
+static int xen_grant_init_backend_domid(struct device *dev,
+   domid_t *backend_domid)
+{
+   int ret = -ENODEV;
+
+   if (dev->of_node) {
+   ret = xen_dt_grant_init_backend_domid(dev, backend_domid);
+   } else if (IS_ENABLED(CONFIG_XEN_VIRTIO_FORCE_GRANT) || 
xen_pv_domain()) {
+   dev_info(dev, "Using dom0 as backend\n");
+   *backend_domid = 0;
+   ret = 0;
+   }
+
+   return ret;
+}
+
+static void xen_grant_setup_dma_ops(struct device *dev, domid_t backend_domid)
 {
struct xen_grant_dma_data *data;
 
@@ -365,16 +351,7 @@ void xen_grant_setup_dma_ops(struct device *dev)
if (!data)
got

1 2 3 4 5 6 7 >

1 - 100 of 693 matches

Mail list logo