[PATCH] optee: immediately free RPC buffers that are released by OP-TEE

2022-05-03 Thread Jens Wiklander
This commit fixes a case overlooked in [1].

There are two kinds of shared memory buffers used by OP-TEE:
1. Normal payload buffer
2. Internal command structure buffers

The internal command structure buffers are represented with a shadow
copy internally in Xen since this buffer can contain physical addresses
that may need to be translated between real physical address and guest
physical address without leaking information to the guest.

[1] fixes the problem when releasing the normal payload buffers. The
internal command structure buffers must be released in the same way.
Failure to follow this order opens a window where the guest has freed
the shared memory but Xen is still tracking the buffer.

During this window the guest may happen to recycle this particular
shared memory in some other thread and try to use it. Xen will block
this which will lead to spurious failures to register a new shared
memory block.

Fix this by freeing the internal command structure buffers first before
informing the guest that the buffer can be freed.

[1] 5b13eb1d978e ("optee: immediately free buffers that are released by OP-TEE")

Signed-off-by: Jens Wiklander 
---
 xen/arch/arm/tee/optee.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c
index 8a39fe33b0ef..539a862fd185 100644
--- a/xen/arch/arm/tee/optee.c
+++ b/xen/arch/arm/tee/optee.c
@@ -1149,6 +1149,11 @@ static int handle_rpc_return(struct optee_domain *ctx,
 call->rpc_data_cookie = 0;
 }
 unmap_domain_page(shm_rpc->xen_arg);
+} else if ( call->rpc_op == OPTEE_SMC_RPC_FUNC_FREE ) {
+uint64_t cookie = regpair_to_uint64(get_user_reg(regs, 1),
+get_user_reg(regs, 2));
+
+free_shm_rpc(ctx, cookie);
 }
 
 return ret;
@@ -1598,13 +1603,6 @@ static void handle_rpc(struct optee_domain *ctx, struct 
cpu_user_regs *regs)
 case OPTEE_SMC_RPC_FUNC_ALLOC:
 handle_rpc_func_alloc(ctx, regs, call);
 return;
-case OPTEE_SMC_RPC_FUNC_FREE:
-{
-uint64_t cookie = regpair_to_uint64(call->rpc_params[0],
-call->rpc_params[1]);
-free_shm_rpc(ctx, cookie);
-break;
-}
 case OPTEE_SMC_RPC_FUNC_FOREIGN_INTR:
 break;
 case OPTEE_SMC_RPC_FUNC_CMD:
-- 
2.31.1




[PATCH 23/32] Bluetooth: Use mem_to_flex_dup() with struct hci_op_configure_data_path

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Marcel Holtmann 
Cc: Johan Hedberg 
Cc: Luiz Augusto von Dentz 
Cc: "David S. Miller" 
Cc: Eric Dumazet 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: linux-blueto...@vger.kernel.org
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 include/net/bluetooth/hci.h | 4 ++--
 net/bluetooth/hci_request.c | 9 ++---
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index 62a9bb022aed..7b398ef0b46d 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -1321,8 +1321,8 @@ struct hci_rp_read_local_oob_ext_data {
 struct hci_op_configure_data_path {
__u8direction;
__u8data_path_id;
-   __u8vnd_len;
-   __u8vnd_data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(__u8, vnd_len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(__u8, vnd_data);
 } __packed;
 
 #define HCI_OP_READ_LOCAL_VERSION  0x1001
diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index f4afe482e300..e29be3810b93 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -2435,19 +2435,14 @@ int hci_req_configure_datapath(struct hci_dev *hdev, 
struct bt_codec *codec)
if (err < 0)
goto error;
 
-   cmd = kzalloc(sizeof(*cmd) + vnd_len, GFP_KERNEL);
-   if (!cmd) {
-   err = -ENOMEM;
+   err = mem_to_flex_dup(, vnd_data, vnd_len, GFP_KERNEL);
+   if (err < 0)
goto error;
-   }
 
err = hdev->get_data_path_id(hdev, >data_path_id);
if (err < 0)
goto error;
 
-   cmd->vnd_len = vnd_len;
-   memcpy(cmd->vnd_data, vnd_data, vnd_len);
-
cmd->direction = 0x00;
hci_req_add(, HCI_CONFIGURE_DATA_PATH, sizeof(*cmd) + vnd_len, cmd);
 
-- 
2.32.0




Re: [PATCH 01/32] netlink: Avoid memcpy() across flexible array boundary

2022-05-03 Thread Gustavo A. R. Silva
On Tue, May 03, 2022 at 06:44:10PM -0700, Kees Cook wrote:
> In preparation for run-time memcpy() bounds checking, split the nlmsg
> copying for error messages (which crosses a previous unspecified flexible
> array boundary) in half. Avoids the future run-time warning:
> 
> memcpy: detected field-spanning write (size 32) of single field 
> ">msg" (size 16)
> 
> Creates an explicit flexible array at the end of nlmsghdr for the payload,
> named "nlmsg_payload". There is no impact on UAPI; the sizeof(struct
> nlmsghdr) does not change, but now the compiler can better reason about
> where things are being copied.
> 
> Fixed-by: Rasmus Villemoes 
> Link: 
> https://lore.kernel.org/lkml/d7251d92-150b-5346-6237-52afc154b...@rasmusvillemoes.dk
> Cc: "David S. Miller" 
> Cc: Jakub Kicinski 
> Cc: Rich Felker 
> Cc: Eric Dumazet 
> Cc: net...@vger.kernel.org
> Signed-off-by: Kees Cook 
> ---
>  include/uapi/linux/netlink.h | 1 +
>  net/netlink/af_netlink.c | 5 -
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
> index 855dffb4c1c3..47f9342d51bc 100644
> --- a/include/uapi/linux/netlink.h
> +++ b/include/uapi/linux/netlink.h
> @@ -47,6 +47,7 @@ struct nlmsghdr {
>   __u16   nlmsg_flags;/* Additional flags */
>   __u32   nlmsg_seq;  /* Sequence number */
>   __u32   nlmsg_pid;  /* Sending process port ID */
> + __u8nlmsg_payload[];/* Contents of message */
>  };
>  
>  /* Flags values */
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 1b5a9c2e1c29..09346aee1022 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -2445,7 +2445,10 @@ void netlink_ack(struct sk_buff *in_skb, struct 
> nlmsghdr *nlh, int err,
> NLMSG_ERROR, payload, flags);
>   errmsg = nlmsg_data(rep);
>   errmsg->error = err;
> - memcpy(>msg, nlh, payload > sizeof(*errmsg) ? nlh->nlmsg_len : 
> sizeof(*nlh));
> + errmsg->msg = *nlh;
> + if (payload > sizeof(*errmsg))
> + memcpy(errmsg->msg.nlmsg_payload, nlh->nlmsg_payload,
> +nlh->nlmsg_len - sizeof(*nlh));

They have nlmsg_len()[1] for the length of the payload without the header:

/**
 * nlmsg_len - length of message payload
 * @nlh: netlink message header
 */
static inline int nlmsg_len(const struct nlmsghdr *nlh)
{
return nlh->nlmsg_len - NLMSG_HDRLEN;
}

(would that function use some sanitization, though? what if nlmsg_len is
somehow manipulated to be less than NLMSG_HDRLEN?...)

Also, it seems there is at least one more instance of this same issue:

diff --git a/net/netfilter/ipset/ip_set_core.c 
b/net/netfilter/ipset/ip_set_core.c
index 16ae92054baa..d06184b94af5 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -1723,7 +1723,8 @@ call_ad(struct net *net, struct sock *ctnl, struct 
sk_buff *skb,
  nlh->nlmsg_seq, NLMSG_ERROR, payload, 0);
errmsg = nlmsg_data(rep);
errmsg->error = ret;
-   memcpy(>msg, nlh, nlh->nlmsg_len);
+   errmsg->msg = *nlh;
+   memcpy(errmsg->msg.nlmsg_payload, nlh->nlmsg_payload, 
nlmsg_len(nlh));
cmdattr = (void *)>msg + min_len;

ret = nla_parse(cda, IPSET_ATTR_CMD_MAX, cmdattr,

--
Gustavo

[1] https://elixir.bootlin.com/linux/v5.18-rc5/source/include/net/netlink.h#L577



[PATCH 32/32] esas2r: Use __mem_to_flex() with struct atto_ioctl

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying. This requires adding the
flexible array explicitly.

Cc: Bradley Grove 
Cc: "James E.J. Bottomley" 
Cc: "Martin K. Petersen" 
Cc: linux-s...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/scsi/esas2r/atioctl.h  |  1 +
 drivers/scsi/esas2r/esas2r_ioctl.c | 11 +++
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/esas2r/atioctl.h b/drivers/scsi/esas2r/atioctl.h
index ff2ad9b38575..dd3437412ffc 100644
--- a/drivers/scsi/esas2r/atioctl.h
+++ b/drivers/scsi/esas2r/atioctl.h
@@ -831,6 +831,7 @@ struct __packed atto_hba_trace {
u32 total_length;
u32 trace_mask;
u8 reserved2[48];
+   u8 contents[];
 };
 
 #define ATTO_FUNC_SCSI_PASS_THRU 0x04
diff --git a/drivers/scsi/esas2r/esas2r_ioctl.c 
b/drivers/scsi/esas2r/esas2r_ioctl.c
index 08f4e43c7d9e..9310b54b1575 100644
--- a/drivers/scsi/esas2r/esas2r_ioctl.c
+++ b/drivers/scsi/esas2r/esas2r_ioctl.c
@@ -947,11 +947,14 @@ static int hba_ioctl_callback(struct esas2r_adapter *a,
break;
}
 
-   memcpy(trc + 1,
-  a->fw_coredump_buff + offset,
-  len);
+   if (__mem_to_flex(hi, data.trace.contents,
+ data_length,
+ a->fw_coredump_buff + offset,
+ len)) {
+   hi->status = ATTO_STS_INV_FUNC;
+   break;
+   }
 
-   hi->data_length = len;
} else if (trc->trace_func == ATTO_TRC_TF_RESET) {
memset(a->fw_coredump_buff, 0,
   ESAS2R_FWCOREDUMP_SZ);
-- 
2.32.0




[PATCH 25/32] Drivers: hv: utils: Use mem_to_flex_dup() with struct cn_msg

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: "K. Y. Srinivasan" 
Cc: Haiyang Zhang 
Cc: Stephen Hemminger 
Cc: Wei Liu 
Cc: Dexuan Cui 
Cc: linux-hyp...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/hv/hv_utils_transport.c | 7 ++-
 include/uapi/linux/connector.h  | 4 ++--
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/hv/hv_utils_transport.c b/drivers/hv/hv_utils_transport.c
index 832885198643..43b4f8893cc0 100644
--- a/drivers/hv/hv_utils_transport.c
+++ b/drivers/hv/hv_utils_transport.c
@@ -217,20 +217,17 @@ static void hvt_cn_callback(struct cn_msg *msg, struct 
netlink_skb_parms *nsp)
 int hvutil_transport_send(struct hvutil_transport *hvt, void *msg, int len,
  void (*on_read_cb)(void))
 {
-   struct cn_msg *cn_msg;
+   struct cn_msg *cn_msg = NULL;
int ret = 0;
 
if (hvt->mode == HVUTIL_TRANSPORT_INIT ||
hvt->mode == HVUTIL_TRANSPORT_DESTROY) {
return -EINVAL;
} else if (hvt->mode == HVUTIL_TRANSPORT_NETLINK) {
-   cn_msg = kzalloc(sizeof(*cn_msg) + len, GFP_ATOMIC);
-   if (!cn_msg)
+   if (mem_to_flex_dup(_msg, msg, len, GFP_ATOMIC))
return -ENOMEM;
cn_msg->id.idx = hvt->cn_id.idx;
cn_msg->id.val = hvt->cn_id.val;
-   cn_msg->len = len;
-   memcpy(cn_msg->data, msg, len);
ret = cn_netlink_send(cn_msg, 0, 0, GFP_ATOMIC);
kfree(cn_msg);
/*
diff --git a/include/uapi/linux/connector.h b/include/uapi/linux/connector.h
index 3738936149a2..b85bbe753dae 100644
--- a/include/uapi/linux/connector.h
+++ b/include/uapi/linux/connector.h
@@ -73,9 +73,9 @@ struct cn_msg {
__u32 seq;
__u32 ack;
 
-   __u16 len;  /* Length of the following data */
+   __DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(__u16, len);
__u16 flags;
-   __u8 data[0];
+   __DECLARE_FLEX_ARRAY_ELEMENTS(__u8, data);
 };
 
 #endif /* _UAPI__CONNECTOR_H */
-- 
2.32.0




[PATCH 30/32] usb: gadget: f_fs: Use mem_to_flex_dup() with struct ffs_buffer

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Felipe Balbi 
Cc: Greg Kroah-Hartman 
Cc: Eugeniu Rosca 
Cc: John Keeping 
Cc: Jens Axboe 
Cc: Udipto Goswami 
Cc: Andrew Gabbasov 
Cc: linux-...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/usb/gadget/function/f_fs.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/gadget/function/f_fs.c 
b/drivers/usb/gadget/function/f_fs.c
index 4585ee3a444a..bb0ff41dabd2 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -202,9 +202,9 @@ struct ffs_epfile {
 };
 
 struct ffs_buffer {
-   size_t length;
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(size_t, length);
char *data;
-   char storage[];
+   DECLARE_FLEX_ARRAY_ELEMENTS(char, storage);
 };
 
 /*  ffs_io_data structure ***/
@@ -905,7 +905,7 @@ static ssize_t __ffs_epfile_read_data(struct ffs_epfile 
*epfile,
  void *data, int data_len,
  struct iov_iter *iter)
 {
-   struct ffs_buffer *buf;
+   struct ffs_buffer *buf = NULL;
 
ssize_t ret = copy_to_iter(data, data_len, iter);
if (data_len == ret)
@@ -919,12 +919,9 @@ static ssize_t __ffs_epfile_read_data(struct ffs_epfile 
*epfile,
data_len, ret);
 
data_len -= ret;
-   buf = kmalloc(struct_size(buf, storage, data_len), GFP_KERNEL);
-   if (!buf)
+   if (mem_to_flex_dup(, data + ret, data_len, GFP_KERNEL))
return -ENOMEM;
-   buf->length = data_len;
buf->data = buf->storage;
-   memcpy(buf->storage, data + ret, flex_array_size(buf, storage, 
data_len));
 
/*
 * At this point read_buffer is NULL or READ_BUFFER_DROP (if
-- 
2.32.0




[PATCH 31/32] xenbus: Use mem_to_flex_dup() with struct read_buffer

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: xen-devel@lists.xenproject.org
Signed-off-by: Kees Cook 
---
 drivers/xen/xenbus/xenbus_dev_frontend.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_dev_frontend.c 
b/drivers/xen/xenbus/xenbus_dev_frontend.c
index 597af455a522..4267aaef33fb 100644
--- a/drivers/xen/xenbus/xenbus_dev_frontend.c
+++ b/drivers/xen/xenbus/xenbus_dev_frontend.c
@@ -81,8 +81,8 @@ struct xenbus_transaction_holder {
 struct read_buffer {
struct list_head list;
unsigned int cons;
-   unsigned int len;
-   char msg[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(unsigned int, len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(char, msg);
 };
 
 struct xenbus_file_priv {
@@ -188,21 +188,17 @@ static ssize_t xenbus_file_read(struct file *filp,
  */
 static int queue_reply(struct list_head *queue, const void *data, size_t len)
 {
-   struct read_buffer *rb;
+   struct read_buffer *rb = NULL;
 
if (len == 0)
return 0;
if (len > XENSTORE_PAYLOAD_MAX)
return -EINVAL;
 
-   rb = kmalloc(sizeof(*rb) + len, GFP_KERNEL);
-   if (rb == NULL)
+   if (mem_to_flex_dup(, data, len, GFP_KERNEL))
return -ENOMEM;
 
rb->cons = 0;
-   rb->len = len;
-
-   memcpy(rb->msg, data, len);
 
list_add_tail(>list, queue);
return 0;
-- 
2.32.0




[PATCH 26/32] ima: Use mem_to_flex_dup() with struct modsig

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Mimi Zohar 
Cc: Dmitry Kasatkin 
Cc: James Morris 
Cc: "Serge E. Hallyn" 
Cc: linux-integr...@vger.kernel.org
Cc: linux-security-mod...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 security/integrity/ima/ima_modsig.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/security/integrity/ima/ima_modsig.c 
b/security/integrity/ima/ima_modsig.c
index fb25723c65bc..200c080d36de 100644
--- a/security/integrity/ima/ima_modsig.c
+++ b/security/integrity/ima/ima_modsig.c
@@ -28,8 +28,8 @@ struct modsig {
 * This is what will go to the measurement list if the template requires
 * storing the signature.
 */
-   int raw_pkcs7_len;
-   u8 raw_pkcs7[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(int, raw_pkcs7_len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, raw_pkcs7);
 };
 
 /*
@@ -42,7 +42,7 @@ int ima_read_modsig(enum ima_hooks func, const void *buf, 
loff_t buf_len,
 {
const size_t marker_len = strlen(MODULE_SIG_STRING);
const struct module_signature *sig;
-   struct modsig *hdr;
+   struct modsig *hdr = NULL;
size_t sig_len;
const void *p;
int rc;
@@ -65,8 +65,7 @@ int ima_read_modsig(enum ima_hooks func, const void *buf, 
loff_t buf_len,
buf_len -= sig_len + sizeof(*sig);
 
/* Allocate sig_len additional bytes to hold the raw PKCS#7 data. */
-   hdr = kzalloc(sizeof(*hdr) + sig_len, GFP_KERNEL);
-   if (!hdr)
+   if (mem_to_flex_dup(, buf + buf_len, sig_len, GFP_KERNEL))
return -ENOMEM;
 
hdr->pkcs7_msg = pkcs7_parse_message(buf + buf_len, sig_len);
@@ -76,9 +75,6 @@ int ima_read_modsig(enum ima_hooks func, const void *buf, 
loff_t buf_len,
return rc;
}
 
-   memcpy(hdr->raw_pkcs7, buf + buf_len, sig_len);
-   hdr->raw_pkcs7_len = sig_len;
-
/* We don't know the hash algorithm yet. */
hdr->hash_algo = HASH_ALGO__LAST;
 
-- 
2.32.0




[PATCH 24/32] IB/hfi1: Use mem_to_flex_dup() for struct tid_rb_node

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Dennis Dalessandro 
Cc: Jason Gunthorpe 
Cc: Leon Romanovsky 
Cc: linux-r...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/infiniband/hw/hfi1/user_exp_rcv.c | 7 ++-
 drivers/infiniband/hw/hfi1/user_exp_rcv.h | 4 ++--
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c 
b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
index 186d30291260..f14846662ac9 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
@@ -683,7 +683,7 @@ static int set_rcvarray_entry(struct hfi1_filedata *fd,
 {
int ret;
struct hfi1_ctxtdata *uctxt = fd->uctxt;
-   struct tid_rb_node *node;
+   struct tid_rb_node *node = NULL;
struct hfi1_devdata *dd = uctxt->dd;
dma_addr_t phys;
struct page **pages = tbuf->pages + pageidx;
@@ -692,8 +692,7 @@ static int set_rcvarray_entry(struct hfi1_filedata *fd,
 * Allocate the node first so we can handle a potential
 * failure before we've programmed anything.
 */
-   node = kzalloc(struct_size(node, pages, npages), GFP_KERNEL);
-   if (!node)
+   if (mem_to_flex_dup(, pages, npages, GFP_KERNEL))
return -ENOMEM;
 
phys = dma_map_single(>pcidev->dev, __va(page_to_phys(pages[0])),
@@ -707,12 +706,10 @@ static int set_rcvarray_entry(struct hfi1_filedata *fd,
 
node->fdata = fd;
node->phys = page_to_phys(pages[0]);
-   node->npages = npages;
node->rcventry = rcventry;
node->dma_addr = phys;
node->grp = grp;
node->freed = false;
-   memcpy(node->pages, pages, flex_array_size(node, pages, npages));
 
if (fd->use_mn) {
ret = mmu_interval_notifier_insert(
diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.h 
b/drivers/infiniband/hw/hfi1/user_exp_rcv.h
index 8c53e416bf84..4be3446c4d25 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.h
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.h
@@ -32,8 +32,8 @@ struct tid_rb_node {
u32 rcventry;
dma_addr_t dma_addr;
bool freed;
-   unsigned int npages;
-   struct page *pages[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(unsigned int, npages);
+   DECLARE_FLEX_ARRAY_ELEMENTS(struct page *, pages);
 };
 
 static inline int num_user_pages(unsigned long addr,
-- 
2.32.0




[PATCH 13/32] mac80211: Use mem_to_flex_dup() with several structs

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying:

struct probe_resp
struct fils_discovery_data
struct unsol_bcast_probe_resp_data

Cc: Johannes Berg 
Cc: "David S. Miller" 
Cc: Eric Dumazet 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: linux-wirel...@vger.kernel.org
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 net/mac80211/cfg.c | 22 ++
 net/mac80211/ieee80211_i.h | 12 ++--
 2 files changed, 12 insertions(+), 22 deletions(-)

diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index f1d211e61e49..355edbf41707 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -867,20 +867,16 @@ ieee80211_set_probe_resp(struct ieee80211_sub_if_data 
*sdata,
 const struct ieee80211_csa_settings *csa,
 const struct ieee80211_color_change_settings *cca)
 {
-   struct probe_resp *new, *old;
+   struct probe_resp *new = NULL, *old;
 
if (!resp || !resp_len)
return 1;
 
old = sdata_dereference(sdata->u.ap.probe_resp, sdata);
 
-   new = kzalloc(sizeof(struct probe_resp) + resp_len, GFP_KERNEL);
-   if (!new)
+   if (mem_to_flex_dup(, resp, resp_len, GFP_KERNEL))
return -ENOMEM;
 
-   new->len = resp_len;
-   memcpy(new->data, resp, resp_len);
-
if (csa)
memcpy(new->cntdwn_counter_offsets, csa->counter_offsets_presp,
   csa->n_counter_offsets_presp *
@@ -898,7 +894,7 @@ ieee80211_set_probe_resp(struct ieee80211_sub_if_data 
*sdata,
 static int ieee80211_set_fils_discovery(struct ieee80211_sub_if_data *sdata,
struct cfg80211_fils_discovery *params)
 {
-   struct fils_discovery_data *new, *old = NULL;
+   struct fils_discovery_data *new = NULL, *old = NULL;
struct ieee80211_fils_discovery *fd;
 
if (!params->tmpl || !params->tmpl_len)
@@ -909,11 +905,8 @@ static int ieee80211_set_fils_discovery(struct 
ieee80211_sub_if_data *sdata,
fd->max_interval = params->max_interval;
 
old = sdata_dereference(sdata->u.ap.fils_discovery, sdata);
-   new = kzalloc(sizeof(*new) + params->tmpl_len, GFP_KERNEL);
-   if (!new)
+   if (mem_to_flex_dup(, params->tmpl, params->tmpl_len, GFP_KERNEL))
return -ENOMEM;
-   new->len = params->tmpl_len;
-   memcpy(new->data, params->tmpl, params->tmpl_len);
rcu_assign_pointer(sdata->u.ap.fils_discovery, new);
 
if (old)
@@ -926,17 +919,14 @@ static int
 ieee80211_set_unsol_bcast_probe_resp(struct ieee80211_sub_if_data *sdata,
 struct cfg80211_unsol_bcast_probe_resp 
*params)
 {
-   struct unsol_bcast_probe_resp_data *new, *old = NULL;
+   struct unsol_bcast_probe_resp_data *new = NULL, *old = NULL;
 
if (!params->tmpl || !params->tmpl_len)
return -EINVAL;
 
old = sdata_dereference(sdata->u.ap.unsol_bcast_probe_resp, sdata);
-   new = kzalloc(sizeof(*new) + params->tmpl_len, GFP_KERNEL);
-   if (!new)
+   if (mem_to_flex_dup(, params->tmpl, params->tmpl_len, GFP_KERNEL))
return -ENOMEM;
-   new->len = params->tmpl_len;
-   memcpy(new->data, params->tmpl, params->tmpl_len);
rcu_assign_pointer(sdata->u.ap.unsol_bcast_probe_resp, new);
 
if (old)
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index d4a7ba4a8202..2e9bbfb12c0d 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -263,21 +263,21 @@ struct beacon_data {
 
 struct probe_resp {
struct rcu_head rcu_head;
-   int len;
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(int, len);
u16 cntdwn_counter_offsets[IEEE80211_MAX_CNTDWN_COUNTERS_NUM];
-   u8 data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, data);
 };
 
 struct fils_discovery_data {
struct rcu_head rcu_head;
-   int len;
-   u8 data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(int, len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, data);
 };
 
 struct unsol_bcast_probe_resp_data {
struct rcu_head rcu_head;
-   int len;
-   u8 data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(int, len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, data);
 };
 
 struct ps_data {
-- 
2.32.0




Re: [PATCH 03/32] flex_array: Add Kunit tests

2022-05-03 Thread David Gow
On Wed, May 4, 2022 at 9:47 AM Kees Cook  wrote:
>
> Add tests for the new flexible array structure helpers. These can be run
> with:
>
>   make ARCH=um mrproper
>   ./tools/testing/kunit/kunit.py config

Nit: it shouldn't be necessary to run kunit.py config separately:
kunit.py run will configure the kernel if necessary.

>   ./tools/testing/kunit/kunit.py run flex_array
>
> Cc: David Gow 
> Cc: kunit-...@googlegroups.com
> Signed-off-by: Kees Cook 
> ---

This looks pretty good to me: it certainly worked on the different
setups I tried (um, x86_64, x86_64+KASAN).

A few minor nitpicks inline, mostly around minor config-y things, or
things which weren't totally clear on my first read-through.

Hopefully one day, with the various stubbing features or something
similar, we'll be able to check against allocation failures in
flex_dup(), too, but otherwise nothing seems too obviously missing.

Reviewed-by: David Gow 

-- David

>  lib/Kconfig.debug  |  12 +-
>  lib/Makefile   |   1 +
>  lib/flex_array_kunit.c | 523 +
>  3 files changed, 531 insertions(+), 5 deletions(-)
>  create mode 100644 lib/flex_array_kunit.c
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 9077bb38bc93..8bae6b169c50 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -2551,11 +2551,6 @@ config OVERFLOW_KUNIT_TEST
>   Builds unit tests for the check_*_overflow(), size_*(), allocation, 
> and
>   related functions.
>
> - For more information on KUnit and unit tests in general please refer
> - to the KUnit documentation in Documentation/dev-tools/kunit/.
> -
> - If unsure, say N.
> -

Nit: while I'm not against removing some of this boilerplate, is it
better suited for a separate commit?

>  config STACKINIT_KUNIT_TEST
> tristate "Test level of stack variable initialization" if 
> !KUNIT_ALL_TESTS
> depends on KUNIT
> @@ -2567,6 +2562,13 @@ config STACKINIT_KUNIT_TEST
>   CONFIG_GCC_PLUGIN_STRUCTLEAK, CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF,
>   or CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL.
>
> +config FLEX_ARRAY_KUNIT_TEST
> +   tristate "Test flex_*() family of helper functions at runtime" if 
> !KUNIT_ALL_TESTS
> +   depends on KUNIT
> +   default KUNIT_ALL_TESTS
> +   help
> + Builds unit tests for flexible array copy helper functions.
> +

Nit: checkpatch warns that the description here may be insufficient:
WARNING: please write a help paragraph that fully describes the config symbol

>  config TEST_UDELAY
> tristate "udelay test driver"
> help
> diff --git a/lib/Makefile b/lib/Makefile
> index 6b9ffc1bd1ee..9884318db330 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -366,6 +366,7 @@ obj-$(CONFIG_MEMCPY_KUNIT_TEST) += memcpy_kunit.o
>  obj-$(CONFIG_OVERFLOW_KUNIT_TEST) += overflow_kunit.o
>  CFLAGS_stackinit_kunit.o += $(call cc-disable-warning, switch-unreachable)
>  obj-$(CONFIG_STACKINIT_KUNIT_TEST) += stackinit_kunit.o
> +obj-$(CONFIG_FLEX_ARRAY_KUNIT_TEST) += flex_array_kunit.o
>
>  obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
>
> diff --git a/lib/flex_array_kunit.c b/lib/flex_array_kunit.c
> new file mode 100644
> index ..48bee88945b4
> --- /dev/null
> +++ b/lib/flex_array_kunit.c
> @@ -0,0 +1,523 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Test cases for flex_*() array manipulation helpers.
> + */
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define COMPARE_STRUCTS(STRUCT_A, STRUCT_B)do {\
> +   STRUCT_A *ptr_A;\
> +   STRUCT_B *ptr_B;\
> +   int rc; \
> +   size_t size_A, size_B;  \
> +   \
> +   /* matching types for flex array elements and count */  \
> +   KUNIT_EXPECT_EQ(test, sizeof(*ptr_A), sizeof(*ptr_B));  \
> +   KUNIT_EXPECT_TRUE(test, __same_type(*ptr_A->data,   \
> +   *ptr_B->__flex_array_elements));\
> +   KUNIT_EXPECT_TRUE(test, __same_type(ptr_A->datalen, \
> +   ptr_B->__flex_array_elements_count));   \
> +   KUNIT_EXPECT_EQ(test, sizeof(*ptr_A->data), \
> + sizeof(*ptr_B->__flex_array_elements));   \
> +   KUNIT_EXPECT_EQ(test, offsetof(typeof(*ptr_A), data),   \
> + offsetof(typeof(*ptr_B),  \
> +  __flex_array_elements)); \
> +   KUNIT_EXPECT_EQ(test, offsetof(typeof(*ptr_A), datalen),\
> + 

[PATCH 17/32] net/flow_offload: Use mem_to_flex_dup() with struct flow_action_cookie

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: "David S. Miller" 
Cc: Eric Dumazet 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: Baowen Zheng 
Cc: Eli Cohen 
Cc: Louis Peens 
Cc: Simon Horman 
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 include/net/flow_offload.h | 4 ++--
 net/core/flow_offload.c| 7 ++-
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 021778a7e1af..ca5db457a0bc 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -190,8 +190,8 @@ enum flow_action_hw_stats {
 typedef void (*action_destr)(void *priv);
 
 struct flow_action_cookie {
-   u32 cookie_len;
-   u8 cookie[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(u32, cookie_len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, cookie);
 };
 
 struct flow_action_cookie *flow_action_cookie_create(void *data,
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 73f68d4625f3..e23c8d05b828 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -199,13 +199,10 @@ struct flow_action_cookie *flow_action_cookie_create(void 
*data,
 unsigned int len,
 gfp_t gfp)
 {
-   struct flow_action_cookie *cookie;
+   struct flow_action_cookie *cookie = NULL;
 
-   cookie = kmalloc(sizeof(*cookie) + len, gfp);
-   if (!cookie)
+   if (mem_to_flex_dup(, data, len, gfp))
return NULL;
-   cookie->cookie_len = len;
-   memcpy(cookie->cookie, data, len);
return cookie;
 }
 EXPORT_SYMBOL(flow_action_cookie_create);
-- 
2.32.0




[PATCH 19/32] afs: Use mem_to_flex_dup() with struct afs_acl

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: David Howells 
Cc: Marc Dionne 
Cc: linux-...@lists.infradead.org
Signed-off-by: Kees Cook 
---
 fs/afs/internal.h | 4 ++--
 fs/afs/xattr.c| 7 ++-
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 7a72e9c60423..83014d20b6b3 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1125,8 +1125,8 @@ extern bool afs_fs_get_capabilities(struct afs_net *, 
struct afs_server *,
 extern void afs_fs_inline_bulk_status(struct afs_operation *);
 
 struct afs_acl {
-   u32 size;
-   u8  data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(u32, size);
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, data);
 };
 
 extern void afs_fs_fetch_acl(struct afs_operation *);
diff --git a/fs/afs/xattr.c b/fs/afs/xattr.c
index 7751b0b3f81d..77b3af283d49 100644
--- a/fs/afs/xattr.c
+++ b/fs/afs/xattr.c
@@ -73,16 +73,13 @@ static int afs_xattr_get_acl(const struct xattr_handler 
*handler,
 static bool afs_make_acl(struct afs_operation *op,
 const void *buffer, size_t size)
 {
-   struct afs_acl *acl;
+   struct afs_acl *acl = NULL;
 
-   acl = kmalloc(sizeof(*acl) + size, GFP_KERNEL);
-   if (!acl) {
+   if (mem_to_flex_dup(, buffer, size, GFP_KERNEL)) {
afs_op_nomem(op);
return false;
}
 
-   acl->size = size;
-   memcpy(acl->data, buffer, size);
op->acl = acl;
return true;
 }
-- 
2.32.0




[PATCH 05/32] brcmfmac: Use mem_to_flex_dup() with struct brcmf_fweh_queue_item

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Arend van Spriel 
Cc: Franky Lin 
Cc: Hante Meuleman 
Cc: Kalle Valo 
Cc: "David S. Miller" 
Cc: Eric Dumazet 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: linux-wirel...@vger.kernel.org
Cc: brcm80211-dev-list@broadcom.com
Cc: sha-cyfmac-dev-l...@infineon.com
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 .../net/wireless/broadcom/brcm80211/brcmfmac/fweh.c   | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
index bc3f4e4edcdf..bea798ca6466 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.c
@@ -32,8 +32,8 @@ struct brcmf_fweh_queue_item {
u8 ifidx;
u8 ifaddr[ETH_ALEN];
struct brcmf_event_msg_be emsg;
-   u32 datalen;
-   u8 data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(u32, datalen);
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, data);
 };
 
 /*
@@ -395,7 +395,7 @@ void brcmf_fweh_process_event(struct brcmf_pub *drvr,
 {
enum brcmf_fweh_event_code code;
struct brcmf_fweh_info *fweh = >fweh;
-   struct brcmf_fweh_queue_item *event;
+   struct brcmf_fweh_queue_item *event = NULL;
void *data;
u32 datalen;
 
@@ -414,8 +414,7 @@ void brcmf_fweh_process_event(struct brcmf_pub *drvr,
datalen + sizeof(*event_packet) > packet_len)
return;
 
-   event = kzalloc(sizeof(*event) + datalen, gfp);
-   if (!event)
+   if (mem_to_flex_dup(, data, datalen, gfp))
return;
 
event->code = code;
@@ -423,8 +422,6 @@ void brcmf_fweh_process_event(struct brcmf_pub *drvr,
 
/* use memcpy to get aligned event message */
memcpy(>emsg, _packet->msg, sizeof(event->emsg));
-   memcpy(event->data, data, datalen);
-   event->datalen = datalen;
memcpy(event->ifaddr, event_packet->eth.h_dest, ETH_ALEN);
 
brcmf_fweh_queue_event(fweh, event);
-- 
2.32.0




[PATCH 15/32] 802/garp: Use mem_to_flex_dup() with struct garp_attr

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: "David S. Miller" 
Cc: Eric Dumazet 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: Hulk Robot 
Cc: Yang Yingliang 
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 include/net/garp.h | 4 ++--
 net/802/garp.c | 9 +++--
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/include/net/garp.h b/include/net/garp.h
index 4d9a0c6a2e5f..ec087ae534e7 100644
--- a/include/net/garp.h
+++ b/include/net/garp.h
@@ -80,8 +80,8 @@ struct garp_attr {
struct rb_node  node;
enum garp_applicant_state   state;
u8  type;
-   u8  dlen;
-   unsigned char   data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(u8, dlen);
+   DECLARE_FLEX_ARRAY_ELEMENTS(unsigned char, data);
 };
 
 enum garp_applications {
diff --git a/net/802/garp.c b/net/802/garp.c
index f6012f8e59f0..72743ed00a54 100644
--- a/net/802/garp.c
+++ b/net/802/garp.c
@@ -168,7 +168,7 @@ static struct garp_attr *garp_attr_create(struct 
garp_applicant *app,
  const void *data, u8 len, u8 type)
 {
struct rb_node *parent = NULL, **p = >gid.rb_node;
-   struct garp_attr *attr;
+   struct garp_attr *attr = NULL;
int d;
 
while (*p) {
@@ -184,13 +184,10 @@ static struct garp_attr *garp_attr_create(struct 
garp_applicant *app,
return attr;
}
}
-   attr = kmalloc(sizeof(*attr) + len, GFP_ATOMIC);
-   if (!attr)
-   return attr;
+   if (mem_to_flex_dup(, data, len, GFP_ATOMIC))
+   return NULL;
attr->state = GARP_APPLICANT_VO;
attr->type  = type;
-   attr->dlen  = len;
-   memcpy(attr->data, data, len);
 
rb_link_node(>node, parent, p);
rb_insert_color(>node, >gid);
-- 
2.32.0




[PATCH 08/32] iwlwifi: mvm: Use mem_to_flex_dup() with struct ieee80211_key_conf

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Luca Coelho 
Cc: Kalle Valo 
Cc: "David S. Miller" 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: Johannes Berg 
Cc: Gregory Greenman 
Cc: Eric Dumazet 
Cc: linux-wirel...@vger.kernel.org
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/net/wireless/intel/iwlwifi/mvm/sta.c | 8 ++--
 include/net/mac80211.h   | 4 ++--
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c 
b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
index 406f0a50a5bf..23cade528dcf 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
@@ -4108,7 +4108,7 @@ int iwl_mvm_add_pasn_sta(struct iwl_mvm *mvm, struct 
ieee80211_vif *vif,
int ret;
u16 queue;
struct iwl_mvm_vif *mvmvif = iwl_mvm_vif_from_mac80211(vif);
-   struct ieee80211_key_conf *keyconf;
+   struct ieee80211_key_conf *keyconf = NULL;
 
ret = iwl_mvm_allocate_int_sta(mvm, sta, 0,
   NL80211_IFTYPE_UNSPECIFIED,
@@ -4122,15 +4122,11 @@ int iwl_mvm_add_pasn_sta(struct iwl_mvm *mvm, struct 
ieee80211_vif *vif,
if (ret)
goto out;
 
-   keyconf = kzalloc(sizeof(*keyconf) + key_len, GFP_KERNEL);
-   if (!keyconf) {
+   if (mem_to_flex_dup(, key, key_len, GFP_KERNEL)) {
ret = -ENOBUFS;
goto out;
}
-
keyconf->cipher = cipher;
-   memcpy(keyconf->key, key, key_len);
-   keyconf->keylen = key_len;
 
ret = iwl_mvm_send_sta_key(mvm, sta->sta_id, keyconf, false,
   0, NULL, 0, 0, true);
diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 75880fc70700..4abe52963a96 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -1890,8 +1890,8 @@ struct ieee80211_key_conf {
u8 hw_key_idx;
s8 keyidx;
u16 flags;
-   u8 keylen;
-   u8 key[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(u8, keylen);
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, key);
 };
 
 #define IEEE80211_MAX_PN_LEN   16
-- 
2.32.0




Re: [PATCH 01/32] netlink: Avoid memcpy() across flexible array boundary

2022-05-03 Thread Kees Cook
On Tue, May 03, 2022 at 10:31:05PM -0500, Gustavo A. R. Silva wrote:
> On Tue, May 03, 2022 at 06:44:10PM -0700, Kees Cook wrote:
> [...]
> > diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> > index 1b5a9c2e1c29..09346aee1022 100644
> > --- a/net/netlink/af_netlink.c
> > +++ b/net/netlink/af_netlink.c
> > @@ -2445,7 +2445,10 @@ void netlink_ack(struct sk_buff *in_skb, struct 
> > nlmsghdr *nlh, int err,
> >   NLMSG_ERROR, payload, flags);
> > errmsg = nlmsg_data(rep);
> > errmsg->error = err;
> > -   memcpy(>msg, nlh, payload > sizeof(*errmsg) ? nlh->nlmsg_len : 
> > sizeof(*nlh));
> > +   errmsg->msg = *nlh;
> > +   if (payload > sizeof(*errmsg))
> > +   memcpy(errmsg->msg.nlmsg_payload, nlh->nlmsg_payload,
> > +  nlh->nlmsg_len - sizeof(*nlh));
> 
> They have nlmsg_len()[1] for the length of the payload without the header:
> 
> /**
>  * nlmsg_len - length of message payload
>  * @nlh: netlink message header
>  */
> static inline int nlmsg_len(const struct nlmsghdr *nlh)
> {
>   return nlh->nlmsg_len - NLMSG_HDRLEN;
> }

Oh, hm, yeah, that would be much cleaner. The relationship between
"payload" and nlmsg_len is confusing in here. :)

So, this should be simpler:

-   memcpy(>msg, nlh, payload > sizeof(*errmsg) ? nlh->nlmsg_len : 
sizeof(*nlh));
+   errmsg->msg = *nlh;
+   memcpy(errmsg->msg.nlmsg_payload, nlh->nlmsg_payload, nlmsg_len(nlh));

It's actually this case that triggered my investigation in __bos(1)'s
misbehavior around sub-structs, since this case wasn't getting silenced:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832

It still feels like it should be possible to get this right without
splitting the memcpy, though. Hmpf.

> (would that function use some sanitization, though? what if nlmsg_len is
> somehow manipulated to be less than NLMSG_HDRLEN?...)

Maybe something like:

static inline int nlmsg_len(const struct nlmsghdr *nlh)
{
if (WARN_ON(nlh->nlmsg_len < NLMSG_HDRLEN))
return 0;
return nlh->nlmsg_len - NLMSG_HDRLEN;
}

> Also, it seems there is at least one more instance of this same issue:
> 
> diff --git a/net/netfilter/ipset/ip_set_core.c 
> b/net/netfilter/ipset/ip_set_core.c
> index 16ae92054baa..d06184b94af5 100644
> --- a/net/netfilter/ipset/ip_set_core.c
> +++ b/net/netfilter/ipset/ip_set_core.c
> @@ -1723,7 +1723,8 @@ call_ad(struct net *net, struct sock *ctnl, struct 
> sk_buff *skb,
>   nlh->nlmsg_seq, NLMSG_ERROR, payload, 0);
> errmsg = nlmsg_data(rep);
> errmsg->error = ret;
> -   memcpy(>msg, nlh, nlh->nlmsg_len);
> +   errmsg->msg = *nlh;
> +   memcpy(errmsg->msg.nlmsg_payload, nlh->nlmsg_payload, 
> nlmsg_len(nlh));

Ah, yes, nice catch!

> cmdattr = (void *)>msg + min_len;
> 
> ret = nla_parse(cda, IPSET_ATTR_CMD_MAX, cmdattr,
> 
> --
> Gustavo
> 
> [1] 
> https://elixir.bootlin.com/linux/v5.18-rc5/source/include/net/netlink.h#L577

-- 
Kees Cook



[PATCH 29/32] xtensa: Use mem_to_flex_dup() with struct property

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Chris Zankel 
Cc: Max Filippov 
Cc: Rob Herring 
Cc: Frank Rowand 
Cc: Guenter Roeck 
Cc: linux-xte...@linux-xtensa.org
Cc: devicet...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 arch/xtensa/platforms/xtfpga/setup.c | 9 +++--
 include/linux/of.h   | 3 ++-
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/xtensa/platforms/xtfpga/setup.c 
b/arch/xtensa/platforms/xtfpga/setup.c
index 538e6748e85a..31c1fa4ba4ec 100644
--- a/arch/xtensa/platforms/xtfpga/setup.c
+++ b/arch/xtensa/platforms/xtfpga/setup.c
@@ -102,7 +102,7 @@ CLK_OF_DECLARE(xtfpga_clk, "cdns,xtfpga-clock", 
xtfpga_clk_setup);
 #define MAC_LEN 6
 static void __init update_local_mac(struct device_node *node)
 {
-   struct property *newmac;
+   struct property *newmac = NULL;
const u8* macaddr;
int prop_len;
 
@@ -110,19 +110,16 @@ static void __init update_local_mac(struct device_node 
*node)
if (macaddr == NULL || prop_len != MAC_LEN)
return;
 
-   newmac = kzalloc(sizeof(*newmac) + MAC_LEN, GFP_KERNEL);
-   if (newmac == NULL)
+   if (mem_to_flex_dup(, macaddr, MAC_LEN, GFP_KERNEL))
return;
 
-   newmac->value = newmac + 1;
-   newmac->length = MAC_LEN;
+   newmac->value = newmac->contents;
newmac->name = kstrdup("local-mac-address", GFP_KERNEL);
if (newmac->name == NULL) {
kfree(newmac);
return;
}
 
-   memcpy(newmac->value, macaddr, MAC_LEN);
((u8*)newmac->value)[5] = (*(u32*)DIP_SWITCHES_VADDR) & 0x3f;
of_update_property(node, newmac);
 }
diff --git a/include/linux/of.h b/include/linux/of.h
index 17741eee0ca4..efb0f419fd1f 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -30,7 +30,7 @@ typedef u32 ihandle;
 
 struct property {
char*name;
-   int length;
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(int, length);
void*value;
struct property *next;
 #if defined(CONFIG_OF_DYNAMIC) || defined(CONFIG_SPARC)
@@ -42,6 +42,7 @@ struct property {
 #if defined(CONFIG_OF_KOBJ)
struct bin_attribute attr;
 #endif
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, contents);
 };
 
 #if defined(CONFIG_SPARC)
-- 
2.32.0




[PATCH 21/32] soc: qcom: apr: Use mem_to_flex_dup() with struct apr_rx_buf

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Andy Gross 
Cc: Bjorn Andersson 
Cc: linux-arm-...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/soc/qcom/apr.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/soc/qcom/apr.c b/drivers/soc/qcom/apr.c
index 3caabd873322..6cf6f6df276e 100644
--- a/drivers/soc/qcom/apr.c
+++ b/drivers/soc/qcom/apr.c
@@ -40,8 +40,8 @@ struct packet_router {
 
 struct apr_rx_buf {
struct list_head node;
-   int len;
-   uint8_t buf[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(int, len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(uint8_t, buf);
 };
 
 /**
@@ -162,7 +162,7 @@ static int apr_callback(struct rpmsg_device *rpdev, void 
*buf,
  int len, void *priv, u32 addr)
 {
struct packet_router *apr = dev_get_drvdata(>dev);
-   struct apr_rx_buf *abuf;
+   struct apr_rx_buf *abuf = NULL;
unsigned long flags;
 
if (len <= APR_HDR_SIZE) {
@@ -171,13 +171,9 @@ static int apr_callback(struct rpmsg_device *rpdev, void 
*buf,
return -EINVAL;
}
 
-   abuf = kzalloc(sizeof(*abuf) + len, GFP_ATOMIC);
-   if (!abuf)
+   if (mem_to_flex_dup(, buf, len, GFP_ATOMIC))
return -ENOMEM;
 
-   abuf->len = len;
-   memcpy(abuf->buf, buf, len);
-
spin_lock_irqsave(>rx_lock, flags);
list_add_tail(>node, >rx_list);
spin_unlock_irqrestore(>rx_lock, flags);
-- 
2.32.0




[PATCH 22/32] atags_proc: Use mem_to_flex_dup() with struct buffer

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Russell King 
Cc: Christian Brauner 
Cc: Andrew Morton 
Cc: Muchun Song 
Cc: linux-arm-ker...@lists.infradead.org
Signed-off-by: Kees Cook 
---
 arch/arm/kernel/atags_proc.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kernel/atags_proc.c b/arch/arm/kernel/atags_proc.c
index 3ec2afe78423..638bbb616daa 100644
--- a/arch/arm/kernel/atags_proc.c
+++ b/arch/arm/kernel/atags_proc.c
@@ -6,8 +6,8 @@
 #include 
 
 struct buffer {
-   size_t size;
-   char data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(size_t, size);
+   DECLARE_FLEX_ARRAY_ELEMENTS(char, data);
 };
 
 static ssize_t atags_read(struct file *file, char __user *buf,
@@ -38,7 +38,7 @@ static int __init init_atags_procfs(void)
 */
struct proc_dir_entry *tags_entry;
struct tag *tag = (struct tag *)atags_copy;
-   struct buffer *b;
+   struct buffer *b = NULL;
size_t size;
 
if (tag->hdr.tag != ATAG_CORE) {
@@ -54,13 +54,9 @@ static int __init init_atags_procfs(void)
 
WARN_ON(tag->hdr.tag != ATAG_NONE);
 
-   b = kmalloc(sizeof(*b) + size, GFP_KERNEL);
-   if (!b)
+   if (mem_to_flex_dup(, atags_copy, size, GFP_KERNEL))
goto nomem;
 
-   b->size = size;
-   memcpy(b->data, atags_copy, size);
-
tags_entry = proc_create_data("atags", 0400, NULL, _proc_ops, b);
if (!tags_entry)
goto nomem;
-- 
2.32.0




[PATCH 16/32] 802/mrp: Use mem_to_flex_dup() with struct mrp_attr

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: "David S. Miller" 
Cc: Eric Dumazet 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: Yang Yingliang 
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 include/net/mrp.h | 4 ++--
 net/802/mrp.c | 9 +++--
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/include/net/mrp.h b/include/net/mrp.h
index 1c308c034e1a..211670bb46f2 100644
--- a/include/net/mrp.h
+++ b/include/net/mrp.h
@@ -91,8 +91,8 @@ struct mrp_attr {
struct rb_node  node;
enum mrp_applicant_statestate;
u8  type;
-   u8  len;
-   unsigned char   value[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(u8, len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(unsigned char, value);
 };
 
 enum mrp_applications {
diff --git a/net/802/mrp.c b/net/802/mrp.c
index 35e04cc5390c..8b9b2e685a42 100644
--- a/net/802/mrp.c
+++ b/net/802/mrp.c
@@ -257,7 +257,7 @@ static struct mrp_attr *mrp_attr_create(struct 
mrp_applicant *app,
const void *value, u8 len, u8 type)
 {
struct rb_node *parent = NULL, **p = >mad.rb_node;
-   struct mrp_attr *attr;
+   struct mrp_attr *attr = NULL;
int d;
 
while (*p) {
@@ -273,13 +273,10 @@ static struct mrp_attr *mrp_attr_create(struct 
mrp_applicant *app,
return attr;
}
}
-   attr = kmalloc(sizeof(*attr) + len, GFP_ATOMIC);
-   if (!attr)
-   return attr;
+   if (mem_to_flex_dup(, value, len, GFP_ATOMIC))
+   return NULL;
attr->state = MRP_APPLICANT_VO;
attr->type  = type;
-   attr->len   = len;
-   memcpy(attr->value, value, len);
 
rb_link_node(>node, parent, p);
rb_insert_color(>node, >mad);
-- 
2.32.0




[PATCH 18/32] firewire: Use __mem_to_flex_dup() with struct iso_interrupt_event

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Stefan Richter 
Cc: linux1394-de...@lists.sourceforge.net
Signed-off-by: Kees Cook 
---
 drivers/firewire/core-cdev.c   | 7 ++-
 include/uapi/linux/firewire-cdev.h | 4 ++--
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c
index c9fe5903725a..7e884c61e12e 100644
--- a/drivers/firewire/core-cdev.c
+++ b/drivers/firewire/core-cdev.c
@@ -913,17 +913,14 @@ static void iso_callback(struct fw_iso_context *context, 
u32 cycle,
 size_t header_length, void *header, void *data)
 {
struct client *client = data;
-   struct iso_interrupt_event *e;
+   struct iso_interrupt_event *e = NULL;
 
-   e = kmalloc(sizeof(*e) + header_length, GFP_ATOMIC);
-   if (e == NULL)
+   if (__mem_to_flex_dup(, .interrupt, header, header_length, 
GFP_ATOMIC))
return;
 
e->interrupt.type  = FW_CDEV_EVENT_ISO_INTERRUPT;
e->interrupt.closure   = client->iso_closure;
e->interrupt.cycle = cycle;
-   e->interrupt.header_length = header_length;
-   memcpy(e->interrupt.header, header, header_length);
queue_event(client, >event, >interrupt,
sizeof(e->interrupt) + header_length, NULL, 0);
 }
diff --git a/include/uapi/linux/firewire-cdev.h 
b/include/uapi/linux/firewire-cdev.h
index 5effa9832802..22c5f59e9dfa 100644
--- a/include/uapi/linux/firewire-cdev.h
+++ b/include/uapi/linux/firewire-cdev.h
@@ -264,8 +264,8 @@ struct fw_cdev_event_iso_interrupt {
__u64 closure;
__u32 type;
__u32 cycle;
-   __u32 header_length;
-   __u32 header[0];
+   __DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(__u32, header_length);
+   __DECLARE_FLEX_ARRAY_ELEMENTS(__u32, header);
 };
 
 /**
-- 
2.32.0




[PATCH 10/32] wcn36xx: Use mem_to_flex_dup() with struct wcn36xx_hal_ind_msg

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Loic Poulain 
Cc: Kalle Valo 
Cc: "David S. Miller" 
Cc: Eric Dumazet 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: wcn3...@lists.infradead.org
Cc: linux-wirel...@vger.kernel.org
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/net/wireless/ath/wcn36xx/smd.c | 8 ++--
 drivers/net/wireless/ath/wcn36xx/smd.h | 4 ++--
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c 
b/drivers/net/wireless/ath/wcn36xx/smd.c
index dc3805609284..106af0a2ffc4 100644
--- a/drivers/net/wireless/ath/wcn36xx/smd.c
+++ b/drivers/net/wireless/ath/wcn36xx/smd.c
@@ -3343,7 +3343,7 @@ int wcn36xx_smd_rsp_process(struct rpmsg_device *rpdev,
const struct wcn36xx_hal_msg_header *msg_header = buf;
struct ieee80211_hw *hw = priv;
struct wcn36xx *wcn = hw->priv;
-   struct wcn36xx_hal_ind_msg *msg_ind;
+   struct wcn36xx_hal_ind_msg *msg_ind = NULL;
wcn36xx_dbg_dump(WCN36XX_DBG_SMD_DUMP, "SMD <<< ", buf, len);
 
switch (msg_header->msg_type) {
@@ -3407,16 +3407,12 @@ int wcn36xx_smd_rsp_process(struct rpmsg_device *rpdev,
case WCN36XX_HAL_DELETE_STA_CONTEXT_IND:
case WCN36XX_HAL_PRINT_REG_INFO_IND:
case WCN36XX_HAL_SCAN_OFFLOAD_IND:
-   msg_ind = kmalloc(struct_size(msg_ind, msg, len), GFP_ATOMIC);
-   if (!msg_ind) {
+   if (mem_to_flex_dup(_ind, buf, len, GFP_ATOMIC)) {
wcn36xx_err("Run out of memory while handling SMD_EVENT 
(%d)\n",
msg_header->msg_type);
return -ENOMEM;
}
 
-   msg_ind->msg_len = len;
-   memcpy(msg_ind->msg, buf, len);
-
spin_lock(>hal_ind_lock);
list_add_tail(_ind->list, >hal_ind_queue);
queue_work(wcn->hal_ind_wq, >hal_ind_work);
diff --git a/drivers/net/wireless/ath/wcn36xx/smd.h 
b/drivers/net/wireless/ath/wcn36xx/smd.h
index 3fd598ac2a27..76ecac46f36b 100644
--- a/drivers/net/wireless/ath/wcn36xx/smd.h
+++ b/drivers/net/wireless/ath/wcn36xx/smd.h
@@ -46,8 +46,8 @@ struct wcn36xx_fw_msg_status_rsp {
 
 struct wcn36xx_hal_ind_msg {
struct list_head list;
-   size_t msg_len;
-   u8 msg[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(size_t, msg_len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, msg);
 };
 
 struct wcn36xx;
-- 
2.32.0




[PATCH 09/32] p54: Use mem_to_flex_dup() with struct p54_cal_database

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Christian Lamparter 
Cc: Kalle Valo 
Cc: "David S. Miller" 
Cc: Eric Dumazet 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: linux-wirel...@vger.kernel.org
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/net/wireless/intersil/p54/eeprom.c | 8 ++--
 drivers/net/wireless/intersil/p54/p54.h| 4 ++--
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/intersil/p54/eeprom.c 
b/drivers/net/wireless/intersil/p54/eeprom.c
index 5bd35c147e19..bd9b3ea327b9 100644
--- a/drivers/net/wireless/intersil/p54/eeprom.c
+++ b/drivers/net/wireless/intersil/p54/eeprom.c
@@ -702,7 +702,7 @@ static int p54_convert_output_limits(struct ieee80211_hw 
*dev,
 static struct p54_cal_database *p54_convert_db(struct pda_custom_wrapper *src,
   size_t total_len)
 {
-   struct p54_cal_database *dst;
+   struct p54_cal_database *dst = NULL;
size_t payload_len, entries, entry_size, offset;
 
payload_len = le16_to_cpu(src->len);
@@ -713,16 +713,12 @@ static struct p54_cal_database *p54_convert_db(struct 
pda_custom_wrapper *src,
 (payload_len + sizeof(*src) != total_len))
return NULL;
 
-   dst = kmalloc(sizeof(*dst) + payload_len, GFP_KERNEL);
-   if (!dst)
+   if (mem_to_flex_dup(, src->data, payload_len, GFP_KERNEL))
return NULL;
 
dst->entries = entries;
dst->entry_size = entry_size;
dst->offset = offset;
-   dst->len = payload_len;
-
-   memcpy(dst->data, src->data, payload_len);
return dst;
 }
 
diff --git a/drivers/net/wireless/intersil/p54/p54.h 
b/drivers/net/wireless/intersil/p54/p54.h
index 3356ea708d81..22bbb6d28245 100644
--- a/drivers/net/wireless/intersil/p54/p54.h
+++ b/drivers/net/wireless/intersil/p54/p54.h
@@ -125,8 +125,8 @@ struct p54_cal_database {
size_t entries;
size_t entry_size;
size_t offset;
-   size_t len;
-   u8 data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(size_t, len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, data);
 };
 
 #define EEPROM_READBACK_LEN 0x3fc
-- 
2.32.0




[PATCH 27/32] KEYS: Use mem_to_flex_dup() with struct user_key_payload

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: David Howells 
Cc: Jarkko Sakkinen 
Cc: James Morris 
Cc: "Serge E. Hallyn" 
Cc: keyri...@vger.kernel.org
Cc: linux-security-mod...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 include/keys/user-type.h | 4 ++--
 security/keys/user_defined.c | 7 ++-
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/include/keys/user-type.h b/include/keys/user-type.h
index 386c31432789..4e67ff902a32 100644
--- a/include/keys/user-type.h
+++ b/include/keys/user-type.h
@@ -26,8 +26,8 @@
  */
 struct user_key_payload {
struct rcu_head rcu;/* RCU destructor */
-   unsigned short  datalen;/* length of this data */
-   chardata[] __aligned(__alignof__(u64)); /* actual data */
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(unsigned short, datalen);
+   DECLARE_FLEX_ARRAY_ELEMENTS(char, data) __aligned(__alignof__(u64));
 };
 
 extern struct key_type key_type_user;
diff --git a/security/keys/user_defined.c b/security/keys/user_defined.c
index 749e2a4dcb13..2fb84894cdaa 100644
--- a/security/keys/user_defined.c
+++ b/security/keys/user_defined.c
@@ -58,21 +58,18 @@ EXPORT_SYMBOL_GPL(key_type_logon);
  */
 int user_preparse(struct key_preparsed_payload *prep)
 {
-   struct user_key_payload *upayload;
+   struct user_key_payload *upayload = NULL;
size_t datalen = prep->datalen;
 
if (datalen <= 0 || datalen > 32767 || !prep->data)
return -EINVAL;
 
-   upayload = kmalloc(sizeof(*upayload) + datalen, GFP_KERNEL);
-   if (!upayload)
+   if (mem_to_flex_dup(, prep->data, datalen, GFP_KERNEL))
return -ENOMEM;
 
/* attach the data */
prep->quotalen = datalen;
prep->payload.data[0] = upayload;
-   upayload->datalen = datalen;
-   memcpy(upayload->data, prep->data, datalen);
return 0;
 }
 EXPORT_SYMBOL_GPL(user_preparse);
-- 
2.32.0




[PATCH 28/32] selinux: Use mem_to_flex_dup() with xfrm and sidtab

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying:

struct xfrm_sec_ctx
struct sidtab_str_cache

Cc: Steffen Klassert 
Cc: Herbert Xu 
Cc: "David S. Miller" 
Cc: Paul Moore 
Cc: Stephen Smalley 
Cc: Eric Paris 
Cc: Nick Desaulniers 
Cc: Xiu Jianfeng 
Cc: "Christian Göttsche" 
Cc: net...@vger.kernel.org
Cc: seli...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 include/uapi/linux/xfrm.h| 4 ++--
 security/selinux/ss/sidtab.c | 9 +++--
 security/selinux/xfrm.c  | 7 ++-
 3 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/include/uapi/linux/xfrm.h b/include/uapi/linux/xfrm.h
index 65e13a099b1a..4a6fa2beff6a 100644
--- a/include/uapi/linux/xfrm.h
+++ b/include/uapi/linux/xfrm.h
@@ -31,9 +31,9 @@ struct xfrm_id {
 struct xfrm_sec_ctx {
__u8ctx_doi;
__u8ctx_alg;
-   __u16   ctx_len;
+   __DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(__u16, ctx_len);
__u32   ctx_sid;
-   charctx_str[0];
+   __DECLARE_FLEX_ARRAY_ELEMENTS(char, ctx_str);
 };
 
 /* Security Context Domains of Interpretation */
diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
index a54b8652bfb5..a9d434e8cff7 100644
--- a/security/selinux/ss/sidtab.c
+++ b/security/selinux/ss/sidtab.c
@@ -23,8 +23,8 @@ struct sidtab_str_cache {
struct rcu_head rcu_member;
struct list_head lru_member;
struct sidtab_entry *parent;
-   u32 len;
-   char str[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(u32, len);
+   DECLARE_FLEX_ARRAY_ELEMENTS(char, str);
 };
 
 #define index_to_sid(index) ((index) + SECINITSID_NUM + 1)
@@ -570,8 +570,7 @@ void sidtab_sid2str_put(struct sidtab *s, struct 
sidtab_entry *entry,
goto out_unlock;
}
 
-   cache = kmalloc(struct_size(cache, str, str_len), GFP_ATOMIC);
-   if (!cache)
+   if (mem_to_flex_dup(, str, str_len, GFP_ATOMIC))
goto out_unlock;
 
if (s->cache_free_slots == 0) {
@@ -584,8 +583,6 @@ void sidtab_sid2str_put(struct sidtab *s, struct 
sidtab_entry *entry,
s->cache_free_slots--;
}
cache->parent = entry;
-   cache->len = str_len;
-   memcpy(cache->str, str, str_len);
list_add(>lru_member, >cache_lru_list);
 
rcu_assign_pointer(entry->cache, cache);
diff --git a/security/selinux/xfrm.c b/security/selinux/xfrm.c
index c576832febc6..bc7a54bf8f0d 100644
--- a/security/selinux/xfrm.c
+++ b/security/selinux/xfrm.c
@@ -345,7 +345,7 @@ int selinux_xfrm_state_alloc_acquire(struct xfrm_state *x,
 struct xfrm_sec_ctx *polsec, u32 secid)
 {
int rc;
-   struct xfrm_sec_ctx *ctx;
+   struct xfrm_sec_ctx *ctx = NULL;
char *ctx_str = NULL;
u32 str_len;
 
@@ -360,8 +360,7 @@ int selinux_xfrm_state_alloc_acquire(struct xfrm_state *x,
if (rc)
return rc;
 
-   ctx = kmalloc(struct_size(ctx, ctx_str, str_len), GFP_ATOMIC);
-   if (!ctx) {
+   if (mem_to_flex_dup(, ctx_str, str_len, GFP_ATOMIC)) {
rc = -ENOMEM;
goto out;
}
@@ -369,8 +368,6 @@ int selinux_xfrm_state_alloc_acquire(struct xfrm_state *x,
ctx->ctx_doi = XFRM_SC_DOI_LSM;
ctx->ctx_alg = XFRM_SC_ALG_SELINUX;
ctx->ctx_sid = secid;
-   ctx->ctx_len = str_len;
-   memcpy(ctx->ctx_str, ctx_str, str_len);
 
x->security = ctx;
atomic_inc(_xfrm_refcount);
-- 
2.32.0




[PATCH 06/32] iwlwifi: calib: Prepare to use mem_to_flex_dup()

2022-05-03 Thread Kees Cook
In preparation for replacing an open-coded memcpy() of a dynamically
side buffer, rearrange the structures to pass enough information into
the calling function to examine the bounds of the struct.

Rearrange the argument passing to use "cmd", rather than "hdr", since
"res" expects to operate on the "data" flex array in "cmd" (that follows
"hdr").

Cc: Luca Coelho 
Cc: "David S. Miller" 
Cc: Jakub Kicinski 
Cc: Lee Jones 
Cc: Johannes Berg 
Cc: Gregory Greenman 
Cc: Kalle Valo 
Cc: Eric Dumazet 
Cc: Paolo Abeni 
Cc: Andy Lavr 
Cc: linux-wirel...@vger.kernel.org
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/net/wireless/intel/iwlwifi/dvm/agn.h   |  2 +-
 drivers/net/wireless/intel/iwlwifi/dvm/calib.c | 10 +-
 drivers/net/wireless/intel/iwlwifi/dvm/ucode.c |  8 
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/agn.h 
b/drivers/net/wireless/intel/iwlwifi/dvm/agn.h
index abb8696ba294..744e111d2ea3 100644
--- a/drivers/net/wireless/intel/iwlwifi/dvm/agn.h
+++ b/drivers/net/wireless/intel/iwlwifi/dvm/agn.h
@@ -112,7 +112,7 @@ int iwl_load_ucode_wait_alive(struct iwl_priv *priv,
  enum iwl_ucode_type ucode_type);
 int iwl_send_calib_results(struct iwl_priv *priv);
 int iwl_calib_set(struct iwl_priv *priv,
- const struct iwl_calib_hdr *cmd, int len);
+ const struct iwl_calib_cmd *cmd, int len);
 void iwl_calib_free_results(struct iwl_priv *priv);
 int iwl_dump_nic_event_log(struct iwl_priv *priv, bool full_log,
char **buf);
diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/calib.c 
b/drivers/net/wireless/intel/iwlwifi/dvm/calib.c
index a11884fa254b..ae1f0cf560e2 100644
--- a/drivers/net/wireless/intel/iwlwifi/dvm/calib.c
+++ b/drivers/net/wireless/intel/iwlwifi/dvm/calib.c
@@ -19,7 +19,7 @@
 struct iwl_calib_result {
struct list_head list;
size_t cmd_len;
-   struct iwl_calib_hdr hdr;
+   struct iwl_calib_cmd cmd;
/* data follows */
 };
 
@@ -43,12 +43,12 @@ int iwl_send_calib_results(struct iwl_priv *priv)
int ret;
 
hcmd.len[0] = res->cmd_len;
-   hcmd.data[0] = >hdr;
+   hcmd.data[0] = >cmd;
hcmd.dataflags[0] = IWL_HCMD_DFL_NOCOPY;
ret = iwl_dvm_send_cmd(priv, );
if (ret) {
IWL_ERR(priv, "Error %d on calib cmd %d\n",
-   ret, res->hdr.op_code);
+   ret, res->cmd.hdr.op_code);
return ret;
}
}
@@ -57,7 +57,7 @@ int iwl_send_calib_results(struct iwl_priv *priv)
 }
 
 int iwl_calib_set(struct iwl_priv *priv,
- const struct iwl_calib_hdr *cmd, int len)
+ const struct iwl_calib_cmd *cmd, int len)
 {
struct iwl_calib_result *res, *tmp;
 
@@ -69,7 +69,7 @@ int iwl_calib_set(struct iwl_priv *priv,
res->cmd_len = len;
 
list_for_each_entry(tmp, >calib_results, list) {
-   if (tmp->hdr.op_code == res->hdr.op_code) {
+   if (tmp->cmd.hdr.op_code == res->cmd.hdr.op_code) {
list_replace(>list, >list);
kfree(tmp);
return 0;
diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/ucode.c 
b/drivers/net/wireless/intel/iwlwifi/dvm/ucode.c
index 4b27a53d0bb4..bb13ca5d666c 100644
--- a/drivers/net/wireless/intel/iwlwifi/dvm/ucode.c
+++ b/drivers/net/wireless/intel/iwlwifi/dvm/ucode.c
@@ -356,18 +356,18 @@ static bool iwlagn_wait_calib(struct iwl_notif_wait_data 
*notif_wait,
  struct iwl_rx_packet *pkt, void *data)
 {
struct iwl_priv *priv = data;
-   struct iwl_calib_hdr *hdr;
+   struct iwl_calib_cmd *cmd;
 
if (pkt->hdr.cmd != CALIBRATION_RES_NOTIFICATION) {
WARN_ON(pkt->hdr.cmd != CALIBRATION_COMPLETE_NOTIFICATION);
return true;
}
 
-   hdr = (struct iwl_calib_hdr *)pkt->data;
+   cmd = (struct iwl_calib_cmd *)pkt->data;
 
-   if (iwl_calib_set(priv, hdr, iwl_rx_packet_payload_len(pkt)))
+   if (iwl_calib_set(priv, cmd, iwl_rx_packet_payload_len(pkt)))
IWL_ERR(priv, "Failed to record calibration data %d\n",
-   hdr->op_code);
+   cmd->hdr.op_code);
 
return false;
 }
-- 
2.32.0




[PATCH 12/32] cfg80211: Use mem_to_flex_dup() with struct cfg80211_bss_ies

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Johannes Berg 
Cc: "David S. Miller" 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: Eric Dumazet 
Cc: linux-wirel...@vger.kernel.org
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 include/net/cfg80211.h |  4 ++--
 net/wireless/scan.c| 21 ++---
 2 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index 68713388b617..fa236015f6ef 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -2600,9 +2600,9 @@ struct cfg80211_inform_bss {
 struct cfg80211_bss_ies {
u64 tsf;
struct rcu_head rcu_head;
-   int len;
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(int, len);
bool from_beacon;
-   u8 data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, data);
 };
 
 /**
diff --git a/net/wireless/scan.c b/net/wireless/scan.c
index 4a6d86432910..9f53d05c6aaa 100644
--- a/net/wireless/scan.c
+++ b/net/wireless/scan.c
@@ -1932,7 +1932,7 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy,
gfp_t gfp)
 {
struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy);
-   struct cfg80211_bss_ies *ies;
+   struct cfg80211_bss_ies *ies = NULL;
struct ieee80211_channel *channel;
struct cfg80211_internal_bss tmp = {}, *res;
int bss_type;
@@ -1978,13 +1978,10 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy,
 * override the IEs pointer should we have received an earlier
 * indication of Probe Response data.
 */
-   ies = kzalloc(sizeof(*ies) + ielen, gfp);
-   if (!ies)
+   if (mem_to_flex_dup(, ie, ielen, gfp))
return NULL;
-   ies->len = ielen;
ies->tsf = tsf;
ies->from_beacon = false;
-   memcpy(ies->data, ie, ielen);
 
switch (ftype) {
case CFG80211_BSS_FTYPE_BEACON:
@@ -2277,7 +2274,7 @@ cfg80211_update_notlisted_nontrans(struct wiphy *wiphy,
size_t ielen = len - offsetof(struct ieee80211_mgmt,
  u.probe_resp.variable);
size_t new_ie_len;
-   struct cfg80211_bss_ies *new_ies;
+   struct cfg80211_bss_ies *new_ies = NULL;
const struct cfg80211_bss_ies *old;
u8 cpy_len;
 
@@ -2314,8 +2311,7 @@ cfg80211_update_notlisted_nontrans(struct wiphy *wiphy,
if (!new_ie)
return;
 
-   new_ies = kzalloc(sizeof(*new_ies) + new_ie_len, GFP_ATOMIC);
-   if (!new_ies)
+   if (mem_to_flex_dup(_ies, new_ie, new_ie_len, GFP_ATOMIC))
goto out_free;
 
pos = new_ie;
@@ -2333,10 +2329,8 @@ cfg80211_update_notlisted_nontrans(struct wiphy *wiphy,
memcpy(pos, mbssid + cpy_len, ((ie + ielen) - (mbssid + cpy_len)));
 
/* update ie */
-   new_ies->len = new_ie_len;
new_ies->tsf = le64_to_cpu(mgmt->u.probe_resp.timestamp);
new_ies->from_beacon = ieee80211_is_beacon(mgmt->frame_control);
-   memcpy(new_ies->data, new_ie, new_ie_len);
if (ieee80211_is_probe_resp(mgmt->frame_control)) {
old = rcu_access_pointer(nontrans_bss->proberesp_ies);
rcu_assign_pointer(nontrans_bss->proberesp_ies, new_ies);
@@ -2363,7 +2357,7 @@ cfg80211_inform_single_bss_frame_data(struct wiphy *wiphy,
  gfp_t gfp)
 {
struct cfg80211_internal_bss tmp = {}, *res;
-   struct cfg80211_bss_ies *ies;
+   struct cfg80211_bss_ies *ies = NULL;
struct ieee80211_channel *channel;
bool signal_valid;
struct ieee80211_ext *ext = NULL;
@@ -2442,14 +2436,11 @@ cfg80211_inform_single_bss_frame_data(struct wiphy 
*wiphy,
capability = le16_to_cpu(mgmt->u.probe_resp.capab_info);
}
 
-   ies = kzalloc(sizeof(*ies) + ielen, gfp);
-   if (!ies)
+   if (mem_to_flex_dup(, variable, ielen, gfp))
return NULL;
-   ies->len = ielen;
ies->tsf = le64_to_cpu(mgmt->u.probe_resp.timestamp);
ies->from_beacon = ieee80211_is_beacon(mgmt->frame_control) ||
   ieee80211_is_s1g_beacon(mgmt->frame_control);
-   memcpy(ies->data, variable, ielen);
 
if (ieee80211_is_probe_resp(mgmt->frame_control))
rcu_assign_pointer(tmp.pub.proberesp_ies, ies);
-- 
2.32.0




[PATCH 04/32] fortify: Add run-time WARN for cross-field memcpy()

2022-05-03 Thread Kees Cook
Enable run-time checking of dynamic memcpy() and memmove() lengths,
issuing a WARN when a write would exceed the size of the target struct
member, when built with CONFIG_FORTIFY_SOURCE=y. This would have caught
all of the memcpy()-based buffer overflows from 2018 through 2020,
specifically covering all the cases where the destination buffer size
is known at compile time.

This change ONLY adds a run-time warning. As false positives are currently
still expected, this will not block the overflow. The new warnings will
look like this:

  memcpy: detected field-spanning write (size N) of single field "var->dest" 
(size M)
  WARNING: CPU: n PID:  at source/file/path.c:nr function+0xXX/0xXX [module]

The false positives are most likely where intentional field-spanning
writes are happening. These need to be addressed similarly to how the
compile-time cases were addressed: add a struct_group(), split the
memcpy(), use a flex_array.h helper, or some other refactoring.

In order to make identifying/investigating instances of added runtime
checks easier, each instance includes the destination variable name as a
WARN argument, prefixed with 'field "'. Therefore, on any given build,
it is trivial to inspect the artifacts to find instances. For example
on an x86_64 defconfig build, there are 78 new run-time memcpy() bounds
checks added:

  $ for i in vmlinux $(find . -name '*.ko'); do \
  strings "$i" | grep '^field "'; done | wc -l
  78

Currently, the common case where a destination buffer is known to be a
dynamic size (i.e. has a trailing flexible array) does not generate a
WARN. For example:

struct normal_flex_array {
void *a;
int b;
size_t array_size;
u32 c;
u8 flex_array[];
};

struct normal_flex_array *instance;
...
/* These cases will be ignored for run-time bounds checking. */
memcpy(instance, src, len);
memcpy(instance->flex_array, src, len);

This code pattern will need to be addressed separately, likely by
migrating to one of the flex_array.h family of helpers.

Note that one of the dynamic-sized destination cases is irritatingly
unable to be detected by the compiler: when using memcpy() to target
a composite struct member which contains a trailing flexible array
struct. For example:

struct wrapper {
int foo;
char bar;
struct normal_flex_array embedded;
};

struct wrapper *instance;
...
/* This will incorrectly WARN when len > sizeof(instance->embedded) */
memcpy(>embedded, src, len);

These cases end up appearing to the compiler to be sized as if the
flexible array had 0 elements. :( For more details see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
https://godbolt.org/z/vW6x8vh4P

Regardless, all cases of copying to/from flexible array structures
should be migrated to using the new flex*()-family of helpers to gain
their added safety checking, but priority will need to be given to the
"composite flexible array structure destination" cases noted above.

As mentioned, none of these bounds checks block any overflows
currently. For users that have tested their workloads, do not encounter
any warnings, and wish to make these checks stop any overflows, they
can use a big hammer and set the sysctl panic_on_warn=1.

Cc: Nathan Chancellor 
Cc: Nick Desaulniers 
Cc: Tom Rix 
Cc: linux-harden...@vger.kernel.org
Cc: l...@lists.linux.dev
Signed-off-by: Kees Cook 
---
 include/linux/fortify-string.h | 70 --
 1 file changed, 67 insertions(+), 3 deletions(-)

diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
index 295637a66c46..9f65527fff40 100644
--- a/include/linux/fortify-string.h
+++ b/include/linux/fortify-string.h
@@ -3,6 +3,7 @@
 #define _LINUX_FORTIFY_STRING_H_
 
 #include 
+#include 
 
 #define __FORTIFY_INLINE extern __always_inline __gnu_inline __overloadable
 #define __RENAME(x) __asm__(#x)
@@ -303,7 +304,7 @@ __FORTIFY_INLINE void fortify_memset_chk(__kernel_size_t 
size,
  * V = vulnerable to run-time overflow (will need refactoring to solve)
  *
  */
-__FORTIFY_INLINE void fortify_memcpy_chk(__kernel_size_t size,
+__FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
 const size_t p_size,
 const size_t q_size,
 const size_t p_size_field,
@@ -352,16 +353,79 @@ __FORTIFY_INLINE void fortify_memcpy_chk(__kernel_size_t 
size,
if ((p_size != (size_t)(-1) && p_size < size) ||
(q_size != (size_t)(-1) && q_size < size))
fortify_panic(func);
+
+   /*
+* Warn when writing beyond destination field size.
+*
+* We must ignore p_size_field == 0 and -1 for existing
+* 0-element and flexible arrays, until they are all converted
+* to flexible arrays and use the flex()-family of helpers.
+*
+* The implementation 

[PATCH 03/32] flex_array: Add Kunit tests

2022-05-03 Thread Kees Cook
Add tests for the new flexible array structure helpers. These can be run
with:

  make ARCH=um mrproper
  ./tools/testing/kunit/kunit.py config
  ./tools/testing/kunit/kunit.py run flex_array

Cc: David Gow 
Cc: kunit-...@googlegroups.com
Signed-off-by: Kees Cook 
---
 lib/Kconfig.debug  |  12 +-
 lib/Makefile   |   1 +
 lib/flex_array_kunit.c | 523 +
 3 files changed, 531 insertions(+), 5 deletions(-)
 create mode 100644 lib/flex_array_kunit.c

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 9077bb38bc93..8bae6b169c50 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2551,11 +2551,6 @@ config OVERFLOW_KUNIT_TEST
  Builds unit tests for the check_*_overflow(), size_*(), allocation, 
and
  related functions.
 
- For more information on KUnit and unit tests in general please refer
- to the KUnit documentation in Documentation/dev-tools/kunit/.
-
- If unsure, say N.
-
 config STACKINIT_KUNIT_TEST
tristate "Test level of stack variable initialization" if 
!KUNIT_ALL_TESTS
depends on KUNIT
@@ -2567,6 +2562,13 @@ config STACKINIT_KUNIT_TEST
  CONFIG_GCC_PLUGIN_STRUCTLEAK, CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF,
  or CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL.
 
+config FLEX_ARRAY_KUNIT_TEST
+   tristate "Test flex_*() family of helper functions at runtime" if 
!KUNIT_ALL_TESTS
+   depends on KUNIT
+   default KUNIT_ALL_TESTS
+   help
+ Builds unit tests for flexible array copy helper functions.
+
 config TEST_UDELAY
tristate "udelay test driver"
help
diff --git a/lib/Makefile b/lib/Makefile
index 6b9ffc1bd1ee..9884318db330 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -366,6 +366,7 @@ obj-$(CONFIG_MEMCPY_KUNIT_TEST) += memcpy_kunit.o
 obj-$(CONFIG_OVERFLOW_KUNIT_TEST) += overflow_kunit.o
 CFLAGS_stackinit_kunit.o += $(call cc-disable-warning, switch-unreachable)
 obj-$(CONFIG_STACKINIT_KUNIT_TEST) += stackinit_kunit.o
+obj-$(CONFIG_FLEX_ARRAY_KUNIT_TEST) += flex_array_kunit.o
 
 obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
 
diff --git a/lib/flex_array_kunit.c b/lib/flex_array_kunit.c
new file mode 100644
index ..48bee88945b4
--- /dev/null
+++ b/lib/flex_array_kunit.c
@@ -0,0 +1,523 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Test cases for flex_*() array manipulation helpers.
+ */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define COMPARE_STRUCTS(STRUCT_A, STRUCT_B)do {\
+   STRUCT_A *ptr_A;\
+   STRUCT_B *ptr_B;\
+   int rc; \
+   size_t size_A, size_B;  \
+   \
+   /* matching types for flex array elements and count */  \
+   KUNIT_EXPECT_EQ(test, sizeof(*ptr_A), sizeof(*ptr_B));  \
+   KUNIT_EXPECT_TRUE(test, __same_type(*ptr_A->data,   \
+   *ptr_B->__flex_array_elements));\
+   KUNIT_EXPECT_TRUE(test, __same_type(ptr_A->datalen, \
+   ptr_B->__flex_array_elements_count));   \
+   KUNIT_EXPECT_EQ(test, sizeof(*ptr_A->data), \
+ sizeof(*ptr_B->__flex_array_elements));   \
+   KUNIT_EXPECT_EQ(test, offsetof(typeof(*ptr_A), data),   \
+ offsetof(typeof(*ptr_B),  \
+  __flex_array_elements)); \
+   KUNIT_EXPECT_EQ(test, offsetof(typeof(*ptr_A), datalen),\
+ offsetof(typeof(*ptr_B),  \
+  __flex_array_elements_count));   \
+   \
+   /* struct_size() vs __fas_bytes() */\
+   size_A = struct_size(ptr_A, data, 13);  \
+   rc = __fas_bytes(ptr_B, __flex_array_elements,  \
+__flex_array_elements_count, 13, _B); \
+   KUNIT_EXPECT_EQ(test, rc, 0);   \
+   KUNIT_EXPECT_EQ(test, size_A, size_B);  \
+   \
+   /* flex_array_size() vs __fas_elements_bytes() */   \
+   size_A = flex_array_size(ptr_A, data, 13);  \
+   rc = __fas_elements_bytes(ptr_B, __flex_array_elements, \
+__flex_array_elements_count, 13, _B); \
+   KUNIT_EXPECT_EQ(test, rc, 0);   \
+   KUNIT_EXPECT_EQ(test, 

[PATCH 07/32] iwlwifi: calib: Use mem_to_flex_dup() with struct iwl_calib_result

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Avoids future false-positive warning when strict run-time memcpy()
bounds checking is enabled:

memcpy: detected field-spanning write (size 8) of single field ">hdr" 
(size 4)

Adds an additional size check since the minimum isn't 0.

Reported-by: Andy Lavr 
Cc: Luca Coelho 
Cc: Kalle Valo 
Cc: "David S. Miller" 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: Gregory Greenman 
Cc: Eric Dumazet 
Cc: linux-wirel...@vger.kernel.org
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/net/wireless/intel/iwlwifi/dvm/calib.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/calib.c 
b/drivers/net/wireless/intel/iwlwifi/dvm/calib.c
index ae1f0cf560e2..7480c19d7af0 100644
--- a/drivers/net/wireless/intel/iwlwifi/dvm/calib.c
+++ b/drivers/net/wireless/intel/iwlwifi/dvm/calib.c
@@ -18,8 +18,11 @@
 /* Opaque calibration results */
 struct iwl_calib_result {
struct list_head list;
-   size_t cmd_len;
-   struct iwl_calib_cmd cmd;
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(size_t, cmd_len);
+   union {
+   struct iwl_calib_cmd cmd;
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, data);
+   };
/* data follows */
 };
 
@@ -59,14 +62,10 @@ int iwl_send_calib_results(struct iwl_priv *priv)
 int iwl_calib_set(struct iwl_priv *priv,
  const struct iwl_calib_cmd *cmd, int len)
 {
-   struct iwl_calib_result *res, *tmp;
+   struct iwl_calib_result *res = NULL, *tmp;
 
-   res = kmalloc(sizeof(*res) + len - sizeof(struct iwl_calib_hdr),
- GFP_ATOMIC);
-   if (!res)
+   if (len < sizeof(*cmd) || mem_to_flex_dup(, cmd, len, GFP_ATOMIC))
return -ENOMEM;
-   memcpy(>hdr, cmd, len);
-   res->cmd_len = len;
 
list_for_each_entry(tmp, >calib_results, list) {
if (tmp->cmd.hdr.op_code == res->cmd.hdr.op_code) {
-- 
2.32.0




[PATCH 20/32] ASoC: sigmadsp: Use mem_to_flex_dup() with struct sigmadsp_data

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Lars-Peter Clausen 
Cc: "Nuno Sá" 
Cc: Liam Girdwood 
Cc: Mark Brown 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: alsa-de...@alsa-project.org
Signed-off-by: Kees Cook 
---
 sound/soc/codecs/sigmadsp.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/sound/soc/codecs/sigmadsp.c b/sound/soc/codecs/sigmadsp.c
index b992216aee55..648bdc73c5d9 100644
--- a/sound/soc/codecs/sigmadsp.c
+++ b/sound/soc/codecs/sigmadsp.c
@@ -42,8 +42,8 @@ struct sigmadsp_data {
struct list_head head;
uint32_t samplerates;
unsigned int addr;
-   unsigned int length;
-   uint8_t data[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(unsigned int, length);
+   DECLARE_FLEX_ARRAY_ELEMENTS(uint8_t, data);
 };
 
 struct sigma_fw_chunk {
@@ -263,7 +263,7 @@ static int sigma_fw_load_data(struct sigmadsp *sigmadsp,
const struct sigma_fw_chunk *chunk, unsigned int length)
 {
const struct sigma_fw_chunk_data *data_chunk;
-   struct sigmadsp_data *data;
+   struct sigmadsp_data *data = NULL;
 
if (length <= sizeof(*data_chunk))
return -EINVAL;
@@ -272,14 +272,11 @@ static int sigma_fw_load_data(struct sigmadsp *sigmadsp,
 
length -= sizeof(*data_chunk);
 
-   data = kzalloc(sizeof(*data) + length, GFP_KERNEL);
-   if (!data)
+   if (mem_to_flex_dup(, data_chunk->data, length, GFP_KERNEL))
return -ENOMEM;
 
data->addr = le16_to_cpu(data_chunk->addr);
-   data->length = length;
data->samplerates = le32_to_cpu(chunk->samplerates);
-   memcpy(data->data, data_chunk->data, length);
list_add_tail(>head, >data_list);
 
return 0;
-- 
2.32.0




[PATCH 01/32] netlink: Avoid memcpy() across flexible array boundary

2022-05-03 Thread Kees Cook
In preparation for run-time memcpy() bounds checking, split the nlmsg
copying for error messages (which crosses a previous unspecified flexible
array boundary) in half. Avoids the future run-time warning:

memcpy: detected field-spanning write (size 32) of single field ">msg" 
(size 16)

Creates an explicit flexible array at the end of nlmsghdr for the payload,
named "nlmsg_payload". There is no impact on UAPI; the sizeof(struct
nlmsghdr) does not change, but now the compiler can better reason about
where things are being copied.

Fixed-by: Rasmus Villemoes 
Link: 
https://lore.kernel.org/lkml/d7251d92-150b-5346-6237-52afc154b...@rasmusvillemoes.dk
Cc: "David S. Miller" 
Cc: Jakub Kicinski 
Cc: Rich Felker 
Cc: Eric Dumazet 
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 include/uapi/linux/netlink.h | 1 +
 net/netlink/af_netlink.c | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
index 855dffb4c1c3..47f9342d51bc 100644
--- a/include/uapi/linux/netlink.h
+++ b/include/uapi/linux/netlink.h
@@ -47,6 +47,7 @@ struct nlmsghdr {
__u16   nlmsg_flags;/* Additional flags */
__u32   nlmsg_seq;  /* Sequence number */
__u32   nlmsg_pid;  /* Sending process port ID */
+   __u8nlmsg_payload[];/* Contents of message */
 };
 
 /* Flags values */
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 1b5a9c2e1c29..09346aee1022 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2445,7 +2445,10 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr 
*nlh, int err,
  NLMSG_ERROR, payload, flags);
errmsg = nlmsg_data(rep);
errmsg->error = err;
-   memcpy(>msg, nlh, payload > sizeof(*errmsg) ? nlh->nlmsg_len : 
sizeof(*nlh));
+   errmsg->msg = *nlh;
+   if (payload > sizeof(*errmsg))
+   memcpy(errmsg->msg.nlmsg_payload, nlh->nlmsg_payload,
+  nlh->nlmsg_len - sizeof(*nlh));
 
if (nlk_has_extack && extack) {
if (extack->_msg) {
-- 
2.32.0




[PATCH 14/32] af_unix: Use mem_to_flex_dup() with struct unix_address

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: "David S. Miller" 
Cc: Eric Dumazet 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: Kuniyuki Iwashima 
Cc: Alexei Starovoitov 
Cc: Cong Wang 
Cc: Al Viro 
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 include/net/af_unix.h | 14 --
 net/unix/af_unix.c|  7 ++-
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index a7ef624ed726..422535b71295 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -25,8 +25,18 @@ extern struct hlist_head unix_socket_table[2 * 
UNIX_HASH_SIZE];
 
 struct unix_address {
refcount_t  refcnt;
-   int len;
-   struct sockaddr_un name[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(int, len);
+   union {
+   DECLARE_FLEX_ARRAY(struct sockaddr_un, name);
+   /*
+* While a struct is used to access the flexible
+* array, it may only be partially populated, and
+* "len" above is actually tracking bytes, not a
+* count of struct sockaddr_un elements, so also
+* include a byte-size flexible array.
+*/
+   DECLARE_FLEX_ARRAY_ELEMENTS(u8, bytes);
+   };
 };
 
 struct unix_skb_parms {
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index e1dd9e9c8452..8410cbc82ded 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -244,15 +244,12 @@ EXPORT_SYMBOL_GPL(unix_peer_get);
 static struct unix_address *unix_create_addr(struct sockaddr_un *sunaddr,
 int addr_len)
 {
-   struct unix_address *addr;
+   struct unix_address *addr = NULL;
 
-   addr = kmalloc(sizeof(*addr) + addr_len, GFP_KERNEL);
-   if (!addr)
+   if (mem_to_flex_dup(, sunaddr, addr_len, GFP_KERNEL))
return NULL;
 
refcount_set(>refcnt, 1);
-   addr->len = addr_len;
-   memcpy(addr->name, sunaddr, addr_len);
 
return addr;
 }
-- 
2.32.0




[PATCH 11/32] nl80211: Use mem_to_flex_dup() with struct cfg80211_cqm_config

2022-05-03 Thread Kees Cook
As part of the work to perform bounds checking on all memcpy() uses,
replace the open-coded a deserialization of bytes out of memory into a
trailing flexible array by using a flex_array.h helper to perform the
allocation, bounds checking, and copying.

Cc: Johannes Berg 
Cc: "David S. Miller" 
Cc: Jakub Kicinski 
Cc: Paolo Abeni 
Cc: linux-wirel...@vger.kernel.org
Cc: net...@vger.kernel.org
Cc: Eric Dumazet 
Signed-off-by: Kees Cook 
---
 net/wireless/core.h|  4 ++--
 net/wireless/nl80211.c | 15 ---
 2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/net/wireless/core.h b/net/wireless/core.h
index 3a7dbd63d8c6..899d111993c6 100644
--- a/net/wireless/core.h
+++ b/net/wireless/core.h
@@ -295,8 +295,8 @@ struct cfg80211_beacon_registration {
 struct cfg80211_cqm_config {
u32 rssi_hyst;
s32 last_rssi_event_value;
-   int n_rssi_thresholds;
-   s32 rssi_thresholds[];
+   DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(int, n_rssi_thresholds);
+   DECLARE_FLEX_ARRAY_ELEMENTS(s32, rssi_thresholds);
 };
 
 void cfg80211_destroy_ifaces(struct cfg80211_registered_device *rdev);
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 945ed87d12e0..70df7132cce8 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -12096,21 +12096,14 @@ static int nl80211_set_cqm_rssi(struct genl_info 
*info,
 
wdev_lock(wdev);
if (n_thresholds) {
-   struct cfg80211_cqm_config *cqm_config;
+   struct cfg80211_cqm_config *cqm_config = NULL;
 
-   cqm_config = kzalloc(struct_size(cqm_config, rssi_thresholds,
-n_thresholds),
-GFP_KERNEL);
-   if (!cqm_config) {
-   err = -ENOMEM;
+   err = mem_to_flex_dup(_config, thresholds, n_thresholds,
+ GFP_KERNEL);
+   if (err)
goto unlock;
-   }
 
cqm_config->rssi_hyst = hysteresis;
-   cqm_config->n_rssi_thresholds = n_thresholds;
-   memcpy(cqm_config->rssi_thresholds, thresholds,
-  flex_array_size(cqm_config, rssi_thresholds,
-  n_thresholds));
 
wdev->cqm_config = cqm_config;
}
-- 
2.32.0




[PATCH 00/32] Introduce flexible array struct memcpy() helpers

2022-05-03 Thread Kees Cook
Hi,

This is the next phase of memcpy() buffer bounds checking[1], which
starts by adding a new set of helpers to address common code patterns
that result in memcpy() usage that can't be easily verified by the
compiler (i.e. dynamic bounds due to flexible arrays). The runtime WARN
from memcpy has been posted before, but now there's more context around
alternatives for refactoring false positives, etc.

The core of this series is patches 2 (flex_array.h), 3 (flex_array
KUnit), and 4 (runtime memcpy WARN). Patch 1 is a fix to land before 4
(and I can send separately), and everything else are examples of what the
conversions look like for one of the helpers, mem_to_flex_dup(). These
will need to land via their respective trees, but they all depend on
patch 2, which I'm hoping to land in the coming merge window.

I'm happy to also point out that the conversions (patches 5+) are actually
a net reduction in lines of code:
 49 files changed, 154 insertions(+), 244 deletions(-)

Anyway, please let me know what you think. And apologies in advance
if this is spammy; the CC list got rather large due to the "treewide"
nature of the example conversions.

Also available here:
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=flexcpy/next-20220502

-Kees

[1] https://lwn.net/Articles/864521/

Kees Cook (32):
  netlink: Avoid memcpy() across flexible array boundary
  Introduce flexible array struct memcpy() helpers
  flex_array: Add Kunit tests
  fortify: Add run-time WARN for cross-field memcpy()
  brcmfmac: Use mem_to_flex_dup() with struct brcmf_fweh_queue_item
  iwlwifi: calib: Prepare to use mem_to_flex_dup()
  iwlwifi: calib: Use mem_to_flex_dup() with struct iwl_calib_result
  iwlwifi: mvm: Use mem_to_flex_dup() with struct ieee80211_key_conf
  p54: Use mem_to_flex_dup() with struct p54_cal_database
  wcn36xx: Use mem_to_flex_dup() with struct wcn36xx_hal_ind_msg
  nl80211: Use mem_to_flex_dup() with struct cfg80211_cqm_config
  cfg80211: Use mem_to_flex_dup() with struct cfg80211_bss_ies
  mac80211: Use mem_to_flex_dup() with several structs
  af_unix: Use mem_to_flex_dup() with struct unix_address
  802/garp: Use mem_to_flex_dup() with struct garp_attr
  802/mrp: Use mem_to_flex_dup() with struct mrp_attr
  net/flow_offload: Use mem_to_flex_dup() with struct flow_action_cookie
  firewire: Use __mem_to_flex_dup() with struct iso_interrupt_event
  afs: Use mem_to_flex_dup() with struct afs_acl
  ASoC: sigmadsp: Use mem_to_flex_dup() with struct sigmadsp_data
  soc: qcom: apr: Use mem_to_flex_dup() with struct apr_rx_buf
  atags_proc: Use mem_to_flex_dup() with struct buffer
  Bluetooth: Use mem_to_flex_dup() with struct
hci_op_configure_data_path
  IB/hfi1: Use mem_to_flex_dup() for struct tid_rb_node
  Drivers: hv: utils: Use mem_to_flex_dup() with struct cn_msg
  ima: Use mem_to_flex_dup() with struct modsig
  KEYS: Use mem_to_flex_dup() with struct user_key_payload
  selinux: Use mem_to_flex_dup() with xfrm and sidtab
  xtensa: Use mem_to_flex_dup() with struct property
  usb: gadget: f_fs: Use mem_to_flex_dup() with struct ffs_buffer
  xenbus: Use mem_to_flex_dup() with struct read_buffer
  esas2r: Use __mem_to_flex() with struct atto_ioctl

 arch/arm/kernel/atags_proc.c  |  12 +-
 arch/xtensa/platforms/xtfpga/setup.c  |   9 +-
 drivers/firewire/core-cdev.c  |   7 +-
 drivers/hv/hv_utils_transport.c   |   7 +-
 drivers/infiniband/hw/hfi1/user_exp_rcv.c |   7 +-
 drivers/infiniband/hw/hfi1/user_exp_rcv.h |   4 +-
 drivers/net/wireless/ath/wcn36xx/smd.c|   8 +-
 drivers/net/wireless/ath/wcn36xx/smd.h|   4 +-
 .../broadcom/brcm80211/brcmfmac/fweh.c|  11 +-
 drivers/net/wireless/intel/iwlwifi/dvm/agn.h  |   2 +-
 .../net/wireless/intel/iwlwifi/dvm/calib.c|  23 +-
 .../net/wireless/intel/iwlwifi/dvm/ucode.c|   8 +-
 drivers/net/wireless/intel/iwlwifi/mvm/sta.c  |   8 +-
 drivers/net/wireless/intersil/p54/eeprom.c|   8 +-
 drivers/net/wireless/intersil/p54/p54.h   |   4 +-
 drivers/scsi/esas2r/atioctl.h |   1 +
 drivers/scsi/esas2r/esas2r_ioctl.c|  11 +-
 drivers/soc/qcom/apr.c|  12 +-
 drivers/usb/gadget/function/f_fs.c|  11 +-
 drivers/xen/xenbus/xenbus_dev_frontend.c  |  12 +-
 fs/afs/internal.h |   4 +-
 fs/afs/xattr.c|   7 +-
 include/keys/user-type.h  |   4 +-
 include/linux/flex_array.h| 637 ++
 include/linux/fortify-string.h|  70 +-
 include/linux/of.h|   3 +-
 include/linux/string.h|   1 +
 include/net/af_unix.h |  14 +-
 include/net/bluetooth/hci.h   |   4 +-
 include/net/cfg80211.h|   4 +-
 include/net/flow_offload.h|   4 +-
 include/net/garp.h   

[PATCH 02/32] Introduce flexible array struct memcpy() helpers

2022-05-03 Thread Kees Cook
The compiler is not able to automatically perform bounds checking
on structures that end in flexible arrays: __builtin_object_size()
is compile-time only. Any possible run-time checks are currently
short-circuited because there isn't an obvious common way to figure out
the bounds of such a structure. C has no way (yet[1]) to signify which
struct member holds the number of allocated flexible array elements
(like exists in other languages).

As a result, the kernel (and C projects generally) need to manually
check the bounds, check the element size calculations, and perform sanity
checking on all the associated variable types in between (e.g. 260
cannot be stored in a u8). This is extremely fragile.

However, even if we could do all this through a magic memcpy(), the API
itself doesn't provide meaningful feedback, which forces the kernel into
an "all or nothing" approach: either do the copy or panic the system. Any
failure conditions should be _detectable_, with API users able to
gracefully recover.

To deal with these needs, create a set of helper functions that do the
work of memcpy() but perform the needed bounds checking based on the
arguments given: flex_cpy(). The common pattern of "allocate and copy"
is also included: flex_dup(). However, one of the most common patterns
is deserialization: allocating and populating flexible array members
from a byte array: mem_to_flex_dup(). And if the elements are already
allocated: mem_to_flex().

The concept of a "flexible array structure" is introduced, which is a
struct that has both a trailing flexible array member _and_ an element
count member. If a struct lacks the element count member, it's just a
blob: there are no bounds associated with it.

The most common style of flexible array struct in the kernel is a
"normal" one, where both the flex-array and element-count are present:

struct flex_array_struct_example {
... /* arbitrary members */
u16 part_count; /* count of elements stored in "parts" below. */
... /* arbitrary members */
u32 parts[];/* flexible array with elements of type u32. */
};

Next are "encapsulating flexible array structs", which is just a struct
that contains a flexible array struct as its final member:

struct encapsulating_example {
... /* arbitrary members */
struct flex_array_struct_example fas;
};

There are also "split" flex array structs, which have the element-count
member in a separate struct level than the flex-array member:

struct split_example {
... /* arbitrary members */
u16 part_count; /* count of elements stored in "parts" below. */
... /* arbitrary members */
struct blob_example {
... /* other blob members */
u32 parts[];/* flexible array with elements of type u32. */
} blob;
};

To have the helpers deal with these arbitrary layouts, the names of the
flex-array and element-count members need to be specified with each use
(since C lacks the array-with-length syntax[1] so the compiler cannot
automatically determine them). However, for the "normal" (most common)
case, we can get close to "automatic" by explicitly declaring common
member aliases "__flex_array_elements", and "__flex_array_elements_count"
respectively. The regular helpers use these members, but extended helpers
exist to cover the other two code patterns.

For example, using the most complicated helper, mem_to_flex_dup():

/* Flexible array struct with members identified. */
struct something {
int mode;
DECLARE_FLEX_ARRAY_ELEMENTS_COUNT(int, how_many);
unsigned long flags;
DECLARE_FLEX_ARRAY_ELEMENTS(u32, value);
};
...
struct something *instance = NULL;
int rc;

rc = mem_to_flex_dup(, byte_array, count, GFP_KERNEL);
if (rc)
return rc;

This will:

- validate "instance" is non-NULL (no NULL dereference).
- validate "*instance" is NULL (no memory allocation resource leak).
- validate that "count" is:
  - non-negative (no arithmetic underflow).
  - has a value that can be stored in the "how_many" type (no value
truncation).
- calculate the bytes needed to store "count"-many trailing u32 elements
  (no arithmetic overflow/underflow).
- calculate the bytes needed for a "struct something" with the above
  trailing elements (no arithmetic overflow/underflow).
- allocate the memory and check the result (no NULL dereference).
- initialize the non-flex-array portion of the struct to zero (no
  uninitialized memory usage).
- copy from "buf" into the flexible array elements.

If anything goes wrong, it returns a negative errno.

With these helpers the kernel can move away from many of the open-coded
patterns of using memcpy() with a dynamically-sized destination buffer.

[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1990.htm

Cc: "Gustavo A. R. Silva" 
Cc: Keith Packard 
Cc: Francis 

Re: [PATCH 04/30] firmware: google: Convert regular spinlock into trylock on panic path

2022-05-03 Thread Evan Green
On Wed, Apr 27, 2022 at 3:51 PM Guilherme G. Piccoli
 wrote:
>
> Currently the gsmi driver registers a panic notifier as well as
> reboot and die notifiers. The callbacks registered are called in
> atomic and very limited context - for instance, panic disables
> preemption, local IRQs and all other CPUs that aren't running the
> current panic function.
>
> With that said, taking a spinlock in this scenario is a
> dangerous invitation for a deadlock scenario. So, we fix
> that in this commit by changing the regular spinlock with
> a trylock, which is a safer approach.
>
> Fixes: 74c5b31c6618 ("driver: Google EFI SMI")
> Cc: Ard Biesheuvel 
> Cc: David Gow 
> Cc: Evan Green 
> Cc: Julius Werner 
> Signed-off-by: Guilherme G. Piccoli 
> ---
>  drivers/firmware/google/gsmi.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/google/gsmi.c b/drivers/firmware/google/gsmi.c
> index adaa492c3d2d..b01ed02e4a87 100644
> --- a/drivers/firmware/google/gsmi.c
> +++ b/drivers/firmware/google/gsmi.c
> @@ -629,7 +629,10 @@ static int gsmi_shutdown_reason(int reason)
> if (saved_reason & (1 << reason))
> return 0;
>
> -   spin_lock_irqsave(_dev.lock, flags);
> +   if (!spin_trylock_irqsave(_dev.lock, flags)) {
> +   rc = -EBUSY;
> +   goto out;
> +   }

gsmi_shutdown_reason() is a common function called in other scenarios
as well, like reboot and thermal trip, where it may still make sense
to wait to acquire a spinlock. Maybe we should add a parameter to
gsmi_shutdown_reason() so that you can get your change on panic, but
we don't convert other callbacks into try-fail scenarios causing us to
miss logs.

Though thinking more about it, is this really a Good Change (TM)? The
spinlock itself already disables interrupts, meaning the only case
where this change makes a difference is if the panic happens from
within the function that grabbed the spinlock (in which case the
callback is also likely to panic), or in an NMI that panics within
that window. The downside of this change is that if one core was
politely working through an event with the lock held, and another core
panics, we now might lose the panic log, even though it probably would
have gone through fine assuming the other core has a chance to
continue.

-Evan



Re: [PATCH 04/30] firmware: google: Convert regular spinlock into trylock on panic path

2022-05-03 Thread Evan Green
Hi Guilherme,

On Tue, May 3, 2022 at 12:12 PM Guilherme G. Piccoli
 wrote:
>
> On 03/05/2022 15:03, Evan Green wrote:
> > [...]
> > gsmi_shutdown_reason() is a common function called in other scenarios
> > as well, like reboot and thermal trip, where it may still make sense
> > to wait to acquire a spinlock. Maybe we should add a parameter to
> > gsmi_shutdown_reason() so that you can get your change on panic, but
> > we don't convert other callbacks into try-fail scenarios causing us to
> > miss logs.
> >
>
> Hi Evan, thanks for your feedback, much appreciated!
> What I've done in other cases like this was to have a helper checking
> the spinlock in the panic notifier - if we can acquire that, go ahead
> but if not, bail out. For a proper example of an implementation, check
> patch 13 of the series:
> https://lore.kernel.org/lkml/20220427224924.592546-14-gpicc...@igalia.com/ .
>
> Do you agree with that, or prefer really a parameter in
> gsmi_shutdown_reason() ? I'll follow your choice =)

I'm fine with either, thanks for the link. Mostly I want to make sure
other paths to gsmi_shutdown_reason() aren't also converted to a try.

>
>
> > Though thinking more about it, is this really a Good Change (TM)? The
> > spinlock itself already disables interrupts, meaning the only case
> > where this change makes a difference is if the panic happens from
> > within the function that grabbed the spinlock (in which case the
> > callback is also likely to panic), or in an NMI that panics within
> > that window. The downside of this change is that if one core was
> > politely working through an event with the lock held, and another core
> > panics, we now might lose the panic log, even though it probably would
> > have gone through fine assuming the other core has a chance to
> > continue.
>
> My feeling is that this is a good change, indeed - a lot of places are
> getting changed like this, in this series.
>
> Reasoning: the problem with your example is that, by default, secondary
> CPUs are disabled in the panic path, through an IPI mechanism. IPIs take
> precedence and interrupt the work in these CPUs, effectively
> interrupting the "polite work" with the lock held heh

The IPI can only interrupt a CPU with irqs disabled if the IPI is an
NMI. I haven't looked before to see if we use NMI IPIs to corral the
other CPUs on panic. On x86, I grepped my way down to
native_stop_other_cpus(), which looks like it does a normal IPI, waits
1 second, then does an NMI IPI. So, if a secondary CPU has the lock
held, on x86 it has roughly 1s to finish what it's doing and re-enable
interrupts before smp_send_stop() brings the NMI hammer down. I think
this should be more than enough time for the secondary CPU to get out
and release the lock.

So then it makes sense to me that you're fixing cases where we
panicked with the lock held, or hung with the lock held. Given the 1
second grace period x86 gives us, I'm on board, as that helps mitigate
the risk that we bailed out early with the try and should have spun a
bit longer instead. Thanks.

-Evan

>
> Then, such CPU is put to sleep and we finally reach the panic notifier
> hereby discussed, in the main CPU. If the other CPU was shut-off *with
> the lock held*, it's never finishing such work, so the lock is never to
> be released. Conclusion: the spinlock can't be acquired, hence we broke
> the machine (which is already broken, given it's panic) in the path of
> this notifier.
> This should be really rare, but..possible. So I think we should protect
> against this scenario.
>
> We can grab others' feedback if you prefer, and of course you have the
> rights to refuse this change in the gsmi code, but from my
> point-of-view, I don't see any advantage in just assume the risk,
> specially since the change is very very simple.
>
> Cheers,
>
>
> Guilherme



OPNSense running in domU has no network connectivity on 5.15.29+

2022-05-03 Thread Colton Reeder
Hello,

I am running the FreeBSD-based router OS OPNSense in a domU. I
recently upgraded my dom0 kernel from 5.15.26 to 5.15.32 and with the
new kernel, OPNSense had no connectivity. I downloaded from kernel.org
5.15.26-32, built and installed each version and booted them
consecutively until I found the version that no longer worked. It
turned out to be 5.15.29.

I looked through the change log of 5.15.29 and found two commits for xen-netback

commit 2708ceb4e5cc84ef179bad25a2d7890573ef78be commit
fe39ab30dcc204e321c2670cc1cf55904af35d01

I reverted these changes (a revert of a revert, yes)  in 5.15.32,
built and installed. Now the network works. Now I dont know enough to
know thats for sure the right fix. Maybe I have a config issue, I dont
know, but reverting that change fixes the problem. What should I do?
I was asked to provide xenstore -ls https://pastebin.com/hHPWgrEy



[ovmf test] 170077: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170077 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170077/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   64 days  792 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days   16 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



[ovmf test] 170073: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170073 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170073/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   64 days  791 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days   15 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



[ovmf test] 170069: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170069 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170069/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   64 days  790 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days   14 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



[linux-linus test] 170053: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170053 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170053/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-examine-uefi  5 host-install  broken REGR. vs. 170001
 test-armhf-armhf-libvirt-raw 12 debian-di-installfail REGR. vs. 170001

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 170001
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 170001
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 170001
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 170001
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 170001
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 170001
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 170001
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 linuxef8e4d3c2ab1f47f63b6c7e578266b7e5cc9cd1b
baseline version:
 linux9050ba3a61a4b5bd84c2cde092a100404f814f31

Last test of basis   170001  2022-05-02 19:09:59 Z1 days
Testing same since   170053  2022-05-03 17:42:45 Z0 days1 attempts


People who touched revisions under test:
  Adam Wujek 
  Armin Wolf 
  Denis Pauk 
  Guenter Roeck 
  Ji-Ze Hong (Peter Hong) 
  Ji-Ze Hong (Peter Hong) 
  Linus Torvalds 
  Rob Herring 
  Zev Weiss 
  Zheyu Ma 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass

[ovmf test] 170066: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170066 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170066/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  789 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days   13 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



Re: [PATCH V1 4/6] dt-bindings: Add xen,dev-domid property description for xen-grant DMA ops

2022-05-03 Thread Rob Herring
On Tue, May 03, 2022 at 08:09:32PM +0300, Oleksandr wrote:
> 
> On 03.05.22 00:59, Rob Herring wrote:
> 
> Hello Rob
> 
> 
> > On Fri, Apr 22, 2022 at 07:51:01PM +0300, Oleksandr Tyshchenko wrote:
> > > From: Oleksandr Tyshchenko 
> > > 
> > > Introduce Xen specific binding for the virtualized device (e.g. virtio)
> > > to be used by Xen grant DMA-mapping layer in the subsequent commit.
> > > 
> > > This binding indicates that Xen grant mappings scheme needs to be
> > > enabled for the device which DT node contains that property and specifies
> > > the ID of Xen domain where the corresponding backend resides. The ID
> > > (domid) is used as an argument to the grant mapping APIs.
> > > 
> > > This is needed for the option to restrict memory access using Xen grant
> > > mappings to work which primary goal is to enable using virtio devices
> > > in Xen guests.
> > > 
> > > Signed-off-by: Oleksandr Tyshchenko 
> > > ---
> > > Changes RFC -> V1:
> > > - update commit subject/description and text in description
> > > - move to devicetree/bindings/arm/
> > > ---
> > >   .../devicetree/bindings/arm/xen,dev-domid.yaml | 37 
> > > ++
> > >   1 file changed, 37 insertions(+)
> > >   create mode 100644 
> > > Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
> > > 
> > > diff --git a/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml 
> > > b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
> > > new file mode 100644
> > > index ..ef0f747
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
> > > @@ -0,0 +1,37 @@
> > > +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
> > > +%YAML 1.2
> > > +---
> > > +$id: http://devicetree.org/schemas/arm/xen,dev-domid.yaml#
> > > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > > +
> > > +title: Xen specific binding for the virtualized device (e.g. virtio)
> > > +
> > > +maintainers:
> > > +  - Oleksandr Tyshchenko 
> > > +
> > > +select: true
> > Do we really need to support this property everywhere?
> 
> From my understanding - yes.
> 
> As, I think, any device node describing virtulized device in the guest
> device tree can have this property.  Initially (in the RFC series) the
> "solution to restrict memory access using Xen grant mappings" was
> virtio-specific.
> 
> Although the support of virtio is a primary target of this series, we
> decided to generalize this work and expand it to any device [1]. So the Xen
> grant mappings scheme (this property to be used for) can be theoretically
> used for any device emulated by the Xen backend.
> 
> 
> > > +
> > > +description:
> > > +  This binding indicates that Xen grant mappings scheme needs to be 
> > > enabled
> > > +  for that device and specifies the ID of Xen domain where the 
> > > corresponding
> > > +  device (backend) resides. This is needed for the option to restrict 
> > > memory
> > > +  access using Xen grant mappings to work.
> > > +
> > > +properties:
> > > +  xen,dev-domid:
> > > +$ref: /schemas/types.yaml#/definitions/uint32
> > > +description:
> > > +  The domid (domain ID) of the domain where the device (backend) is 
> > > running.
> > > +
> > > +additionalProperties: true
> > > +
> > > +examples:
> > > +  - |
> > > +virtio_block@3000 {
> > virtio@3000
> 
> ok, will change
> 
> 
> > 
> > > +compatible = "virtio,mmio";
> > > +reg = <0x3000 0x100>;
> > > +interrupts = <41>;
> > > +
> > > +/* The device is located in Xen domain with ID 1 */
> > > +xen,dev-domid = <1>;
> > This fails validation:
> > 
> > Documentation/devicetree/bindings/arm/xen,dev-domid.example.dtb: 
> > virtio_block@3000: xen,dev-domid: [[1]] is not of type 'object'
> >  From schema: 
> > /home/rob/proj/git/linux-dt/Documentation/devicetree/bindings/virtio/mmio.yaml
> 
> Thank you for pointing this out, my fault, I haven't "properly" checked this
> before. I think, we need to remove "compatible = "virtio,mmio"; here

Uhh, no. That just means the example is incomplete. You need to add this 
property or reference this schema from virtio/mmio.yaml.


> diff --git a/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
> b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
> index 2daa8aa..d2f2140 100644
> --- a/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
> +++ b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
> @@ -28,7 +28,7 @@ additionalProperties: true
>  examples:
>    - |
>  virtio_block@3000 {
> -    compatible = "virtio,mmio";
> +    /* ... */
>  reg = <0x3000 0x100>;
>  interrupts = <41>;
> 
> 
> 
> > 
> > The property has to be added to the virtio/mmio.yaml schema. If it is
> > not needed elsewhere, then *just* add the property there.
> 
> As I described above, the property is not virtio specific and can be used
> for any virtualized device for which Xen grant mappings 

Re: [PATCH V8 2/2] libxl: Introduce basic virtio-mmio support on Arm

2022-05-03 Thread Stefano Stabellini
On Tue, 3 May 2022, Oleksandr Tyshchenko wrote:
> From: Julien Grall 
> 
> This patch introduces helpers to allocate Virtio MMIO params
> (IRQ and memory region) and create specific device node in
> the Guest device-tree with allocated params. In order to deal
> with multiple Virtio devices, reserve corresponding ranges.
> For now, we reserve 1MB for memory regions and 10 SPIs.
> 
> As these helpers should be used for every Virtio device attached
> to the Guest, call them for Virtio disk(s).
> 
> Please note, with statically allocated Virtio IRQs there is
> a risk of a clash with a physical IRQs of passthrough devices.
> For the first version, it's fine, but we should consider allocating
> the Virtio IRQs automatically. Thankfully, we know in advance which
> IRQs will be used for passthrough to be able to choose non-clashed
> ones.
> 
> Signed-off-by: Julien Grall 
> Signed-off-by: Oleksandr Tyshchenko 

Reviewed-by: Stefano Stabellini 


> ---
> Please note, this is a split/cleanup/hardening of Julien's PoC:
> "Add support for Guest IO forwarding to a device emulator"
> 
> Changes RFC -> V1:
>- was squashed with:
>  "[RFC PATCH V1 09/12] libxl: Handle virtio-mmio irq in more correct way"
>  "[RFC PATCH V1 11/12] libxl: Insert "dma-coherent" property into 
> virtio-mmio device node"
>  "[RFC PATCH V1 12/12] libxl: Fix duplicate memory node in DT"
>- move VirtIO MMIO #define-s to xen/include/public/arch-arm.h
> 
> Changes V1 -> V2:
>- update the author of a patch
> 
> Changes V2 -> V3:
>- no changes
> 
> Changes V3 -> V4:
>- no changes
> 
> Changes V4 -> V5:
>- split the changes, change the order of the patches
>- drop an extra "virtio" configuration option
>- update patch description
>- use CONTAINER_OF instead of own implementation
>- reserve ranges for Virtio MMIO params and put them
>  in correct location
>- create helpers to allocate Virtio MMIO params, add
>  corresponding sanity-сhecks
>- add comment why MMIO size 0x200 is chosen
>- update debug print
>- drop Wei's T-b
> 
> Changes V5 -> V6:
>- rebase on current staging
> 
> Changes V6 -> V7:
>- rebase on current staging
>- add T-b and R-b tags
>- update according to the recent changes to
>  "libxl: Add support for Virtio disk configuration"
> 
> Changes V7 -> V8:
>- drop T-b and R-b tags
>- make virtio_mmio_base/irq global variables to be local in
>  libxl__arch_domain_prepare_config() and initialize them at
>  the beginning of the function, then rework alloc_virtio_mmio_base/irq()
>  to take a pointer to virtio_mmio_base/irq variables as an argument
>- update according to the recent changes to
>  "libxl: Add support for Virtio disk configuration"
> ---
>  tools/libs/light/libxl_arm.c  | 118 
> +-
>  xen/include/public/arch-arm.h |   7 +++
>  2 files changed, 123 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index eef1de0..37403a2 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -8,6 +8,46 @@
>  #include 
>  #include 
>  
> +/*
> + * There is no clear requirements for the total size of Virtio MMIO region.
> + * The size of control registers is 0x100 and device-specific configuration
> + * registers starts at the offset 0x100, however it's size depends on the 
> device
> + * and the driver. Pick the biggest known size at the moment to cover most
> + * of the devices (also consider allowing the user to configure the size via
> + * config file for the one not conforming with the proposed value).
> + */
> +#define VIRTIO_MMIO_DEV_SIZE   xen_mk_ullong(0x200)
> +
> +static uint64_t alloc_virtio_mmio_base(libxl__gc *gc, uint64_t 
> *virtio_mmio_base)
> +{
> +uint64_t base = *virtio_mmio_base;
> +
> +/* Make sure we have enough reserved resources */
> +if ((base + VIRTIO_MMIO_DEV_SIZE >
> +GUEST_VIRTIO_MMIO_BASE + GUEST_VIRTIO_MMIO_SIZE)) {
> +LOG(ERROR, "Ran out of reserved range for Virtio MMIO BASE 
> 0x%"PRIx64"\n",
> +base);
> +return 0;
> +}
> +*virtio_mmio_base += VIRTIO_MMIO_DEV_SIZE;
> +
> +return base;
> +}
> +
> +static uint32_t alloc_virtio_mmio_irq(libxl__gc *gc, uint32_t 
> *virtio_mmio_irq)
> +{
> +uint32_t irq = *virtio_mmio_irq;
> +
> +/* Make sure we have enough reserved resources */
> +if (irq > GUEST_VIRTIO_MMIO_SPI_LAST) {
> +LOG(ERROR, "Ran out of reserved range for Virtio MMIO IRQ %u\n", 
> irq);
> +return 0;
> +}
> +(*virtio_mmio_irq)++;
> +
> +return irq;
> +}
> +
>  static const char *gicv_to_string(libxl_gic_version gic_version)
>  {
>  switch (gic_version) {
> @@ -26,8 +66,10 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>  {
>  uint32_t nr_spis = 0;
>  unsigned int i;
> -uint32_t vuart_irq;
> -bool vuart_enabled = false;
> +uint32_t 

[ovmf test] 170062: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170062 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170062/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  788 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days   12 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



[qemu-mainline test] 170051: tolerable FAIL - PUSHED

2022-05-03 Thread osstest service owner
flight 170051 qemu-mainline real [real]
flight 170057 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/170051/
http://logs.test-lab.xenproject.org/osstest/logs/170057/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-libvirt   8 xen-bootfail pass in 170057-retest
 test-amd64-i386-libvirt-raw  16 guest-saverestore   fail pass in 170057-retest

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt 15 migrate-support-check fail in 170057 never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 169967
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 169967
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 169967
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 169967
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 169967
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 169967
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 169967
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 169967
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu5f14cfe187e2fc3c71f4536b2021b8118d224239
baseline version:
 qemuu

[ovmf test] 170059: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170059 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170059/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  787 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days   11 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



[ovmf test] 170055: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170055 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170055/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  786 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days   10 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



Re: [PATCH 04/30] firmware: google: Convert regular spinlock into trylock on panic path

2022-05-03 Thread Guilherme G. Piccoli
On 03/05/2022 15:03, Evan Green wrote:
> [...]
> gsmi_shutdown_reason() is a common function called in other scenarios
> as well, like reboot and thermal trip, where it may still make sense
> to wait to acquire a spinlock. Maybe we should add a parameter to
> gsmi_shutdown_reason() so that you can get your change on panic, but
> we don't convert other callbacks into try-fail scenarios causing us to
> miss logs.
> 

Hi Evan, thanks for your feedback, much appreciated!
What I've done in other cases like this was to have a helper checking
the spinlock in the panic notifier - if we can acquire that, go ahead
but if not, bail out. For a proper example of an implementation, check
patch 13 of the series:
https://lore.kernel.org/lkml/20220427224924.592546-14-gpicc...@igalia.com/ .

Do you agree with that, or prefer really a parameter in
gsmi_shutdown_reason() ? I'll follow your choice =)


> Though thinking more about it, is this really a Good Change (TM)? The
> spinlock itself already disables interrupts, meaning the only case
> where this change makes a difference is if the panic happens from
> within the function that grabbed the spinlock (in which case the
> callback is also likely to panic), or in an NMI that panics within
> that window. The downside of this change is that if one core was
> politely working through an event with the lock held, and another core
> panics, we now might lose the panic log, even though it probably would
> have gone through fine assuming the other core has a chance to
> continue.

My feeling is that this is a good change, indeed - a lot of places are
getting changed like this, in this series.

Reasoning: the problem with your example is that, by default, secondary
CPUs are disabled in the panic path, through an IPI mechanism. IPIs take
precedence and interrupt the work in these CPUs, effectively
interrupting the "polite work" with the lock held heh

Then, such CPU is put to sleep and we finally reach the panic notifier
hereby discussed, in the main CPU. If the other CPU was shut-off *with
the lock held*, it's never finishing such work, so the lock is never to
be released. Conclusion: the spinlock can't be acquired, hence we broke
the machine (which is already broken, given it's panic) in the path of
this notifier.
This should be really rare, but..possible. So I think we should protect
against this scenario.

We can grab others' feedback if you prefer, and of course you have the
rights to refuse this change in the gsmi code, but from my
point-of-view, I don't see any advantage in just assume the risk,
specially since the change is very very simple.

Cheers,


Guilherme



[ovmf test] 170054: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170054 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170054/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  785 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days9 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



Re: [PATCH 16/30] drivers/hv/vmbus, video/hyperv_fb: Untangle and refactor Hyper-V panic notifiers

2022-05-03 Thread Guilherme G. Piccoli
On 03/05/2022 15:13, Michael Kelley (LINUX) wrote:
> [...]
>> (a) We could forget about this change, and always do the clean-up here,
>> not relying in machine_crash_shutdown().
>> Pro: really simple, behaves the same as it is doing currently.
>> Con: less elegant/concise, doesn't allow arm64 customization.
>>
>> (b) Add a way to allow ARM64 customization of shutdown crash handler.
>> Pro: matches x86, more customizable, improves arm64 arch code.
>> Con: A tad more complex.
>>
>> Also, a question that came-up: if ARM64 has no way of calling special
>> crash shutdown handler, how can you execute hv_stimer_cleanup() and
>> hv_synic_disable_regs() there? Or are they not required in ARM64?
>>
> 
> My suggestion is to do (a) for now.  I suspect (b) could be a more
> extended discussion and I wouldn't want your patch set to get held
> up on that discussion.  I don't know what the sense of the ARM64
> maintainers would be toward (b).  They have tried to avoid picking
> up code warts like have accumulated on the x86/x64 side over the
> years, and I agree with that effort.  But as more and varied
> hypervisors become available for ARM64, it seems like a framework
> for supporting a custom shutdown handler may become necessary.
> But that could take a little time.
> 
> You are right about hv_stimer_cleanup() and hv_synic_disable_regs().
> We are not running these when a panic occurs on ARM64, and we
> should be, though the risk is small.   We will pursue (b) and add
> these additional cleanups as part of that.  But again, I would suggest
> doing (a) for now, and we will switch back to your solution once
> (b) is in place.
> 

Thanks again Michael, I'll stick with (a) for now. I'll check with ARM64
community about that, and I might even try to implement something in
parallel (if you are not already working on that - lemme know please),
so we don't get stuck here. As you said, I feel that this is more and
more relevant as the number of panic/crash/kexec scenarios tend to
increase in ARM64.


>> [...]
>> Some ideas of what we can do here:
>>
>> I) we could change the framebuffer notifier to rely on trylocks, instead
>> of risking a lockup scenario, and with that, we can execute it before
>> the vmbus disconnect in the hypervisor list;
> 
> I think we have to do this approach for now.
> 
>>
>> II) we ignore the hypervisor notifier in case of kdump _by default_, and
>> if the users don't want that, they can always set the panic notifier
>> level to 4 and run all notifiers prior to kdump; would that be terrible
>> you think? Kdump users might don't care about the framebuffer...
>>
>> III) we go with approach (b) above and refactor arm64 code to allow the
>> custom crash handler on kdump time, then [with point (I) above] the
>> logic proposed in this series is still valid - seems more and more the
>> most correct/complete solution.
> 
> But even when/if we get approach (b) implemented, having the
> framebuffer notifier on the pre_reboot list is still too late with the
> default of panic_notifier_level = 2.  The kdump path will reset the
> VMbus connection and then the framebuffer notifier won't work.
> 

OK, perfect! I'll work something along these lines in V2, allowing the
FB notifier to always run in the hypervisor list before the vmbus unload
mechanism.


>> [...]
 +static int hv_panic_vmbus_unload(struct notifier_block *nb, unsigned long 
 val,
  void *args)
 +{
 +  if (!kexec_crash_loaded())
>>>
>>> I'm not clear on the purpose of this condition.  I think it means
>>> we will skip the vmbus_initiate_unload() if a panic occurs in the
>>> kdump kernel.  Is there a reason a panic in the kdump kernel
>>> should be treated differently?  Or am I misunderstanding?
>>
>> This is really related with the point discussed in the top of this
>> response - I assumed both ARM64/x86_64 would behave the same and
>> disconnect the vmbus through the custom crash handler when kdump is set,
>> so worth skipping it here in the notifier. But that's not true for ARM64
>> as you pointed, so this guard against kexec is really part of the
>> decision/discussion on what to do with ARM64 heh
> 
> But note that vmbus_initiate_unload() already has a guard built-in.
> If the intent of this test is just as a guard against running twice,
> then it isn't needed.

Since we're going to avoid relying in the custom crash_shutdown(), due
to the lack of ARM64 support for now, this check will be removed in V2.

Its purpose was to skip the notifier *proactively* in case kexec is set,
given that...once kexec happens, the custom crash_shutdown() would run
the same function (wrong assumption for ARM64, my bad).

Postponing that slightly would maybe gain us some time while the
hypervisor finish its work, so we'd delay less in the vmbus unload path
- that was the rationale behind this check.


Cheers!



Re: [PATCH 3/3] xen/arm: Add sb instruction support

2022-05-03 Thread Julien Grall

Hi Bertrand,

On 03/05/2022 10:38, Bertrand Marquis wrote:

This patch is adding sb instruction support when it is supported by a
CPU on arm64.
To achieve this, the "sb" macro is moved to sub-arch macros.h so that we
can use sb instruction when available through alternative on arm64 and
keep the current behaviour on arm32.


SB is also supported on Arm32. So I would prefer to introduce the 
encoding right now and avoid duplicating the .macro sb.



A new cpuerrata capability is introduced to enable the alternative


'sb' is definitely not an erratum. Errata are for stuff that are meant 
to be specific to one (or multiple) CPU and they are not part of the 
architecture.


This is the first time we introduce a feature in Xen. So we need to add 
a new array in cpufeature.c that will cover 'SB' for now. In future we 
could add feature like pointer auth, LSE atomics...



code for sb when the support is detected using isa64 coprocessor


s/coprocessor/system/


register.
The sb instruction is encoded using its hexadecimal value.


This is necessary to avoid recursive macro, right?


diff --git a/xen/arch/arm/include/asm/arm64/macros.h 
b/xen/arch/arm/include/asm/arm64/macros.h
index 140e223b4c..e639cec400 100644
--- a/xen/arch/arm/include/asm/arm64/macros.h
+++ b/xen/arch/arm/include/asm/arm64/macros.h
@@ -1,6 +1,24 @@
  #ifndef __ASM_ARM_ARM64_MACROS_H
  #define __ASM_ARM_ARM64_MACROS_H
  
+#include 

+
+/*
+ * Speculative barrier
+ */
+.macro sb
+alternative_if_not ARM64_HAS_SB
+dsb nsh
+isb
+alternative_else
+/*
+ * SB encoding as given in chapter C6.2.264 of ARM ARM (DDI 0487H.a).
+ */


NIT: Please align the comment with ".inst" below. I also don't think it 
is necessary to mention the spec here. The instruction encoding is not 
going to change.



+.inst 0xd50330ff
+nop


Why do we need the NOP?


+alternative_endif
+.endm
+
  /*
   * @dst: Result of get_cpu_info()
   */
diff --git a/xen/arch/arm/include/asm/cpufeature.h 
b/xen/arch/arm/include/asm/cpufeature.h
index 4719de47f3..9370805900 100644
--- a/xen/arch/arm/include/asm/cpufeature.h
+++ b/xen/arch/arm/include/asm/cpufeature.h
@@ -67,8 +67,9 @@
  #define ARM_WORKAROUND_BHB_LOOP_24 13
  #define ARM_WORKAROUND_BHB_LOOP_32 14
  #define ARM_WORKAROUND_BHB_SMCC_3 15
+#define ARM64_HAS_SB 16
  
-#define ARM_NCAPS   16

+#define ARM_NCAPS   17
  
  #ifndef __ASSEMBLY__
  
diff --git a/xen/arch/arm/include/asm/macros.h b/xen/arch/arm/include/asm/macros.h

index 1aa373760f..91ea3505e4 100644
--- a/xen/arch/arm/include/asm/macros.h
+++ b/xen/arch/arm/include/asm/macros.h
@@ -5,15 +5,6 @@
  # error "This file should only be included in assembly file"
  #endif
  
-/*

- * Speculative barrier
- * XXX: Add support for the 'sb' instruction
- */
-.macro sb
-dsb nsh
-isb
-.endm
-
  #if defined (CONFIG_ARM_32)
  # include 
  #elif defined(CONFIG_ARM_64)


Cheers,

--
Julien Grall



Re: [PATCH 2/3] xen/arm: Advertise workaround 1 if we apply 3

2022-05-03 Thread Julien Grall

Hi Bertrand,

On 03/05/2022 10:38, Bertrand Marquis wrote:

SMCC_WORKAROUND_3 is handling both Spectre v2 and spectre BHB.
So when a guest is asking if we support workaround 1, tell yes if we
apply workaround 3 on exception entry as it handles it.

This will allow guests not supporting Spectre BHB but impacted by
spectre v2 to still handle it correctly.
The modified behaviour is coherent with what the Linux kernel does in
KVM for guests.

While there use ARM_SMCCC_SUCCESS instead of 0 for the return code value
for workaround detection to be coherent with Workaround 2 handling.

Signed-off-by: Bertrand Marquis 


Acked-by: Julien Grall 

I think we should also consider for backport.

Cheers,

--
Julien Grall



RE: [PATCH 16/30] drivers/hv/vmbus, video/hyperv_fb: Untangle and refactor Hyper-V panic notifiers

2022-05-03 Thread Michael Kelley (LINUX)
From: Guilherme G. Piccoli  Sent: Friday, April 29, 2022 
3:35 PM
> 
> Hi Michael, first of all thanks for the great review, much appreciated.
> Some comments inline below:
> 
> On 29/04/2022 14:16, Michael Kelley (LINUX) wrote:
> > [...]
> >> hypervisor I/O completion), so we postpone that to run late. But more
> >> relevant: this *same* vmbus unloading happens in the crash_shutdown()
> >> handler, so if kdump is set, we can safely skip this panic notifier and
> >> defer such clean-up to the kexec crash handler.
> >
> > While the last sentence is true for Hyper-V on x86/x64, it's not true for
> > Hyper-V on ARM64.  x86/x64 has the 'machine_ops' data structure
> > with the ability to provide a custom crash_shutdown() function, which
> > Hyper-V does in the form of hv_machine_crash_shutdown().  But ARM64
> > has no mechanism to provide such a custom function that will eventually
> > do the needed vmbus_initiate_unload() before running kdump.
> >
> > I'm not immediately sure what the best solution is for ARM64.  At this
> > point, I'm just pointing out the problem and will think about the tradeoffs
> > for various possible solutions.  Please do the same yourself. :-)
> >
> 
> Oh, you're totally right! I just assumed ARM64 would the the same, my
> bad. Just to propose some alternatives, so you/others can also discuss
> here and we can reach a consensus about the trade-offs:
> 
> (a) We could forget about this change, and always do the clean-up here,
> not relying in machine_crash_shutdown().
> Pro: really simple, behaves the same as it is doing currently.
> Con: less elegant/concise, doesn't allow arm64 customization.
> 
> (b) Add a way to allow ARM64 customization of shutdown crash handler.
> Pro: matches x86, more customizable, improves arm64 arch code.
> Con: A tad more complex.
> 
> Also, a question that came-up: if ARM64 has no way of calling special
> crash shutdown handler, how can you execute hv_stimer_cleanup() and
> hv_synic_disable_regs() there? Or are they not required in ARM64?
> 

My suggestion is to do (a) for now.  I suspect (b) could be a more
extended discussion and I wouldn't want your patch set to get held
up on that discussion.  I don't know what the sense of the ARM64
maintainers would be toward (b).  They have tried to avoid picking
up code warts like have accumulated on the x86/x64 side over the
years, and I agree with that effort.  But as more and varied
hypervisors become available for ARM64, it seems like a framework
for supporting a custom shutdown handler may become necessary.
But that could take a little time.

You are right about hv_stimer_cleanup() and hv_synic_disable_regs().
We are not running these when a panic occurs on ARM64, and we
should be, though the risk is small.   We will pursue (b) and add
these additional cleanups as part of that.  But again, I would suggest
doing (a) for now, and we will switch back to your solution once
(b) is in place.

> 
> >>
> >> (c) There is also a Hyper-V framebuffer panic notifier, which relies in
> >> doing a vmbus operation that demands a valid connection. So, we must
> >> order this notifier with the panic notifier from vmbus_drv.c, in order to
> >> guarantee that the framebuffer code executes before the vmbus connection
> >> is unloaded.
> >
> > Patch 21 of this set puts the Hyper-V FB panic notifier on the pre_reboot
> > notifier list, which means it won't execute before the VMbus connection
> > unload in the case of kdump.   This notifier is making sure that Hyper-V
> > is notified about the last updates made to the frame buffer before the
> > panic, so maybe it needs to be put on the hypervisor notifier list.  It
> > sends a message to Hyper-V over its existing VMbus channel, but it
> > does not wait for a reply.  It does, however, obtain a spin lock on the
> > ring buffer used to communicate with Hyper-V.   Unless someone has
> > a better suggestion, I'm inclined to take the risk of blocking on that
> > spin lock.
> 
> The logic behind that was: when kdump is set, we'd skip the vmbus
> disconnect on notifiers, deferring that to crash_shutdown(), logic this
> one refuted in the above discussion on ARM64 (one more Pro argument to
> the idea of refactoring aarch64 code to allow a custom crash shutdown
> handler heh). But you're right, for the default level 2, we skip the
> pre_reboot notifiers on kdump, effectively skipping this notifier.
> 
> Some ideas of what we can do here:
> 
> I) we could change the framebuffer notifier to rely on trylocks, instead
> of risking a lockup scenario, and with that, we can execute it before
> the vmbus disconnect in the hypervisor list;

I think we have to do this approach for now.

> 
> II) we ignore the hypervisor notifier in case of kdump _by default_, and
> if the users don't want that, they can always set the panic notifier
> level to 4 and run all notifiers prior to kdump; would that be terrible
> you think? Kdump users might don't care about the framebuffer...
> 
> III) we go with approach 

Re: [PATCH 1/3] xen/arm: Sync sysregs and cpuinfo with Linux 5.18-rc3

2022-05-03 Thread Julien Grall

Hi Bertrand,

On 03/05/2022 10:38, Bertrand Marquis wrote:

Sync arm64 sysreg bit shift definitions with status of Linux kernel as
of 5.18-rc3 version (linux commit b2d229d4ddb1).
Sync ID registers sanitization with the status of Linux 5.18-rc3 and add
sanitization of ISAR2 registers.
Please outline which specific commits you are actually backported. This 
would help to know what changed, why and also keep track of the autorships.


When possible, the changes should be separated to match each Linux 
commit we backport.



Complete AA64ISAR2 and AA64MMFR1 with more fields.
While there add a comment for MMFR bitfields as for other registers in
the cpuinfo structure definition.


AFAICT, this patch is doing 3 different things that are somewhat related:
  - Sync cpufeature.c
  - Update the headers with unused defines
  - Complete the structure cpufeature.h

All those changes seem to be independent, so I think they should be done 
separately. This would help to keep the authorship right (your code vs 
Linux code).




Signed-off-by: Bertrand Marquis 
---
  xen/arch/arm/arm64/cpufeature.c  | 18 +-
  xen/arch/arm/include/asm/arm64/sysregs.h | 76 
  xen/arch/arm/include/asm/cpufeature.h| 14 -
  3 files changed, 91 insertions(+), 17 deletions(-)

diff --git a/xen/arch/arm/arm64/cpufeature.c b/xen/arch/arm/arm64/cpufeature.c
index 6e5d30dc7b..d9039d37b2 100644
--- a/xen/arch/arm/arm64/cpufeature.c
+++ b/xen/arch/arm/arm64/cpufeature.c
@@ -143,6 +143,16 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
ARM64_FTR_END,
  };
  
+static const struct arm64_ftr_bits ftr_id_aa64isar2[] = {

+   ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_HIGHER_SAFE, 
ID_AA64ISAR2_CLEARBHB_SHIFT, 4, 0),
+   ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
+  FTR_STRICT, FTR_EXACT, ID_AA64ISAR2_APA3_SHIFT, 4, 0),
+   ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
So we are using CONFIG_ARM64_PTR_AUTH. But this is not defined in 
Kconfig. I realize there are more in cpufeature.c (somehow I didn't spot 
during preview), but I don't think this is right to define CONFIG_* 
without an associated entry in Kconfig.


In one hand, I think it would be odd to add an entry in Kconfig because 
Xen wouldn't properly work if selected. On the other hand, it is useful 
if when we will implement pointer authentification.


So maybe we should just add the Kconfig entry with a comment explaning 
why they are not selected. Any thoughts?


Cheers,

--
Julien Grall



Re: [PATCH 24/30] panic: Refactor the panic path

2022-05-03 Thread Guilherme G. Piccoli
On 03/05/2022 14:31, Michael Kelley (LINUX) wrote:
> [...]
> 
> To me, it's a weak correlation between having a kmsg dumper, and
> wanting or not wanting the info level output to come before kdump.
> Hyper-V is one of only a few places that register a kmsg dumper, so most
> Linux instances outside of Hyper-V guest (and PowerPC systems?) will have
> the info level output after kdump.  It seems like anyone who cared strongly
> about the info level output would set the panic_notifier_level to 1 or to 3
> so that the result is more deterministic.  But that's just my opinion, and
> it's probably an opinion that is not as well informed on the topic as some
> others in the discussion. So keeping things as in your patch set is not a
> show-stopper for me.
> 
> However, I would request a clarification in the documentation.   The
> panic_notifier_level affects not only the hypervisor, informational,
> and pre_reboot lists, but it also affects panic_print_sys_info() and
> kmsg_dump().  Specifically, at level 1, panic_print_sys_info() and
> kmsg_dump() will not be run before kdump.  At level 3, they will
> always be run before kdump.  Your documentation above mentions
> "informational lists" (plural), which I take to vaguely include
> kmsg_dump() and panic_print_sys_info(), but being explicit about
> the effect would be better.
> 
> Michael

Thanks again Michael, to express your points and concerns - great idea
of documentation improvement here, I'll do that for V2, for sure.

The idea of "defaulting" to skip the info list on kdump (if no
kmsg_dump() is set) is again a mechanism that aims at accommodating all
users and concerns of antagonistic goals, kdump vs notifier lists.

Before this patch set, by default no notifier executed before kdump. So,
the "pendulum"  was strongly on kdump side, and clearly this was a
sub-optimal decision - proof of that is that both Hyper-V / PowerPC code
forcibly set the "crash_kexec_post_notifiers". The goal here is to have
a more lightweight list that by default runs before kdump, a secondary
list that only runs before kdump if there's usage for that (either user
sets that or kmsg_dumper set is considered a valid user), and the
remaining notifiers run by default only after kdump, all of that very
customizable through the levels idea.

Now, one thing we could do to improve consistency for the hyper-v case:
having a kmsg_dump_once() helper, and *for Hyper-V only*, call it on the
hypervisor list, within the info notifier (that would be moved to
hypervisor list, ofc).
Let's wait for more feedback on that, just throwing some ideas in order
we can have everyone happy with the end-result!

Cheers,


Guilherme



Re: [PATCH 19/30] panic: Add the panic hypervisor notifier list

2022-05-03 Thread Guilherme G. Piccoli
On 03/05/2022 14:44, Michael Kelley (LINUX) wrote:
> [...]
>>
>> Hi Michael, thanks for your feedback! I agree that your idea could work,
>> but...there is one downside: imagine the kmsg_dump() approach is not set
>> in some Hyper-V guest, then we would rely in the regular notification
>> mechanism [hv_die_panic_notify_crash()], right?
>> But...you want then to run this notifier in the informational list,
>> which...won't execute *by default* before kdump if no kmsg_dump() is
>> set. So, this logic is convoluted when you mix it with the default level
>> concept + kdump.
> 
> Yes, you are right.  But to me that speaks as much to the linkage
> between the informational list and kmsg_dump() being the core
> problem.  But as I described in my reply to Patch 24, I can live with
> the linkage as-is.

Thanks for the feedback Michael!

> [...] 
>> I feel the panic notification mechanism does really fit with a
>> hypervisor list, it's a good match with the nature of the list, which
>> aims at informing the panic notification to the hypervisor/FW.
>> Of course we can modify it if you prefer...but please take into account
>> the kdump case and how it complicates the logic.
> 
> I agree that the runtime effect of one list vs. the other is nil.  The
> code works and can stay as you written it.
> 
> I was trying to align from a conceptual standpoint.  It was a bit
> unexpected that one path would be on the hypervisor list, and the
> other path effectively on the informational list.  When I see
> conceptual mismatches like that, I tend to want to understand why,
> and if there is something more fundamental that is out-of-whack.
> 

Totally agree with you here, I am like that as well - try to really
understand the details, this is very important specially in this patch
set, since it's a refactor and affects every user of the notifiers
infrastructure.

Again, just to double-say it: feel free to suggest any change for the
Hyper-V portion (might as well for any patch in the series, indeed) -
you and the other Hyper-V maintainers own this code and I'd be glad to
align with your needs, you are honor citizens in the panic notifiers
area, being one the most heavy users for that =)

Cheers,


Guilherme



RE: [PATCH 19/30] panic: Add the panic hypervisor notifier list

2022-05-03 Thread Michael Kelley (LINUX)
From: Guilherme G. Piccoli  Sent: Friday, April 29, 2022 
11:04 AM
> 
> On 29/04/2022 14:30, Michael Kelley (LINUX) wrote:
> > From: Guilherme G. Piccoli  Sent: Wednesday, April 27, 
> > 2022
> 3:49 PM
> >> [...]
> >>
> >> @@ -2843,7 +2843,7 @@ static void __exit vmbus_exit(void)
> >>if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
> >>kmsg_dump_unregister(_kmsg_dumper);
> >>unregister_die_notifier(_die_report_block);
> >> -  atomic_notifier_chain_unregister(_notifier_list,
> >> +  atomic_notifier_chain_unregister(_hypervisor_list,
> >>_panic_report_block);
> >>}
> >>
> >
> > Using the hypervisor_list here produces a bit of a mismatch.  In many cases
> > this notifier will do nothing, and will defer to the kmsg_dump() mechanism
> > to notify the hypervisor about the panic.   Running the kmsg_dump()
> > mechanism is linked to the info_list, so I'm thinking the Hyper-V panic 
> > report
> > notifier should be on the info_list as well.  That way the reporting 
> > behavior
> > is triggered at the same point in the panic path regardless of which
> > reporting mechanism is used.
> >
> 
> Hi Michael, thanks for your feedback! I agree that your idea could work,
> but...there is one downside: imagine the kmsg_dump() approach is not set
> in some Hyper-V guest, then we would rely in the regular notification
> mechanism [hv_die_panic_notify_crash()], right?
> But...you want then to run this notifier in the informational list,
> which...won't execute *by default* before kdump if no kmsg_dump() is
> set. So, this logic is convoluted when you mix it with the default level
> concept + kdump.

Yes, you are right.  But to me that speaks as much to the linkage
between the informational list and kmsg_dump() being the core
problem.  But as I described in my reply to Patch 24, I can live with
the linkage as-is.

FWIW, guests on newer versions of Hyper-V will always register a
kmsg dumper.  The flags that are tested to decide whether to
register provide compatibility with older versions of Hyper-V that 
don’t support the 4K bytes of notification info.

> 
> May I suggest something? If possible, take a run with this patch set +
> DEBUG_NOTIFIER=y, in *both* cases (with and without the kmsg_dump()
> set). I did that and they run almost at the same time...I've checked the
> notifiers called, it's like almost nothing runs in-between.
> 
> I feel the panic notification mechanism does really fit with a
> hypervisor list, it's a good match with the nature of the list, which
> aims at informing the panic notification to the hypervisor/FW.
> Of course we can modify it if you prefer...but please take into account
> the kdump case and how it complicates the logic.

I agree that the runtime effect of one list vs. the other is nil.  The
code works and can stay as you written it.

I was trying to align from a conceptual standpoint.  It was a bit
unexpected that one path would be on the hypervisor list, and the
other path effectively on the informational list.  When I see
conceptual mismatches like that, I tend to want to understand why,
and if there is something more fundamental that is out-of-whack.


> 
> Let me know your considerations, in case you can experiment with the
> patch set as-is.
> Cheers,
> 
> 
> Guilherme


Re: [PATCH v2] xen/arm: gnttab: cast unused macro arguments to void

2022-05-03 Thread Julien Grall

Hi,

On 28/04/2022 10:46, Michal Orzel wrote:

Function unmap_common_complete (common/grant_table.c) defines and sets
a variable ld that is later on passed to a macro:
gnttab_host_mapping_get_page_type().
On Arm this macro does not make use of any arguments causing a compiler
to warn about unused-but-set variable (when -Wunused-but-set-variable
is enabled). Fix it by casting the arguments to void in macro's body.

While there, take the opportunity to modify other macros in this file
that do not make use of all the arguments to prevent similar issues in
the future.

Signed-off-by: Michal Orzel 
---
Changes since v1:
-standalone patch carved out from a series (other patches already merged)
-v1 was ([3/8] gnttab: Remove unused-but-set variable)
-modify macro on Arm instead of removing ld variable
---
  xen/arch/arm/include/asm/grant_table.h | 13 -
  1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/include/asm/grant_table.h 
b/xen/arch/arm/include/asm/grant_table.h
index d31a4d6805..5bcd1ec528 100644
--- a/xen/arch/arm/include/asm/grant_table.h
+++ b/xen/arch/arm/include/asm/grant_table.h
@@ -31,10 +31,11 @@ static inline void gnttab_mark_dirty(struct domain *d, 
mfn_t mfn)
  
  int create_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,

unsigned int flags, unsigned int cache_flags);
-#define gnttab_host_mapping_get_page_type(ro, ld, rd) (0)
+#define gnttab_host_mapping_get_page_type(ro, ld, rd) \
+((void)(ro), (void)(ld), (void)(rd), 0)


I would switch to a static inline helper:

static inline bool
gnttab_host_mapping_get_page_type(bool ro, struct domain *ld,
  struct domian *rd)
{
return false;
}

Note the switch from 0 to false as the function is technically returning 
a boolean (see the x86 implementation).



  int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn,
 unsigned long new_gpaddr, unsigned int flags);
-#define gnttab_release_host_mappings(domain) 1
+#define gnttab_release_host_mappings(domain) ((void)(domain), 1)


Same here.

  
  /*

   * The region used by Xen on the memory will never be mapped in DOM0
@@ -89,10 +90,12 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t 
mfn,
  })
  
  #define gnttab_shared_gfn(d, t, i)   \

-(((i) >= nr_grant_frames(t)) ? INVALID_GFN : (t)->arch.shared_gfn[i])
+((void)(d),  \
+ ((i) >= nr_grant_frames(t)) ? INVALID_GFN : (t)->arch.shared_gfn[i])
  
-#define gnttab_status_gfn(d, t, i)   \

-(((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
+#define gnttab_status_gfn(d, t, i)\
+((void)(d),   \
+ ((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])


I share Jan's opinion here. If we want to evaluate d, then we should 
make sure t and i should be also evaluated once. However, IIRC, they 
can't be turned to static inline because the type of t (struct 
grant_table) is not fully defined yet.


Cheers

--
Julien Grall



RE: [PATCH 24/30] panic: Refactor the panic path

2022-05-03 Thread Michael Kelley (LINUX)
From: Guilherme G. Piccoli  Sent: Friday, April 29, 2022 
1:38 PM
> 
> On 29/04/2022 14:53, Michael Kelley (LINUX) wrote:
> > From: Guilherme G. Piccoli  Sent: Wednesday, April 27, 
> > 2022
> 3:49 PM
> >> [...]
> >> +  panic_notifiers_level=
> >> +  [KNL] Set the panic notifiers execution order.
> >> +  Format: 
> >> +  We currently have 4 lists of panic notifiers; based
> >> +  on the functionality and risk (for panic success) the
> >> +  callbacks are added in a given list. The lists are:
> >> +  - hypervisor/FW notification list (low risk);
> >> +  - informational list (low/medium risk);
> >> +  - pre_reboot list (higher risk);
> >> +  - post_reboot list (only run late in panic and after
> >> +  kdump, not configurable for now).
> >> +  This parameter defines the ordering of the first 3
> >> +  lists with regards to kdump; the levels determine
> >> +  which set of notifiers execute before kdump. The
> >> +  accepted levels are:
> >> +  0: kdump is the first thing to run, NO list is
> >> +  executed before kdump.
> >> +  1: only the hypervisor list is executed before kdump.
> >> +  2 (default level): the hypervisor list and (*if*
> >> +  there's any kmsg_dumper defined) the informational
> >> +  list are executed before kdump.
> >> +  3: both the hypervisor and the informational lists
> >> +  (always) execute before kdump.
> >
> > I'm not clear on why level 2 exists.  What is the scenario where
> > execution of the info list before kdump should be conditional on the
> > existence of a kmsg_dumper?   Maybe the scenario is described
> > somewhere in the patch set and I just missed it.
> >
> 
> Hi Michael, thanks for your review/consideration. So, this idea started
> kind of some time ago. It all started with a need of exposing more
> information on kernel log *before* kdump and *before* pstore -
> specifically, we're talking about panic_print. But this cause some
> reactions, Baoquan was very concerned with that [0]. Soon after, I've
> proposed a panic notifiers filter (orthogonal) approach, to which Petr
> suggested instead doing a major refactor [1] - it finally is alive in
> the form of this series.
> 
> The theory behind the level 2 is to allow a scenario of kdump with the
> minimum amount of notifiers - what is the point in printing more
> information if the user doesn't care, since it's going to kdump? Now, if
> there is a kmsg dumper, it means that there is likely some interest in
> collecting information, and that might as well be required before the
> potential kdump (which is my case, hence the proposal on [0]).
> 
> Instead of forcing one of the two behaviors (level 1 or level 3), we
> have a middle-term/compromise: if there's interest in collecting such
> data (in the form of a kmsg dumper), we then execute the informational
> notifiers before kdump. If not, why to increase (even slightly) the risk
> for kdump?
> 
> I'm OK in removing the level 2 if people prefer, but I don't feel it's a
> burden, quite opposite - seems a good way to accommodate the somewhat
> antagonistic ideas (jump to kdump ASAP vs collecting more info in the
> panicked kernel log).
> 
> [0] https://lore.kernel.org/lkml/20220126052246.GC2086@MiWiFi-R3L-srv/
> 
> [1] https://lore.kernel.org/lkml/YfPxvzSzDLjO5ldp@alley/
> 

To me, it's a weak correlation between having a kmsg dumper, and
wanting or not wanting the info level output to come before kdump.
Hyper-V is one of only a few places that register a kmsg dumper, so most
Linux instances outside of Hyper-V guest (and PowerPC systems?) will have
the info level output after kdump.  It seems like anyone who cared strongly
about the info level output would set the panic_notifier_level to 1 or to 3
so that the result is more deterministic.  But that's just my opinion, and
it's probably an opinion that is not as well informed on the topic as some
others in the discussion. So keeping things as in your patch set is not a
show-stopper for me.

However, I would request a clarification in the documentation.   The
panic_notifier_level affects not only the hypervisor, informational,
and pre_reboot lists, but it also affects panic_print_sys_info() and
kmsg_dump().  Specifically, at level 1, panic_print_sys_info() and
kmsg_dump() will not be run before kdump.  At level 3, they will
always be run before kdump.  Your documentation above mentions
"informational lists" (plural), which I take to vaguely include
kmsg_dump() and panic_print_sys_info(), but being explicit about
the effect would be better.

Michael

> 
> >[...]
> >> +   * Based on the level configured (smaller than 4), we clear the
> >> +   * proper bits in "panic_notifiers_bits". 

[PATCH V8 1/2] libxl: Add support for Virtio disk configuration

2022-05-03 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

This patch adds basic support for configuring and assisting virtio-mmio
based virtio-disk backend (emulator) which is intended to run out of
Qemu and could be run in any domain.
Although the Virtio block device is quite different from traditional
Xen PV block device (vbd) from the toolstack's point of view:
 - as the frontend is virtio-blk which is not a Xenbus driver, nothing
   written to Xenstore are fetched by the frontend currently ("vdev"
   is not passed to the frontend). But this might need to be revised
   in future, so frontend data might be written to Xenstore in order to
   support hotplugging virtio devices or passing the backend domain id
   on arch where the device-tree is not available.
 - the ring-ref/event-channel are not used for the backend<->frontend
   communication, the proposed IPC for Virtio is IOREQ/DM
it is still a "block device" and ought to be integrated in existing
"disk" handling. So, re-use (and adapt) "disk" parsing/configuration
logic to deal with Virtio devices as well.

For the immediate purpose and an ability to extend that support for
other use-cases in future (Qemu, virtio-pci, etc) perform the following
actions:
- Add new disk backend type (LIBXL_DISK_BACKEND_OTHER) and reflect
  that in the configuration
- Introduce new disk "specification" and "transport" fields to struct
  libxl_device_disk. Both are written to the Xenstore. The transport
  field is only used for the specification "virtio" and it assumes
  only "mmio" value for now.
- Introduce new "specification" option with "xen" communication
  protocol being default value.
- Add new device kind (LIBXL__DEVICE_KIND_VIRTIO_DISK) as current
  one (LIBXL__DEVICE_KIND_VBD) doesn't fit into Virtio disk model

An example of domain configuration for Virtio disk:
disk = [ 'phy:/dev/mmcblk0p3, xvda1, backendtype=other, specification=virtio']

Nothing has changed for default Xen disk configuration.

Please note, this patch is not enough for virtio-disk to work
on Xen (Arm), as for every Virtio device (including disk) we need
to allocate Virtio MMIO params (IRQ and memory region) and pass
them to the backend, also update Guest device-tree. The subsequent
patch will add these missing bits. For the current patch,
the default "irq" and "base" are just written to the Xenstore.
This is not an ideal splitting, but this way we avoid breaking
the bisectability.

Signed-off-by: Oleksandr Tyshchenko 
---
Changes RFC -> V1:
   - no changes

Changes V1 -> V2:
   - rebase according to the new location of libxl_virtio_disk.c

Changes V2 -> V3:
   - no changes

Changes V3 -> V4:
   - rebase according to the new argument for DEFINE_DEVICE_TYPE_STRUCT

Changes V4 -> V5:
   - split the changes, change the order of the patches
   - update patch description
   - don't introduce new "vdisk" configuration option with own parsing logic,
 re-use Xen PV block "disk" parsing/configuration logic for the virtio-disk
   - introduce "virtio" flag and document it's usage
   - add LIBXL_HAVE_DEVICE_DISK_VIRTIO
   - update libxlu_disk_l.[ch]
   - drop num_disks variable/MAX_VIRTIO_DISKS
   - drop Wei's T-b

Changes V5 -> V6:
   - rebase on current staging
   - use "%"PRIu64 instead of %lu for disk->base in device_disk_add()
   - update *.gen.go files

Changes V6 -> V7:
   - rebase on current staging
   - update *.gen.go files and libxlu_disk_l.[ch] files
   - update patch description
   - rework significantly to support more flexible configuration
 and have more generic basic implementation for being able to extend
 that for other use-cases (virtio-pci, qemu, etc).

Changes V7 -> V8:
   - update *.gen.go files and libxlu_disk_l.[ch] files
   - update patch description and comments in the code
   - use "specification" config option instead of "protocol"
   - update libxl_types.idl and code according to new fields
 in libxl_device_disk
---
 docs/man/xl-disk-configuration.5.pod.in   |  38 +-
 tools/golang/xenlight/helpers.gen.go  |   8 +
 tools/golang/xenlight/types.gen.go|  18 +
 tools/include/libxl.h |   7 +
 tools/libs/light/libxl_device.c   |  62 +-
 tools/libs/light/libxl_disk.c | 136 -
 tools/libs/light/libxl_internal.h |   2 +
 tools/libs/light/libxl_types.idl  |  16 +
 tools/libs/light/libxl_types_internal.idl |   1 +
 tools/libs/light/libxl_utils.c|   2 +
 tools/libs/util/libxlu_disk_l.c   | 959 +++---
 tools/libs/util/libxlu_disk_l.h   |   2 +-
 tools/libs/util/libxlu_disk_l.l   |   9 +
 tools/xl/xl_block.c   |  11 +
 14 files changed, 791 insertions(+), 480 deletions(-)

diff --git a/docs/man/xl-disk-configuration.5.pod.in 
b/docs/man/xl-disk-configuration.5.pod.in
index 71d0e86..487ffef 100644
--- a/docs/man/xl-disk-configuration.5.pod.in
+++ b/docs/man/xl-disk-configuration.5.pod.in
@@ -232,7 +232,7 @@ Specifies the backend implementation to use
 
 

[PATCH V8 2/2] libxl: Introduce basic virtio-mmio support on Arm

2022-05-03 Thread Oleksandr Tyshchenko
From: Julien Grall 

This patch introduces helpers to allocate Virtio MMIO params
(IRQ and memory region) and create specific device node in
the Guest device-tree with allocated params. In order to deal
with multiple Virtio devices, reserve corresponding ranges.
For now, we reserve 1MB for memory regions and 10 SPIs.

As these helpers should be used for every Virtio device attached
to the Guest, call them for Virtio disk(s).

Please note, with statically allocated Virtio IRQs there is
a risk of a clash with a physical IRQs of passthrough devices.
For the first version, it's fine, but we should consider allocating
the Virtio IRQs automatically. Thankfully, we know in advance which
IRQs will be used for passthrough to be able to choose non-clashed
ones.

Signed-off-by: Julien Grall 
Signed-off-by: Oleksandr Tyshchenko 
---
Please note, this is a split/cleanup/hardening of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Changes RFC -> V1:
   - was squashed with:
 "[RFC PATCH V1 09/12] libxl: Handle virtio-mmio irq in more correct way"
 "[RFC PATCH V1 11/12] libxl: Insert "dma-coherent" property into 
virtio-mmio device node"
 "[RFC PATCH V1 12/12] libxl: Fix duplicate memory node in DT"
   - move VirtIO MMIO #define-s to xen/include/public/arch-arm.h

Changes V1 -> V2:
   - update the author of a patch

Changes V2 -> V3:
   - no changes

Changes V3 -> V4:
   - no changes

Changes V4 -> V5:
   - split the changes, change the order of the patches
   - drop an extra "virtio" configuration option
   - update patch description
   - use CONTAINER_OF instead of own implementation
   - reserve ranges for Virtio MMIO params and put them
 in correct location
   - create helpers to allocate Virtio MMIO params, add
 corresponding sanity-сhecks
   - add comment why MMIO size 0x200 is chosen
   - update debug print
   - drop Wei's T-b

Changes V5 -> V6:
   - rebase on current staging

Changes V6 -> V7:
   - rebase on current staging
   - add T-b and R-b tags
   - update according to the recent changes to
 "libxl: Add support for Virtio disk configuration"

Changes V7 -> V8:
   - drop T-b and R-b tags
   - make virtio_mmio_base/irq global variables to be local in
 libxl__arch_domain_prepare_config() and initialize them at
 the beginning of the function, then rework alloc_virtio_mmio_base/irq()
 to take a pointer to virtio_mmio_base/irq variables as an argument
   - update according to the recent changes to
 "libxl: Add support for Virtio disk configuration"
---
 tools/libs/light/libxl_arm.c  | 118 +-
 xen/include/public/arch-arm.h |   7 +++
 2 files changed, 123 insertions(+), 2 deletions(-)

diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index eef1de0..37403a2 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -8,6 +8,46 @@
 #include 
 #include 
 
+/*
+ * There is no clear requirements for the total size of Virtio MMIO region.
+ * The size of control registers is 0x100 and device-specific configuration
+ * registers starts at the offset 0x100, however it's size depends on the 
device
+ * and the driver. Pick the biggest known size at the moment to cover most
+ * of the devices (also consider allowing the user to configure the size via
+ * config file for the one not conforming with the proposed value).
+ */
+#define VIRTIO_MMIO_DEV_SIZE   xen_mk_ullong(0x200)
+
+static uint64_t alloc_virtio_mmio_base(libxl__gc *gc, uint64_t 
*virtio_mmio_base)
+{
+uint64_t base = *virtio_mmio_base;
+
+/* Make sure we have enough reserved resources */
+if ((base + VIRTIO_MMIO_DEV_SIZE >
+GUEST_VIRTIO_MMIO_BASE + GUEST_VIRTIO_MMIO_SIZE)) {
+LOG(ERROR, "Ran out of reserved range for Virtio MMIO BASE 
0x%"PRIx64"\n",
+base);
+return 0;
+}
+*virtio_mmio_base += VIRTIO_MMIO_DEV_SIZE;
+
+return base;
+}
+
+static uint32_t alloc_virtio_mmio_irq(libxl__gc *gc, uint32_t *virtio_mmio_irq)
+{
+uint32_t irq = *virtio_mmio_irq;
+
+/* Make sure we have enough reserved resources */
+if (irq > GUEST_VIRTIO_MMIO_SPI_LAST) {
+LOG(ERROR, "Ran out of reserved range for Virtio MMIO IRQ %u\n", irq);
+return 0;
+}
+(*virtio_mmio_irq)++;
+
+return irq;
+}
+
 static const char *gicv_to_string(libxl_gic_version gic_version)
 {
 switch (gic_version) {
@@ -26,8 +66,10 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
 {
 uint32_t nr_spis = 0;
 unsigned int i;
-uint32_t vuart_irq;
-bool vuart_enabled = false;
+uint32_t vuart_irq, virtio_irq = 0;
+bool vuart_enabled = false, virtio_enabled = false;
+uint64_t virtio_mmio_base = GUEST_VIRTIO_MMIO_BASE;
+uint32_t virtio_mmio_irq = GUEST_VIRTIO_MMIO_SPI_FIRST;
 
 /*
  * If pl011 vuart is enabled then increment the nr_spis to allow allocation
@@ -39,6 +81,30 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
  

[PATCH V8 0/2] Virtio support for toolstack on Arm (Was "IOREQ feature (+ virtio-mmio) on Arm")

2022-05-03 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

Hello all.

The purpose of this patch series is to add missing virtio-mmio bits to Xen 
toolstack on Arm.
The Virtio support for toolstack [1] was postponed as the main target was to 
upstream IOREQ/DM
support on Arm in the first place. Now, we already have IOREQ support in, so we 
can resume Virtio
enabling work. You can find previous discussions at [2].

Patch series [3] is based on recent "staging" branch
(fa6dc0879ffd3daea2837953c7a8761a9ba0 page_alloc: assert IRQs are enabled 
in heap alloc/free)
and tested on Renesas Salvator-X board + H3 ES3.0 SoC (Arm64) with virtio-mmio 
based virtio-disk backend [4]
running in Dom0 (or Driver domain) and unmodified Linux Guest running on 
existing virtio-blk driver (frontend).
No issues were observed. Guest domain 'reboot/destroy' use-cases work properly.

Any feedback/help would be highly appreciated.

[1]
https://lore.kernel.org/xen-devel/1610488352-18494-24-git-send-email-olekst...@gmail.com/
https://lore.kernel.org/xen-devel/1610488352-18494-25-git-send-email-olekst...@gmail.com/
[2]
https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg02403.html
https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg02536.html
https://lore.kernel.org/xen-devel/1621626361-29076-1-git-send-email-olekst...@gmail.com/
https://lore.kernel.org/xen-devel/1638982784-14390-1-git-send-email-olekst...@gmail.com/
https://lore.kernel.org/xen-devel/1649442065-8332-1-git-send-email-olekst...@gmail.com/

[3] https://github.com/otyshchenko1/xen/commits/libxl_virtio3
[4] https://github.com/otyshchenko1/virtio-disk/commits/virtio_grant

Julien Grall (1):
  libxl: Introduce basic virtio-mmio support on Arm

Oleksandr Tyshchenko (1):
  libxl: Add support for Virtio disk configuration

 docs/man/xl-disk-configuration.5.pod.in   |  38 +-
 tools/golang/xenlight/helpers.gen.go  |   8 +
 tools/golang/xenlight/types.gen.go|  18 +
 tools/include/libxl.h |   7 +
 tools/libs/light/libxl_arm.c  | 118 +++-
 tools/libs/light/libxl_device.c   |  62 +-
 tools/libs/light/libxl_disk.c | 136 -
 tools/libs/light/libxl_internal.h |   2 +
 tools/libs/light/libxl_types.idl  |  16 +
 tools/libs/light/libxl_types_internal.idl |   1 +
 tools/libs/light/libxl_utils.c|   2 +
 tools/libs/util/libxlu_disk_l.c   | 959 +++---
 tools/libs/util/libxlu_disk_l.h   |   2 +-
 tools/libs/util/libxlu_disk_l.l   |   9 +
 tools/xl/xl_block.c   |  11 +
 xen/include/public/arch-arm.h |   7 +
 16 files changed, 914 insertions(+), 482 deletions(-)

-- 
2.7.4




[ovmf test] 170052: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170052 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170052/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  784 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days8 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



Re: [PATCH V1 4/6] dt-bindings: Add xen,dev-domid property description for xen-grant DMA ops

2022-05-03 Thread Oleksandr



On 03.05.22 00:59, Rob Herring wrote:

Hello Rob



On Fri, Apr 22, 2022 at 07:51:01PM +0300, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

Introduce Xen specific binding for the virtualized device (e.g. virtio)
to be used by Xen grant DMA-mapping layer in the subsequent commit.

This binding indicates that Xen grant mappings scheme needs to be
enabled for the device which DT node contains that property and specifies
the ID of Xen domain where the corresponding backend resides. The ID
(domid) is used as an argument to the grant mapping APIs.

This is needed for the option to restrict memory access using Xen grant
mappings to work which primary goal is to enable using virtio devices
in Xen guests.

Signed-off-by: Oleksandr Tyshchenko 
---
Changes RFC -> V1:
- update commit subject/description and text in description
- move to devicetree/bindings/arm/
---
  .../devicetree/bindings/arm/xen,dev-domid.yaml | 37 ++
  1 file changed, 37 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/arm/xen,dev-domid.yaml

diff --git a/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml 
b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
new file mode 100644
index ..ef0f747
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
@@ -0,0 +1,37 @@
+# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/arm/xen,dev-domid.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Xen specific binding for the virtualized device (e.g. virtio)
+
+maintainers:
+  - Oleksandr Tyshchenko 
+
+select: true

Do we really need to support this property everywhere?


From my understanding - yes.

As, I think, any device node describing virtulized device in the guest 
device tree can have this property.  Initially (in the RFC series) the 
"solution to restrict memory access using Xen grant mappings" was 
virtio-specific.


Although the support of virtio is a primary target of this series, we 
decided to generalize this work and expand it to any device [1]. So the 
Xen grant mappings scheme (this property to be used for) can be 
theoretically used for any device emulated by the Xen backend.




+
+description:
+  This binding indicates that Xen grant mappings scheme needs to be enabled
+  for that device and specifies the ID of Xen domain where the corresponding
+  device (backend) resides. This is needed for the option to restrict memory
+  access using Xen grant mappings to work.
+
+properties:
+  xen,dev-domid:
+$ref: /schemas/types.yaml#/definitions/uint32
+description:
+  The domid (domain ID) of the domain where the device (backend) is 
running.
+
+additionalProperties: true
+
+examples:
+  - |
+virtio_block@3000 {

virtio@3000


ok, will change





+compatible = "virtio,mmio";
+reg = <0x3000 0x100>;
+interrupts = <41>;
+
+/* The device is located in Xen domain with ID 1 */
+xen,dev-domid = <1>;

This fails validation:

Documentation/devicetree/bindings/arm/xen,dev-domid.example.dtb: 
virtio_block@3000: xen,dev-domid: [[1]] is not of type 'object'
 From schema: 
/home/rob/proj/git/linux-dt/Documentation/devicetree/bindings/virtio/mmio.yaml


Thank you for pointing this out, my fault, I haven't "properly" checked 
this before. I think, we need to remove "compatible = "virtio,mmio"; here



diff --git a/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml 
b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml

index 2daa8aa..d2f2140 100644
--- a/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
+++ b/Documentation/devicetree/bindings/arm/xen,dev-domid.yaml
@@ -28,7 +28,7 @@ additionalProperties: true
 examples:
   - |
 virtio_block@3000 {
-    compatible = "virtio,mmio";
+    /* ... */
 reg = <0x3000 0x100>;
 interrupts = <41>;





The property has to be added to the virtio/mmio.yaml schema. If it is
not needed elsewhere, then *just* add the property there.


As I described above, the property is not virtio specific and can be 
used for any virtualized device for which Xen grant mappings scheme 
needs to be enabled (xen-grant DMA-mapping layer).



[1] 
https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2204181202080.915916@ubuntu-linux-20-04-desktop/





Rob


--
Regards,

Oleksandr Tyshchenko




Re: fix and cleanup discard_alignment handling

2022-05-03 Thread Jens Axboe
On Mon, 18 Apr 2022 06:53:03 +0200, Christoph Hellwig wrote:
> the somewhat confusing name of the discard_alignment queue limit, that
> really is an offset for the discard granularity mislead a lot of driver
> authors to set it to an incorrect value.  This series tries to fix up
> all these cases.
> 
> Diffstat:
>  arch/um/drivers/ubd_kern.c |1 -
>  drivers/block/loop.c   |1 -
>  drivers/block/nbd.c|3 ---
>  drivers/block/null_blk/main.c  |1 -
>  drivers/block/rnbd/rnbd-srv-dev.h  |2 +-
>  drivers/block/virtio_blk.c |7 ---
>  drivers/block/xen-blkback/xenbus.c |4 ++--
>  drivers/md/dm-zoned-target.c   |2 +-
>  drivers/md/raid5.c |1 -
>  drivers/nvme/host/core.c   |1 -
>  drivers/s390/block/dasd_fba.c  |1 -
>  11 files changed, 8 insertions(+), 16 deletions(-)
> 
> [...]

Applied, thanks!

[01/11] ubd: don't set the discard_alignment queue limit
commit: 07c6e92a8478770a7302f7dde72f03a5465901bd
[02/11] nbd: don't set the discard_alignment queue limit
commit: 4a04d517c56e0616c6f69afc226ee2691e543712
[03/11] null_blk: don't set the discard_alignment queue limit
commit: fb749a87f4536d2fa86ea135ae4eff1072903438
[04/11] virtio_blk: fix the discard_granularity and discard_alignment queue 
limits
commit: 62952cc5bccd89b76d710de1d0b43244af0f2903
[05/11] dm-zoned: don't set the discard_alignment queue limit
commit: 44d583702f4429763c558624fac763650a1f05bf
[06/11] raid5: don't set the discard_alignment queue limit
commit: 3d50d368c92ade2f98a3d0d28b842a57c35284e9
[07/11] dasd: don't set the discard_alignment queue limit
commit: c3f765299632727fa5ea5a0acf118665227a4f1a
[08/11] loop: remove a spurious clear of discard_alignment
commit: 4418bfd8fb9602d9cd8747c3ad52fdbaa02e2ffd
[09/11] nvme: remove a spurious clear of discard_alignment
commit: 4e7f0ece41e1be8f876f320a0972a715daec0a50
[10/11] rnbd-srv: use bdev_discard_alignment
commit: 18292faa89d2bff3bdd33ab9c065f45fb6710e47
[11/11] xen-blkback: use bdev_discard_alignment
commit: c899b23533866910c90ef4386b501af50270d320

Best regards,
-- 
Jens Axboe





Re: [PATCH v4 02/21] IOMMU: simplify unmap-on-error in iommu_map()

2022-05-03 Thread Roger Pau Monné
On Tue, May 03, 2022 at 04:37:29PM +0200, Jan Beulich wrote:
> On 03.05.2022 12:25, Roger Pau Monné wrote:
> > On Mon, Apr 25, 2022 at 10:32:10AM +0200, Jan Beulich wrote:
> >> As of 68a8aa5d7264 ("iommu: make map and unmap take a page count,
> >> similar to flush") there's no need anymore to have a loop here.
> >>
> >> Suggested-by: Roger Pau Monné 
> >> Signed-off-by: Jan Beulich 
> > 
> > Reviewed-by: Roger Pau Monné 
> 
> Thanks.
> 
> > I wonder whether we should have a macro to ignore returns from
> > __must_check attributed functions.  Ie:
> > 
> > #define IGNORE_RETURN(exp) while ( exp ) break;
> > 
> > As to avoid confusion (and having to reason) whether the usage of
> > while is correct.  I always find it confusing to assert such loop
> > expressions are correct.
> 
> I've been considering some form of wrapper macro (not specifically
> the one you suggest), but I'm of two minds: On one hand I agree it
> would help readers, but otoh I fear it may make it more attractive
> to actually override the __must_check (which really ought to be an
> exception).

Well, I think anyone reviewing the code would realize that the error
is being ignored, and hence check that this is actually intended.

Thanks, Roger.



Re: [PATCH v4 07/21] IOMMU/x86: support freeing of pagetables

2022-05-03 Thread Roger Pau Monné
On Mon, Apr 25, 2022 at 10:35:45AM +0200, Jan Beulich wrote:
> For vendor specific code to support superpages we need to be able to
> deal with a superpage mapping replacing an intermediate page table (or
> hierarchy thereof). Consequently an iommu_alloc_pgtable() counterpart is
> needed to free individual page tables while a domain is still alive.
> Since the freeing needs to be deferred until after a suitable IOTLB
> flush was performed, released page tables get queued for processing by a
> tasklet.
> 
> Signed-off-by: Jan Beulich 
> ---
> I was considering whether to use a softirq-tasklet instead. This would
> have the benefit of avoiding extra scheduling operations, but come with
> the risk of the freeing happening prematurely because of a
> process_pending_softirqs() somewhere.

I'm sorry again if I already raised this, I don't seem to find a
reference.

What about doing the freeing before resuming the guest execution in
guest vCPU context?

We already have a hook like this on HVM in hvm_do_resume() calling
vpci_process_pending().  I wonder whether we could have a similar hook
for PV and keep the pages to be freed in the vCPU instead of the pCPU.
This would have the benefit of being able to context switch the vCPU
in case the operation takes too long.

Not that the current approach is wrong, but doing it in the guest
resume path we could likely prevent guests doing heavy p2m
modifications from hogging CPU time.

> ---
> v4: Change type of iommu_queue_free_pgtable()'s 1st parameter. Re-base.
> v3: Call process_pending_softirqs() from free_queued_pgtables().
> 
> --- a/xen/arch/x86/include/asm/iommu.h
> +++ b/xen/arch/x86/include/asm/iommu.h
> @@ -147,6 +147,7 @@ void iommu_free_domid(domid_t domid, uns
>  int __must_check iommu_free_pgtables(struct domain *d);
>  struct domain_iommu;
>  struct page_info *__must_check iommu_alloc_pgtable(struct domain_iommu *hd);
> +void iommu_queue_free_pgtable(struct domain_iommu *hd, struct page_info *pg);
>  
>  #endif /* !__ARCH_X86_IOMMU_H__ */
>  /*
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -12,6 +12,7 @@
>   * this program; If not, see .
>   */
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -550,6 +551,91 @@ struct page_info *iommu_alloc_pgtable(st
>  return pg;
>  }
>  
> +/*
> + * Intermediate page tables which get replaced by large pages may only be
> + * freed after a suitable IOTLB flush. Hence such pages get queued on a
> + * per-CPU list, with a per-CPU tasklet processing the list on the assumption
> + * that the necessary IOTLB flush will have occurred by the time tasklets get
> + * to run. (List and tasklet being per-CPU has the benefit of accesses not
> + * requiring any locking.)
> + */
> +static DEFINE_PER_CPU(struct page_list_head, free_pgt_list);
> +static DEFINE_PER_CPU(struct tasklet, free_pgt_tasklet);
> +
> +static void free_queued_pgtables(void *arg)
> +{
> +struct page_list_head *list = arg;
> +struct page_info *pg;
> +unsigned int done = 0;
> +

With the current logic I think it might be helpful to assert that the
list is not empty when we get here?

Given the operation requires a context switch we would like to avoid
such unless there's indeed pending work to do.

> +while ( (pg = page_list_remove_head(list)) )
> +{
> +free_domheap_page(pg);
> +
> +/* Granularity of checking somewhat arbitrary. */
> +if ( !(++done & 0x1ff) )
> + process_pending_softirqs();
> +}
> +}
> +
> +void iommu_queue_free_pgtable(struct domain_iommu *hd, struct page_info *pg)
> +{
> +unsigned int cpu = smp_processor_id();
> +
> +spin_lock(>arch.pgtables.lock);
> +page_list_del(pg, >arch.pgtables.list);
> +spin_unlock(>arch.pgtables.lock);
> +
> +page_list_add_tail(pg, _cpu(free_pgt_list, cpu));
> +
> +tasklet_schedule(_cpu(free_pgt_tasklet, cpu));
> +}
> +
> +static int cf_check cpu_callback(
> +struct notifier_block *nfb, unsigned long action, void *hcpu)
> +{
> +unsigned int cpu = (unsigned long)hcpu;
> +struct page_list_head *list = _cpu(free_pgt_list, cpu);
> +struct tasklet *tasklet = _cpu(free_pgt_tasklet, cpu);
> +
> +switch ( action )
> +{
> +case CPU_DOWN_PREPARE:
> +tasklet_kill(tasklet);
> +break;
> +
> +case CPU_DEAD:
> +page_list_splice(list, _cpu(free_pgt_list));

I think you could check whether list is empty before queuing it?

Thanks, Roger.



Re: [PATCH RFC] x86/lld: fix symbol map generation

2022-05-03 Thread Jan Beulich
On 03.05.2022 11:15, Roger Pau Monné wrote:
> On Tue, May 03, 2022 at 10:17:44AM +0200, Jan Beulich wrote:
>> On 02.05.2022 17:20, Roger Pau Monne wrote:
>>> The symbol map generation (and thus the debug info attached to Xen) is
>>> partially broken when using LLVM LD.  That's due to LLD converting
>>> almost all symbols from global to local in the last linking step, and
>>
>> I'm puzzled by "almost" - is there a pattern of which ones aren't
>> converted?
> 
> This is the list of the ones that aren't converted:
> 
> __x86_indirect_thunk_r11
> s3_resume
> start
> __image_base__
> __high_start
> wakeup_stack
> wakeup_stack_start
> handle_exception
> dom_crash_sync_extable
> common_interrupt
> __x86_indirect_thunk_rbx
> __x86_indirect_thunk_rcx
> __x86_indirect_thunk_rax
> __x86_indirect_thunk_rdx
> __x86_indirect_thunk_rbp
> __x86_indirect_thunk_rsi
> __x86_indirect_thunk_rdi
> __x86_indirect_thunk_r8
> __x86_indirect_thunk_r9
> __x86_indirect_thunk_r10
> __x86_indirect_thunk_r12
> __x86_indirect_thunk_r13
> __x86_indirect_thunk_r14
> __x86_indirect_thunk_r15
> 
> I assume there's some kind of pattern, but I haven't yet been able to
> spot where triggers the conversion from global to local in lld.

At least this looks to all be symbols defined in assembly files, which
don't have a C-visible declaration.

>>> Not applied to EFI, partially because I don't have an environment with
>>> LLD capable of generating the EFI binary.
>>>
>>> Obtaining the global symbol list could likely be a target on itself,
>>> if it is to be shared between the ELF and the EFI binary generation.
>>
>> If, as the last paragraph of the description is worded, you did this
>> just once (as a prereq), I could see this working.
> 
> Yes, my comment was about splitting the:
> 
> $(NM) -pa --format=bsd $< | awk '{ if($$2 == "T") print $$3}' \
>   > $(@D)/.$(@F).global-syms
> 
> rune into a separate $(TARGET)-syms.global-syms target or some such.
> Not sure it's really worth it.

Probably indeed only when splitting up the rule as a whole.

>>> --- a/xen/arch/x86/Makefile
>>> +++ b/xen/arch/x86/Makefile
>>> @@ -134,24 +134,34 @@ $(TARGET): $(TARGET)-syms $(efi-y) $(obj)/boot/mkelf32
>>>  CFLAGS-$(XEN_BUILD_EFI) += -DXEN_BUILD_EFI
>>>  
>>>  $(TARGET)-syms: $(objtree)/prelink.o $(obj)/xen.lds
>>> +   # Dump global text symbols before the linking step
>>> +   $(NM) -pa --format=bsd $< | awk '{ if($$2 == "T") print $$3}' \
>>> +   > $(@D)/.$(@F).global-syms
>>> $(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) \
>>> -   $(objtree)/common/symbols-dummy.o -o $(@D)/.$(@F).0
>>> +   $(objtree)/common/symbols-dummy.o -o $(@D)/.$(@F).0.tmp
>>> +   # LLVM LD has converted global symbols into local ones as part of the
>>> +   # linking step, convert those back to global before using tools/symbols.
>>> +   $(OBJCOPY) --globalize-symbols=$(@D)/.$(@F).global-syms \
>>> +   $(@D)/.$(@F).0.tmp $(@D)/.$(@F).0
>>> $(NM) -pa --format=sysv $(@D)/.$(@F).0 \
>>> | $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
>>> >$(@D)/.$(@F).0.S
>>> $(MAKE) $(build)=$(@D) $(@D)/.$(@F).0.o
>>> $(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) \
>>> -   $(@D)/.$(@F).0.o -o $(@D)/.$(@F).1
>>> +   $(@D)/.$(@F).0.o -o $(@D)/.$(@F).1.tmp
>>> +   $(OBJCOPY) --globalize-symbols=$(@D)/.$(@F).global-syms \
>>> +   $(@D)/.$(@F).1.tmp $(@D)/.$(@F).1
>>> $(NM) -pa --format=sysv $(@D)/.$(@F).1 \
>>> | $(objtree)/tools/symbols $(all_symbols) --sysv --sort 
>>> $(syms-warn-dup-y) \
>>> >$(@D)/.$(@F).1.S
>>> $(MAKE) $(build)=$(@D) $(@D)/.$(@F).1.o
>>> $(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) \
>>> -   $(orphan-handling-y) $(@D)/.$(@F).1.o -o $@
>>> +   $(orphan-handling-y) $(@D)/.$(@F).1.o -o $@.tmp
>>> +   $(OBJCOPY) --globalize-symbols=$(@D)/.$(@F).global-syms $@.tmp $@
>>
>> Is this very useful? It only affects ...
>>
>>> $(NM) -pa --format=sysv $(@D)/$(@F) \
>>> | $(objtree)/tools/symbols --all-symbols --xensyms --sysv 
>>> --sort \
>>> >$(@D)/$(@F).map
>>
>> ... the actual map file; what's in the binary and in this map file doesn't
>> depend on local vs global anymore (and you limit this to text symbols
>> anyway; I wonder in how far livepatching might also be affected by the
>> same issue with data symbols).
> 
> If I don't add this step then the map file will also end up with lines
> like:
> 
> 0x82d0405b6968 b lib/xxhash64.c#iommuv2_enabled
> 0x82d0405b6970 b lib/xxhash64.c#nr_ioapic_sbdf
> 0x82d0405b6980 b lib/xxhash64.c#ioapic_sbdf
> 
> I see the same happen with other non-text symbols, so I would likely
> need to extend the fixing to preserve all global symbols from the
> input file, not just text ones.

Oh, I see - yes, this wants avoiding.

Jan




[ovmf test] 170050: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170050 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170050/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  783 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days7 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



Re: [PATCH] arm/its: enable LPIs before mapping the collection table

2022-05-03 Thread Julien Grall

Hi Rahul,

On 27/04/2022 17:14, Rahul Singh wrote:

MAPC_LPI_OFF ITS command error can be reported to software if LPIs are
not enabled before mapping the collection table using MAPC command.

Enable the LPIs using GICR_CTLR.EnableLPIs before mapping the collection
table.

Signed-off-by: Rahul Singh 
---
  xen/arch/arm/gic-v3.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 3c472ed768..8fb0014b16 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -812,11 +812,11 @@ static int gicv3_cpu_init(void)
  /* If the host has any ITSes, enable LPIs now. */
  if ( gicv3_its_host_has_its() )
  {
+if ( !gicv3_enable_lpis() )
+return -EBUSY;


gicv3_enable_lpis() is using writel_relaxed(). So in theory, the write 
may not be visible before gicv3_its_setup_collection() send the command.


So I think we need to add an smp_wmb() to ensure the ordering with a 
comment explaning why this is necessary.


Cheers,

--
Julien Grall



Re: [PATCH] arm/its: enable LPIs before mapping the collection table

2022-05-03 Thread Julien Grall




On 28/04/2022 15:11, Rahul Singh wrote:

Hi Julien,


Hi Rahul,


On 28 Apr 2022, at 1:59 pm, Julien Grall  wrote:



On 28/04/2022 11:00, Rahul Singh wrote:

Hi Julien,

On 27 Apr 2022, at 6:59 pm, Julien Grall  wrote:

Hi Rahul,

On 27/04/2022 17:14, Rahul Singh wrote:

MAPC_LPI_OFF ITS command error can be reported to software if LPIs are


Looking at the spec (ARM IHI 0069H), I can't find a command error named 
MAPC_LPI_OFF. Is it something specific to your HW?

I found the issue on HW that implements GIC-600 and GIC-600 TRM specify the 
MAPC_LPI_OFF its command error.
https://developer.arm.com/documentation/100336/0106/introduction/about-the-gic-600
{Table 3-15 ITS command and translation errors, records 13+ page 3-89}


Please provide a pointer to the spec in the commit message. This would help the 
reviewer to know where MAPC_LPI_OFF come from.

Ok.





not enabled before mapping the collection table using MAPC command.
Enable the LPIs using GICR_CTLR.EnableLPIs before mapping the collection
table.
Signed-off-by: Rahul Singh 
---
xen/arch/arm/gic-v3.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 3c472ed768..8fb0014b16 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -812,11 +812,11 @@ static int gicv3_cpu_init(void)
/* If the host has any ITSes, enable LPIs now. */
if ( gicv3_its_host_has_its() )
{
+ if ( !gicv3_enable_lpis() )
+ return -EBUSY;
ret = gicv3_its_setup_collection(smp_processor_id());
if ( ret )
return ret;
- if ( !gicv3_enable_lpis() )
- return -EBUSY;


AFAICT, Linux is using the same ordering as your are proposing. It seems to 
have been introduced from the start, so it is not clear why we chose this 
approach.

Yes I also confirmed that before sending the patch for review. I think this is 
okay if we enable the enable LPIs before mapping the collection table.


In general, I expect change touching the GICv3 code based on the specification rather 
than "we think this is okay". This reduce the risk to make modification that 
could break other platforms (we can't possibly test all of them).

Reading through the spec, the definition of GICR.EnableLPIs contains the 
following:

"
0b0 LPI support is disabled. Any doorbell interrupt generated as a result of a 
write to a virtual LPI register must be discarded, and any ITS translation 
requests or commands involving LPIs in this Redistributor are ignored.

0b1 LPI support is enabled.
"

So your change is correct. But the commit message needs to be updated with more 
details on which GIC HW the issue was seen and why your proposal is correct 
(i.e. quoting the spec).


Ok. I will modify the commit msg as below.Please let me know if it is okay.

arm/its: enable LPIs before mapping the collection table

When Xen boots on the platform that implements the GIC 600, ITS
MAPC_LPI_OFF uncorrectable command error issue is oberved.


s/oberved/observed/



As per the GIC-600 TRM (Revision: r1p6) MAPC_LPI_OFF command error can
be reported if the ITS MAPC command has tried to map a collection to a core
that does not have LPIs enabled.


Please add a quote from the GICv3 specification (see my previous reply).



To fix this issue, enable the LPIs using GICR_CTLR.EnableLPIs before
mapping the collection table.


Cheers,

--
Julien Grall



Re: [PATCH] xen/arm: smmuv1: remove iommu group when deassign a device

2022-05-03 Thread Julien Grall




On 29/04/2022 15:33, Rahul Singh wrote:

Hi Julien,


Hi Rahul,


On 27 Apr 2022, at 6:42 pm, Julien Grall  wrote:

Hi,

On 27/04/2022 17:15, Rahul Singh wrote:

When a device is deassigned from the domain it is required to remove the
iommu group.


This read wrong to me. We should not need to re-create the IOMMU group (and 
call arm_smmu_add_device()) every time a device is re-assigned.

Ok.



If we don't remove the group, the next time when we assign
a device, SME and S2CR will not be setup correctly for the device
because of that SMMU fault will be observed.


I think this is a bug fix for 0435784cc75dcfef3b5f59c29deb1dbb84265ddb. If so, 
please add a Fixes tag.


Ok Let me add the Fixes tag in next version.



Signed-off-by: Rahul Singh 
---
xen/drivers/passthrough/arm/smmu.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 5cacb2dd99..9a31c332d0 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -1690,6 +1690,8 @@ static void arm_smmu_detach_dev(struct iommu_domain 
*domain, struct device *dev)
if (cfg)
arm_smmu_master_free_smes(cfg);
+   iommu_group_put(dev_iommu_group(dev));
+   dev_iommu_group(dev) = NULL;
}


The goal of arm_smmu_detach_dev() is to revert the change made in 
arm_smmu_attach_dev(). But looking at the code, neither the IOMMU group nor the 
smes are allocated in arm_smmu_attach_dev().

Are the SMES meant to be re-allocated everytime we assign to a different 
domain? If yes, the allocation should be done in arm_smmu_attach_dev().


Yes SMES have to be re-allocated every time a device is assigned.


H Looking at the code, arm_smmu_alloc_smes() doesn't seem to use 
the domain information. So why would it need to be done every time it is 
assigned?




Is that okay if I will move the function arm_smmu_master_alloc_smes() from 
arm_smmu_add_device() to arm_smmu_attach_dev().
In this case we don’t need to remove the IOMMU group and also 
arm_smmu_detach_dev() will also revert the  change made in 
arm_smmu_attach_dev().

diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 5cacb2dd99..ff1b73d3d8 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -1680,6 +1680,10 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
 if (!cfg)
 return -ENODEV;
  
+   ret = arm_smmu_master_alloc_smes(dev);

+   if (ret)
+   return ret;
+
 return arm_smmu_domain_add_master(smmu_domain, cfg);


If we go down this route, then you will likely need to revert the change 
made by arm_smmu_master_alloc_smes().



  }
  
@@ -2075,7 +2079,7 @@ static int arm_smmu_add_device(struct device *dev)

 iommu_group_add_device(group, dev);
 iommu_group_put(group);
  
-   return arm_smmu_master_alloc_smes(dev);

+   return 0;
  }

Regards,
Rahul


If not, then we should not free the SMES here

IIUC, the SMES have to be re-allocated every time a device is assigned. 
Therefore, I think we should move the call to arm_smmu_master_alloc_smes() out 
of the detach callback and in a helper that would be used when removing a 
device (not yet supported by Xen).

Cheers,

--
Julien Grall




--
Julien Grall



Re: x86/PV: (lack of) MTRR exposure

2022-05-03 Thread Juergen Gross

On 28.04.22 17:53, Jan Beulich wrote:

Hello,

in the course of analyzing the i915 driver causing boot to fail in
Linux 5.18 I found that Linux, for all the years, has been running
in PV mode as if PAT was (mostly) disabled. This is a result of
them tying PAT initialization to MTRR initialization, while we
offer PAT but not MTRR in CPUID output. This was different before
our moving to CPU featuresets, and as such one could view this
behavior as a regression from that change.

The first question here is whether not exposing MTRR as a feature
was really intended, in particular also for PV Dom0. The XenoLinux
kernel and its forward ports did make use of XENPF_*_memtype to
deal with MTRRs. That's functionality which (maybe for a good
reason) never made it into the pvops kernel. Note that PVH Dom0
does have access to the original settings, as the host values are
used as initial state there.

The next question would be how we could go about improving the
situation. For the particular issue in 5.18 I've found a relatively
simple solution [1] (which also looks to help graphics performance
on other systems, according to my initial observations with using
the change), albeit its simplicity likely means it either is wrong
in some way, or might not be liked for looking hacky and/or abusive.
We can't, for example, simply turn on the MTRR bit in CPUID, as that
would implicitly lead to the kernel trying to write the PAT MSR (if,
see below, it didn't itself zap the bit). We also can't simply
ignore PAT MSR writes, as the kernel won't check whether writes
actually took effect. (All of that did work on top of old Xen
apparently only because xen_init_capabilities() itself also forces
the MTRR feature to off.)


I've sent an alternative patch addressing this problem:

https://lore.kernel.org/lkml/20220503132207.17234-3-jgr...@suse.com/T/#u

Lets see whether it is accepted.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] arm/acpi: don't expose the ACPI IORT SMMUv3 entry to dom0

2022-05-03 Thread Julien Grall

On 29/04/2022 19:18, Rahul Singh wrote:

Hi Julien,


Hi Rahul,


On 27 Apr 2022, at 7:26 pm, Julien Grall  wrote:

Hi Rahul,

On 27/04/2022 17:12, Rahul Singh wrote:

Xen should control the SMMUv3 devices therefore, don't expose the
SMMUv3 devices to dom0. Deny iomem access to SMMUv3 address space for
dom0 and also make ACPI IORT SMMUv3 node type to 0xff.


Looking at the IORT spec (ARM DEN 0049E), 255 (0xff) is marked as reserved. So I don't 
think we can "allocate" 0xff to mean invalid without updating the spec. Did you 
engage with whoever own the spec?


Yes I agree with you 0xff is reserved for future use. I didn’t find any other 
value to make node invalid.
Linux kernel is mostly using the node->type to process the SMMUv3 or other IORT 
node so I thought this is the only possible solution to hide SMMUv3 for dom0
If you have any other suggestion to hide the SMMUv3 node I am okay to use that.
The other solution is to remove completely the SMMUv3 node from the 
IORT. This would require more work as you would need to fully rewrite 
the IORT.


Hence why I suggested to speak with the spec owner (it seems to be Arm) 
to reserve 0xff as "Invalid/Ignore".



+ smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
+ mfn = paddr_to_pfn(smmu->base_address);
+ rc = iomem_deny_access(d, mfn, mfn + PFN_UP(SZ_128K));
+ if ( rc )
+ printk("iomem_deny_access failed for SMMUv3\n");
+ node->type = 0xff;


'node' points to the Xen copy of the ACPI table. We should really not touch 
this copy. Instead, we should modify the version that will be used by dom0.


As of now IORT is untouched by Xen and mapped to dom0. I will create the IORT 
table for dom0 and modify the node SMMUv3 that will be used by dom0.


Furthermore, if we go down the road to update node->type, we should 0 the node 
to avoid leaking the information to dom0.


I am not sure if we can zero the node, let me check and come back to you.


By writing node->type, you already invalidate the content because the 
software cannot know how to interpret it. At which point, zeroing it 
should make no difference for software parsing the table afterwards. 
This may be a problem for software parsing before hand and keeping a 
pointer to the entry. But then, this is yet another reason to no updated 
the host IORT and create a copy for dom0.


Cheers,

--
Julien Grall



[ovmf test] 170049: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170049 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170049/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  782 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days6 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



Re: [PATCH v4 05/21] IOMMU/x86: restrict IO-APIC mappings for PV Dom0

2022-05-03 Thread Jan Beulich
On 03.05.2022 15:00, Roger Pau Monné wrote:
> On Mon, Apr 25, 2022 at 10:34:23AM +0200, Jan Beulich wrote:
>> While already the case for PVH, there's no reason to treat PV
>> differently here, though of course the addresses get taken from another
>> source in this case. Except that, to match CPU side mappings, by default
>> we permit r/o ones. This then also means we now deal consistently with
>> IO-APICs whose MMIO is or is not covered by E820 reserved regions.
>>
>> Signed-off-by: Jan Beulich 
>> ---
>> [integrated] v1: Integrate into series.
>> [standalone] v2: Keep IOMMU mappings in sync with CPU ones.
>>
>> --- a/xen/drivers/passthrough/x86/iommu.c
>> +++ b/xen/drivers/passthrough/x86/iommu.c
>> @@ -275,12 +275,12 @@ void iommu_identity_map_teardown(struct
>>  }
>>  }
>>  
>> -static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
>> - unsigned long pfn,
>> - unsigned long max_pfn)
>> +static unsigned int __hwdom_init hwdom_iommu_map(const struct domain *d,
>> + unsigned long pfn,
>> + unsigned long max_pfn)
>>  {
>>  mfn_t mfn = _mfn(pfn);
>> -unsigned int i, type;
>> +unsigned int i, type, perms = IOMMUF_readable | IOMMUF_writable;
>>  
>>  /*
>>   * Set up 1:1 mapping for dom0. Default to include only conventional RAM
>> @@ -289,44 +289,60 @@ static bool __hwdom_init hwdom_iommu_map
>>   * that fall in unusable ranges for PV Dom0.
>>   */
>>  if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) )
>> -return false;
>> +return 0;
>>  
>>  switch ( type = page_get_ram_type(mfn) )
>>  {
>>  case RAM_TYPE_UNUSABLE:
>> -return false;
>> +return 0;
>>  
>>  case RAM_TYPE_CONVENTIONAL:
>>  if ( iommu_hwdom_strict )
>> -return false;
>> +return 0;
>>  break;
>>  
>>  default:
>>  if ( type & RAM_TYPE_RESERVED )
>>  {
>>  if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved )
>> -return false;
>> +perms = 0;
>>  }
>> -else if ( is_hvm_domain(d) || !iommu_hwdom_inclusive || pfn > 
>> max_pfn )
>> -return false;
>> +else if ( is_hvm_domain(d) )
>> +return 0;
>> +else if ( !iommu_hwdom_inclusive || pfn > max_pfn )
>> +perms = 0;
>>  }
>>  
>>  /* Check that it doesn't overlap with the Interrupt Address Range. */
>>  if ( pfn >= 0xfee00 && pfn <= 0xfeeff )
>> -return false;
>> +return 0;
>>  /* ... or the IO-APIC */
>> -for ( i = 0; has_vioapic(d) && i < d->arch.hvm.nr_vioapics; i++ )
>> -if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
>> -return false;
>> +if ( has_vioapic(d) )
>> +{
>> +for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
>> +if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
>> +return 0;
>> +}
>> +else if ( is_pv_domain(d) )
>> +{
>> +/*
>> + * Be consistent with CPU mappings: Dom0 is permitted to establish 
>> r/o
>> + * ones there, so it should also have such established for IOMMUs.
>> + */
>> +for ( i = 0; i < nr_ioapics; i++ )
>> +if ( pfn == PFN_DOWN(mp_ioapics[i].mpc_apicaddr) )
>> +return rangeset_contains_singleton(mmio_ro_ranges, pfn)
>> +   ? IOMMUF_readable : 0;
> 
> If we really are after consistency with CPU side mappings, we should
> likely take the whole contents of mmio_ro_ranges and d->iomem_caps
> into account, not just the pages belonging to the IO-APIC?
> 
> There could also be HPET pages mapped as RO for PV.

Hmm. This would be a yet bigger functional change, but indeed would further
improve consistency. But shouldn't we then also establish r/w mappings for
stuff in ->iomem_caps but not in mmio_ro_ranges? This would feel like going
too far ...

Jan




Re: [PATCH v4 06/21] IOMMU/x86: perform PV Dom0 mappings in batches

2022-05-03 Thread Roger Pau Monné
On Mon, Apr 25, 2022 at 10:34:59AM +0200, Jan Beulich wrote:
> For large page mappings to be easily usable (i.e. in particular without
> un-shattering of smaller page mappings) and for mapping operations to
> then also be more efficient, pass batches of Dom0 memory to iommu_map().
> In dom0_construct_pv() and its helpers (covering strict mode) this
> additionally requires establishing the type of those pages (albeit with
> zero type references).

I think it's possible I've already asked this.  Would it make sense to
add the IOMMU mappings in alloc_domheap_pages(), maybe by passing a
specific flag?

It would seem to me that doing it that way would also allow the
mappings to get established in blocks for domUs.

And be less error prone in having to match memory allocation with
iommu_memory_setup() calls in order for the pages to be added to the
IOMMU page tables.

> The earlier establishing of PGT_writable_page | PGT_validated requires
> the existing places where this gets done (through get_page_and_type())
> to be updated: For pages which actually have a mapping, the type
> refcount needs to be 1.
> 
> There is actually a related bug that gets fixed here as a side effect:
> Typically the last L1 table would get marked as such only after
> get_page_and_type(..., PGT_writable_page). While this is fine as far as
> refcounting goes, the page did remain mapped in the IOMMU in this case
> (when "iommu=dom0-strict").
> 
> Signed-off-by: Jan Beulich 
> ---
> Subsequently p2m_add_identity_entry() may want to also gain an order
> parameter, for arch_iommu_hwdom_init() to use. While this only affects
> non-RAM regions, systems typically have 2-16Mb of reserved space
> immediately below 4Gb, which hence could be mapped more efficiently.

Indeed.

> The installing of zero-ref writable types has in fact shown (observed
> while putting together the change) that despite the intention by the
> XSA-288 changes (affecting DomU-s only) for Dom0 a number of
> sufficiently ordinary pages (at the very least initrd and P2M ones as
> well as pages that are part of the initial allocation but not part of
> the initial mapping) still have been starting out as PGT_none, meaning
> that they would have gained IOMMU mappings only the first time these
> pages would get mapped writably. Consequently an open question is
> whether iommu_memory_setup() should set the pages to PGT_writable_page
> independent of need_iommu_pt_sync().

I think I'm confused, doesn't the setting of PGT_writable_page happen
as a result of need_iommu_pt_sync() and having those pages added to
the IOMMU page tables? (so they can be properly tracked and IOMMU
mappings are removed if thte page is also removed)

If the pages are not added here (because dom0 is not running in strict
mode) then setting PGT_writable_page is not required?

> I didn't think I need to address the bug mentioned in the description in
> a separate (prereq) patch, but if others disagree I could certainly
> break out that part (needing to first use iommu_legacy_unmap() then).
> 
> Note that 4k P2M pages don't get (pre-)mapped in setup_pv_physmap():
> They'll end up mapped via the later get_page_and_type().
> 
> As to the way these refs get installed: I've chosen to avoid the more
> expensive {get,put}_page_and_type(), favoring to put in place the
> intended type directly. I guess I could be convinced to avoid this
> bypassing of the actual logic; I merely think it's unnecessarily
> expensive.

In a different piece of code I would have asked to avoid open-coding
the type changes.  But there are already open-coded type changes in
dom0_construct_pv(), so adding those doesn't make the current status
worse.

> Note also that strictly speaking the iommu_iotlb_flush_all() here (as
> well as the pre-existing one in arch_iommu_hwdom_init()) shouldn't be
> needed: Actual hooking up (AMD) or enabling of translation (VT-d)
> occurs only afterwards anyway, so nothing can have made it into TLBs
> just yet.

Hm, indeed. I think the one in arch_iommu_hwdom_init can surely go
away, as we must strictly do the hwdom init before enabling the iommu
itself.

The one in dom0 build I'm less convinced, just to be on the safe side
if we ever change the order of IOMMU init and memory setup.  I would
expect flushing an empty TLB to not be very expensive?

> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -347,8 +347,8 @@ static unsigned int __hwdom_init hwdom_i
>  
>  void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>  {
> -unsigned long i, top, max_pfn;
> -unsigned int flush_flags = 0;
> +unsigned long i, top, max_pfn, start, count;
> +unsigned int flush_flags = 0, start_perms = 0;
>  
>  BUG_ON(!is_hardware_domain(d));
>  
> @@ -379,9 +379,9 @@ void __hwdom_init arch_iommu_hwdom_init(
>   * First Mb will get mapped in one go by pvh_populate_p2m(). Avoid
>   * setting up potentially conflicting mappings here.
>   */
> -i = 

Re: [PATCH v4 04/21] IOMMU: have iommu_{,un}map() split requests into largest possible chunks

2022-05-03 Thread Jan Beulich
On 03.05.2022 14:37, Roger Pau Monné wrote:
> On Mon, Apr 25, 2022 at 10:33:32AM +0200, Jan Beulich wrote:
>> --- a/xen/drivers/passthrough/iommu.c
>> +++ b/xen/drivers/passthrough/iommu.c
>> @@ -307,11 +338,10 @@ int iommu_map(struct domain *d, dfn_t df
>>  if ( !d->is_shutting_down && printk_ratelimit() )
>>  printk(XENLOG_ERR
>> "d%d: IOMMU mapping dfn %"PRI_dfn" to mfn %"PRI_mfn" 
>> failed: %d\n",
>> -   d->domain_id, dfn_x(dfn_add(dfn, i)),
>> -   mfn_x(mfn_add(mfn, i)), rc);
>> +   d->domain_id, dfn_x(dfn), mfn_x(mfn), rc);
> 
> Since you are already adjusting the line, I wouldn't mind if you also
> switched to use %pd at once (and in the same adjustment done to
> iommu_unmap).

I did consider doing so, but decided against since this would lead
to also touching the format string (which right now is unaltered).

>>  
>>  /* while statement to satisfy __must_check */
>> -while ( iommu_unmap(d, dfn, i, flush_flags) )
>> +while ( iommu_unmap(d, dfn0, i, flush_flags) )
> 
> To match previous behavior you likely need to use i + (1UL << order),
> so pages covered by the map_page call above are also taken care in the
> unmap request?

I'm afraid I don't follow: Prior behavior was to unmap only what
was mapped on earlier iterations. This continues to be that way.

> With that fixed:
> 
> Reviewed-by: Roger Pau Monné 

Thanks, but I'll wait with applying this.

Jan




Re: [PATCH v4 02/21] IOMMU: simplify unmap-on-error in iommu_map()

2022-05-03 Thread Jan Beulich
On 03.05.2022 12:25, Roger Pau Monné wrote:
> On Mon, Apr 25, 2022 at 10:32:10AM +0200, Jan Beulich wrote:
>> As of 68a8aa5d7264 ("iommu: make map and unmap take a page count,
>> similar to flush") there's no need anymore to have a loop here.
>>
>> Suggested-by: Roger Pau Monné 
>> Signed-off-by: Jan Beulich 
> 
> Reviewed-by: Roger Pau Monné 

Thanks.

> I wonder whether we should have a macro to ignore returns from
> __must_check attributed functions.  Ie:
> 
> #define IGNORE_RETURN(exp) while ( exp ) break;
> 
> As to avoid confusion (and having to reason) whether the usage of
> while is correct.  I always find it confusing to assert such loop
> expressions are correct.

I've been considering some form of wrapper macro (not specifically
the one you suggest), but I'm of two minds: On one hand I agree it
would help readers, but otoh I fear it may make it more attractive
to actually override the __must_check (which really ought to be an
exception).

Jan




Re: [PATCH v4 01/21] AMD/IOMMU: correct potentially-UB shifts

2022-05-03 Thread Jan Beulich
On 03.05.2022 12:10, Roger Pau Monné wrote:
> On Mon, Apr 25, 2022 at 10:30:33AM +0200, Jan Beulich wrote:
>> Recent changes (likely 5fafa6cf529a ["AMD/IOMMU: have callers specify
>> the target level for page table walks"]) have made Coverity notice a
>> shift count in iommu_pde_from_dfn() which might in theory grow too
>> large. While this isn't a problem in practice, address the concern
>> nevertheless to not leave dangling breakage in case very large
>> superpages would be enabled at some point.
>>
>> Coverity ID: 1504264
>>
>> While there also address a similar issue in set_iommu_ptes_present().
>> It's not clear to me why Coverity hasn't spotted that one.
>>
>> Signed-off-by: Jan Beulich 
> 
> Reviewed-by: Roger Pau Monné 

Thanks.

>> --- a/xen/drivers/passthrough/amd/iommu_map.c
>> +++ b/xen/drivers/passthrough/amd/iommu_map.c
>> @@ -89,11 +89,11 @@ static unsigned int set_iommu_ptes_prese
>> bool iw, bool ir)
>>  {
>>  union amd_iommu_pte *table, *pde;
>> -unsigned int page_sz, flush_flags = 0;
>> +unsigned long page_sz = 1UL << (PTE_PER_TABLE_SHIFT * (pde_level - 1));
> 
> Seeing the discussion from Andrews reply, nr_pages might be more
> appropriate while still quite short.

Yes and no - it then would be ambiguous as to what size pages are
meant.

> I'm not making my Rb conditional to that change though.

Good, thanks. But I guess I'm still somewhat stuck unless hearing
back from Andrew (although one might not count a conditional R-b
as a "pending objection"). I'll give him a few more days, but I
continue to think this ought to be a separate change (if renaming
is really needed in the 1st place) ...

Jan




[ovmf test] 170048: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170048 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170048/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  781 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days5 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



Re: [PATCH v2] xen/arm: p2m don't fall over on FEAT_LPA enabled hw

2022-05-03 Thread Luca Fancellu


> On 28 Apr 2022, at 11:34, Alex Bennée  wrote:
> 
> When we introduced FEAT_LPA to QEMU's -cpu max we discovered older
> kernels had a bug where the physical address was copied directly from
> ID_AA64MMFR0_EL1.PARange field. The early cpu_init code of Xen commits
> the same error by blindly copying across the max supported range.
> 
> Unsurprisingly when the page tables aren't set up for these greater
> ranges hilarity ensues and the hypervisor crashes fairly early on in
> the boot-up sequence. This happens when we write to the control
> register in enable_mmu().
> 
> Attempt to fix this the same way as the Linux kernel does by gating
> PARange to the maximum the hypervisor can handle. I also had to fix up
> code in p2m which panics when it sees an "invalid" entry in PARange.
> 
> Signed-off-by: Alex Bennée 
> Cc: Richard Henderson 
> Cc: Stefano Stabellini 
> Cc: Julien Grall 
> Cc: Volodymyr Babchuk 
> Cc: Bertrand Marquis 
> 
> ---
> v2
>  - clamp p2m_ipa_bits = PADDR_BIT instead
> ---

Hi Alex,

I’ve tested the patch on fvp and Xen+Dom0 runs fine.

Tested-by: Luca Fancellu 

Cheers,
Luca



[ovmf bisection] complete build-i386

2022-05-03 Thread osstest service owner
branch xen-unstable
xenbranch xen-unstable
job build-i386
testid xen-build

Tree: ovmf https://github.com/tianocore/edk2.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  ovmf https://github.com/tianocore/edk2.git
  Bug introduced:  d3febfd9ade35dc552df6b3607c2b15d26b82867
  Bug not present: 84338c0d498555f860a480693ee8647a1795fba3
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/170047/


  commit d3febfd9ade35dc552df6b3607c2b15d26b82867
  Author: Jason 
  Date:   Mon Jan 10 21:46:27 2022 +0800
  
  MdePkg: Replace Opcode with the corresponding instructions.
  
  REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3790
  
  Replace Opcode with the corresponding instructions.
  The code changes have been verified with CompareBuild.py tool, which
  can be used to compare the results of two different EDK II builds to
  determine if they generate the same binaries.
  (tool link: https://github.com/mdkinney/edk2/tree/sandbox/CompareBuild)
  
  Signed-off-by: Jason Lou 
  Cc: Michael D Kinney 
  Reviewed-by: Liming Gao 
  Cc: Zhiguang Liu 


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/ovmf/build-i386.xen-build.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/ovmf/build-i386.xen-build 
--summary-out=tmp/170047.bisection-summary --basis-template=168254 
--blessings=real,real-bisect,real-retry ovmf build-i386 xen-build
Searching for failure / basis pass:
 170045 fail [host=elbling1] / 168254 [host=albana0] 168249 [host=huxelrebe0] 
168232 [host=huxelrebe0] 168185 [host=huxelrebe0] 168131 [host=albana0] 168127 
[host=huxelrebe0] 168119 [host=albana0] 168115 [host=albana1] 168074 
[host=huxelrebe0] 168048 [host=albana0] 168046 [host=huxelrebe0] 168043 
[host=huxelrebe0] 168042 [host=chardonnay1] 168038 [host=huxelrebe0] 168017 
[host=albana0] 167989 [host=huxelrebe1] 167980 [host=albana1] 167976 
[host=huxelrebe0] 167956 [host=huxelrebe1] 167950 [host=a\
 lbana0] 167946 [host=fiano0] 167940 [host=albana0] 167933 [host=albana0] 
167929 [host=huxelrebe1] 167919 [host=fiano1] 167907 [host=albana0] 167803 
[host=huxelrebe0] 167775 [host=albana0] 167760 [host=fiano0] 167754 
[host=albana0] 167729 [host=albana1] 167727 [host=huxelrebe0] 167689 
[host=fiano0] 167685 [host=chardonnay1] 167651 [host=albana0] 167636 
[host=fiano0] 167627 [host=albana0] 167601 [host=albana1] 167598 
[host=huxelrebe0] 167559 [host=huxelrebe0] 167555 [host=huxelrebe0] 167552 
[host=\
 albana0] 167535 [host=chardonnay1] 167527 [host=chardonnay1] 167522 
[host=huxelrebe0] 167513 [host=albana1] 167487 [host=huxelrebe1] 167465 
[host=albana1] 167463 [host=huxelrebe0] 167450 [host=fiano1] 167445 
[host=chardonnay0] 167436 [host=fiano1] 167419 [host=pinot1] 167414 
[host=albana1] 167409 [host=albana0] 167394 [host=albana1] 167393 
[host=albana1] 167392 [host=albana1] 167391 [host=albana1] 167379 
[host=huxelrebe0] 167377 [host=huxelrebe1] 167239 [host=huxelrebe0] 167237 
[host=albana0] 16\
 7231 [host=albana0] 167225 [host=albana0] 167122 [host=huxelrebe0] 167104 
[host=huxelrebe0] 167081 [host=albana0] 166961 [host=albana0] 166951 
[host=pinot0] 166949 [host=pinot0] 166826 [host=albana0] 166360 [host=fiano0] 
166133 [host=albana1] 166130 [host=huxelrebe0] 166126 [host=huxelrebe1] 166123 
[host=albana0] 166120 [host=huxelrebe1] 166114 [host=huxelrebe1] 166108 
[host=huxelrebe0] 166105 [host=albana0] 166102 [host=huxelrebe0] 166097 
[host=huxelrebe1] 166093 [host=huxelrebe1] 166090 [host=\
 huxelrebe1] 166087 [host=fiano0] 166083 [host=albana0] 166081 
[host=huxelrebe0] 166063 [host=albana0] 166042 [host=huxelrebe0] 166035 
[host=albana0] 165969 [host=fiano0] 165962 [host=fiano0] 165950 [host=fiano0] 
165948 [host=fiano1] 165934 [host=fiano1] 165921 [host=albana0] 165899 
[host=huxelrebe0] 165873 [host=chardonnay0] 165862 [host=albana0] 165827 
[host=fiano0] 165808 [host=albana0] 165767 [host=fiano0] 165714 [host=fiano0] 
165701 [host=fiano0] 165690 [host=huxelrebe0] 165688 [host=huxelre\
 be0] 165685 [host=albana1] 165671 [host=albana0] 165657 [host=albana0] 165652 
[host=albana0] 165637 [host=fiano1] 165531 [host=albana1] 165523 [host=albana0] 
165508 [host=fiano1] 165505 [host=huxelrebe0] 165502 [host=fiano1] 165494 
[host=albana0] 165487 [host=albana1] 165474 [host=huxelrebe0] 165462 
[host=chardonnay0] 165433 [host=huxelrebe0] 165425 [host=albana0] 165398 
[host=albana1] 165382 [host=huxelrebe0] 165377 [host=albana0] 165347 
[host=chardonnay0] 165321 [host=elbling0] 165200 [host=ch\
 ardonnay0] 165175 [host=albana1] 165170 [host=albana1] 165155 
[host=huxelrebe0] 

Re: [PATCH v6 2/2] flask: implement xsm_set_system_active

2022-05-03 Thread Luca Fancellu


> On 3 May 2022, at 12:17, Daniel P. Smith  wrote:
> 
> This commit implements full support for starting the idle domain privileged by
> introducing a new flask label xenboot_t which the idle domain is labeled with
> at creation.  It then provides the implementation for the XSM hook
> xsm_set_system_active to relabel the idle domain to the existing xen_t flask
> label.
> 
> In the reference flask policy a new macro, xen_build_domain(target), is
> introduced for creating policies for dom0less/hyperlaunch allowing the
> hypervisor to create and assign the necessary resources for domain
> construction.
> 
> Signed-off-by: Daniel P. Smith 
> Reviewed-by: Jason Andryuk 

Hi Daniel,

I’ve built and tested the whole serie on arm, checked SILO and FLASK with 
builtin flask policy and I’ve
tested that Dom0 is booting fine.

So for me:

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 

Cheers,
Luca

[PATCH 0/2] x86/pat: fix querying available caching modes

2022-05-03 Thread Juergen Gross
Fix some issues with querying caching modes being available for memory
mappings.

This is a replacement for the patch of Jan sent recently:

https://lists.xen.org/archives/html/xen-devel/2022-04/msg02392.html

Juergen Gross (2):
  x86/pat: fix x86_has_pat_wp()
  x86/pat: add functions to query specific cache mode availability

 arch/x86/include/asm/memtype.h   |  2 ++
 arch/x86/include/asm/pci.h   |  2 +-
 arch/x86/mm/init.c   | 24 ++--
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 
 4 files changed, 29 insertions(+), 7 deletions(-)

-- 
2.35.3




[PATCH 2/2] x86/pat: add functions to query specific cache mode availability

2022-05-03 Thread Juergen Gross
Some drivers are using pat_enabled() in order to test availability of
special caching modes (WC and UC-). This will lead to false negatives
in case the system was booted e.g. with the "nopat" variant and the
BIOS did setup the PAT MSR supporting the queried mode, or if the
system is running as a Xen PV guest.

Add test functions for those caching modes instead and use them at the
appropriate places.

For symmetry reasons export the already existing x86_has_pat_wp() for
modules, too.

Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
Signed-off-by: Juergen Gross 
---
 arch/x86/include/asm/memtype.h   |  2 ++
 arch/x86/include/asm/pci.h   |  2 +-
 arch/x86/mm/init.c   | 25 +---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |  8 
 4 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/memtype.h b/arch/x86/include/asm/memtype.h
index 9ca760e430b9..d00e0be854d4 100644
--- a/arch/x86/include/asm/memtype.h
+++ b/arch/x86/include/asm/memtype.h
@@ -25,6 +25,8 @@ extern void memtype_free_io(resource_size_t start, 
resource_size_t end);
 extern bool pat_pfn_immune_to_uc_mtrr(unsigned long pfn);
 
 bool x86_has_pat_wp(void);
+bool x86_has_pat_wc(void);
+bool x86_has_pat_uc_minus(void);
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot);
 
 #endif /* _ASM_X86_MEMTYPE_H */
diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index f3fd5928bcbb..a5742268dec1 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int 
irq);
 
 
 #define HAVE_PCI_MMAP
-#define arch_can_pci_mmap_wc() pat_enabled()
+#define arch_can_pci_mmap_wc() x86_has_pat_wc()
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE
 
 #ifdef CONFIG_PCI
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 71e182ebced3..b6431f714dc2 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -77,12 +77,31 @@ static uint8_t __pte2cachemode_tbl[8] = {
[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
 };
 
-/* Check that the write-protect PAT entry is set for write-protect */
+static bool x86_has_pat_mode(unsigned int mode)
+{
+   return __pte2cachemode_tbl[__cachemode2pte_tbl[mode]] == mode;
+}
+
+/* Check that PAT supports write-protect */
 bool x86_has_pat_wp(void)
 {
-   return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
-  _PAGE_CACHE_MODE_WP;
+   return x86_has_pat_mode(_PAGE_CACHE_MODE_WP);
+}
+EXPORT_SYMBOL_GPL(x86_has_pat_wp);
+
+/* Check that PAT supports WC */
+bool x86_has_pat_wc(void)
+{
+   return x86_has_pat_mode(_PAGE_CACHE_MODE_WC);
+}
+EXPORT_SYMBOL_GPL(x86_has_pat_wc);
+
+/* Check that PAT supports UC- */
+bool x86_has_pat_uc_minus(void)
+{
+   return x86_has_pat_mode(_PAGE_CACHE_MODE_UC_MINUS);
 }
+EXPORT_SYMBOL_GPL(x86_has_pat_uc_minus);
 
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 0c5c43852e24..f43ecf3f63eb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
if (args->flags & ~(I915_MMAP_WC))
return -EINVAL;
 
-   if (args->flags & I915_MMAP_WC && !pat_enabled())
+   if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
return -ENODEV;
 
obj = i915_gem_object_lookup(file, args->handle);
@@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
 
if (HAS_LMEM(to_i915(dev)))
mmap_type = I915_MMAP_TYPE_FIXED;
-   else if (pat_enabled())
+   else if (x86_has_pat_wc())
mmap_type = I915_MMAP_TYPE_WC;
else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
return -ENODEV;
@@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void 
*data,
break;
 
case I915_MMAP_OFFSET_WC:
-   if (!pat_enabled())
+   if (!x86_has_pat_wc())
return -ENODEV;
type = I915_MMAP_TYPE_WC;
break;
@@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device *dev, void 
*data,
break;
 
case I915_MMAP_OFFSET_UC:
-   if (!pat_enabled())
+   if (!x86_has_pat_uc_minus())
return -ENODEV;
type = I915_MMAP_TYPE_UC;
break;
-- 
2.35.3




[PATCH 1/2] x86/pat: fix x86_has_pat_wp()

2022-05-03 Thread Juergen Gross
x86_has_pat_wp() is using a wrong test, as it relies on the normal
PAT configuration used by the kernel. In case the PAT MSR has been
setup by another entity (e.g. BIOS or Xen hypervisor) it might return
false even if the PAT configuration is allowing WP mappings.

Fixes: 1f6f655e01ad ("x86/mm: Add a x86_has_pat_wp() helper")
Signed-off-by: Juergen Gross 
---
 arch/x86/mm/init.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index d8cfce221275..71e182ebced3 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -80,7 +80,8 @@ static uint8_t __pte2cachemode_tbl[8] = {
 /* Check that the write-protect PAT entry is set for write-protect */
 bool x86_has_pat_wp(void)
 {
-   return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
+   return __pte2cachemode_tbl[__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP]] ==
+  _PAGE_CACHE_MODE_WP;
 }
 
 enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
-- 
2.35.3




[ovmf test] 170045: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170045 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170045/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  780 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days4 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



Re: [PATCH v6 1/2] xsm: create idle domain privileged and demote after setup

2022-05-03 Thread Luca Fancellu
Hi Daniel,

> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
> index 0bf63ffa84..b93101191e 100644
> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -186,6 +186,28 @@ static int cf_check flask_domain_alloc_security(struct 
> domain *d)
> return 0;
> }
> 
> +static int cf_check flask_set_system_active(void)
> +{
> +struct domain *d = current->domain;
> +
> +ASSERT(d->is_privileged);
> +
> +if ( d->domain_id != DOMID_IDLE )
> +{
> +printk("xsm_set_system_active should only be called by idle 
> domain\n");

Sorry I spotted that now, here in the printk probably you mean 
“flask_set_system_active”
instead of “xsm_set_system_active”, you can keep my R-by after this change.

Cheers,
Luca




Re: [PATCH v4 05/21] IOMMU/x86: restrict IO-APIC mappings for PV Dom0

2022-05-03 Thread Roger Pau Monné
On Mon, Apr 25, 2022 at 10:34:23AM +0200, Jan Beulich wrote:
> While already the case for PVH, there's no reason to treat PV
> differently here, though of course the addresses get taken from another
> source in this case. Except that, to match CPU side mappings, by default
> we permit r/o ones. This then also means we now deal consistently with
> IO-APICs whose MMIO is or is not covered by E820 reserved regions.
> 
> Signed-off-by: Jan Beulich 
> ---
> [integrated] v1: Integrate into series.
> [standalone] v2: Keep IOMMU mappings in sync with CPU ones.
> 
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -275,12 +275,12 @@ void iommu_identity_map_teardown(struct
>  }
>  }
>  
> -static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
> - unsigned long pfn,
> - unsigned long max_pfn)
> +static unsigned int __hwdom_init hwdom_iommu_map(const struct domain *d,
> + unsigned long pfn,
> + unsigned long max_pfn)
>  {
>  mfn_t mfn = _mfn(pfn);
> -unsigned int i, type;
> +unsigned int i, type, perms = IOMMUF_readable | IOMMUF_writable;
>  
>  /*
>   * Set up 1:1 mapping for dom0. Default to include only conventional RAM
> @@ -289,44 +289,60 @@ static bool __hwdom_init hwdom_iommu_map
>   * that fall in unusable ranges for PV Dom0.
>   */
>  if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) )
> -return false;
> +return 0;
>  
>  switch ( type = page_get_ram_type(mfn) )
>  {
>  case RAM_TYPE_UNUSABLE:
> -return false;
> +return 0;
>  
>  case RAM_TYPE_CONVENTIONAL:
>  if ( iommu_hwdom_strict )
> -return false;
> +return 0;
>  break;
>  
>  default:
>  if ( type & RAM_TYPE_RESERVED )
>  {
>  if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved )
> -return false;
> +perms = 0;
>  }
> -else if ( is_hvm_domain(d) || !iommu_hwdom_inclusive || pfn > 
> max_pfn )
> -return false;
> +else if ( is_hvm_domain(d) )
> +return 0;
> +else if ( !iommu_hwdom_inclusive || pfn > max_pfn )
> +perms = 0;
>  }
>  
>  /* Check that it doesn't overlap with the Interrupt Address Range. */
>  if ( pfn >= 0xfee00 && pfn <= 0xfeeff )
> -return false;
> +return 0;
>  /* ... or the IO-APIC */
> -for ( i = 0; has_vioapic(d) && i < d->arch.hvm.nr_vioapics; i++ )
> -if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
> -return false;
> +if ( has_vioapic(d) )
> +{
> +for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
> +if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
> +return 0;
> +}
> +else if ( is_pv_domain(d) )
> +{
> +/*
> + * Be consistent with CPU mappings: Dom0 is permitted to establish 
> r/o
> + * ones there, so it should also have such established for IOMMUs.
> + */
> +for ( i = 0; i < nr_ioapics; i++ )
> +if ( pfn == PFN_DOWN(mp_ioapics[i].mpc_apicaddr) )
> +return rangeset_contains_singleton(mmio_ro_ranges, pfn)
> +   ? IOMMUF_readable : 0;

If we really are after consistency with CPU side mappings, we should
likely take the whole contents of mmio_ro_ranges and d->iomem_caps
into account, not just the pages belonging to the IO-APIC?

There could also be HPET pages mapped as RO for PV.

Thanks, Roger.



Re: [PATCH] x86/PAT: have pat_enabled() properly reflect state when running on e.g. Xen

2022-05-03 Thread Juergen Gross

On 28.04.22 16:50, Jan Beulich wrote:

The latest with commit bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT
with pat_enabled()") pat_enabled() returning false (because of PAT
initialization being suppressed in the absence of MTRRs being announced
to be available) has become a problem: The i915 driver now fails to
initialize when running PV on Xen (i915_gem_object_pin_map() is where I
located the induced failure), and its error handling is flaky enough to
(at least sometimes) result in a hung system.

Yet even beyond that problem the keying of the use of WC mappings to
pat_enabled() (see arch_can_pci_mmap_wc()) means that in particular
graphics frame buffer accesses would have been quite a bit less
performant than possible.

Arrange for the function to return true in such environments, without
undermining the rest of PAT MSR management logic considering PAT to be
disabled: Specifically, no writes to the PAT MSR should occur.

For the new boolean to live in .init.data, init_cache_modes() also needs
moving to .init.text (where it could/should have lived already before).

Signed-off-by: Jan Beulich 


I think this approach isn't the best way to tackle the issue.

It can be solved rather easily by not deriving the supported caching
modes via pat_enabled(), but by adding specific functions to query
the needed caching mode from the PAT translation tables, and to use
those functions instead of pat_enabled().

I'm preparing a patch for that purpose.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[ovmf test] 170043: regressions - FAIL

2022-05-03 Thread osstest service owner
flight 170043 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170043/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-xsm   6 xen-buildfail REGR. vs. 168254
 build-amd64   6 xen-buildfail REGR. vs. 168254
 build-i3866 xen-buildfail REGR. vs. 168254
 build-i386-xsm6 xen-buildfail REGR. vs. 168254

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a

version targeted for testing:
 ovmf 101f4c789221716585b972f2c2a22a85c078ef1d
baseline version:
 ovmf b1b89f9009f2390652e0061bd7b24fc40732bc70

Last test of basis   168254  2022-02-28 10:41:46 Z   64 days
Failing since168258  2022-03-01 01:55:31 Z   63 days  779 attempts
Testing same since   170038  2022-05-03 10:12:47 Z0 days3 attempts


People who touched revisions under test:
  Abdul Lateef Attar 
  Abdul Lateef Attar via groups.io 
  Abner Chang 
  Akihiko Odaki 
  Anthony PERARD 
  Bo Chang Ke 
  Bob Feng 
  Chen Lin Z 
  Chen, Lin Z 
  Corvin Köhne 
  Dandan Bi 
  Dun Tan 
  Feng, Bob C 
  Gerd Hoffmann 
  Guo Dong 
  Guomin Jiang 
  Hao A Wu 
  Heng Luo 
  Hua Ma 
  Huang, Li-Xia 
  Jagadeesh Ujja 
  Jake Garver 
  Jake Garver via groups.io 
  Jason 
  Jason Lou 
  Jiewen Yao 
  Ke, Bo-ChangX 
  Ken Lautner 
  Kenneth Lautner 
  Kuo, Ted 
  Laszlo Ersek 
  Lean Sheng Tan 
  Leif Lindholm 
  Li, Yi1 
  Li, Zhihao 
  Liming Gao 
  Liu 
  Liu Yun 
  Liu Yun Y 
  Lixia Huang 
  Lou, Yun 
  Ma, Hua 
  Mara Sophie Grosch 
  Mara Sophie Grosch via groups.io 
  Matt DeVillier 
  Michael D Kinney 
  Michael Kubacki 
  Michael Kubacki 
  Min Xu 
  Oliver Steffen 
  Patrick Rudolph 
  Peter Grehan 
  Purna Chandra Rao Bandaru 
  Ray Ni 
  Rebecca Cran 
  Rebecca Cran 
  Sami Mujawar 
  Sean Rhodes 
  Sean Rhodes sean@starlabs.systems
  Sebastien Boeuf 
  Sunny Wang 
  Tan, Dun 
  Ted Kuo 
  Wenyi Xie 
  wenyi,xie via groups.io 
  Xiaolu.Jiang 
  Xie, Yuanhao 
  Yi Li 
  yi1 li 
  Yuanhao Xie 
  Zhihao Li 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5915 lines long.)



Re: [PATCH v4 04/21] IOMMU: have iommu_{,un}map() split requests into largest possible chunks

2022-05-03 Thread Roger Pau Monné
On Mon, Apr 25, 2022 at 10:33:32AM +0200, Jan Beulich wrote:
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -307,11 +338,10 @@ int iommu_map(struct domain *d, dfn_t df
>  if ( !d->is_shutting_down && printk_ratelimit() )
>  printk(XENLOG_ERR
> "d%d: IOMMU mapping dfn %"PRI_dfn" to mfn %"PRI_mfn" 
> failed: %d\n",
> -   d->domain_id, dfn_x(dfn_add(dfn, i)),
> -   mfn_x(mfn_add(mfn, i)), rc);
> +   d->domain_id, dfn_x(dfn), mfn_x(mfn), rc);

Since you are already adjusting the line, I wouldn't mind if you also
switched to use %pd at once (and in the same adjustment done to
iommu_unmap).

>  
>  /* while statement to satisfy __must_check */
> -while ( iommu_unmap(d, dfn, i, flush_flags) )
> +while ( iommu_unmap(d, dfn0, i, flush_flags) )

To match previous behavior you likely need to use i + (1UL << order),
so pages covered by the map_page call above are also taken care in the
unmap request?

With that fixed:

Reviewed-by: Roger Pau Monné 

(Feel free to adjust the printks to use %pd or not, that's not a
requirement for the Rb)

Thanks, Roger.



Re: [PATCH v6 1/2] xsm: create idle domain privileged and demote after setup

2022-05-03 Thread Luca Fancellu



> On 3 May 2022, at 12:17, Daniel P. Smith  wrote:
> 
> There are new capabilities, dom0less and hyperlaunch, that introduce internal
> hypervisor logic which needs to make resource allocation calls that are
> protected by XSM access checks. This creates an issue as a subset of the
> hypervisor code is executed under a system domain, the idle domain, that is
> represented by a per-CPU non-privileged struct domain. To enable these new
> capabilities to function correctly but in a controlled manner, this commit
> changes the idle system domain to be created as a privileged domain under the
> default policy and demoted before transitioning to running. A new XSM hook,
> xsm_set_system_active(), is introduced to allow each XSM policy type to demote
> the idle domain appropriately for that policy type. In the case of SILO, it
> inherits the default policy's hook for xsm_set_system_active().
> 
> For flask a stub is added to ensure that flask policy system will function
> correctly with this patch until flask is extended with support for starting 
> the
> idle domain privileged and properly demoting it on the call to
> xsm_set_system_active().
> 
> Signed-off-by: Daniel P. Smith 
> Reviewed-by: Jason Andryuk 


Reviewed-by: Luca Fancellu 

Cheers,
Luca




Re: [PATCH v5 1/2] xsm: create idle domain privileged and demote after setup

2022-05-03 Thread Luca Fancellu


> On 3 May 2022, at 12:30, Daniel P. Smith  wrote:
> 
> On 5/3/22 05:43, Luca Fancellu wrote:
>> 
>> 
>>> On 2 May 2022, at 14:53, Daniel P. Smith  
>>> wrote:
>>> 
>>> On 5/2/22 09:49, Daniel P. Smith wrote:
 On 5/2/22 09:42, Jason Andryuk wrote:
> On Mon, May 2, 2022 at 9:31 AM Daniel P. Smith
>  wrote:
>> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
>> index d5d0792ed4..b9057222d6 100644
>> --- a/xen/arch/arm/setup.c
>> +++ b/xen/arch/arm/setup.c
>> @@ -1048,6 +1048,10 @@ void __init start_xen(unsigned long 
>> boot_phys_offset,
>>/* Hide UART from DOM0 if we're using it */
>>serial_endboot();
>> 
>> +if ( (rc = xsm_set_system_active()) != 0 )
>> +panic("xsm(err=%d): "
>> +  "unable to set hypervisor to SYSTEM_ACTIVE privilege\n", 
>> err);
> 
> You want to print rc in this message.
 
 Thanks, but now I want to figure out how that compile
>>> 
>>> Ah, arm which I do not have a build env to do build tests.
>> 
>> I’ve built this patch on arm (changing err to rc), everything looks fine, so 
>> with that
>> addressed:
>> 
>> Reviewed-by: Luca Fancellu 
> 
> Thank you and my apologies for not adding your review-by this morning. I
> had v6 queued to go out last night and missed this email before releasing.
> 

Hi Daniel,

It’s ok I will put it again in the new serie.

Cheers,
Luca

[seabios test] 170031: tolerable FAIL - PUSHED

2022-05-03 Thread osstest service owner
flight 170031 seabios real [real]
http://logs.test-lab.xenproject.org/osstest/logs/170031/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 169167
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 169167
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 169167
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 169167
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 169167
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass

version targeted for testing:
 seabios  dc88f9b72df52b22c35b127b80c487e0b6fca4af
baseline version:
 seabios  01774004c7f7fdc9c1e8f1715f70d3b913f8d491

Last test of basis   169167  2022-04-04 21:41:47 Z   28 days
Testing same since   170031  2022-05-03 08:44:11 Z0 days1 attempts


People who touched revisions under test:
  Gerd Hoffmann 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm pass
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-qemuu-freebsd11-amd64   pass
 test-amd64-amd64-qemuu-freebsd12-amd64   pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-ws16-amd64 fail
 test-amd64-i386-xl-qemuu-ws16-amd64  fail
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrictpass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict pass
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/seabios.git
   0177400..dc88f9b  dc88f9b72df52b22c35b127b80c487e0b6fca4af -> 
xen-tested-master



  1   2   >