date:20230110

Re: Questions about how block devices use snapshots

2023-01-10 Thread Zhiyong Ye


Hi Kevin,

Thank you for your reply and detailed answers.

In my scenario is the iSCSI SAN environment. The network storage device 
is connected to the physical machine via iSCSI, and LVM is used as the 
middle layer between the storage device and the VM for storage 
management and metadata synchronization. Every VM uses both raw and 
qcow2 formats, with the system disk being qcow2 and the data disk being 
raw. Therefore block devices need to support snapshot capability in both 
raw and qcow2 store methods. In addition, snapshot images should also be 
stored in iSCSI storage, which is a block device.


Both internal and external snapshots can implement snapshots of block 
devices, but they both have their drawbacks when multiple snapshots are 
required.


Internal snapshots can only be used in qcow2 format and do not require 
additional creation of new block devices. As you said, the block device 
has much more space than the virtual disk. There is no telling when disk 
space will be full when creating multiple snapshots.


External snapshots require the creation of additional block devices to 
store the overlay images, but it is not clear how much space needs to be 
created. If the space is the same as the virtual disk, when there are 
multiple snapshots it will be a serious waste of disk space, because 
each time a new snapshot is created the previous one will become 
read-only. However, if the disk space created is too small, the snapshot 
data may not be stored when the disk space is full.


The problem with both is the uncertainty of the space size of the block 
device at the time of creation. Of course, we can rely on lvm's resize 
function to dynamically grow the space of the block device. But I think 
this is more of a workaround.


It is mentioned in the Qemu docs page under "QEMU disk image utility" 
that the qemu-img rebase can be used to perform a “diff” operation on 
two disk images.


Say that base.img has been cloned as modified.img by copying it, and 
that the modified.img guest has run so there are now some changes 
compared to base.img. To construct a thin image called diff.qcow2 that 
contains just the differences, do:


qemu-img create -f qcow2 -b modified.img diff.qcow2
qemu-img rebase -b base.img diff.qcow2

At this point, modified.img can be discarded, since base.img + 
diff.qcow2 contains the same information.


Can this “diff” operation be used on snapshots of block devices? The 
first snapshot is a copy of the original disk (to save space we can copy 
only the data that has already been used), while the subsequent 
snapshots are based on the diff of the previous snapshot, so that the 
space required for the created block device is known at the time of the 
snapshot.


Regards

Zhiyong

On 1/9/23 9:57 PM, Kevin Wolf wrote:

Am 09.01.2023 um 13:45 hat Zhiyong Ye geschrieben:

Qemu provides powerful snapshot capabilities for different file
formats. But this is limited to the block backend being a file, and
support is not good enough when it is a block device. When creating
snapshots based on files, there is no need to specify the size of the
snapshot image, which can grow dynamically as the virtual machine is
used. But block devices are fixed in size at creation and cannot be
dynamically grown at a later time.

So is there any way to support snapshots when the block backend is a
block device?


In order to have snapshots, you need to have an image format like qcow2.

A qcow2 file can have a raw block device as its backing file, so even if
you store the overlay image on a filesystem, you have technically
snapshotted a block device. This may or may not be enough for your use
case.

It is also possible to store qcow2 files on block devices, though
depending on your requirements, it can get very tricky because then
you're responsible for making sure that there is always enough free
space on the block device.

So a second, still very simple, approach could be taking a second block
device that is a little bit larger than the virtual disk (for the qcow2
metadata) and use that as the external snapshot. Obviously, you require
a lot of disk space this way, because each snapshots needs to be able to
store the full image.

You could also use internal snapshots. In this case, you just need to
make sure that the block device is a lot larger than the virtual disk,
so that there is enough space left for storing the snapshots. At some
point it will be full.

And finally, for example if your block devices are actually LVs, you
could start resizing the block device dynmically as needed. This becomes
very complex quickly and you're on your own, but it is possible and has
been done by oVirt.

Kevin

[PULL 2/6] hw/nvme: rename shadow doorbell related trace events

2023-01-10 Thread Klaus Jensen

From: Klaus Jensen 

Rename the trace events related to writing the event index and reading
the doorbell value to make it more clear that the event is associated
with an actual update (write or read respectively).

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Keith Busch 
Signed-off-by: Klaus Jensen 
---
 hw/nvme/ctrl.c   | 11 +++
 hw/nvme/trace-events |  8 
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 78c2f4e39d0a..cfe16476f0a4 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1337,8 +1337,9 @@ static inline void nvme_blk_write(BlockBackend *blk, 
int64_t offset,
 static void nvme_update_cq_head(NvmeCQueue *cq)
 {
 pci_dma_read(PCI_DEVICE(cq->ctrl), cq->db_addr, &cq->head,
-sizeof(cq->head));
-trace_pci_nvme_shadow_doorbell_cq(cq->cqid, cq->head);
+ sizeof(cq->head));
+
+trace_pci_nvme_update_cq_head(cq->cqid, cq->head);
 }
 
 static void nvme_post_cqes(void *opaque)
@@ -6147,16 +6148,18 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest 
*req)
 
 static void nvme_update_sq_eventidx(const NvmeSQueue *sq)
 {
+trace_pci_nvme_update_sq_eventidx(sq->sqid, sq->tail);
+
 pci_dma_write(PCI_DEVICE(sq->ctrl), sq->ei_addr, &sq->tail,
   sizeof(sq->tail));
-trace_pci_nvme_eventidx_sq(sq->sqid, sq->tail);
 }
 
 static void nvme_update_sq_tail(NvmeSQueue *sq)
 {
 pci_dma_read(PCI_DEVICE(sq->ctrl), sq->db_addr, &sq->tail,
  sizeof(sq->tail));
-trace_pci_nvme_shadow_doorbell_sq(sq->sqid, sq->tail);
+
+trace_pci_nvme_update_sq_tail(sq->sqid, sq->tail);
 }
 
 static void nvme_process_sq(void *opaque)
diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events
index fccb79f48973..b16f2260b4fd 100644
--- a/hw/nvme/trace-events
+++ b/hw/nvme/trace-events
@@ -84,8 +84,8 @@ pci_nvme_enqueue_event_noqueue(int queued) "queued %d"
 pci_nvme_enqueue_event_masked(uint8_t typ) "type 0x%"PRIx8""
 pci_nvme_no_outstanding_aers(void) "ignoring event; no outstanding AERs"
 pci_nvme_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint32_t dw0, 
uint32_t dw1, uint16_t status) "cid %"PRIu16" cqid %"PRIu16" dw0 0x%"PRIx32" 
dw1 0x%"PRIx32" status 0x%"PRIx16""
-pci_nvme_eventidx_cq(uint16_t cqid, uint16_t new_eventidx) "cqid %"PRIu16" 
new_eventidx %"PRIu16""
-pci_nvme_eventidx_sq(uint16_t sqid, uint16_t new_eventidx) "sqid %"PRIu16" 
new_eventidx %"PRIu16""
+pci_nvme_update_cq_eventidx(uint16_t cqid, uint16_t new_eventidx) "cqid 
%"PRIu16" new_eventidx %"PRIu16""
+pci_nvme_update_sq_eventidx(uint16_t sqid, uint16_t new_eventidx) "sqid 
%"PRIu16" new_eventidx %"PRIu16""
 pci_nvme_mmio_read(uint64_t addr, unsigned size) "addr 0x%"PRIx64" size %d"
 pci_nvme_mmio_write(uint64_t addr, uint64_t data, unsigned size) "addr 
0x%"PRIx64" data 0x%"PRIx64" size %d"
 pci_nvme_mmio_doorbell_cq(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" 
new_head %"PRIu16""
@@ -102,8 +102,8 @@ pci_nvme_mmio_start_success(void) "setting controller 
enable bit succeeded"
 pci_nvme_mmio_stopped(void) "cleared controller enable bit"
 pci_nvme_mmio_shutdown_set(void) "shutdown bit set"
 pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared"
-pci_nvme_shadow_doorbell_cq(uint16_t cqid, uint16_t new_shadow_doorbell) "cqid 
%"PRIu16" new_shadow_doorbell %"PRIu16""
-pci_nvme_shadow_doorbell_sq(uint16_t sqid, uint16_t new_shadow_doorbell) "sqid 
%"PRIu16" new_shadow_doorbell %"PRIu16""
+pci_nvme_update_cq_head(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" 
new_head %"PRIu16""
+pci_nvme_update_sq_tail(uint16_t sqid, uint16_t new_tail) "sqid %"PRIu16" 
new_tail %"PRIu16""
 pci_nvme_open_zone(uint64_t slba, uint32_t zone_idx, int all) "open zone, 
slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32""
 pci_nvme_close_zone(uint64_t slba, uint32_t zone_idx, int all) "close zone, 
slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32""
 pci_nvme_finish_zone(uint64_t slba, uint32_t zone_idx, int all) "finish zone, 
slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32""
-- 
2.39.0

[PULL 6/6] hw/nvme: cleanup error reporting in nvme_init_pci()

2023-01-10 Thread Klaus Jensen

From: Klaus Jensen 

Replace the local Error variable with errp and ERRP_GUARD() and change
the return value to bool.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Klaus Jensen 
---
 hw/nvme/ctrl.c | 25 -
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index b21455ada660..f25cc2c235e9 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -7290,15 +7290,14 @@ static int nvme_add_pm_capability(PCIDevice *pci_dev, 
uint8_t offset)
 return 0;
 }
 
-static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp)
+static bool nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp)
 {
+ERRP_GUARD();
 uint8_t *pci_conf = pci_dev->config;
 uint64_t bar_size;
 unsigned msix_table_offset, msix_pba_offset;
 int ret;
 
-Error *err = NULL;
-
 pci_conf[PCI_INTERRUPT_PIN] = 1;
 pci_config_set_prog_interface(pci_conf, 0x2);
 
@@ -7335,14 +7334,14 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice 
*pci_dev, Error **errp)
 }
 ret = msix_init(pci_dev, n->params.msix_qsize,
 &n->bar0, 0, msix_table_offset,
-&n->bar0, 0, msix_pba_offset, 0, &err);
-if (ret < 0) {
-if (ret == -ENOTSUP) {
-warn_report_err(err);
-} else {
-error_propagate(errp, err);
-return ret;
-}
+&n->bar0, 0, msix_pba_offset, 0, errp);
+if (ret == -ENOTSUP) {
+/* report that msix is not supported, but do not error out */
+warn_report_err(*errp);
+*errp = NULL;
+} else if (ret < 0) {
+/* propagate error to caller */
+return false;
 }
 
 nvme_update_msixcap_ts(pci_dev, n->conf_msix_qsize);
@@ -7359,7 +7358,7 @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, 
Error **errp)
 nvme_init_sriov(n, pci_dev, 0x120);
 }
 
-return 0;
+return true;
 }
 
 static void nvme_init_subnqn(NvmeCtrl *n)
@@ -7535,7 +7534,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
 return;
 }
 nvme_init_state(n);
-if (nvme_init_pci(n, pci_dev, errp)) {
+if (!nvme_init_pci(n, pci_dev, errp)) {
 return;
 }
 nvme_init_ctrl(n, pci_dev);
-- 
2.39.0

[PULL 4/6] hw/nvme: fix missing cq eventidx update

2023-01-10 Thread Klaus Jensen

From: Klaus Jensen 

Prior to reading the shadow doorbell cq head, we have to update the
eventidx. Otherwise, we risk that the driver will skip an mmio doorbell
write. This happens on riscv64, as reported by Guenter.

Adding the missing update to the cq eventidx fixes the issue.

Fixes: 3f7fe8de3d49 ("hw/nvme: Implement shadow doorbell buffer support")
Cc: qemu-sta...@nongnu.org
Cc: qemu-ri...@nongnu.org
Reported-by: Guenter Roeck 
Reviewed-by: Keith Busch 
Signed-off-by: Klaus Jensen 
---
 hw/nvme/ctrl.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 28e02ec7baa6..226480033771 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1334,6 +1334,15 @@ static inline void nvme_blk_write(BlockBackend *blk, 
int64_t offset,
 }
 }
 
+static void nvme_update_cq_eventidx(const NvmeCQueue *cq)
+{
+uint32_t v = cpu_to_le32(cq->head);
+
+trace_pci_nvme_update_cq_eventidx(cq->cqid, cq->head);
+
+pci_dma_write(PCI_DEVICE(cq->ctrl), cq->ei_addr, &v, sizeof(v));
+}
+
 static void nvme_update_cq_head(NvmeCQueue *cq)
 {
 uint32_t v;
@@ -1358,6 +1367,7 @@ static void nvme_post_cqes(void *opaque)
 hwaddr addr;
 
 if (n->dbbuf_enabled) {
+nvme_update_cq_eventidx(cq);
 nvme_update_cq_head(cq);
 }
 
-- 
2.39.0

[PULL 1/6] hw/nvme: use QOM accessors

2023-01-10 Thread Klaus Jensen

From: Klaus Jensen 

Replace various ->parent_obj use with the equivalent QOM accessors.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Klaus Jensen 
---
 hw/nvme/ctrl.c | 89 +++---
 1 file changed, 48 insertions(+), 41 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 4a0c51a9477e..78c2f4e39d0a 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -449,7 +449,7 @@ static int nvme_addr_read(NvmeCtrl *n, hwaddr addr, void 
*buf, int size)
 return 0;
 }
 
-return pci_dma_read(&n->parent_obj, addr, buf, size);
+return pci_dma_read(PCI_DEVICE(n), addr, buf, size);
 }
 
 static int nvme_addr_write(NvmeCtrl *n, hwaddr addr, const void *buf, int size)
@@ -469,7 +469,7 @@ static int nvme_addr_write(NvmeCtrl *n, hwaddr addr, const 
void *buf, int size)
 return 0;
 }
 
-return pci_dma_write(&n->parent_obj, addr, buf, size);
+return pci_dma_write(PCI_DEVICE(n), addr, buf, size);
 }
 
 static bool nvme_nsid_valid(NvmeCtrl *n, uint32_t nsid)
@@ -514,24 +514,27 @@ static uint8_t nvme_sq_empty(NvmeSQueue *sq)
 
 static void nvme_irq_check(NvmeCtrl *n)
 {
+PCIDevice *pci = PCI_DEVICE(n);
 uint32_t intms = ldl_le_p(&n->bar.intms);
 
-if (msix_enabled(&(n->parent_obj))) {
+if (msix_enabled(pci)) {
 return;
 }
 if (~intms & n->irq_status) {
-pci_irq_assert(&n->parent_obj);
+pci_irq_assert(pci);
 } else {
-pci_irq_deassert(&n->parent_obj);
+pci_irq_deassert(pci);
 }
 }
 
 static void nvme_irq_assert(NvmeCtrl *n, NvmeCQueue *cq)
 {
+PCIDevice *pci = PCI_DEVICE(n);
+
 if (cq->irq_enabled) {
-if (msix_enabled(&(n->parent_obj))) {
+if (msix_enabled(pci)) {
 trace_pci_nvme_irq_msix(cq->vector);
-msix_notify(&(n->parent_obj), cq->vector);
+msix_notify(pci, cq->vector);
 } else {
 trace_pci_nvme_irq_pin();
 assert(cq->vector < 32);
@@ -546,7 +549,7 @@ static void nvme_irq_assert(NvmeCtrl *n, NvmeCQueue *cq)
 static void nvme_irq_deassert(NvmeCtrl *n, NvmeCQueue *cq)
 {
 if (cq->irq_enabled) {
-if (msix_enabled(&(n->parent_obj))) {
+if (msix_enabled(PCI_DEVICE(n))) {
 return;
 } else {
 assert(cq->vector < 32);
@@ -570,7 +573,7 @@ static void nvme_req_clear(NvmeRequest *req)
 static inline void nvme_sg_init(NvmeCtrl *n, NvmeSg *sg, bool dma)
 {
 if (dma) {
-pci_dma_sglist_init(&sg->qsg, &n->parent_obj, 0);
+pci_dma_sglist_init(&sg->qsg, PCI_DEVICE(n), 0);
 sg->flags = NVME_SG_DMA;
 } else {
 qemu_iovec_init(&sg->iov, 0);
@@ -1333,7 +1336,7 @@ static inline void nvme_blk_write(BlockBackend *blk, 
int64_t offset,
 
 static void nvme_update_cq_head(NvmeCQueue *cq)
 {
-pci_dma_read(&cq->ctrl->parent_obj, cq->db_addr, &cq->head,
+pci_dma_read(PCI_DEVICE(cq->ctrl), cq->db_addr, &cq->head,
 sizeof(cq->head));
 trace_pci_nvme_shadow_doorbell_cq(cq->cqid, cq->head);
 }
@@ -1363,7 +1366,7 @@ static void nvme_post_cqes(void *opaque)
 req->cqe.sq_id = cpu_to_le16(sq->sqid);
 req->cqe.sq_head = cpu_to_le16(sq->head);
 addr = cq->dma_addr + cq->tail * n->cqe_size;
-ret = pci_dma_write(&n->parent_obj, addr, (void *)&req->cqe,
+ret = pci_dma_write(PCI_DEVICE(n), addr, (void *)&req->cqe,
 sizeof(req->cqe));
 if (ret) {
 trace_pci_nvme_err_addr_write(addr);
@@ -4615,6 +4618,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest 
*req)
 
 static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
 {
+PCIDevice *pci = PCI_DEVICE(n);
 uint16_t offset = (cq->cqid << 3) + (1 << 2);
 
 n->cq[cq->cqid] = NULL;
@@ -4625,8 +4629,8 @@ static void nvme_free_cq(NvmeCQueue *cq, NvmeCtrl *n)
 event_notifier_set_handler(&cq->notifier, NULL);
 event_notifier_cleanup(&cq->notifier);
 }
-if (msix_enabled(&n->parent_obj)) {
-msix_vector_unuse(&n->parent_obj, cq->vector);
+if (msix_enabled(pci)) {
+msix_vector_unuse(pci, cq->vector);
 }
 if (cq->cqid) {
 g_free(cq);
@@ -4664,8 +4668,10 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, 
uint64_t dma_addr,
  uint16_t cqid, uint16_t vector, uint16_t size,
  uint16_t irq_enabled)
 {
-if (msix_enabled(&n->parent_obj)) {
-msix_vector_use(&n->parent_obj, vector);
+PCIDevice *pci = PCI_DEVICE(n);
+
+if (msix_enabled(pci)) {
+msix_vector_use(pci, vector);
 }
 cq->ctrl = n;
 cq->cqid = cqid;
@@ -4716,7 +4722,7 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest 
*req)
 trace_pci_nvme_err_invalid_create_cq_addr(prp1);
 return NVME_INVALID_PRP_OFFSET | NVME_DNR;
 }
-if (unlikely(!msix_enabled(&n->parent_obj) && vector)) {
+if (unlikely(!msix_enabled(PCI_DEVICE(n)) && vector

[PULL 3/6] hw/nvme: fix missing endian conversions for doorbell buffers

2023-01-10 Thread Klaus Jensen

From: Klaus Jensen 

The eventidx and doorbell value are not handling endianness correctly.
Fix this.

Fixes: 3f7fe8de3d49 ("hw/nvme: Implement shadow doorbell buffer support")
Cc: qemu-sta...@nongnu.org
Reported-by: Guenter Roeck 
Reviewed-by: Keith Busch 
Signed-off-by: Klaus Jensen 
---
 hw/nvme/ctrl.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index cfe16476f0a4..28e02ec7baa6 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1336,8 +1336,11 @@ static inline void nvme_blk_write(BlockBackend *blk, 
int64_t offset,
 
 static void nvme_update_cq_head(NvmeCQueue *cq)
 {
-pci_dma_read(PCI_DEVICE(cq->ctrl), cq->db_addr, &cq->head,
- sizeof(cq->head));
+uint32_t v;
+
+pci_dma_read(PCI_DEVICE(cq->ctrl), cq->db_addr, &v, sizeof(v));
+
+cq->head = le32_to_cpu(v);
 
 trace_pci_nvme_update_cq_head(cq->cqid, cq->head);
 }
@@ -6148,16 +6151,20 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest 
*req)
 
 static void nvme_update_sq_eventidx(const NvmeSQueue *sq)
 {
+uint32_t v = cpu_to_le32(sq->tail);
+
 trace_pci_nvme_update_sq_eventidx(sq->sqid, sq->tail);
 
-pci_dma_write(PCI_DEVICE(sq->ctrl), sq->ei_addr, &sq->tail,
-  sizeof(sq->tail));
+pci_dma_write(PCI_DEVICE(sq->ctrl), sq->ei_addr, &v, sizeof(v));
 }
 
 static void nvme_update_sq_tail(NvmeSQueue *sq)
 {
-pci_dma_read(PCI_DEVICE(sq->ctrl), sq->db_addr, &sq->tail,
- sizeof(sq->tail));
+uint32_t v;
+
+pci_dma_read(PCI_DEVICE(sq->ctrl), sq->db_addr, &v, sizeof(v));
+
+sq->tail = le32_to_cpu(v);
 
 trace_pci_nvme_update_sq_tail(sq->sqid, sq->tail);
 }
-- 
2.39.0

[PULL 0/6] hw/nvme updates

2023-01-10 Thread Klaus Jensen

From: Klaus Jensen 

Hi Peter,

The following changes since commit 528d9f33cad5245c1099d77084c78bb2244d5143:

  Merge tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu into staging 
(2023-01-08 11:23:17 +)

are available in the Git repository at:

  https://gitlab.com/birkelund/qemu.git tags/nvme-next-pull-request

for you to fetch changes up to 973f76cf7743545a5d8a0a8bfdfe2cd02aa3e238:

  hw/nvme: cleanup error reporting in nvme_init_pci() (2023-01-11 08:41:19 
+0100)


hw/nvme updates



Klaus Jensen (6):
  hw/nvme: use QOM accessors
  hw/nvme: rename shadow doorbell related trace events
  hw/nvme: fix missing endian conversions for doorbell buffers
  hw/nvme: fix missing cq eventidx update
  hw/nvme: clean up confusing use of errp/local_err
  hw/nvme: cleanup error reporting in nvme_init_pci()

 hw/nvme/ctrl.c   | 194 ---
 hw/nvme/trace-events |   8 +-
 2 files changed, 113 insertions(+), 89 deletions(-)

-- 
2.39.0

[PULL 5/6] hw/nvme: clean up confusing use of errp/local_err

2023-01-10 Thread Klaus Jensen

From: Klaus Jensen 

Remove an unnecessary local Error value in nvme_realize(). In the
process, change nvme_check_constraints() to return a bool.

Reviewed-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Klaus Jensen 
---
 hw/nvme/ctrl.c | 48 +++-
 1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 226480033771..b21455ada660 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -6981,7 +6981,7 @@ static const MemoryRegionOps nvme_cmb_ops = {
 },
 };
 
-static void nvme_check_constraints(NvmeCtrl *n, Error **errp)
+static bool nvme_check_params(NvmeCtrl *n, Error **errp)
 {
 NvmeParams *params = &n->params;
 
@@ -6995,38 +6995,38 @@ static void nvme_check_constraints(NvmeCtrl *n, Error 
**errp)
 if (n->namespace.blkconf.blk && n->subsys) {
 error_setg(errp, "subsystem support is unavailable with legacy "
"namespace ('drive' property)");
-return;
+return false;
 }
 
 if (params->max_ioqpairs < 1 ||
 params->max_ioqpairs > NVME_MAX_IOQPAIRS) {
 error_setg(errp, "max_ioqpairs must be between 1 and %d",
NVME_MAX_IOQPAIRS);
-return;
+return false;
 }
 
 if (params->msix_qsize < 1 ||
 params->msix_qsize > PCI_MSIX_FLAGS_QSIZE + 1) {
 error_setg(errp, "msix_qsize must be between 1 and %d",
PCI_MSIX_FLAGS_QSIZE + 1);
-return;
+return false;
 }
 
 if (!params->serial) {
 error_setg(errp, "serial property not set");
-return;
+return false;
 }
 
 if (n->pmr.dev) {
 if (host_memory_backend_is_mapped(n->pmr.dev)) {
 error_setg(errp, "can't use already busy memdev: %s",

object_get_canonical_path_component(OBJECT(n->pmr.dev)));
-return;
+return false;
 }
 
 if (!is_power_of_2(n->pmr.dev->size)) {
 error_setg(errp, "pmr backend size needs to be power of 2 in 
size");
-return;
+return false;
 }
 
 host_memory_backend_set_mapped(n->pmr.dev, true);
@@ -7035,64 +7035,64 @@ static void nvme_check_constraints(NvmeCtrl *n, Error 
**errp)
 if (n->params.zasl > n->params.mdts) {
 error_setg(errp, "zoned.zasl (Zone Append Size Limit) must be less "
"than or equal to mdts (Maximum Data Transfer Size)");
-return;
+return false;
 }
 
 if (!n->params.vsl) {
 error_setg(errp, "vsl must be non-zero");
-return;
+return false;
 }
 
 if (params->sriov_max_vfs) {
 if (!n->subsys) {
 error_setg(errp, "subsystem is required for the use of SR-IOV");
-return;
+return false;
 }
 
 if (params->sriov_max_vfs > NVME_MAX_VFS) {
 error_setg(errp, "sriov_max_vfs must be between 0 and %d",
NVME_MAX_VFS);
-return;
+return false;
 }
 
 if (params->cmb_size_mb) {
 error_setg(errp, "CMB is not supported with SR-IOV");
-return;
+return false;
 }
 
 if (n->pmr.dev) {
 error_setg(errp, "PMR is not supported with SR-IOV");
-return;
+return false;
 }
 
 if (!params->sriov_vq_flexible || !params->sriov_vi_flexible) {
 error_setg(errp, "both sriov_vq_flexible and sriov_vi_flexible"
" must be set for the use of SR-IOV");
-return;
+return false;
 }
 
 if (params->sriov_vq_flexible < params->sriov_max_vfs * 2) {
 error_setg(errp, "sriov_vq_flexible must be greater than or equal"
" to %d (sriov_max_vfs * 2)", params->sriov_max_vfs * 
2);
-return;
+return false;
 }
 
 if (params->max_ioqpairs < params->sriov_vq_flexible + 2) {
 error_setg(errp, "(max_ioqpairs - sriov_vq_flexible) must be"
" greater than or equal to 2");
-return;
+return false;
 }
 
 if (params->sriov_vi_flexible < params->sriov_max_vfs) {
 error_setg(errp, "sriov_vi_flexible must be greater than or equal"
" to %d (sriov_max_vfs)", params->sriov_max_vfs);
-return;
+return false;
 }
 
 if (params->msix_qsize < params->sriov_vi_flexible + 1) {
 error_setg(errp, "(msix_qsize - sriov_vi_flexible) must be"
" greater than or equal to 1");
-return;
+return false;
 }
 
 if (params->sriov_max_vi_per_vf &&
@@ -7100,7 +7100,7 @@ static void nvme_check_constraints(NvmeCtrl *n, Error 
**errp)
 error_setg(errp, "sriov_max_vi_per_vf must meet:"

Re: [PATCH v9 0/9] support subsets of code size reduction extension

2023-01-10 Thread weiwei




On 2023/1/11 13:00, Alistair Francis wrote:

On Wed, Dec 28, 2022 at 4:23 PM Weiwei Li  wrote:

This patchset implements RISC-V Zc* extension v1.0.0.RC5.7 version instructions.

Specification:
https://github.com/riscv/riscv-code-size-reduction/tree/main/Zc-specification

The port is available here:
https://github.com/plctlab/plct-qemu/tree/plct-zce-upstream-v9

To test Zc* implementation, specify cpu argument with 'x-zca=true,x-zcb=true,x-zcf=true,f=true" and 
"x-zcd=true,d=true" (or "x-zcmp=true,x-zcmt=true" with c or d=false) to enable 
Zca/Zcb/Zcf and Zcd(or Zcmp,Zcmt) extensions support.


This implementation can pass the basic zc tests from 
https://github.com/yulong-plct/zc-test

v9:
* rebase on riscv-to-apply.next

v8:
* improve disas support in Patch 9

v7:
* Fix description for Zca

v6：
* fix base address for jump table in Patch 7
* rebase on riscv-to-apply.next

v5:
* fix exception unwind problem for cpu_ld*_code in helper of cm_jalt

v4:
* improve Zcmp suggested by Richard
* fix stateen related check for Zcmt

v3:
* update the solution for Zcf to the way of Zcd
* update Zcb to reuse gen_load/store
* use trans function instead of helper for push/pop

v2:
* add check for relationship between Zca/Zcf/Zcd with C/F/D based on related 
discussion in review of Zc* spec
* separate c.fld{sp}/fsd{sp} with fld{sp}/fsd{sp} before support of zcmp/zcmt

Weiwei Li (9):
   target/riscv: add cfg properties for Zc* extension
   target/riscv: add support for Zca extension
   target/riscv: add support for Zcf extension
   target/riscv: add support for Zcd extension
   target/riscv: add support for Zcb extension
   target/riscv: add support for Zcmp extension
   target/riscv: add support for Zcmt extension
   target/riscv: expose properties for Zc* extension
   disas/riscv.c: add disasm support for Zc*

This series broke a range of boards that use specific CPUs. I have
dropped it from my tree.

Daniel has sent a series that should fix it though
(https://www.mail-archive.com/qemu-devel@nongnu.org/msg930952.html). I
have applied his fixes. Can you rebase this series on
https://github.com/alistair23/qemu/tree/riscv-to-apply.next, test to
ensure the SiFive boards continue to work and then re-send the series?

Alistair


This seems "C implies Zca" is not applied on specific CPUs and it'll be 
fixed if Zc* related check is


moved to riscv_cpu_validate_set_extensions just as  Daniel's series.

I'll rebase on it and test the CPUs in next version.

Regards,

Weiwei Li


  disas/riscv.c | 228 +++-
  target/riscv/cpu.c|  56 
  target/riscv/cpu.h|  10 +
  target/riscv/cpu_bits.h   |   7 +
  target/riscv/csr.c|  38 ++-
  target/riscv/helper.h |   3 +
  target/riscv/insn16.decode|  63 -
  target/riscv/insn_trans/trans_rvd.c.inc   |  18 ++
  target/riscv/insn_trans/trans_rvf.c.inc   |  18 ++
  target/riscv/insn_trans/trans_rvi.c.inc   |   4 +-
  target/riscv/insn_trans/trans_rvzce.c.inc | 313 ++
  target/riscv/machine.c|  19 ++
  target/riscv/meson.build  |   3 +-
  target/riscv/translate.c  |  15 +-
  target/riscv/zce_helper.c |  55 
  15 files changed, 834 insertions(+), 16 deletions(-)
  create mode 100644 target/riscv/insn_trans/trans_rvzce.c.inc
  create mode 100644 target/riscv/zce_helper.c

--
2.25.1

Re: virtio-iommu issue with VFIO device downstream to a PCIe-to-PCI bridge: VFIO devices are not assigned any iommu group

2023-01-10 Thread Jason Wang

On Tue, Jan 10, 2023 at 5:11 AM Eric Auger  wrote:
>
> Hi,
>
> On 1/9/23 14:24, Eric Auger wrote:
> > Hi,
> >
> > we have a trouble with virtio-iommu and protected assigned devices
> > downstream to a pcie-to-pci bridge. In that use case we observe the
> > assigned devices are not put to any group. This is true on both x86 and
> > aarch64. This use case works with intel-iommu.
> >
> > *** Guest PCI topology is:
> > lspci -tv
> > -[:00]-+-00.0  Intel Corporation 82G33/G31/P35/P31 Express DRAM
> > Controller
> >+-01.0  Device 1234:
> >+-02.0-[01-02]00.0-[02]01.0  Broadcom Inc. and
> > subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller
> >+-02.1-[03]--
> >+-02.2-[04]00.0  Red Hat, Inc. Virtio block device
> >+-0a.0  Red Hat, Inc. Device 1057
> >+-1f.0  Intel Corporation 82801IB (ICH9) LPC Interface Controller
> >+-1f.2  Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port
> > SATA Controller [AHCI mode]
> >\-1f.3  Intel Corporation 82801I (ICH9 Family) SMBus Controller
> >
> >
> > All the assigned devices are aliased and they get devfn=0x0.
> > see qemu pci_device_iommu_address_space in hw/pci.c
> >
> > Initially I see the following traces
> > pci_device_iommu_address_space name=vfio-pci BDF=0x8 bus=0 devfn=0x8
> > pci_device_iommu_address_space name=vfio-pci BDF=0x8 bus=0 devfn=0x8
> > call iommu_fn with bus=0x55f556dde180 and devfn=0
> > virtio_iommu_init_iommu_mr init virtio-iommu-memory-region-0-0
> >
> > Note the bus is 0 at this time and devfn that is used in the
> > virtio-iommu is 0. So an associated IOMMU MR is created with this bus at
> > devfn=0 slot. This is before bus actual numbering.
> >
> > However later on, I see virtio_iommu_probe() and virtio_iommu_attach()
> > getting called with ep_id=520
> > because in the qemu virtio-iommu device, virtio_iommu_mr(pe_id) fails to
> > find the iommu_mr and returns -ENOENT
> >
> > On guest side I see that
> > acpi_iommu_configure_id/iommu_probe_device() fails
> > (__iommu_probe_device) and also __iommu_attach_device would also fail
> > anyway.
> >
> > I guess those get called before actual bus number recomputation?
> >
> > on aarch64 I eventually see the "good" MR beeing created, ie. featuring
> > the right bus number:
> > qemu-system-aarch64: pci_device_iommu_address_space name=vfio-pci
> > BDF=0x208 bus=2 devfn=0x8
> > qemu-system-aarch64: pci_device_iommu_address_space name=vfio-pci
> > BDF=0x208 bus=2 devfn=0x8 call iommu_fn with bus=0xef12c450 and devfn=0
> >
> > But this does not happen on x86.
> >
> > Jean, do you have any idea about how to fix that? Do you think we have a
> > trouble in the acpi/viot setup or virtio-iommu probe sequence. It looks
> > like virtio probe and attach commands are called too early, before the
> > bus is actually correctly numbered.
>
> So after further investigations looks this is not a problem of bus
> number, which is good at the time of the virtio cmd calls but rather a
> problem related to the devfn (0 was used when creating the IOMMU MR)
> whereas the virtio-iommu cmds looks for the non aliased devfn. With that
> fixed, the probe and attach at least succeeds. The device still does not
> work for me but I will continue my investigations and send a tentative fix.

Haven't thought this deeply, just one thing in my mind and in case
that may help:

intel-iommu doesn't use bus no as the key for hashing address spaces
since it could be configured by the guest:

/*
 * Note that we use pointer to PCIBus as the key, so hashing/shifting
 * based on the pointer value is intended. Note that we deal with
 * collisions through vtd_as_equal().
 */
static guint vtd_as_hash(gconstpointer v)
{
const struct vtd_as_key *key = v;
guint value = (guint)(uintptr_t)key->bus;

return (guint)(value << 8 | key->devfn);
}

Thanks

>
> Thanks
>
> Eric
> >
> > Thanks
> >
> > Eric
> >
> >
> >
> >
> >
> >
> >
>

[PATCH] hw/net/lan9118: log [read|write]b when mode_16bit is enabled rather than abort

2023-01-10 Thread Qiang Liu

This patch replaces hw_error to guest error log for [read|write]b
accesses when mode_16bit is enabled. This avoids aborting qemu.

Fixes: 1248f8d4cbc3 ("hw/lan9118: Add basic 16-bit mode support.")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1433
Reported-by: Qiang Liu 
Signed-off-by: Qiang Liu 
---
 hw/net/lan9118.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
index f1cba55967..7f35715f27 100644
--- a/hw/net/lan9118.c
+++ b/hw/net/lan9118.c
@@ -1209,7 +1209,8 @@ static void lan9118_16bit_mode_write(void *opaque, hwaddr 
offset,
 return;
 }
 
-hw_error("lan9118_write: Bad size 0x%x\n", size);
+qemu_log_mask(LOG_GUEST_ERROR,
+  "lan9118_16bit_mode_write: Bad size 0x%x\n", size);
 }
 
 static uint64_t lan9118_readl(void *opaque, hwaddr offset,
@@ -1324,7 +1325,8 @@ static uint64_t lan9118_16bit_mode_read(void *opaque, 
hwaddr offset,
 return lan9118_readl(opaque, offset, size);
 }
 
-hw_error("lan9118_read: Bad size 0x%x\n", size);
+qemu_log_mask(LOG_GUEST_ERROR,
+  "lan9118_16bit_mode_read: Bad size 0x%x\n", size);
 return 0;
 }
 
-- 
2.25.1

Re: [PATCH] bulk: Rename TARGET_FMT_plx -> HWADDR_FMT_plx

2023-01-10 Thread Philippe Mathieu-Daudé


On 10/1/23 23:01, BALATON Zoltan wrote:

On Tue, 10 Jan 2023, Philippe Mathieu-Daudé wrote:

The 'hwaddr' type is defined in "exec/hwaddr.h" as:

   hwaddr is the type of a physical address
  (its size can be different from 'target_ulong').

All definitions use the 'HWADDR_' prefix, except TARGET_FMT_plx:

$ fgrep define include/exec/hwaddr.h
#define HWADDR_H
#define HWADDR_BITS 64
#define HWADDR_MAX UINT64_MAX
#define TARGET_FMT_plx "%016" PRIx64
    ^^
#define HWADDR_PRId PRId64
#define HWADDR_PRIi PRIi64
#define HWADDR_PRIo PRIo64
#define HWADDR_PRIu PRIu64
#define HWADDR_PRIx PRIx64


Why are there both TARGET_FMT_plx and HWADDR_PRIx? Why not just use 
HWADDR_PRIx instead?


Too lazy to specify the 0-digit alignment format I presume?

[RFC v1 0/2] spice: Add an option to forward the dmabuf directly to the encoder

2023-01-10 Thread Vivek Kasireddy

This patch series adds options to select a preferred codec and also
to forward a dmabuf directly to the encoder module that is part of
the Spice server. Currently, gstreamer:h264 is the only combination
tested but additional work is ongoing to test other combinations. 

Tested with: -device virtio-gpu-pci,max_outputs=1,blob=true,xres=1920,yres=1080
 -spice port=3001,gl=on,disable-ticketing=on,dmabuf-encode=on,
  preferred-codec=gstreamer:h264

and remote-viewer --spice-debug spice://x.x.x.x:3001 on the client side.

Associated Spice server patches can be found here:
https://lists.freedesktop.org/archives/spice-devel/2023-January/052927.html

Cc: Gerd Hoffmann 
Cc: Marc-André Lureau 
Cc: Dongwon Kim 

Vivek Kasireddy (2):
  spice: Add an option for users to provide a preferred codec
  spice: Add an option to forward the dmabuf directly to the encoder

 include/ui/spice-display.h |   2 +
 qemu-options.hx|  11 +++-
 ui/spice-core.c|  36 +++--
 ui/spice-display.c | 106 ++---
 4 files changed, 120 insertions(+), 35 deletions(-)

-- 
2.37.2

[RFC v1 1/2] spice: Add an option for users to provide a preferred codec

2023-01-10 Thread Vivek Kasireddy

Giving users an option to choose a particular codec will enable
them to make an appropriate decision based on their hardware and
use-case.

Cc: Gerd Hoffmann 
Cc: Marc-André Lureau 
Cc: Dongwon Kim 
Signed-off-by: Vivek Kasireddy 
---
 qemu-options.hx |  5 +
 ui/spice-core.c | 14 ++
 2 files changed, 19 insertions(+)

diff --git a/qemu-options.hx b/qemu-options.hx
index 3aa3a2f5a3..aab8df0922 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2142,6 +2142,7 @@ DEF("spice", HAS_ARG, QEMU_OPTION_spice,
 "   [,streaming-video=[off|all|filter]][,disable-copy-paste=on|off]\n"
 "   [,disable-agent-file-xfer=on|off][,agent-mouse=[on|off]]\n"
 "   [,playback-compression=[on|off]][,seamless-migration=[on|off]]\n"
+"   [,preferred-codec=:\n"
 "   [,gl=[on|off]][,rendernode=]\n"
 "   enable spice\n"
 "   at least one of {port, tls-port} is mandatory\n",
@@ -2237,6 +2238,10 @@ SRST
 ``seamless-migration=[on|off]``
 Enable/disable spice seamless migration. Default is off.
 
+``preferred-codec=:``
+Provide the preferred codec the Spice server should use.
+Default would be spice:mjpeg.
+
 ``gl=[on|off]``
 Enable/disable OpenGL context. Default is off.
 
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 72f8f1681c..6e00211e3a 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -469,6 +469,9 @@ static QemuOptsList qemu_spice_opts = {
 },{
 .name = "streaming-video",
 .type = QEMU_OPT_STRING,
+},{
+.name = "preferred-codec",
+.type = QEMU_OPT_STRING,
 },{
 .name = "agent-mouse",
 .type = QEMU_OPT_BOOL,
@@ -644,6 +647,7 @@ static void qemu_spice_init(void)
 char *x509_key_file = NULL,
 *x509_cert_file = NULL,
 *x509_cacert_file = NULL;
+const char *preferred_codec = NULL;
 int port, tls_port, addr_flags;
 spice_image_compression_t compression;
 spice_wan_compression_t wan_compr;
@@ -795,6 +799,16 @@ static void qemu_spice_init(void)
 spice_server_set_streaming_video(spice_server, SPICE_STREAM_VIDEO_OFF);
 }
 
+preferred_codec = qemu_opt_get(opts, "preferred-codec");
+if (preferred_codec) {
+if (spice_server_set_video_codecs(spice_server, preferred_codec)) {
+error_report("Preferred codec name is not valid");
+exit(1);
+}
+} else {
+spice_server_set_video_codecs(spice_server, "spice:mjpeg");
+}
+
 spice_server_set_agent_mouse
 (spice_server, qemu_opt_get_bool(opts, "agent-mouse", 1));
 spice_server_set_playback_compression
-- 
2.37.2

[RFC v1 2/2] spice: Add an option to forward the dmabuf directly to the encoder

2023-01-10 Thread Vivek Kasireddy

This patch adds support for gl=on and port != 0. In other words,
with this option enabled, it should be possible to stream the
content associated with the dmabuf without making any additional
copies.

The encoder (that is part of Spice Server) extracts the dmabuf
fd from the drawable (RedDrawable) which in turn gets it from
the scanout. Once the encoder is done encoding the dmabuf, it
triggers an async that would indicate to Qemu to unblock the
pipeline.

Cc: Gerd Hoffmann 
Cc: Marc-André Lureau 
Cc: Dongwon Kim 
Signed-off-by: Vivek Kasireddy 
---
 include/ui/spice-display.h |   2 +
 qemu-options.hx|   6 ++-
 ui/spice-core.c|  22 ++--
 ui/spice-display.c | 106 ++---
 4 files changed, 101 insertions(+), 35 deletions(-)

diff --git a/include/ui/spice-display.h b/include/ui/spice-display.h
index e271e011da..4f9b3aa2d9 100644
--- a/include/ui/spice-display.h
+++ b/include/ui/spice-display.h
@@ -62,6 +62,7 @@ enum {
 QXL_COOKIE_TYPE_RENDER_UPDATE_AREA,
 QXL_COOKIE_TYPE_POST_LOAD_MONITORS_CONFIG,
 QXL_COOKIE_TYPE_GL_DRAW_DONE,
+QXL_COOKIE_TYPE_DMABUF_ENCODE_DONE,
 };
 
 typedef struct QXLCookie {
@@ -153,6 +154,7 @@ struct SimpleSpiceCursor {
 };
 
 extern bool spice_opengl;
+extern bool spice_dmabuf_encode;
 
 int qemu_spice_rect_is_empty(const QXLRect* r);
 void qemu_spice_rect_union(QXLRect *dest, const QXLRect *r);
diff --git a/qemu-options.hx b/qemu-options.hx
index aab8df0922..3016f8a6f7 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2143,7 +2143,7 @@ DEF("spice", HAS_ARG, QEMU_OPTION_spice,
 "   [,disable-agent-file-xfer=on|off][,agent-mouse=[on|off]]\n"
 "   [,playback-compression=[on|off]][,seamless-migration=[on|off]]\n"
 "   [,preferred-codec=:\n"
-"   [,gl=[on|off]][,rendernode=]\n"
+"   [,gl=[on|off]][,rendernode=][,dmabuf-encode=[on|off]]\n"
 "   enable spice\n"
 "   at least one of {port, tls-port} is mandatory\n",
 QEMU_ARCH_ALL)
@@ -2248,6 +2248,10 @@ SRST
 ``rendernode=``
 DRM render node for OpenGL rendering. If not specified, it will
 pick the first available. (Since 2.9)
+
+``dmabuf-encode=[on|off]``
+Forward the dmabuf directly to the encoder (Gstreamer).
+Default is off.
 ERST
 
 DEF("portrait", 0, QEMU_OPTION_portrait,
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 6e00211e3a..c9b856b056 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -494,6 +494,9 @@ static QemuOptsList qemu_spice_opts = {
 },{
 .name = "rendernode",
 .type = QEMU_OPT_STRING,
+},{
+.name = "dmabuf-encode",
+.type = QEMU_OPT_BOOL,
 #endif
 },
 { /* end of list */ }
@@ -843,11 +846,24 @@ static void qemu_spice_init(void)
 g_free(password);
 
 #ifdef HAVE_SPICE_GL
+if (qemu_opt_get_bool(opts, "dmabuf-encode", 0)) {
+spice_dmabuf_encode = 1;
+}
 if (qemu_opt_get_bool(opts, "gl", 0)) {
-if ((port != 0) || (tls_port != 0)) {
-error_report("SPICE GL support is local-only for now and "
- "incompatible with -spice port/tls-port");
+if (((port != 0) || (tls_port != 0)) && !spice_dmabuf_encode) {
+error_report("Add dmabuf-encode=on option to enable GL streaming");
 exit(1);
+} else if (spice_dmabuf_encode) {
+if (port == 0 && tls_port == 0) {
+error_report("dmabuf-encode=on is only meant to be used for "
+ "non-local displays");
+exit(1);
+}
+if (g_strcmp0(preferred_codec, "gstreamer:h264")) {
+error_report("dmabuf-encode=on currently only works and tested"
+ "with gstreamer:h264");
+exit(1);
+}
 }
 if (egl_rendernode_init(qemu_opt_get(opts, "rendernode"),
 DISPLAYGL_MODE_ON) != 0) {
diff --git a/ui/spice-display.c b/ui/spice-display.c
index 494168e7fe..d02ebd7f24 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -28,6 +28,7 @@
 #include "ui/spice-display.h"
 
 bool spice_opengl;
+bool spice_dmabuf_encode;
 
 int qemu_spice_rect_is_empty(const QXLRect* r)
 {
@@ -117,7 +118,7 @@ void qemu_spice_wakeup(SimpleSpiceDisplay *ssd)
 }
 
 static void qemu_spice_create_one_update(SimpleSpiceDisplay *ssd,
- QXLRect *rect)
+ QXLRect *rect, int fd)
 {
 SimpleSpiceUpdate *update;
 QXLDrawable *drawable;
@@ -168,15 +169,17 @@ static void 
qemu_spice_create_one_update(SimpleSpiceDisplay *ssd,
 image->bitmap.palette = 0;
 image->bitmap.format = SPICE_BITMAP_FMT_32BIT;
 
-dest = pixman_image_create_bits(PIXMAN_LE_x8r8g8b8, bw, bh,
-(void *)update->bitmap, bw * 4);
-pixman_image_composite(PIXMAN_OP_SRC, ssd->surf

Re: [PATCH 2/2] target/riscv/cpu.c: do not skip misa logic in riscv_cpu_realize()

2023-01-10 Thread Bin Meng

On Wed, Jan 11, 2023 at 4:17 AM Daniel Henrique Barboza
 wrote:
>
> All RISCV CPUs are setting cpu->cfg during their cpu_init() functions,
> meaning that there's no reason to skip all the misa validation and setup
> if misa_ext was set beforehand - especially since we're setting an
> updated value in set_misa() in the end.
>
> Put this code chunk into a new riscv_cpu_validate_set_extensions()
> helper and always execute it regardless of what the board set in
> env->misa_ext.
>
> This will put more responsibility in how each board is going to init
> their attributes and extensions if they're not using the defaults.
> It'll also allow realize() to do its job looking only at the extensions
> enabled per se, not corner cases that some CPUs might have, and we won't
> have to change multiple code paths to fix or change how extensions work.
>
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/riscv/cpu.c | 485 +++--
>  1 file changed, 248 insertions(+), 237 deletions(-)
>

Reviewed-by: Bin Meng

Re: [PATCH 1/2] target/riscv/cpu: set cpu->cfg in register_cpu_props()

2023-01-10 Thread Bin Meng

On Wed, Jan 11, 2023 at 4:17 AM Daniel Henrique Barboza
 wrote:
>
> There is an informal contract between the cpu_init() functions and
> riscv_cpu_realize(): if cpu->env.misa_ext is zero, assume that the
> default settings were loaded via register_cpu_props() and do validations
> to set env.misa_ext.  If it's not zero, skip this whole process and
> assume that the board somehow did everything.
>
> At this moment, all SiFive CPUs are setting a non-zero misa_ext during
> their cpu_init() and skipping a good chunk of riscv_cpu_realize().
> This causes problems when the code being skipped in riscv_cpu_realize()
> contains fixes or assumptions that affects all CPUs, meaning that SiFive
> CPUs are missing out.
>
> To allow this code to not be skipped anymore, all the cpu->cfg.ext_* 
> attributes
> needs to be set during cpu_init() time. At this moment this is being done in
> register_cpu_props(). The SiFive oards are setting their own extensions during

The SiFive boards

> cpu_init() though, meaning that they don't want all the defaults from
> register_cpu_props().
>
> Let's move the contract between *_cpu_init() and riscv_cpu_realize() to
> register_cpu_props(). Inside this function we'll check if cpu->env.misa_ext
> was set and, if that's the case, set all relevant cpu->cfg.ext_*
> attributes, and only that. Leave the 'misa_ext' = 0 case as is today,
> i.e. loading all the defaults from riscv_cpu_extensions[].
>
> register_cpu_props() can then be called by all the cpu_init() functions,
> including the SiFive ones. This will make all CPUs behave more in line
> with that riscv_cpu_realize() expects.

with what

>
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/riscv/cpu.c | 40 
>  target/riscv/cpu.h |  4 
>  2 files changed, 44 insertions(+)
>

Regards,
Bin

Re: [RFC PATCH for 8.0 10/13] virtio-net: Migrate vhost inflight descriptors

2023-01-10 Thread Jason Wang

On Wed, Jan 11, 2023 at 12:40 PM Parav Pandit  wrote:
>
>
> > From: Jason Wang 
> > Sent: Tuesday, January 10, 2023 11:35 PM
> >
> > On Tue, Jan 10, 2023 at 11:02 AM Parav Pandit  wrote:
> > >
> > > Hi Jason,
> > >
> > > > From: Jason Wang 
> > > > Sent: Monday, December 5, 2022 10:25 PM
> > >
> > > >
> > > > A dumb question, any reason we need bother with virtio-net? It looks
> > > > to me it's not a must and would complicate migration compatibility.
> > >
> > > Virtio net vdpa device is processing the descriptors out of order.
> > > This vdpa device doesn’t offer IN_ORDER flag.
> > >
> > > And when a VQ is suspended it cannot complete these descriptors as some
> > dummy zero length completions.
> > > The guest VM is flooded with [1].
> >
> > Yes, but any reason for the device to do out-of-order for RX?
> >
> For some devices it is more optimal to process them out of order.
> And its not limited to RX.

TX should be fine, since the device can anyhow pretend to send all
packets, so we won't have any in-flight descriptors.

>
> > >
> > > So it is needed for the devices that doesn’t offer IN_ORDER feature.
> > >
> > > [1]
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tre
> > > e/drivers/net/virtio_net.c?h=v6.2-rc3#n1252
> >
> > It is only enabled in a debug kernel which should be harmless?
> it is KERN_DEBUG log level. Its is not debug kernel, just the debug log level.

Ok, but the production environment should not use that level anyhow.

> And regardless, generating zero length packets for debug kernel is even more 
> confusing.

Note that it is allowed in the virtio-spec[1] (we probably can fix
that in the driver) and we have pr_debug() all over this drivers and
other places. It doesn't cause any side effects except for the
debugging purpose.

So I think having inflight tracking is useful, but I'm not sure it's
worth bothering with virtio-net (or worth to bothering now):

- zero length is allowed
- it only helps for debugging
- may cause issues for migration compatibility
- requires new infrastructure to be invented

Thanks

[1] spec said

"
Note: len is particularly useful for drivers using untrusted buffers:
if a driver does not know exactly how much has been written by the
device, the driver would have to zero the buffer in advance to ensure
no data leakage occurs.
"

Re: make vm-build-freebsd appears to require . in PATH

2023-01-10 Thread Markus Armbruster

Peter Maydell  writes:

> On Tue, 10 Jan 2023 at 16:26, Markus Armbruster  wrote:
>> Peter Maydell  writes:
>> > Does it actually require '.' on the PATH, or does it just want
>> > a qemu-img binary on the PATH? (eg your distro one in /usr/bin).
>> > I don't have '.' on my PATH and it works for me.
>>
>> Do we want to use qemu-img, qemu-system-x86_64 and so forth from PATH,
>> or the one in the build tree?
>
> There's no guarantee there is one in the build tree at all.
> I usually use these like
>  (cd build && ../configure)
>  make -C build  vm-build-openbsd
>
> in which case it doesn't need to build anything in the build
> tree at all (neither qemu-system-x86_64 nor qemu-img).
>
> It's nice to be able to do "test this build on *BSD" with
> a known-good QEMU running the VM rather than having the
> code-under-test affecting both the outer QEMU and the
> build-and-make-check running inside the VM.

True.

>> The former could well be old, which feels like a potential source of
>> problems.
>
> In practice we only use it for very simple operations
> ("create a qcow2 image" and "resize this qcow2 file"),
> so using the distro qemu-img has never been an issue for me.
>
> I think I have in the past run into problems because the
> system's qemu-system-x86_64 was super-old, but it was easy
> to just build a known-good QEMU version and put that on
> the PATH. (And now that system has had a host distro
> upgrade, so I have gone back to using the system binary.)

I since came to understand this line in output of vm-help:

QEMU_LOCAL=1 - Use QEMU binary local to this build.

So the intent appears to be "use (presumably known-good) QEMU tooling
from $PATH by default, pass QEMU_LOCAL=1 to use the build tree instead,
and pass QEMU=... QEMU_IMG=... QEMU_CONFIG=... when you need even more
control."

Thanks again!

Re: [RFC PATCH v2 17/19] target/arm: Move regime_using_lpae_format into internal.h

2023-01-10 Thread Richard Henderson


On 1/9/23 14:42, Fabiano Rosas wrote:

This function is needed by common code (ptw.c), so move it along with
the other regime_* functions in internal.h. When we enable the build
without TCG, the tlb_helper.c file will not be present.

Signed-off-by: Fabiano Rosas
---
Richard: this cannot into ptw.c because that file is softmmu only
---
  target/arm/internals.h  | 21 ++---
  target/arm/tcg/tlb_helper.c | 18 --
  2 files changed, 18 insertions(+), 21 deletions(-)


Not thrilled, because of the size of the function, but I have no better 
suggestions.

Reviewed-by: Richard Henderson 


r~

Re: [PATCH 1/2] target/riscv/cpu: set cpu->cfg in register_cpu_props()

2023-01-10 Thread Richard Henderson


On 1/10/23 12:14, Daniel Henrique Barboza wrote:

+/*
+ * Register CPU props based on env.misa_ext. If a non-zero
+ * value was set, register only the required cpu->cfg.ext_*
+ * properties and leave. env.misa_ext = 0 means that we want
+ * all the default properties to be registered.
+ */
  static void register_cpu_props(DeviceState *dev)


Suggest invoking this as .instance_post_init hook on TYPE_RISCV_CPU.
Then you don't need to manually call it on every cpu class.


r~

Re: [PATCH v5 00/11] riscv: OpenSBI boot test and cleanups

2023-01-10 Thread Alistair Francis

On Mon, Jan 2, 2023 at 9:54 PM Daniel Henrique Barboza
 wrote:
>
> Hi,
>
> This new version is still rebased on top of [1]:
>
> "[PATCH v2 00/12] hw/riscv: Improve Spike HTIF emulation fidelity"
>
> from Bin Meng.
>
> The change from v4 is on patch 9 where we added an extra flag in
> riscv_load_kernel() to allow for boards that don't load initrd
> (e.g. opentitan and sifive_e) to opt out from loading it altogether.
>
> * Patch without reviews: 9
>
> Changes from v4:
> - patch 9:
>   - added a 'load_init' flag in riscv_load_kernel() to control whether
> the function should execute riscv_load_initrd() or not
> v4 link: https://lists.gnu.org/archive/html/qemu-devel/2022-12/msg04652.html
>
> Changes from v3:
> - patch 1:
>   - fixed more instances of 'opensbi' and 'Opensbi' to 'OpenSBI'
>   - changed tests order
> - patch 4 (new):
>   - added a g_assert(filename) guard in riscv_load_initrd() and
> riscv_load_kernel()
> v3 link: https://mail.gnu.org/archive/html/qemu-devel/2022-12/msg04491.html
>
> Changes from v2:
> - patch 1:
>   - reduced code repetition with a boot_opensbi() helper
>   - renamed 'opensbi' to 'OpenSBI' in the file header
> - patch 9:
>   - renamed riscv_load_kernel() to riscv_load_kernel_and_initrd()
> v2 link: https://mail.gnu.org/archive/html/qemu-devel/2022-12/msg04466.html
>
>
> Changes from v1:
> - patches were rebased with [1]
> - patches 13-15: removed
>   * will be re-sent in a follow-up series
> - patches 4-5: removed since they're picked by Bin in [1]
> - patch 1:
>   - added a 'skip' riscv32 spike test
> v1 link: https://mail.gnu.org/archive/html/qemu-devel/2022-12/msg03860.html
>
>
> Based-on: <20221227064812.1903326-1-bm...@tinylab.org>
>
> Cc: Alistair Francis 
> Cc: Bin Meng 
>
> [1] https://patchwork.ozlabs.org/project/qemu-devel/list/?series=334352
>
> Daniel Henrique Barboza (11):
>   tests/avocado: add RISC-V OpenSBI boot test
>   hw/riscv/spike: use 'fdt' from MachineState
>   hw/riscv/sifive_u: use 'fdt' from MachineState
>   hw/riscv/boot.c: exit early if filename is NULL in load functions
>   hw/riscv/spike.c: load initrd right after riscv_load_kernel()
>   hw/riscv: write initrd 'chosen' FDT inside riscv_load_initrd()
>   hw/riscv: write bootargs 'chosen' FDT after riscv_load_kernel()
>   hw/riscv/boot.c: use MachineState in riscv_load_initrd()
>   hw/riscv/boot.c: use MachineState in riscv_load_kernel()
>   hw/riscv/boot.c: consolidate all kernel init in riscv_load_kernel()
>   hw/riscv/boot.c: make riscv_load_initrd() static

Thanks!

Applied to riscv-to-apply.next

Alistair

>
>  hw/riscv/boot.c| 91 +++---
>  hw/riscv/microchip_pfsoc.c | 20 +---
>  hw/riscv/opentitan.c   |  3 +-
>  hw/riscv/sifive_e.c|  4 +-
>  hw/riscv/sifive_u.c| 32 +++-
>  hw/riscv/spike.c   | 37 --
>  hw/riscv/virt.c| 21 +---
>  include/hw/riscv/boot.h|  5 +-
>  include/hw/riscv/sifive_u.h|  3 --
>  include/hw/riscv/spike.h   |  2 -
>  tests/avocado/riscv_opensbi.py | 65 
>  11 files changed, 150 insertions(+), 133 deletions(-)
>  create mode 100644 tests/avocado/riscv_opensbi.py
>
> --
> 2.39.0
>
>

Re: [PATCH 0/2] target/riscv/cpu: fix sifive_u 32/64bits boot in riscv-to-apply.next

2023-01-10 Thread Alistair Francis

On Wed, Jan 11, 2023 at 6:17 AM Daniel Henrique Barboza
 wrote:
>
> Hi,
>
> I found this bug when testing my avocado changes in riscv-to-apply.next.
> The sifive_u board, both 32 and 64 bits, stopped booting OpenSBI. The
> guest hangs indefinitely.
>
> Git bisect points that this patch broke things:
>
> 8c3f35d25e7e98655c609b6c1e9f103b9240f8f8 is the first bad commit
> commit 8c3f35d25e7e98655c609b6c1e9f103b9240f8f8
> Author: Weiwei Li 
> Date:   Wed Dec 28 14:20:21 2022 +0800
>
> target/riscv: add support for Zca extension
>
> Modify the check for C extension to Zca (C implies Zca)
> (https://github.com/alistair23/qemu/commit/8c3f35d25e7e98655c609b6c1e9f103b9240f8f8)
>
>
> But this patch per se isn't doing anything wrong. The root of the
> problem is that this patch makes assumptions based on the previous
> patch:
>
> commit a2b409aa6cadc1ed9715e1ab916ddd3dade0ba85
> Author: Weiwei Li 
> Date:   Wed Dec 28 14:20:20 2022 +0800
>
> target/riscv: add cfg properties for Zc* extension
> (https://github.com/alistair23/qemu/commit/a2b409aa6cadc1ed9715e1ab916ddd3dade0ba85)
>
> Which added a lot of logic and assumptions that are being skipped by all
> the SiFive boards because, during riscv_cpu_realize(), we have this
> code:
>
> /* If only MISA_EXT is unset for misa, then set it from properties */
> if (env->misa_ext == 0) {
> uint32_t ext = 0;
> (...)
> }
>
> In short, we have a lot of code that are being skipped by all SiFive
> CPUs because these CPUs are setting a non-zero value in set_misa() in
> their respective cpu_init() functions.
>
> It's possible to just hack in and fix the SiFive problem in isolate, but
> I believe we can do better and allow all riscv_cpu_realize() to be executed
> for all CPUs, regardless of what they've done during their cpu_init().
>
>
> Daniel Henrique Barboza (2):
>   target/riscv/cpu: set cpu->cfg in register_cpu_props()
>   target/riscv/cpu.c: do not skip misa logic in riscv_cpu_realize()

Thanks for the patches

I have rebased these onto the latest master and dropped the other
series. That way when the other series is applied we don't break
bisectability.

Alistair

>
>  target/riscv/cpu.c | 525 +
>  target/riscv/cpu.h |   4 +
>  2 files changed, 292 insertions(+), 237 deletions(-)
>
> --
> 2.39.0
>
>

Re: [PATCH v9 0/9] support subsets of code size reduction extension

2023-01-10 Thread Alistair Francis

On Wed, Dec 28, 2022 at 4:23 PM Weiwei Li  wrote:
>
> This patchset implements RISC-V Zc* extension v1.0.0.RC5.7 version 
> instructions.
>
> Specification:
> https://github.com/riscv/riscv-code-size-reduction/tree/main/Zc-specification
>
> The port is available here:
> https://github.com/plctlab/plct-qemu/tree/plct-zce-upstream-v9
>
> To test Zc* implementation, specify cpu argument with 
> 'x-zca=true,x-zcb=true,x-zcf=true,f=true" and "x-zcd=true,d=true" (or 
> "x-zcmp=true,x-zcmt=true" with c or d=false) to enable Zca/Zcb/Zcf and Zcd(or 
> Zcmp,Zcmt) extensions support.
>
>
> This implementation can pass the basic zc tests from 
> https://github.com/yulong-plct/zc-test
>
> v9:
> * rebase on riscv-to-apply.next
>
> v8:
> * improve disas support in Patch 9
>
> v7:
> * Fix description for Zca
>
> v6：
> * fix base address for jump table in Patch 7
> * rebase on riscv-to-apply.next
>
> v5:
> * fix exception unwind problem for cpu_ld*_code in helper of cm_jalt
>
> v4:
> * improve Zcmp suggested by Richard
> * fix stateen related check for Zcmt
>
> v3:
> * update the solution for Zcf to the way of Zcd
> * update Zcb to reuse gen_load/store
> * use trans function instead of helper for push/pop
>
> v2:
> * add check for relationship between Zca/Zcf/Zcd with C/F/D based on related 
> discussion in review of Zc* spec
> * separate c.fld{sp}/fsd{sp} with fld{sp}/fsd{sp} before support of zcmp/zcmt
>
> Weiwei Li (9):
>   target/riscv: add cfg properties for Zc* extension
>   target/riscv: add support for Zca extension
>   target/riscv: add support for Zcf extension
>   target/riscv: add support for Zcd extension
>   target/riscv: add support for Zcb extension
>   target/riscv: add support for Zcmp extension
>   target/riscv: add support for Zcmt extension
>   target/riscv: expose properties for Zc* extension
>   disas/riscv.c: add disasm support for Zc*

This series broke a range of boards that use specific CPUs. I have
dropped it from my tree.

Daniel has sent a series that should fix it though
(https://www.mail-archive.com/qemu-devel@nongnu.org/msg930952.html). I
have applied his fixes. Can you rebase this series on
https://github.com/alistair23/qemu/tree/riscv-to-apply.next, test to
ensure the SiFive boards continue to work and then re-send the series?

Alistair

>
>  disas/riscv.c | 228 +++-
>  target/riscv/cpu.c|  56 
>  target/riscv/cpu.h|  10 +
>  target/riscv/cpu_bits.h   |   7 +
>  target/riscv/csr.c|  38 ++-
>  target/riscv/helper.h |   3 +
>  target/riscv/insn16.decode|  63 -
>  target/riscv/insn_trans/trans_rvd.c.inc   |  18 ++
>  target/riscv/insn_trans/trans_rvf.c.inc   |  18 ++
>  target/riscv/insn_trans/trans_rvi.c.inc   |   4 +-
>  target/riscv/insn_trans/trans_rvzce.c.inc | 313 ++
>  target/riscv/machine.c|  19 ++
>  target/riscv/meson.build  |   3 +-
>  target/riscv/translate.c  |  15 +-
>  target/riscv/zce_helper.c |  55 
>  15 files changed, 834 insertions(+), 16 deletions(-)
>  create mode 100644 target/riscv/insn_trans/trans_rvzce.c.inc
>  create mode 100644 target/riscv/zce_helper.c
>
> --
> 2.25.1
>
>

RE: [RFC PATCH for 8.0 10/13] virtio-net: Migrate vhost inflight descriptors

2023-01-10 Thread Parav Pandit


> From: Jason Wang 
> Sent: Tuesday, January 10, 2023 11:35 PM
> 
> On Tue, Jan 10, 2023 at 11:02 AM Parav Pandit  wrote:
> >
> > Hi Jason,
> >
> > > From: Jason Wang 
> > > Sent: Monday, December 5, 2022 10:25 PM
> >
> > >
> > > A dumb question, any reason we need bother with virtio-net? It looks
> > > to me it's not a must and would complicate migration compatibility.
> >
> > Virtio net vdpa device is processing the descriptors out of order.
> > This vdpa device doesn’t offer IN_ORDER flag.
> >
> > And when a VQ is suspended it cannot complete these descriptors as some
> dummy zero length completions.
> > The guest VM is flooded with [1].
> 
> Yes, but any reason for the device to do out-of-order for RX?
>
For some devices it is more optimal to process them out of order.
And its not limited to RX.
 
> >
> > So it is needed for the devices that doesn’t offer IN_ORDER feature.
> >
> > [1]
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tre
> > e/drivers/net/virtio_net.c?h=v6.2-rc3#n1252
> 
> It is only enabled in a debug kernel which should be harmless?
it is KERN_DEBUG log level. Its is not debug kernel, just the debug log level.
And regardless, generating zero length packets for debug kernel is even more 
confusing.

Re: [RFC PATCH for 8.0 10/13] virtio-net: Migrate vhost inflight descriptors

2023-01-10 Thread Jason Wang

On Tue, Jan 10, 2023 at 11:02 AM Parav Pandit  wrote:
>
> Hi Jason,
>
> > From: Jason Wang 
> > Sent: Monday, December 5, 2022 10:25 PM
>
> >
> > A dumb question, any reason we need bother with virtio-net? It looks to me 
> > it's
> > not a must and would complicate migration compatibility.
>
> Virtio net vdpa device is processing the descriptors out of order.
> This vdpa device doesn’t offer IN_ORDER flag.
>
> And when a VQ is suspended it cannot complete these descriptors as some dummy 
> zero length completions.
> The guest VM is flooded with [1].

Yes, but any reason for the device to do out-of-order for RX?

>
> So it is needed for the devices that doesn’t offer IN_ORDER feature.
>
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/virtio_net.c?h=v6.2-rc3#n1252

It is only enabled in a debug kernel which should be harmless?

Thanks

>

[PATCH 2/2] tcg: use QTree instead of GTree

2023-01-10 Thread Emilio Cota

qemu-user can hang in a multi-threaded fork. One common
reason is that when creating a TB, between fork and exec
we manipulate a GTree whose memory allocator (GSlice) is
not fork-safe.

Although POSIX does not mandate it, the system's allocator
(e.g. tcmalloc, libc malloc) is probably fork-safe.

Fix some of these hangs by using QTree, which uses
the system's allocator.

For more details, see:
  https://gitlab.com/qemu-project/qemu/-/issues/285

Performance impact on linux-user:
- ~2% slowdown in spec06
- 1.05% slowdown in Nbench-int
- 4.51% slowdown in Nbench-fp

x86_64 spec06
  Host: AMD Ryzen 7 PRO 5850U with Radeon Graphics
  1.04 
+--+
   |
| |
   |
  qtree-gcc1|.int |
   |+-+ 
| |
  1.02 
|-+...|..|...+-|
   | |  
| |
   |  +-+|  
| |
 1 
|-+..+-+|+-+..|..+-+.|...+-|
   |   **|**   |*|** |*+-+  
 |  | |
   |   *+-+*   |+-+* |*  *+-+   
   **|**| |
   |   *   *   +-+*|**  *  *   **|*   *  * |
   *+-+*| +-+ |
  0.98 
|-+.*...*|.*|.*..*..*...*.|*...*..*...**|*..*...*|**|**..+-|
   |   *   *  **|**   *| *   +-+*  *   * |*   *  *   *+-+
+-+   *   *  **|**  *+-+*|
   |*+-+*  *   *  * | *   +-+*| *  *   * |*   *  *   *  * | 
   *   *  * | *  *   *|
   |*+-+*  *   *  *+-+*   *  *| *  *   * |*   *  *   *  * | 
   *   *  * | *  *   *|
  0.96 
|-+..*...*..*...*..*...*...*..*...*|**...*..*...*.|*...*..*...*..*.|*...*..*.|.*..*...*..+-|
   |*   *  *   *  *   *   *  *   *| *   *  *   * |*   *  *   *  *   
**|*   *   *  * | *  *   *|
   |*   *  *   *  *   *   *  *   +-+*   *  *   *+-+   *  *   *  *   * 
|*   *   *  * | *  *   *|
   |*   *  *   *  *   *   *  *   *  *   *  *   *  *   *  *   *  *   * 
|*   *   *  * | *  *   *|
  0.94 
|-+..*...*..*...*..*...*...*..*...*..*...*..*...*..*...*..*...*..*...*.|*...*...*..*.|.*..*...*..+-|
   |*   *  *   *  *   *   *  *   *  *   *  *   *  *   *  *   *  *   * 
|*   *   *  * | *  *   *|
   |*   *  *   *  *   *   *  *   *  *   *  *   *  *   *  *   *  *   
*+-+   *   *  * | *  *   *|
  0.92 
|-+..*...*..*...*..*...*...*..*...*..*...*..*...*..*...*..*...*..*...*..*...*...*..*.|.*..*...*..+-|
   |*   *  *   *  *   *   *  *   *  *   *  *   *  *   *  *   *  *   *  
*   *   *  *+-+*  *   *|
   |*   *  *   *  *   *   *  *   *  *   *  *   *  *   *  *   *  *   *  
*   *   *  *   *  *   *|
   |*   *  *   *  *   *   *  *   *  *   *  *   *  *   *  *   *  *   *  
*   *   *  *   *  *   *|
   0.9 
+--+
 
400.perlben401.bzip2403.gcc429.m445.gob456.hmme45462.libqua464.h26471.omnet473483.xalancbmkgeomean

 aarch64 NBench Integer Performance
  Host: AMD Ryzen 7 PRO 5850U with Radeon Graphics
 81.2 ++
  | ++ |
  |*** |
   81 |-+   B+-|
  |**##|
 80.8 |-+  ##+-|
  |  ##|
  |##  |
 80.6 |-+##  +-|
  |##  |
  |  ##|
 80.4 |-+  ##+-|
  |  ##|
 80.2 |-+  ##**  +-|
  |

[PATCH 1/2] util: import GTree as QTree

2023-01-10 Thread Emilio Cota

The only reason to add this tree is to control the memory allocator
used. Some users (e.g. TCG) cannot work reliably in multi-threaded
environments (e.g. forking in user-mode) with GTree's allocator, GSlice.
See https://gitlab.com/qemu-project/qemu/-/issues/285 for details.

Importing GTree is a temporary workaround until GTree migrates away
from GSlice.

This implementation is identical to that in glib v2.75.0.
I've imported tests from glib and added a benchmark just to
make sure that performance is similar (Note: it cannot be identical
because we are not using GSlice).

$ taskset -c 2 tests/bench/qtree-bench

- With libc's allocator:

 Tree Op  3210244096  131072
 1048576

GTree Lookup   14.01   15.17   24.93   18.99
   15.28
QTree Lookup   22.50 (1.61x)   32.49 (2.14x)   29.84 (1.20x)   16.77 
(0.88x)   12.21 (0.80x)
GTree Insert   19.24   15.72   25.24   17.87
   16.55
QTree Insert   15.07 (0.78x)   26.70 (1.70x)   25.68 (1.02x)   17.20 
(0.96x)   12.49 (0.75x)
GTree Remove   11.57   31.44   29.77   20.88
   16.60
QTree Remove   14.01 (1.21x)   34.54 (1.10x)   33.52 (1.13x)   26.64 
(1.28x)   14.95 (0.90x)
GTree  RemoveAll   57.97  119.13  118.16  112.82
   61.63
QTree  RemoveAll   46.31 (0.80x)  108.04 (0.91x)  113.85 (0.96x)   77.88 
(0.69x)   41.69 (0.68x)
GTree   Traverse   72.56  232.83  243.20  254.22
   97.44
QTree   Traverse   66.53 (0.92x)  394.76 (1.70x)  357.07 (1.47x)  289.09 
(1.14x)   45.64 (0.47x)


- With tcmalloc:

 Tree Op  3210244096  131072
 1048576

GTree Lookup   24.56   27.69   25.78   17.14
   15.90
QTree Lookup   40.92 (1.67x)   34.04 (1.23x)   30.15 (1.17x)   22.93 
(1.34x)   20.31 (1.28x)
GTree Insert   33.97   28.22   25.66   17.37
   16.07
QTree Insert   34.01 (1.00x)   36.35 (1.29x)   32.29 (1.26x)   22.32 
(1.28x)   17.62 (1.10x)
GTree Remove   20.61   32.42   30.80   16.96
   15.93
QTree Remove   20.61 (1.00x)   43.60 (1.35x)   41.71 (1.35x)   25.04 
(1.48x)   16.33 (1.03x)
GTree  RemoveAll  106.31  125.72  126.49   70.89
   54.60
QTree  RemoveAll   83.99 (0.79x)  207.75 (1.65x)  206.17 (1.63x)   53.35 
(0.75x)   46.38 (0.85x)
GTree   Traverse  128.00  243.93  255.20  140.39
   90.94
QTree   Traverse  110.34 (0.86x)  325.49 (1.33x)  376.82 (1.48x)  118.22 
(0.84x)   62.25 (0.68x)


Signed-off-by: Emilio Cota 
---
 include/qemu/qtree.h  |  119 +++
 tests/bench/meson.build   |4 +
 tests/bench/qtree-bench.c |  273 ++
 tests/unit/meson.build|1 +
 tests/unit/test-qtree.c   |  701 +++
 util/meson.build  |1 +
 util/qtree.c  | 1776 +
 7 files changed, 2875 insertions(+)
 create mode 100644 include/qemu/qtree.h
 create mode 100644 tests/bench/qtree-bench.c
 create mode 100644 tests/unit/test-qtree.c
 create mode 100644 util/qtree.c

diff --git a/include/qemu/qtree.h b/include/qemu/qtree.h
new file mode 100644
index 00..4679457758
--- /dev/null
+++ b/include/qemu/qtree.h
@@ -0,0 +1,119 @@
+/*
+ * GLIB - Library of useful routines for C programming
+ * Copyright (C) 1995-1997  Peter Mattis, Spencer Kimball and Josh MacDonald
+ *
+ * SPDX-License-Identifier: LGPL-2.1-or-later
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+/*
+ * Modified by the GLib Team and others 1997-2000.  See the AUTHORS
+ * file for a list of people on the GLib Team.  See the ChangeLog
+ * files for a list of changes.  These files are distributed with
+ * GLib at ftp://ftp.gtk.org/pub/gtk/.
+ */
+
+#ifndef QEMU_QTREE_H
+#define QEMU_QTREE_H

[PATCH 0/2] fix for #285

2023-01-10 Thread Emilio Cota

Context:
  https://gitlab.com/qemu-project/qemu/-/issues/285

So far the only fix that we have had posted on the list is
  https://lists.gnu.org/archive/html/qemu-devel/2022-10/msg00391.html
by Daniel. The approach that I'm following here should have
the same outcome, except that it doesn't change the guest's
environment. The approach is to import GTree (sans GSlice)
into QEMU, and use that for TCG.

Daniel: what is the testing that you're using? Could you test
these patches to confirm they fix the issue?

Regarding performance, it looks like GSlice does buy us
something, which might explain why GLib's maintainers don't
want to change it. But I'd put correctness over performance
any day. Furthermore, we could use an alternative tree
implementation; I've tried CCAN's AVL and the performance impact
is lower (I believe due to faster traversals), although I'm
going with a straight import of GTree here to keep the API
identical (and also avoid any potential correctness concerns).

Thanks,
Emilio

Re: [RFC PATCH v2 07/19] target/arm: Move helper_set_pstate_* into cpregs.c

2023-01-10 Thread Richard Henderson


On 1/9/23 21:52, Richard Henderson wrote:

On 1/9/23 14:42, Fabiano Rosas wrote:

We want to move sme_helper into the tcg directory, but the cpregs
accessor functions cannot go along, otherwise they would be separate
from the respective ARMCPRegInfo definition which needs to be compiled
with CONFIG_TCG=n as well.


Hmm.  I would have hoped these could stay tcg-only, somehow.
I wonder if it warrants being an ARM_CP_SPECIAL_MASK value instead of 
svcr_write.


To answer my own question, ARM_CP_SPECIAL_MASK forces NO_RAW, which is not what we want 
for migration.


I'll think of something better here though.


r~

Re: [PATCH v2 4/5] hw/i2c/bitbang_i2c: Trace state changes

2023-01-10 Thread Richard Henderson


On 1/10/23 00:29, Philippe Mathieu-Daudé wrote:

+static const char *sname[] = {


Oh,

  const char * const sname[]

should have caught that the first time.


r~

Re: [PATCH v2 4/5] util/qht: use striped locks under TSAN

2023-01-10 Thread Emilio Cota

On Tue, Jan 10, 2023 at 20:58:01 +, Alex Bennée wrote:
> Emilio Cota  writes:
(snip)
> > +static inline void qht_do_if_first_in_stripe(const struct qht_map *map,
> > + struct qht_bucket *b,
> > + void (*func)(QemuSpin *spin))
> > +{
> > +#ifdef CONFIG_TSAN
> > +unsigned long bucket_idx = b - map->buckets;
> > +bool is_first_in_stripe = (bucket_idx >> QHT_TSAN_BUCKET_LOCKS_BITS) 
> > == 0;
> > +if (is_first_in_stripe) {
> > +unsigned long lock_idx = bucket_idx & (QHT_TSAN_BUCKET_LOCKS - 1);
> > +func(&map->tsan_bucket_locks[lock_idx]);
> 
> Hmm I ran into an issue with:
> 
>  ../util/qht.c:286:10: error: incompatible pointer types passing 'const 
> struct qht_tsan_lock *' to parameter of type 'QemuSpin *' (aka 'struct 
> QemuSpin *') [-Werror,-Wincompatible-pointer-types]

Gaah, sorry. I didn't notice this because of unrelated noise due
to having to configure with --disable-werror. Fixed now by removing
a bunch of const's and also using .lock.

> > +static inline void qht_bucket_lock_destroy(const struct qht_map *map,
> > +   struct qht_bucket *b)
> > +{
> > +qht_do_if_first_in_stripe(map, b, qemu_spin_destroy);
> > +}
> 
> Who is meant to be calling this?

Should have been removed in v2; fixed now.

I've uploaded the v3 series to https://github.com/cota/qemu/tree/tsan-v3

Please let me know if you want me to also mail it to the list.
Thanks,

Emilio

Re: [PATCH] hw/display/xlnx_dp: fix overflow in xlnx_dp_aux_push_tx_fifo()

2023-01-10 Thread Qiang Liu

Dear Fred,

On Tue, Jan 10, 2023 at 9:57 PM Konrad, Frederic 
wrote:

> Hi,
>
> > -Original Message-
> > From: qemu-devel-bounces+fkonrad=amd@nongnu.org
>  On Behalf Of
> > Qiang Liu
> > Sent: 09 January 2023 07:00
> > To: qemu-devel@nongnu.org
> > Cc: Qiang Liu ; Alistair Francis <
> alist...@alistair23.me>; Edgar E. Iglesias ;
> Peter
> > Maydell ; open list:Xilinx ZynqMP and... <
> qemu-...@nongnu.org>
> > Subject: [PATCH] hw/display/xlnx_dp: fix overflow in
> xlnx_dp_aux_push_tx_fifo()
> >
> > This patch checks if the s->tx_fifo is full.
> >
> > Fixes: 58ac482a66de ("introduce xlnx-dp")
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1424
> > Reported-by: Qiang Liu 
> > Signed-off-by: Qiang Liu 
> > ---
> >  hw/display/xlnx_dp.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/display/xlnx_dp.c b/hw/display/xlnx_dp.c
> > index 972473d94f..617b394af2 100644
> > --- a/hw/display/xlnx_dp.c
> > +++ b/hw/display/xlnx_dp.c
> > @@ -854,7 +854,11 @@ static void xlnx_dp_write(void *opaque, hwaddr
> offset, uint64_t value,
> >  break;
> >  case DP_AUX_WRITE_FIFO: {
> >  uint8_t c = value;
> > -xlnx_dp_aux_push_tx_fifo(s, &c, 1);
> > +if (fifo8_is_full(&s->tx_fifo)) {
> > +qemu_log_mask(LOG_GUEST_ERROR, "xlnx_dp: TX fifo is full");
> > +} else {
> > +xlnx_dp_aux_push_tx_fifo(s, &c, 1);
> > +}
>
> I'd rather move the check in xlnx_dp_aux_push_tx_fifo, like
> xlnx_dp_aux_pop_tx_fifo.
> Otherwise looks good to me.
>

Sounds fine. Let me resend a patch.

Best,
Qiang

[PATCH v3 3/3] tcg: add perfmap and jitdump

2023-01-10 Thread Ilya Leoshkevich

Add ability to dump /tmp/perf-.map and jit-.dump.
The first one allows the perf tool to map samples to each individual
translation block. The second one adds the ability to resolve symbol
names, line numbers and inspect JITed code.

Example of use:

perf record qemu-x86_64 -perfmap ./a.out
perf report

or

perf record -k 1 qemu-x86_64 -jitdump ./a.out
DEBUGINFOD_URLS= perf inject -j -i perf.data -o perf.data.jitted
perf report -i perf.data.jitted

Co-developed-by: Vanderson M. do Rosario 
Co-developed-by: Alex Bennée 
Signed-off-by: Ilya Leoshkevich 
---
 accel/tcg/meson.build |   1 +
 accel/tcg/perf.c  | 366 ++
 accel/tcg/perf.h  |  49 +
 accel/tcg/translate-all.c |   8 +
 docs/devel/tcg.rst|  23 +++
 linux-user/exit.c |   2 +
 linux-user/main.c |  15 ++
 qemu-options.hx   |  20 +++
 softmmu/vl.c  |  11 ++
 tcg/tcg.c |   2 +
 10 files changed, 497 insertions(+)
 create mode 100644 accel/tcg/perf.c
 create mode 100644 accel/tcg/perf.h

diff --git a/accel/tcg/meson.build b/accel/tcg/meson.build
index 55b3b4dd7e3..77740b1a0d7 100644
--- a/accel/tcg/meson.build
+++ b/accel/tcg/meson.build
@@ -13,6 +13,7 @@ tcg_ss.add(when: 'CONFIG_USER_ONLY', if_true: 
files('user-exec.c'))
 tcg_ss.add(when: 'CONFIG_SOFTMMU', if_false: files('user-exec-stub.c'))
 tcg_ss.add(when: 'CONFIG_PLUGIN', if_true: [files('plugin-gen.c')])
 tcg_ss.add(when: libdw, if_true: files('debuginfo.c'))
+tcg_ss.add(when: 'CONFIG_LINUX', if_true: files('perf.c'))
 specific_ss.add_all(when: 'CONFIG_TCG', if_true: tcg_ss)
 
 specific_ss.add(when: ['CONFIG_SOFTMMU', 'CONFIG_TCG'], if_true: files(
diff --git a/accel/tcg/perf.c b/accel/tcg/perf.c
new file mode 100644
index 000..427ccbe80e1
--- /dev/null
+++ b/accel/tcg/perf.c
@@ -0,0 +1,366 @@
+/*
+ * Linux perf perf-.map and jit-.dump integration.
+ *
+ * The jitdump spec can be found at [1].
+ *
+ * [1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/tools/perf/Documentation/jitdump-specification.txt
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "elf.h"
+#include "qemu/timer.h"
+#include "tcg/tcg.h"
+
+#include "debuginfo.h"
+#include "perf.h"
+
+static FILE *safe_fopen_w(const char *path)
+{
+int saved_errno;
+FILE *f;
+int fd;
+
+/* Delete the old file, if any. */
+unlink(path);
+
+/* Avoid symlink attacks by using O_CREAT | O_EXCL. */
+fd = open(path, O_RDWR | O_CREAT | O_EXCL, S_IRUSR | S_IWUSR);
+if (fd == -1) {
+return NULL;
+}
+
+/* Convert fd to FILE*. */
+f = fdopen(fd, "w");
+if (f == NULL) {
+saved_errno = errno;
+close(fd);
+errno = saved_errno;
+return NULL;
+}
+
+return f;
+}
+
+static FILE *perfmap;
+
+void perf_enable_perfmap(void)
+{
+char map_file[32];
+
+snprintf(map_file, sizeof(map_file), "/tmp/perf-%d.map", getpid());
+perfmap = safe_fopen_w(map_file);
+if (perfmap == NULL) {
+warn_report("Could not open %s: %s, proceeding without perfmap",
+map_file, strerror(errno));
+}
+}
+
+/* Get PC and size of code JITed for guest instruction #INSN. */
+static void get_host_pc_size(uintptr_t *host_pc, uint16_t *host_size,
+ const void *start, size_t insn)
+{
+uint16_t start_off = insn ? tcg_ctx->gen_insn_end_off[insn - 1] : 0;
+
+if (host_pc) {
+*host_pc = (uintptr_t)start + start_off;
+}
+if (host_size) {
+*host_size = tcg_ctx->gen_insn_end_off[insn] - start_off;
+}
+}
+
+static const char *pretty_symbol(const struct debuginfo_query *q, size_t *len)
+{
+static __thread char buf[64];
+int tmp;
+
+if (!q->symbol) {
+tmp = snprintf(buf, sizeof(buf), "guest-0x%llx", q->address);
+if (len) {
+*len = MIN(tmp + 1, sizeof(buf));
+}
+return buf;
+}
+
+if (!q->offset) {
+if (len) {
+*len = strlen(q->symbol) + 1;
+}
+return q->symbol;
+}
+
+tmp = snprintf(buf, sizeof(buf), "%s+0x%llx", q->symbol, q->offset);
+if (len) {
+*len = MIN(tmp + 1, sizeof(buf));
+}
+return buf;
+}
+
+static void write_perfmap_entry(const void *start, size_t insn,
+const struct debuginfo_query *q)
+{
+uint16_t host_size;
+uintptr_t host_pc;
+
+get_host_pc_size(&host_pc, &host_size, start, insn);
+fprintf(perfmap, "%"PRIxPTR" %"PRIx16" %s\n",
+host_pc, host_size, pretty_symbol(q, NULL));
+}
+
+static FILE *jitdump;
+
+#define JITHEADER_MAGIC 0x4A695444
+#define JITHEADER_VERSION 1
+
+struct jitheader {
+uint32_t magic;
+uint32_t version;
+uint32_t total_size;
+uint32_t elf_mach;
+uint32_t pad1;
+uint32_t pid;
+uint64_t timestamp;
+uint64_t flags;
+};
+
+enum jit_record_type {
+JIT_COD

[PATCH v3 2/3] accel/tcg: Add debuginfo support

2023-01-10 Thread Ilya Leoshkevich

Add libdw-based functions for loading and querying debuginfo. Load
debuginfo from the system and the linux-user loaders.

This is useful for the upcoming perf support, which can then put
human-readable guest symbols instead of raw guest PCs into perfmap and
jitdump files.

Signed-off-by: Ilya Leoshkevich 
---
 accel/tcg/debuginfo.c  | 96 ++
 accel/tcg/debuginfo.h  | 77 +
 accel/tcg/meson.build  |  1 +
 hw/core/loader.c   |  5 +++
 linux-user/elfload.c   |  3 ++
 linux-user/meson.build |  1 +
 meson.build|  8 
 7 files changed, 191 insertions(+)
 create mode 100644 accel/tcg/debuginfo.c
 create mode 100644 accel/tcg/debuginfo.h

diff --git a/accel/tcg/debuginfo.c b/accel/tcg/debuginfo.c
new file mode 100644
index 000..fee98a8e867
--- /dev/null
+++ b/accel/tcg/debuginfo.c
@@ -0,0 +1,96 @@
+/*
+ * Debug information support.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/lockable.h"
+
+#include 
+
+#include "debuginfo.h"
+
+static QemuMutex lock;
+static Dwfl *dwfl;
+static const Dwfl_Callbacks dwfl_callbacks = {
+.find_elf = NULL,
+.find_debuginfo = dwfl_standard_find_debuginfo,
+.section_address = NULL,
+.debuginfo_path = NULL,
+};
+
+__attribute__((constructor))
+static void debuginfo_init(void)
+{
+qemu_mutex_init(&lock);
+}
+
+void debuginfo_report_elf(const char *name, int fd, unsigned long long bias)
+{
+QEMU_LOCK_GUARD(&lock);
+
+if (dwfl) {
+dwfl_report_begin_add(dwfl);
+} else {
+dwfl = dwfl_begin(&dwfl_callbacks);
+}
+
+if (dwfl) {
+dwfl_report_elf(dwfl, name, name, fd, bias, true);
+dwfl_report_end(dwfl, NULL, NULL);
+}
+}
+
+void debuginfo_lock(void)
+{
+qemu_mutex_lock(&lock);
+}
+
+void debuginfo_query(struct debuginfo_query *q, size_t n)
+{
+const char *symbol, *file;
+Dwfl_Module *dwfl_module;
+Dwfl_Line *dwfl_line;
+GElf_Off dwfl_offset;
+GElf_Sym dwfl_sym;
+size_t i;
+int line;
+
+if (!dwfl) {
+return;
+}
+
+for (i = 0; i < n; i++) {
+dwfl_module = dwfl_addrmodule(dwfl, q[i].address);
+if (!dwfl_module) {
+continue;
+}
+
+if (q[i].flags & DEBUGINFO_SYMBOL) {
+symbol = dwfl_module_addrinfo(dwfl_module, q[i].address,
+  &dwfl_offset, &dwfl_sym,
+  NULL, NULL, NULL);
+if (symbol) {
+q[i].symbol = symbol;
+q[i].offset = dwfl_offset;
+}
+}
+
+if (q[i].flags & DEBUGINFO_LINE) {
+dwfl_line = dwfl_module_getsrc(dwfl_module, q[i].address);
+if (dwfl_line) {
+file = dwfl_lineinfo(dwfl_line, NULL, &line, 0, NULL, NULL);
+if (file) {
+q[i].file = file;
+q[i].line = line;
+}
+}
+}
+}
+}
+
+void debuginfo_unlock(void)
+{
+qemu_mutex_unlock(&lock);
+}
diff --git a/accel/tcg/debuginfo.h b/accel/tcg/debuginfo.h
new file mode 100644
index 000..50b7a0f8471
--- /dev/null
+++ b/accel/tcg/debuginfo.h
@@ -0,0 +1,77 @@
+/*
+ * Debug information support.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef ACCEL_TCG_DEBUGINFO_H
+#define ACCEL_TCG_DEBUGINFO_H
+
+/*
+ * Debuginfo describing a certain address.
+ */
+struct debuginfo_query {
+unsigned long long address;  /* Input: address. */
+int flags;   /* Input: debuginfo subset. */
+const char *symbol;  /* Symbol that the address is part of. */
+unsigned long long offset;   /* Offset from the symbol. */
+const char *file;/* Source file associated with the address. */
+int line;/* Line number in the source file. */
+};
+
+/*
+ * Debuginfo subsets.
+ */
+#define DEBUGINFO_SYMBOL BIT(1)
+#define DEBUGINFO_LINE   BIT(2)
+
+#if defined(CONFIG_TCG) && defined(CONFIG_LIBDW)
+/*
+ * Load debuginfo for the specified guest ELF image.
+ * Return true on success, false on failure.
+ */
+void debuginfo_report_elf(const char *name, int fd, unsigned long long bias);
+
+/*
+ * Take the debuginfo lock.
+ */
+void debuginfo_lock(void);
+
+/*
+ * Fill each on N Qs with the debuginfo about Q->ADDRESS as specified by
+ * Q->FLAGS:
+ *
+ * - DEBUGINFO_SYMBOL: update Q->SYMBOL and Q->OFFSET. If symbol debuginfo is
+ * missing, then leave them as is.
+ * - DEBUINFO_LINE: update Q->FILE and Q->LINE. If line debuginfo is missing,
+ *  then leave them as is.
+ *
+ * This function must be called under the debuginfo lock. The results can be
+ * accessed only until the debuginfo lock is released.
+ */
+void debuginfo_query(struct debuginfo_query *q, size_t n);
+
+/*
+ * Release the debuginfo lock.
+ */
+void debuginfo_unlock(void);
+#else
+stati

[PATCH v3 1/3] linux-user: Clean up when exiting due to a signal

2023-01-10 Thread Ilya Leoshkevich

When exiting due to an exit() syscall, qemu-user calls
preexit_cleanup(), but this is currently not the case when exiting due
to a signal. This leads to various buffers not being flushed (e.g.,
for gprof, for gcov, and for the upcoming perf support).

Add the missing call.

Signed-off-by: Ilya Leoshkevich 
---
 linux-user/signal.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 61c6fa3fcf1..098f3a787db 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -695,7 +695,7 @@ void cpu_loop_exit_sigbus(CPUState *cpu, target_ulong addr,
 
 /* abort execution with signal */
 static G_NORETURN
-void dump_core_and_abort(int target_sig)
+void dump_core_and_abort(CPUArchState *cpu_env, int target_sig)
 {
 CPUState *cpu = thread_cpu;
 CPUArchState *env = cpu->env_ptr;
@@ -724,6 +724,8 @@ void dump_core_and_abort(int target_sig)
 target_sig, strsignal(host_sig), "core dumped" );
 }
 
+preexit_cleanup(cpu_env, 128 + target_sig);
+
 /* The proper exit code for dying from an uncaught signal is
  * -.  The kernel doesn't allow exit() or _exit() to pass
  * a negative value.  To get the proper exit code we need to
@@ -1058,12 +1060,12 @@ static void handle_pending_signal(CPUArchState 
*cpu_env, int sig,
sig != TARGET_SIGURG &&
sig != TARGET_SIGWINCH &&
sig != TARGET_SIGCONT) {
-dump_core_and_abort(sig);
+dump_core_and_abort(cpu_env, sig);
 }
 } else if (handler == TARGET_SIG_IGN) {
 /* ignore sig */
 } else if (handler == TARGET_SIG_ERR) {
-dump_core_and_abort(sig);
+dump_core_and_abort(cpu_env, sig);
 } else {
 /* compute the blocked signals during the handler execution */
 sigset_t *blocked_set;
-- 
2.39.0

[PATCH v3 0/3] tcg: add perfmap and jitdump

2023-01-10 Thread Ilya Leoshkevich

v2:
https://lists.gnu.org/archive/html/qemu-devel/2022-11/msg02385.html
https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg01026.html

v2 -> v3:
* Enable only for CONFIG_LINUX (Alex).
* Use qemu_get_thread_id() instead of gettid() (Alex).
* Fix CI (Alex).
  https://gitlab.com/iii-i/qemu/-/pipelines/743684604
* Drop unnecessary #includes (Alex).
* Drop the constification change (Alex/Richard).
* Split debuginfo support into a separate patch.
* Fix partial perfmap/jitdump files when terminating due to a signal.
* Fix debuginfo strings being accessed outside of debuginfo lock.
* Fix address resolution with TARGET_TB_PCREL.
* Add DEBUGINFOD_URLS= to the doc; without it perf inject is
  unacceptably slow.
* Note: it's better to test this with the latest perf
  (6.2.rc3.g7dd4b804e080 worked fine for me). There has been at least
  one breakage in the JIT area recently (fixed by 6d518ac7be62).

v1:
https://lists.nongnu.org/archive/html/qemu-devel/2022-10/msg01824.html
https://lists.nongnu.org/archive/html/qemu-devel/2022-11/msg01073.html

v1 -> v2:
* Use QEMU_LOCK_GUARD (Alex).
* Handle TARGET_TB_PCREL (Alex).
* Support ELF -kernels, add a note about this (Alex). Tested with
  qemu-system-x86_64 and Linux kernel - it's not fast, but it works.
* Minor const correctness and style improvements.

Ilya Leoshkevich (3):
  linux-user: Clean up when exiting due to a signal
  accel/tcg: Add debuginfo support
  tcg: add perfmap and jitdump

 accel/tcg/debuginfo.c |  96 ++
 accel/tcg/debuginfo.h |  77 
 accel/tcg/meson.build |   2 +
 accel/tcg/perf.c  | 366 ++
 accel/tcg/perf.h  |  49 +
 accel/tcg/translate-all.c |   8 +
 docs/devel/tcg.rst|  23 +++
 hw/core/loader.c  |   5 +
 linux-user/elfload.c  |   3 +
 linux-user/exit.c |   2 +
 linux-user/main.c |  15 ++
 linux-user/meson.build|   1 +
 linux-user/signal.c   |   8 +-
 meson.build   |   8 +
 qemu-options.hx   |  20 +++
 softmmu/vl.c  |  11 ++
 tcg/tcg.c |   2 +
 17 files changed, 693 insertions(+), 3 deletions(-)
 create mode 100644 accel/tcg/debuginfo.c
 create mode 100644 accel/tcg/debuginfo.h
 create mode 100644 accel/tcg/perf.c
 create mode 100644 accel/tcg/perf.h

-- 
2.39.0

Re: [PATCH v7 3/7] mac_{old,new}world: Pass MacOS VGA NDRV in card ROM instead of fw_cfg

2023-01-10 Thread BALATON Zoltan


On Tue, 10 Jan 2023, Mark Cave-Ayland wrote:

On 04/01/2023 21:59, BALATON Zoltan wrote:

OpenBIOS cannot run FCode ROMs yet but it can detect NDRV in VGA card
ROM and add it to the device tree for MacOS. Pass the NDRV this way
instead of via fw_cfg. This solves the problem with OpenBIOS also
adding the NDRV to ati-vga which it does not work with. This does not
need any changes to OpenBIOS as this NDRV ROM handling is already
there but this patch also allows simplifying OpenBIOS later to remove
the fw_cfg ndrv handling from the vga FCode and also drop the
vga-ndrv? option which is not needed any more as users can disable the
ndrv with -device VGA,romfile="" (or override it with their own NDRV
or ROM). Once FCode support is implemented in OpenBIOS, the proper
FCode ROM can be set the same way so this paves the way to remove some
hacks.

Signed-off-by: BALATON Zoltan 
---
  hw/ppc/mac_newworld.c | 18 ++
  hw/ppc/mac_oldworld.c | 18 ++
  2 files changed, 12 insertions(+), 24 deletions(-)

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 460c14b5e3..60c9c27986 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -510,18 +510,6 @@ static void ppc_core99_init(MachineState *machine)
  fw_cfg_add_i32(fw_cfg, FW_CFG_PPC_BUSFREQ, BUSFREQ);
  fw_cfg_add_i32(fw_cfg, FW_CFG_PPC_NVRAM_ADDR, nvram_addr);
  -/* MacOS NDRV VGA driver */
-filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, NDRV_VGA_FILENAME);
-if (filename) {
-gchar *ndrv_file;
-gsize ndrv_size;
-
-if (g_file_get_contents(filename, &ndrv_file, &ndrv_size, NULL)) {
-fw_cfg_add_file(fw_cfg, "ndrv/qemu_vga.ndrv", ndrv_file, 
ndrv_size);

-}
-g_free(filename);
-}
-
  qemu_register_boot_set(fw_cfg_boot_set, fw_cfg);
  }
  @@ -565,6 +553,11 @@ static int core99_kvm_type(MachineState *machine, 
const char *arg)

  return 2;
  }
  +static GlobalProperty props[] = {
+/* MacOS NDRV VGA driver */
+{ "VGA", "romfile", NDRV_VGA_FILENAME },
+};
+
  static void core99_machine_class_init(ObjectClass *oc, void *data)
  {
  MachineClass *mc = MACHINE_CLASS(oc);
@@ -585,6 +578,7 @@ static void core99_machine_class_init(ObjectClass *oc, 
void *data)

  #endif
  mc->default_ram_id = "ppc_core99.ram";
  mc->ignore_boot_device_suffixes = true;
+compat_props_add(mc->compat_props, props, G_N_ELEMENTS(props));
  fwc->get_dev_path = core99_fw_dev_path;
  }
  diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index 5a7b25a4a8..6a1b1ad47a 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -344,18 +344,6 @@ static void ppc_heathrow_init(MachineState *machine)
  fw_cfg_add_i32(fw_cfg, FW_CFG_PPC_CLOCKFREQ, CLOCKFREQ);
  fw_cfg_add_i32(fw_cfg, FW_CFG_PPC_BUSFREQ, BUSFREQ);
  -/* MacOS NDRV VGA driver */
-filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, NDRV_VGA_FILENAME);
-if (filename) {
-gchar *ndrv_file;
-gsize ndrv_size;
-
-if (g_file_get_contents(filename, &ndrv_file, &ndrv_size, NULL)) {
-fw_cfg_add_file(fw_cfg, "ndrv/qemu_vga.ndrv", ndrv_file, 
ndrv_size);

-}
-g_free(filename);
-}
-
  qemu_register_boot_set(fw_cfg_boot_set, fw_cfg);
  }
  @@ -400,6 +388,11 @@ static int heathrow_kvm_type(MachineState *machine, 
const char *arg)

  return 2;
  }
  +static GlobalProperty props[] = {
+/* MacOS NDRV VGA driver */
+{ "VGA", "romfile", NDRV_VGA_FILENAME },
+};
+
  static void heathrow_class_init(ObjectClass *oc, void *data)
  {
  MachineClass *mc = MACHINE_CLASS(oc);
@@ -420,6 +413,7 @@ static void heathrow_class_init(ObjectClass *oc, void 
*data)

  mc->default_display = "std";
  mc->ignore_boot_device_suffixes = true;
  mc->default_ram_id = "ppc_heathrow.ram";
+compat_props_add(mc->compat_props, props, G_N_ELEMENTS(props));
  fwc->get_dev_path = heathrow_fw_dev_path;
  }


The qemu_vga.ndrv is deliberately kept separate from the PCI option ROM 
because it is a binary generated by a separate project: otherwise you'd end 
up creating a dependency between OpenBIOS and QemuMacDrivers, which is almost 
impossible to achieve since qemu_vga.ndrv can only (currently) be built in an 
emulated MacOS 9 guest.


I don't get this. The dependency is already there as qemu_vga.ndrv ships 
with QEMU such as all the vgabios-*.bin and SeaBIOS binaries which are 
also built from different projects. The qemu_vga.ndrv would also still be 
part of an FCode ROM together with vga.fs if OpenBIOS could run that so 
this patch solely changes the way of passing the ROM binary to OpenBIOS 
from fw_cfg to the card ROM which is closer to how it should be and can 
direcly be replaced with the FCode ROM later after OpenBIOS will be 
advanced to that point.


The best way to do this would be to extract the PCI config words from your 
ATI OpenBIOS patches and the alter drivers/vga.fs so that it only generates 
the drive

Re: [PATCH v7 4/7] mac_newworld: Add machine types for different mac99 configs

2023-01-10 Thread BALATON Zoltan


On Tue, 10 Jan 2023, Mark Cave-Ayland wrote:

On 04/01/2023 21:59, BALATON Zoltan wrote:

The mac99 machine emulates different machines depending on machine
properties or even if it is run as qemu-system-ppc64 or
qemu-system-ppc. This is very confusing for users and many hours were
lost trying to explain it or finding out why commands users came up
with are not working as expected. (E.g. Windows users might think
qemu-system-ppc64 is just the 64 bit version of qemu-system-ppc and
then fail to boot a 32 bit OS with -M mac99 trying to follow an
example that had qemu-system-ppc.) To avoid such confusion, add
explicit machine types for the different configs which will work the
same with both qemu-system-ppc and qemu-system-ppc64 and also make the
command line clearer for new users.

Signed-off-by: BALATON Zoltan 


Some thoughts on this: the first is that not everyone agrees that for 
qemu-system-X that X represents the target. There were previous discussion 
where some KVM people assumed X represented the host, i.e. ppc64 was the 
binary that ran all PPC guests but with hardware acceleration for ppc64 
guests on ppc64 hosts. This was a while ago, so it may be worth starting a 
thread on qemu-devel to see what the current consensus is.


I don't see how this is relevant to this series, Also likely not the case 
any more as qemu-system-ppc and qemu-system-ppc64 share most of the code 
since a while with ppc64 including the config of ppc and adding more 
machines.


Secondly it's not clear to me why you've chosen names like "powermac_3_1" 
instead of "g4agp"? Does powermac_3_1 uniquely identify the G4 AGP Sawtooth 
model? For QEMU it is always best to emulate real machines, and whilst I 
understand you want to separate out the two versions of the mac99 machine, 
having "powermac_X_Y" seems less clear to me.


These machine model identifiers are used by Apple to uniquely identify 
(all of) their machines since new-world Macs (even modern iPads and Macs 
have them) so for Mac people this should be clearer than the informal 
names that could get a bit long and confusing as there may be slight 
differences within a family. In any case, qemu-system-ppc -M mac99 is not 
corresponding to any real Mac so I'd like the options which do emulate 
real Macs to be called in a name that show which Mac is that. For the PPC 
Macs there's some info here for example:


https://en.wikipedia.org/wiki/Power_Mac_G4

And everymac.com also has info on all Macs. There were actually more than 
one G4 PowerMac with AGP but the other one was informally called gigabit 
ethernet. So the model ID is a shorter and better way to clearly identify 
which hardware is it (and it's also referenced in the device-tree of these 
Macs).


Finally can you post links to the device trees that you are using for each of 
the new machine types so that we have a clear reference point for future 
changes to the QEMU Mac machines? Even better include the links in the 
comments for each machine so that the information is easily visible for 
developers.


I still have those I've posted over the past 8 years when I made changes 
to OpenBIOS to make the device-tree closer to real machine. I've 
downloaded it back then, don't know where to find it now but searching for 
e.g. "PowerMac3,1" "device-tree" should get some results.


Regards,
BALATON Zoltan


---
  hw/ppc/mac_newworld.c | 94 +++
  1 file changed, 94 insertions(+)

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 60c9c27986..3f5d1ec097 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -642,9 +642,103 @@ static const TypeInfo core99_machine_info = {
  },
  };
  +static void powermac3_1_machine_class_init(ObjectClass *oc, void *data)
+{
+MachineClass *mc = MACHINE_CLASS(oc);
+
+core99_machine_class_init(oc, data);
+mc->desc = "Apple Power Mac G4 AGP (Sawtooth)";
+mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("7400_v2.9");
+}
+
+static void powermac3_1_instance_init(Object *obj)
+{
+Core99MachineState *cms = CORE99_MACHINE(obj);
+
+cms->via_config = CORE99_VIA_CONFIG_PMU;
+return;
+}
+
+static const TypeInfo powermac3_1_machine_info = {
+.name  = MACHINE_TYPE_NAME("powermac3_1"),
+.parent= TYPE_MACHINE,
+.class_init= powermac3_1_machine_class_init,
+.instance_init = powermac3_1_instance_init,
+.instance_size = sizeof(Core99MachineState),
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_FW_PATH_PROVIDER },
+{ }
+},
+};
+
+static void powerbook3_2_machine_class_init(ObjectClass *oc, void *data)
+{
+MachineClass *mc = MACHINE_CLASS(oc);
+
+core99_machine_class_init(oc, data);
+mc->desc = "Apple PowerBook G4 Titanium (Mercury)";
+mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("7400_v2.9");
+}
+
+static void powerbook3_2_instance_init(Object *obj)
+{
+Core99MachineState *cms = CORE99_MACHINE(obj);
+
+cms->via_config = CORE99_VIA_CONFI

Re: [PATCH v7 6/7] mac_newworld: Deprecate mac99 "via" option

2023-01-10 Thread BALATON Zoltan


On Tue, 10 Jan 2023, Mark Cave-Ayland wrote:

On 04/01/2023 21:59, BALATON Zoltan wrote:


Setting emulated machine type with a property called "via" is
confusing users so deprecate the "via" option in favour of newly added
explicit machine types. The default via=cuda option is not a valid
config (no real Mac has this combination of hardware) so no machine
type could be defined for that therefore it is kept for backwards
compatibility with older QEMU versions for now but other options
resembling real machines are deprecated.

Signed-off-by: BALATON Zoltan 


I believe that people do use -M mac99,via=cuda to run some rare versions of 
MacOS in QEMU (I think possibly OS X DP and Workgroup Server?), so we would 
want to keep this option somewhere.


The idea is that after previous patches we now have machine types for all 
other via option values (that also match real Mac machines) other than 
via=cude but that is the default for mac99 so after the reprecation period 
when the via option is removed mac99 (which is the same as mac99,via=cuda) 
can remain for this use case (and for backward compatibility) until the 
other machines are fixed to not need this any more. So all via options are 
still available but as different machine types.


Regards,
BALATON Zoltan


---
  hw/ppc/mac_newworld.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index f07c37328b..adf185bd3a 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -169,6 +169,15 @@ static void ppc_core99_init(MachineState *machine)
  if (PPC_INPUT(env) == PPC_FLAGS_INPUT_970) {
  warn_report("mac99 with G5 CPU is deprecated, "
  "use powermac7_3 instead");
+} else {
+if (core99_machine->via_config == CORE99_VIA_CONFIG_PMU) {
+warn_report("mac99,via=pmu is deprecated, "
+"use powermac3_1 instead");
+}
+if (core99_machine->via_config == CORE99_VIA_CONFIG_PMU_ADB) {
+warn_report("mac99,via=pmu-adb is deprecated, "
+"use powerbook3_2 instead");
+}
  }
  }
  /* allocate RAM */



ATB,

Mark.

Re: [PATCH v4 01/36] tcg: Define TCG_TYPE_I128 and related helper macros

2023-01-10 Thread Alex Bennée



Richard Henderson  writes:

> Begin staging in support for TCGv_i128 with Int128.
> Define the type enumerator, the typedef, and the
> helper-head.h macros.
>
> This cannot yet be used, because you can't allocate
> temporaries of this new type.
>
> Signed-off-by: Richard Henderson 

Reviewed-by: Alex Bennée 

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH v4 1/1] python/machine: Fix AF_UNIX path too long on macOS

2023-01-10 Thread Peter Delevoryas

On Tue, Jan 10, 2023 at 06:18:29PM -0500, John Snow wrote:
> On Tue, Jan 10, 2023 at 3:34 AM Peter Delevoryas  wrote:
> >
> > On macOS, private $TMPDIR's are the default. These $TMPDIR's are
> > generated from a user's unix UID and UUID [1], which can create a
> > relatively long path:
> >
> > /var/folders/d7/rz20f6hd709c1ty8f6_6y_z4gn/T/
> >
> > QEMU's avocado tests create a temporary directory prefixed by
> > "avo_qemu_sock_", and create QMP sockets within _that_ as well.
> > The QMP socket is unnecessarily long, because a temporary directory
> > is created for every QEMUMachine object.
> >
> > /avo_qemu_sock_uh3w_dgc/qemu-37331-10bacf110-monitor.sock
> >
> > The path limit for unix sockets on macOS is 104: [2]
> >
> > /*
> >  * [XSI] Definitions for UNIX IPC domain.
> >  */
> > struct  sockaddr_un {
> > unsigned char   sun_len;/* sockaddr len including null */
> > sa_family_t sun_family; /* [XSI] AF_UNIX */
> > charsun_path[104];  /* [XSI] path name (gag) */
> > };
> >
> > This results in avocado tests failing on macOS because the QMP unix
> > socket can't be created, because the path is too long:
> >
> > ERROR| Failed to establish connection: OSError: AF_UNIX path too long
> >
> > This change resolves by reducing the size of the socket directory prefix
> > and the suffix on the QMP and console socket names.
> >
> > The result is paths like this:
> >
> > pdel@pdel-mbp:/var/folders/d7/rz20f6hd709c1ty8f6_6y_z4gn/T
> > $ tree qemu*
> > qemu_df4evjeq
> > qemu_jbxel3gy
> > qemu_ml9s_gg7
> > qemu_oc7h7f3u
> > qemu_oqb1yf97
> > ├── 10a004050.con
> > └── 10a004050.qmp
> >
> > [1] 
> > https://apple.stackexchange.com/questions/353832/why-is-mac-osx-temp-directory-in-weird-path
> > [2] 
> > /Library/Developer/CommandLineTools/SDKs/MacOSX12.3.sdk/usr/include/sys/un.h
> >
> > Signed-off-by: Peter Delevoryas 
> 
> I'm tentatively staging this with a benefit-of-the-doubt [1] -- my
> tests are still running -- but I do have a question:
> 
> > ---
> >  python/qemu/machine/machine.py | 6 +++---
> >  tests/avocado/avocado_qemu/__init__.py | 2 +-
> >  2 files changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
> > index 748a0d807c9d..d70977378305 100644
> > --- a/python/qemu/machine/machine.py
> > +++ b/python/qemu/machine/machine.py
> > @@ -157,7 +157,7 @@ def __init__(self,
> >  self._wrapper = wrapper
> >  self._qmp_timer = qmp_timer
> >
> > -self._name = name or f"qemu-{os.getpid()}-{id(self):02x}"
> > +self._name = name or f"{id(self):x}"
> 
> Why is it safe to not differentiate based on the process ID?
> 
> ... I suppose the thinking is: by default, in machine.py, this is a
> temp dir created by tempfile.mkdtemp which will be unique per-process.
> I suppose there's no protection against a caller supplying the same
> tempdir (or sockdir) to multiple instances, but I suppose in those
> cases we get to argue that "Well, don't do that, then."
> 
> Does that sound about right?

Yeah, I think that's it

> 
> --js
> 
> [1] staged @ https://gitlab.com/jsnow/qemu/-/commits/python
> 
> 
> >  self._temp_dir: Optional[str] = None
> >  self._base_temp_dir = base_temp_dir
> >  self._sock_dir = sock_dir
> > @@ -167,7 +167,7 @@ def __init__(self,
> >  self._monitor_address = monitor_address
> >  else:
> >  self._monitor_address = os.path.join(
> > -self.sock_dir, f"{self._name}-monitor.sock"
> > +self.sock_dir, f"{self._name}.qmp"
> >  )
> >
> >  self._console_log_path = console_log
> > @@ -192,7 +192,7 @@ def __init__(self,
> >  self._console_set = False
> >  self._console_device_type: Optional[str] = None
> >  self._console_address = os.path.join(
> > -self.sock_dir, f"{self._name}-console.sock"
> > +self.sock_dir, f"{self._name}.con"
> >  )
> >  self._console_socket: Optional[socket.socket] = None
> >  self._remove_files: List[str] = []
> > diff --git a/tests/avocado/avocado_qemu/__init__.py 
> > b/tests/avocado/avocado_qemu/__init__.py
> > index 910f3ba1eab8..25a546842fab 100644
> > --- a/tests/avocado/avocado_qemu/__init__.py
> > +++ b/tests/avocado/avocado_qemu/__init__.py
> > @@ -306,7 +306,7 @@ def require_netdev(self, netdevname):
> >  self.cancel('no support for user networking')
> >
> >  def _new_vm(self, name, *args):
> > -self._sd = tempfile.TemporaryDirectory(prefix="avo_qemu_sock_")
> > +self._sd = tempfile.TemporaryDirectory(prefix="qemu_")
> >  vm = QEMUMachine(self.qemu_bin, base_temp_dir=self.workdir,
> >   sock_dir=self._sd.name, log_dir=self.logdir)
> >  self.log.debug('QEMUMachine "%s" created', name)
> > --
> > 2.39.0
> >
>

Re: [PATCH v4 1/1] python/machine: Fix AF_UNIX path too long on macOS

2023-01-10 Thread John Snow

On Tue, Jan 10, 2023 at 3:34 AM Peter Delevoryas  wrote:
>
> On macOS, private $TMPDIR's are the default. These $TMPDIR's are
> generated from a user's unix UID and UUID [1], which can create a
> relatively long path:
>
> /var/folders/d7/rz20f6hd709c1ty8f6_6y_z4gn/T/
>
> QEMU's avocado tests create a temporary directory prefixed by
> "avo_qemu_sock_", and create QMP sockets within _that_ as well.
> The QMP socket is unnecessarily long, because a temporary directory
> is created for every QEMUMachine object.
>
> /avo_qemu_sock_uh3w_dgc/qemu-37331-10bacf110-monitor.sock
>
> The path limit for unix sockets on macOS is 104: [2]
>
> /*
>  * [XSI] Definitions for UNIX IPC domain.
>  */
> struct  sockaddr_un {
> unsigned char   sun_len;/* sockaddr len including null */
> sa_family_t sun_family; /* [XSI] AF_UNIX */
> charsun_path[104];  /* [XSI] path name (gag) */
> };
>
> This results in avocado tests failing on macOS because the QMP unix
> socket can't be created, because the path is too long:
>
> ERROR| Failed to establish connection: OSError: AF_UNIX path too long
>
> This change resolves by reducing the size of the socket directory prefix
> and the suffix on the QMP and console socket names.
>
> The result is paths like this:
>
> pdel@pdel-mbp:/var/folders/d7/rz20f6hd709c1ty8f6_6y_z4gn/T
> $ tree qemu*
> qemu_df4evjeq
> qemu_jbxel3gy
> qemu_ml9s_gg7
> qemu_oc7h7f3u
> qemu_oqb1yf97
> ├── 10a004050.con
> └── 10a004050.qmp
>
> [1] 
> https://apple.stackexchange.com/questions/353832/why-is-mac-osx-temp-directory-in-weird-path
> [2] 
> /Library/Developer/CommandLineTools/SDKs/MacOSX12.3.sdk/usr/include/sys/un.h
>
> Signed-off-by: Peter Delevoryas 

I'm tentatively staging this with a benefit-of-the-doubt [1] -- my
tests are still running -- but I do have a question:

> ---
>  python/qemu/machine/machine.py | 6 +++---
>  tests/avocado/avocado_qemu/__init__.py | 2 +-
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
> index 748a0d807c9d..d70977378305 100644
> --- a/python/qemu/machine/machine.py
> +++ b/python/qemu/machine/machine.py
> @@ -157,7 +157,7 @@ def __init__(self,
>  self._wrapper = wrapper
>  self._qmp_timer = qmp_timer
>
> -self._name = name or f"qemu-{os.getpid()}-{id(self):02x}"
> +self._name = name or f"{id(self):x}"

Why is it safe to not differentiate based on the process ID?

... I suppose the thinking is: by default, in machine.py, this is a
temp dir created by tempfile.mkdtemp which will be unique per-process.
I suppose there's no protection against a caller supplying the same
tempdir (or sockdir) to multiple instances, but I suppose in those
cases we get to argue that "Well, don't do that, then."

Does that sound about right?

--js

[1] staged @ https://gitlab.com/jsnow/qemu/-/commits/python


>  self._temp_dir: Optional[str] = None
>  self._base_temp_dir = base_temp_dir
>  self._sock_dir = sock_dir
> @@ -167,7 +167,7 @@ def __init__(self,
>  self._monitor_address = monitor_address
>  else:
>  self._monitor_address = os.path.join(
> -self.sock_dir, f"{self._name}-monitor.sock"
> +self.sock_dir, f"{self._name}.qmp"
>  )
>
>  self._console_log_path = console_log
> @@ -192,7 +192,7 @@ def __init__(self,
>  self._console_set = False
>  self._console_device_type: Optional[str] = None
>  self._console_address = os.path.join(
> -self.sock_dir, f"{self._name}-console.sock"
> +self.sock_dir, f"{self._name}.con"
>  )
>  self._console_socket: Optional[socket.socket] = None
>  self._remove_files: List[str] = []
> diff --git a/tests/avocado/avocado_qemu/__init__.py 
> b/tests/avocado/avocado_qemu/__init__.py
> index 910f3ba1eab8..25a546842fab 100644
> --- a/tests/avocado/avocado_qemu/__init__.py
> +++ b/tests/avocado/avocado_qemu/__init__.py
> @@ -306,7 +306,7 @@ def require_netdev(self, netdevname):
>  self.cancel('no support for user networking')
>
>  def _new_vm(self, name, *args):
> -self._sd = tempfile.TemporaryDirectory(prefix="avo_qemu_sock_")
> +self._sd = tempfile.TemporaryDirectory(prefix="qemu_")
>  vm = QEMUMachine(self.qemu_bin, base_temp_dir=self.workdir,
>   sock_dir=self._sd.name, log_dir=self.logdir)
>  self.log.debug('QEMUMachine "%s" created', name)
> --
> 2.39.0
>
>

Re: [PATCH v6 0/4] Make the mc146818 RTC device target independent

2023-01-10 Thread Mark Cave-Ayland


On 10/01/2023 09:53, Thomas Huth wrote:


The basic idea of this patch set is to change hw/rtc/mc146818rtc.c into
target independent code so that the file only has to be compiled once
instead of multiple times (and that it can be used in a qemu-system-all
binary once we get there).

The first patch extracts some functions from the APIC code that will be
required for linking when the mc146818rtc becomes target-independent.

The second patch adds a new way for checking whether the "driftfix=slew"
policy is available or not (since the corresponding #ifdefs in the
mc146818rtc code will be removed).

The third patch then removes the "#ifdef TARGET" switches and turns
the mc146818rtc code into a target-independent file.

The fourth patch just fixes a small cosmetic nit that I discovered along
the way: On systems without mc146818, the "-rtc driftfix=slew" simply
got ignored silently. We should at least emit a warning in this case.

Changes since last iteration:
- Dropped the approach of using a new "slew-tick-policy-available"
   property that needs to be set by the machine code (and thus dropped
   the clean-up patches from Bernhard from this series since they are
   no longer required here now)
- Use a new check in hw/core/qdev-properties-system.c instead
   (see the second patch)

Thomas Huth (4):
   hw/intc: Extract the IRQ counting functions into a separate file
   hw/core/qdev-properties-system: Allow the 'slew' policy only on x86
   hw/rtc/mc146818rtc: Make the mc146818 RTC device target independent
   softmmu/rtc: Emit warning when using driftfix=slew on systems without
 mc146818

  include/hw/i386/apic.h   |  2 --
  include/hw/i386/apic_internal.h  |  1 -
  include/hw/intc/kvm_irqcount.h   | 10 +++
  include/hw/rtc/mc146818rtc.h |  1 +
  hw/core/qdev-properties-system.c | 28 +-
  hw/i386/kvm/i8259.c  |  4 +--
  hw/i386/kvm/ioapic.c |  4 +--
  hw/intc/apic.c   |  3 +-
  hw/intc/apic_common.c| 30 ++-
  hw/intc/kvm_irqcount.c   | 49 
  hw/rtc/mc146818rtc.c | 20 ++---
  softmmu/rtc.c|  6 +++-
  hw/intc/meson.build  |  6 
  hw/intc/trace-events |  9 +++---
  hw/rtc/meson.build   |  3 +-
  15 files changed, 115 insertions(+), 61 deletions(-)
  create mode 100644 include/hw/intc/kvm_irqcount.h
  create mode 100644 hw/intc/kvm_irqcount.c


This looks much better than the previous approaches - thanks for working on this! 
Looks good to me, so:


Reviewed-by: Mark Cave-Ayland 


ATB,

Mark.

Re: [PATCH v4 00/36] tcg: Support for Int128 with helpers

2023-01-10 Thread Mark Cave-Ayland


On 08/01/2023 02:36, Richard Henderson wrote:


Changes for v4:
   * About half of the v3 series has been merged,
   * AArch64 host requires even argument register.
   * target/{arm,ppc,s390x,i386} uses included here.

Patches requiring review:
   01-tcg-Define-TCG_TYPE_I128-and-related-helper-macro.patch
   02-tcg-Handle-dh_typecode_i128-with-TCG_CALL_-RET-AR.patch
   03-tcg-Allocate-objects-contiguously-in-temp_allocat.patch
   05-tcg-Add-TCG_CALL_-RET-ARG-_BY_REF.patch
   07-tcg-Add-TCG_CALL_RET_BY_VEC.patch
   08-include-qemu-int128-Use-Int128-structure-for-TCI.patch
   09-tcg-i386-Add-TCG_TARGET_CALL_-RET-ARG-_I128.patch
   10-tcg-tci-Fix-big-endian-return-register-ordering.patch
   11-tcg-tci-Add-TCG_TARGET_CALL_-RET-ARG-_I128.patch
   13-tcg-Add-temp-allocation-for-TCGv_i128.patch
   14-tcg-Add-basic-data-movement-for-TCGv_i128.patch
   15-tcg-Add-guest-load-store-primitives-for-TCGv_i128.patch
   16-tcg-Add-tcg_gen_-non-atomic_cmpxchg_i128.patch
   17-tcg-Split-out-tcg_gen_nonatomic_cmpxchg_i-32-64.patch
   24-target-s390x-Use-a-single-return-for-helper_divs3.patch
   31-target-s390x-Use-Int128-for-passing-float128.patch
   32-target-s390x-Use-tcg_gen_atomic_cmpxchg_i128-for-.patch
   33-target-s390x-Implement-CC_OP_NZ-in-gen_op_calc_cc.patch
   34-target-i386-Split-out-gen_cmpxchg8b-gen_cmpxchg16.patch
   35-target-i386-Inline-cmpxchg8b.patch
   36-target-i386-Inline-cmpxchg16b.patch


r~


Ilya Leoshkevich (2):
   tests/tcg/s390x: Add div.c
   tests/tcg/s390x: Add clst.c

Richard Henderson (34):
   tcg: Define TCG_TYPE_I128 and related helper macros
   tcg: Handle dh_typecode_i128 with TCG_CALL_{RET,ARG}_NORMAL
   tcg: Allocate objects contiguously in temp_allocate_frame
   tcg: Introduce tcg_out_addi_ptr
   tcg: Add TCG_CALL_{RET,ARG}_BY_REF
   tcg: Introduce tcg_target_call_oarg_reg
   tcg: Add TCG_CALL_RET_BY_VEC
   include/qemu/int128: Use Int128 structure for TCI
   tcg/i386: Add TCG_TARGET_CALL_{RET,ARG}_I128
   tcg/tci: Fix big-endian return register ordering
   tcg/tci: Add TCG_TARGET_CALL_{RET,ARG}_I128
   tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128
   tcg: Add temp allocation for TCGv_i128
   tcg: Add basic data movement for TCGv_i128
   tcg: Add guest load/store primitives for TCGv_i128
   tcg: Add tcg_gen_{non}atomic_cmpxchg_i128
   tcg: Split out tcg_gen_nonatomic_cmpxchg_i{32,64}
   target/arm: Use tcg_gen_atomic_cmpxchg_i128 for STXP
   target/arm: Use tcg_gen_atomic_cmpxchg_i128 for CASP
   target/ppc: Use tcg_gen_atomic_cmpxchg_i128 for STQCX
   tests/tcg/s390x: Add long-double.c
   target/s390x: Use a single return for helper_divs32/u32
   target/s390x: Use a single return for helper_divs64/u64
   target/s390x: Use Int128 for return from CLST
   target/s390x: Use Int128 for return from CKSM
   target/s390x: Use Int128 for return from TRE
   target/s390x: Copy wout_x1 to wout_x1_P
   target/s390x: Use Int128 for returning float128
   target/s390x: Use Int128 for passing float128
   target/s390x: Use tcg_gen_atomic_cmpxchg_i128 for CDSG
   target/s390x: Implement CC_OP_NZ in gen_op_calc_cc
   target/i386: Split out gen_cmpxchg8b, gen_cmpxchg16b
   target/i386: Inline cmpxchg8b
   target/i386: Inline cmpxchg16b

  accel/tcg/tcg-runtime.h  |  11 +
  include/exec/cpu_ldst.h  |  10 +
  include/exec/helper-head.h   |   7 +
  include/qemu/atomic128.h |  29 ++-
  include/qemu/int128.h|  25 +-
  include/tcg/tcg-op.h |  15 ++
  include/tcg/tcg.h|  49 +++-
  target/arm/helper-a64.h  |   8 -
  target/i386/helper.h |   6 -
  target/ppc/helper.h  |   2 -
  target/s390x/helper.h|  54 ++---
  tcg/aarch64/tcg-target.h |   2 +
  tcg/arm/tcg-target.h |   2 +
  tcg/i386/tcg-target.h|  10 +
  tcg/loongarch64/tcg-target.h |   2 +
  tcg/mips/tcg-target.h|   2 +
  tcg/riscv/tcg-target.h   |   3 +
  tcg/s390x/tcg-target.h   |   2 +
  tcg/sparc64/tcg-target.h |   2 +
  tcg/tcg-internal.h   |  17 ++
  tcg/tci/tcg-target.h |   3 +
  target/s390x/tcg/insn-data.h.inc |  60 ++---
  accel/tcg/cputlb.c   | 112 +
  accel/tcg/user-exec.c|  66 ++
  target/arm/helper-a64.c  | 147 
  target/arm/translate-a64.c   | 121 +-
  target/i386/tcg/mem_helper.c | 126 --
  target/i386/tcg/translate.c  | 126 --
  target/ppc/mem_helper.c  |  44 
  target/ppc/translate.c   | 102 
  target/s390x/tcg/fpu_helper.c| 103 
  target/s390x/tcg/int_helper.c|  64 ++---
  target/s390x/tcg/mem_helper.c|  77 +-
  target/s390x/tcg/translate.c | 217 +++--
  tcg/tcg-op.c | 393 ++-
  tcg/tcg.c| 303 +---
  tcg/tci.c|  65 ++---
  tests/tcg/s390x/clst.c   |  82 +++

Re: [PATCH qemu v3 1/1] Emulating sun keyboard language layout dip switches, taking the value for the dip switches from the "-k" option to qemu.

2023-01-10 Thread Mark Cave-Ayland


On 06/01/2023 21:33, ~henca wrote:


From: Henrik Carlqvist 

SUN Type 4, 5 and 5c keyboards have dip switches to choose the language
layout of the keyboard. Solaris makes an ioctl to query the value of the
dipswitches and uses that value to select keyboard layout. Also the SUN
bios like the one in the file ss5.bin uses this value to support at least
some keyboard layouts. However, the OpenBIOS provided with qemu is
hardcoded to always use an US keyboard layout.

Before this patch, qemu allways gave dip switch value 0x21 (US keyboard),
this patch uses the command line switch "-k" (keyboard layout) to select
dip switch value. A table is used to lookup values from arguments like:

-k fr
-k es

But the patch also accepts numeric dip switch values directly to the -k
switch:

-k 0x2b
-k 43

Both values above are the same and select swedish keyboard as explained in
table 3-15 at
https://docs.oracle.com/cd/E19683-01/806-6642/new-43/index.html

Unless you want to do a full Solaris installation but happen to have
access to a bios file, the easiest way to test that the patch works is to:

qemu-system-sparc -k sv -bios /path/to/ss5.bin

If you already happen to have a Solaris installation in a qemu disk image
file you can easily try different keyboard layouts after this patch is
applied.
---
  hw/char/escc.c | 74 +-
  1 file changed, 73 insertions(+), 1 deletion(-)

diff --git a/hw/char/escc.c b/hw/char/escc.c
index 17a908c59b..53022ccf39 100644
--- a/hw/char/escc.c
+++ b/hw/char/escc.c
@@ -31,6 +31,8 @@
  #include "qemu/module.h"
  #include "hw/char/escc.h"
  #include "ui/console.h"
+#include "sysemu/sysemu.h"
+#include "qemu/cutils.h"
  #include "trace.h"
  
  /*

@@ -190,6 +192,7 @@
  #define R_MISC1I 14
  #define R_EXTINT 15
  
+static unsigned char sun_keyboard_layout_dip_switch(void);

  static void handle_kbd_command(ESCCChannelState *s, int val);
  static int serial_can_receive(void *opaque);
  static void serial_receive_byte(ESCCChannelState *s, int ch);
@@ -846,6 +849,75 @@ static QemuInputHandler sunkbd_handler = {
  .event = sunkbd_handle_event,
  };
  
+static unsigned char sun_keyboard_layout_dip_switch(void)

+{
+/* Return the value of the dip-switches in a SUN Type 5 keyboard */
+static unsigned char ret = 0xff;
+
+if ((ret == 0xff) && keyboard_layout) {
+int i;
+struct layout_values {
+const char *lang;
+unsigned char dip;
+} languages[] =
+/* Dip values from table 3-16 Layouts for Type 4, 5, and 5c Keyboards */
+{
+{"en-us", 0x21}, /* U.S.A. (US5.kt) */
+ /* 0x22 is some other US (US_UNIX5.kt)*/
+{"fr",0x23}, /* France (France5.kt) */
+{"da",0x24}, /* Denmark (Denmark5.kt) */
+{"de",0x25}, /* Germany (Germany5.kt) */
+{"it",0x26}, /* Italy (Italy5.kt) */
+{"nl",0x27}, /* The Netherlands (Netherland5.kt) */
+{"no",0x28}, /* Norway (Norway.kt) */
+{"pt",0x29}, /* Portugal (Portugal5.kt) */
+{"es",0x2a}, /* Spain (Spain5.kt) */
+{"sv",0x2b}, /* Sweden (Sweden5.kt) */
+{"fr-ch", 0x2c}, /* Switzerland/French (Switzer_Fr5.kt) */
+{"de-ch", 0x2d}, /* Switzerland/German (Switzer_Ge5.kt) */
+{"en-gb", 0x2e}, /* Great Britain (UK5.kt) */
+{"ko",0x2f}, /* Korea (Korea5.kt) */
+{"tw",0x30}, /* Taiwan (Taiwan5.kt) */
+{"ja",0x31}, /* Japan (Japan5.kt) */
+{"fr-ca", 0x32}, /* Canada/French (Canada_Fr5.kt) */
+{"hu",0x33}, /* Hungary (Hungary5.kt) */
+{"pl",0x34}, /* Poland (Poland5.kt) */
+{"cz",0x35}, /* Czech (Czech5.kt) */
+{"ru",0x36}, /* Russia (Russia5.kt) */
+{"lv",0x37}, /* Latvia (Latvia5.kt) */
+{"tr",0x38}, /* Turkey-Q5 (TurkeyQ5.kt) */
+{"gr",0x39}, /* Greece (Greece5.kt) */
+{"ar",0x3a}, /* Arabic (Arabic5.kt) */
+{"lt",0x3b}, /* Lithuania (Lithuania5.kt) */
+{"nl-be", 0x3c}, /* Belgium (Belgian5.kt) */
+{"be",0x3c}, /* Belgium (Belgian5.kt) */
+};
+
+for (i = 0;
+ i < sizeof(languages) / sizeof(struct layout_values);
+ i++) {
+if (!strcmp(keyboard_layout, languages[i].lang)) {
+ret = languages[i].dip;
+return ret;
+}
+}
+/* Found no known language code */
+
+if ((keyboard_layout[0] >= '0') && (keyboard_layout[0] <= '9')) {
+unsigned int tmp;
+/* As a fallback we also accept numeric dip switch value */
+if (!qemu_strtoui(keyboard_layout, NULL, 0, &tmp)) {
+

Re: [PATCH 0/1] hw/ide: share bmdma read and write functions

2023-01-10 Thread Bernhard Beschow




Am 9. Januar 2023 19:24:16 UTC schrieb John Snow :
>On Tue, Sep 6, 2022 at 10:27 AM Bernhard Beschow  wrote:
>>
>> Am 19. Februar 2022 08:08:17 UTC schrieb Liav Albani :
>> >This is a preparation before I send v3 of ich6-ide controller emulation 
>> >patch.
>> >I figured that it's more trivial to split the changes this way, by 
>> >extracting
>> >the bmdma functions from via.c and piix.c and sharing them together. Then,
>> >I could easily put these into use when I send v3 of the ich6-ide patch by 
>> >just
>> >using the already separated functions. This was suggested by BALATON Zoltan 
>> >when
>> >he submitted a code review on my ich6-ide controller emulation patch.
>>
>> Ping. Any news?
>
>*cough*.
>
>Has this been folded into subsequent series, or does this still need attention?

Both piix and via still have their own bmdma implementations. This patch might 
be worth having.

Best regards,
Bernhard

>
>>
>> >Liav Albani (1):
>> >  hw/ide: share bmdma read and write functions between piix.c and via.c
>> >
>> > hw/ide/pci.c | 47 
>> > hw/ide/piix.c| 50 ++-
>> > hw/ide/via.c | 51 ++--
>> > include/hw/ide/pci.h |  4 
>> > 4 files changed, 55 insertions(+), 97 deletions(-)
>> >
>>
>

Re: [PATCH 2/2] target/riscv/cpu.c: do not skip misa logic in riscv_cpu_realize()

2023-01-10 Thread Alistair Francis

On Wed, Jan 11, 2023 at 6:17 AM Daniel Henrique Barboza
 wrote:
>
> All RISCV CPUs are setting cpu->cfg during their cpu_init() functions,
> meaning that there's no reason to skip all the misa validation and setup
> if misa_ext was set beforehand - especially since we're setting an
> updated value in set_misa() in the end.
>
> Put this code chunk into a new riscv_cpu_validate_set_extensions()
> helper and always execute it regardless of what the board set in
> env->misa_ext.
>
> This will put more responsibility in how each board is going to init
> their attributes and extensions if they're not using the defaults.
> It'll also allow realize() to do its job looking only at the extensions
> enabled per se, not corner cases that some CPUs might have, and we won't
> have to change multiple code paths to fix or change how extensions work.
>
> Signed-off-by: Daniel Henrique Barboza 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 485 +++--
>  1 file changed, 248 insertions(+), 237 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index b8c1edb7c2..33ed59a1b6 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -631,6 +631,250 @@ static void riscv_cpu_disas_set_info(CPUState *s, 
> disassemble_info *info)
>  }
>  }
>
> +/*
> + * Check consistency between chosen extensions while setting
> + * cpu->cfg accordingly, doing a set_misa() in the end.
> + */
> +static void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
> +{
> +CPURISCVState *env = &cpu->env;
> +uint32_t ext = 0;
> +
> +/* Do some ISA extension error checking */
> +if (cpu->cfg.ext_g && !(cpu->cfg.ext_i && cpu->cfg.ext_m &&
> +cpu->cfg.ext_a && cpu->cfg.ext_f &&
> +cpu->cfg.ext_d &&
> +cpu->cfg.ext_icsr && cpu->cfg.ext_ifencei)) {
> +warn_report("Setting G will also set IMAFD_Zicsr_Zifencei");
> +cpu->cfg.ext_i = true;
> +cpu->cfg.ext_m = true;
> +cpu->cfg.ext_a = true;
> +cpu->cfg.ext_f = true;
> +cpu->cfg.ext_d = true;
> +cpu->cfg.ext_icsr = true;
> +cpu->cfg.ext_ifencei = true;
> +}
> +
> +if (cpu->cfg.ext_i && cpu->cfg.ext_e) {
> +error_setg(errp,
> +   "I and E extensions are incompatible");
> +return;
> +}
> +
> +if (!cpu->cfg.ext_i && !cpu->cfg.ext_e) {
> +error_setg(errp,
> +   "Either I or E extension must be set");
> +return;
> +}
> +
> +if (cpu->cfg.ext_s && !cpu->cfg.ext_u) {
> +error_setg(errp,
> +   "Setting S extension without U extension is illegal");
> +return;
> +}
> +
> +if (cpu->cfg.ext_h && !cpu->cfg.ext_i) {
> +error_setg(errp,
> +   "H depends on an I base integer ISA with 32 x registers");
> +return;
> +}
> +
> +if (cpu->cfg.ext_h && !cpu->cfg.ext_s) {
> +error_setg(errp, "H extension implicitly requires S-mode");
> +return;
> +}
> +
> +if (cpu->cfg.ext_f && !cpu->cfg.ext_icsr) {
> +error_setg(errp, "F extension requires Zicsr");
> +return;
> +}
> +
> +if ((cpu->cfg.ext_zawrs) && !cpu->cfg.ext_a) {
> +error_setg(errp, "Zawrs extension requires A extension");
> +return;
> +}
> +
> +if ((cpu->cfg.ext_zfh || cpu->cfg.ext_zfhmin) && !cpu->cfg.ext_f) {
> +error_setg(errp, "Zfh/Zfhmin extensions require F extension");
> +return;
> +}
> +
> +if (cpu->cfg.ext_d && !cpu->cfg.ext_f) {
> +error_setg(errp, "D extension requires F extension");
> +return;
> +}
> +
> +if (cpu->cfg.ext_v && !cpu->cfg.ext_d) {
> +error_setg(errp, "V extension requires D extension");
> +return;
> +}
> +
> +if ((cpu->cfg.ext_zve32f || cpu->cfg.ext_zve64f) && !cpu->cfg.ext_f) {
> +error_setg(errp, "Zve32f/Zve64f extensions require F extension");
> +return;
> +}
> +
> +/* Set the ISA extensions, checks should have happened above */
> +if (cpu->cfg.ext_zdinx || cpu->cfg.ext_zhinx ||
> +cpu->cfg.ext_zhinxmin) {
> +cpu->cfg.ext_zfinx = true;
> +}
> +
> +if (cpu->cfg.ext_zfinx) {
> +if (!cpu->cfg.ext_icsr) {
> +error_setg(errp, "Zfinx extension requires Zicsr");
> +return;
> +}
> +if (cpu->cfg.ext_f) {
> +error_setg(errp,
> +"Zfinx cannot be supported together with F extension");
> +return;
> +}
> +}
> +
> +if (cpu->cfg.ext_c) {
> +cpu->cfg.ext_zca = true;
> +if (cpu->cfg.ext_f && env->misa_mxl_max == MXL_RV32) {
> +cpu->cfg.ext_zcf = true;
> +}
> +if (cpu->cfg.ext_d) {
> +cpu->cfg.ext_zcd = true;
> +}
> +}
> +
> +if (env->misa_mxl_max != MXL_RV32 && cpu->cfg.ext_z

Re: [PATCH 1/2] target/riscv/cpu: set cpu->cfg in register_cpu_props()

2023-01-10 Thread Alistair Francis

On Wed, Jan 11, 2023 at 6:17 AM Daniel Henrique Barboza
 wrote:
>
> There is an informal contract between the cpu_init() functions and
> riscv_cpu_realize(): if cpu->env.misa_ext is zero, assume that the
> default settings were loaded via register_cpu_props() and do validations
> to set env.misa_ext.  If it's not zero, skip this whole process and
> assume that the board somehow did everything.
>
> At this moment, all SiFive CPUs are setting a non-zero misa_ext during
> their cpu_init() and skipping a good chunk of riscv_cpu_realize().
> This causes problems when the code being skipped in riscv_cpu_realize()
> contains fixes or assumptions that affects all CPUs, meaning that SiFive
> CPUs are missing out.
>
> To allow this code to not be skipped anymore, all the cpu->cfg.ext_* 
> attributes
> needs to be set during cpu_init() time. At this moment this is being done in
> register_cpu_props(). The SiFive oards are setting their own extensions during
> cpu_init() though, meaning that they don't want all the defaults from
> register_cpu_props().
>
> Let's move the contract between *_cpu_init() and riscv_cpu_realize() to
> register_cpu_props(). Inside this function we'll check if cpu->env.misa_ext
> was set and, if that's the case, set all relevant cpu->cfg.ext_*
> attributes, and only that. Leave the 'misa_ext' = 0 case as is today,
> i.e. loading all the defaults from riscv_cpu_extensions[].
>
> register_cpu_props() can then be called by all the cpu_init() functions,
> including the SiFive ones. This will make all CPUs behave more in line
> with that riscv_cpu_realize() expects.
>
> Signed-off-by: Daniel Henrique Barboza 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 40 
>  target/riscv/cpu.h |  4 
>  2 files changed, 44 insertions(+)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index ee3659cc7e..b8c1edb7c2 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -262,6 +262,7 @@ static void rv64_sifive_u_cpu_init(Object *obj)
>  {
>  CPURISCVState *env = &RISCV_CPU(obj)->env;
>  set_misa(env, MXL_RV64, RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
> +register_cpu_props(DEVICE(obj));
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  }
>
> @@ -271,6 +272,7 @@ static void rv64_sifive_e_cpu_init(Object *obj)
>  RISCVCPU *cpu = RISCV_CPU(obj);
>
>  set_misa(env, MXL_RV64, RVI | RVM | RVA | RVC | RVU);
> +register_cpu_props(DEVICE(obj));
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  cpu->cfg.mmu = false;
>  }
> @@ -305,6 +307,7 @@ static void rv32_sifive_u_cpu_init(Object *obj)
>  {
>  CPURISCVState *env = &RISCV_CPU(obj)->env;
>  set_misa(env, MXL_RV32, RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
> +register_cpu_props(DEVICE(obj));
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  }
>
> @@ -314,6 +317,7 @@ static void rv32_sifive_e_cpu_init(Object *obj)
>  RISCVCPU *cpu = RISCV_CPU(obj);
>
>  set_misa(env, MXL_RV32, RVI | RVM | RVA | RVC | RVU);
> +register_cpu_props(DEVICE(obj));
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  cpu->cfg.mmu = false;
>  }
> @@ -324,6 +328,7 @@ static void rv32_ibex_cpu_init(Object *obj)
>  RISCVCPU *cpu = RISCV_CPU(obj);
>
>  set_misa(env, MXL_RV32, RVI | RVM | RVC | RVU);
> +register_cpu_props(DEVICE(obj));
>  set_priv_version(env, PRIV_VERSION_1_11_0);
>  cpu->cfg.mmu = false;
>  cpu->cfg.epmp = true;
> @@ -335,6 +340,7 @@ static void rv32_imafcu_nommu_cpu_init(Object *obj)
>  RISCVCPU *cpu = RISCV_CPU(obj);
>
>  set_misa(env, MXL_RV32, RVI | RVM | RVA | RVF | RVC | RVU);
> +register_cpu_props(DEVICE(obj));
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  cpu->cfg.mmu = false;
>  }
> @@ -1139,10 +1145,44 @@ static Property riscv_cpu_extensions[] = {
>  DEFINE_PROP_END_OF_LIST(),
>  };
>
> +/*
> + * Register CPU props based on env.misa_ext. If a non-zero
> + * value was set, register only the required cpu->cfg.ext_*
> + * properties and leave. env.misa_ext = 0 means that we want
> + * all the default properties to be registered.
> + */
>  static void register_cpu_props(DeviceState *dev)
>  {
> +RISCVCPU *cpu = RISCV_CPU(OBJECT(dev));
> +uint32_t misa_ext = cpu->env.misa_ext;
>  Property *prop;
>
> +/*
> + * If misa_ext is not zero, set cfg properties now to
> + * allow them to be read during riscv_cpu_realize()
> + * later on.
> + */
> +if (cpu->env.misa_ext != 0) {
> +cpu->cfg.ext_i = misa_ext & RVI;
> +cpu->cfg.ext_e = misa_ext & RVE;
> +cpu->cfg.ext_m = misa_ext & RVM;
> +cpu->cfg.ext_a = misa_ext & RVA;
> +cpu->cfg.ext_f = misa_ext & RVF;
> +cpu->cfg.ext_d = misa_ext & RVD;
> +cpu->cfg.ext_v = misa_ext & RVV;
> +cpu->cfg.ext_c = misa_ext & RVC;
> +cpu->cfg.ext_s = misa_ext & RVS;
> +cpu->cfg.ext_u = misa_ext & RVU;
> +cpu->cfg.ext_h

Re: [PATCH v7 6/7] mac_newworld: Deprecate mac99 "via" option

2023-01-10 Thread Mark Cave-Ayland


On 04/01/2023 21:59, BALATON Zoltan wrote:


Setting emulated machine type with a property called "via" is
confusing users so deprecate the "via" option in favour of newly added
explicit machine types. The default via=cuda option is not a valid
config (no real Mac has this combination of hardware) so no machine
type could be defined for that therefore it is kept for backwards
compatibility with older QEMU versions for now but other options
resembling real machines are deprecated.

Signed-off-by: BALATON Zoltan 


I believe that people do use -M mac99,via=cuda to run some rare versions of MacOS in 
QEMU (I think possibly OS X DP and Workgroup Server?), so we would want to keep this 
option somewhere.



---
  hw/ppc/mac_newworld.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index f07c37328b..adf185bd3a 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -169,6 +169,15 @@ static void ppc_core99_init(MachineState *machine)
  if (PPC_INPUT(env) == PPC_FLAGS_INPUT_970) {
  warn_report("mac99 with G5 CPU is deprecated, "
  "use powermac7_3 instead");
+} else {
+if (core99_machine->via_config == CORE99_VIA_CONFIG_PMU) {
+warn_report("mac99,via=pmu is deprecated, "
+"use powermac3_1 instead");
+}
+if (core99_machine->via_config == CORE99_VIA_CONFIG_PMU_ADB) {
+warn_report("mac99,via=pmu-adb is deprecated, "
+"use powerbook3_2 instead");
+}
  }
  }
  /* allocate RAM */



ATB,

Mark.

Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory

2023-01-10 Thread Vishal Annapurve

On Tue, Jan 10, 2023 at 1:19 AM Chao Peng  wrote:
> >
> > Regarding the userspace side of things, please include Vishal's selftests 
> > in v11,
> > it's impossible to properly review the uAPI changes without seeing the 
> > userspace
> > side of things.  I'm in the process of reviewing Vishal's v2[*], I'll try to
> > massage it into a set of patches that you can incorporate into your series.
>
> Previously I included Vishal's selftests in the github repo, but not
> include them in this patch series. It's OK for me to incorporate them
> directly into this series and review together if Vishal is fine.
>

Yeah, I am ok with incorporating selftest patches into this series and
reviewing them together.

Regards,
Vishal

> Chao
> >
> > [*] 
> > https://lore.kernel.org/all/20221205232341.4131240-1-vannapu...@google.com

Re: [PATCH v7 4/7] mac_newworld: Add machine types for different mac99 configs

2023-01-10 Thread Mark Cave-Ayland


On 04/01/2023 21:59, BALATON Zoltan wrote:


The mac99 machine emulates different machines depending on machine
properties or even if it is run as qemu-system-ppc64 or
qemu-system-ppc. This is very confusing for users and many hours were
lost trying to explain it or finding out why commands users came up
with are not working as expected. (E.g. Windows users might think
qemu-system-ppc64 is just the 64 bit version of qemu-system-ppc and
then fail to boot a 32 bit OS with -M mac99 trying to follow an
example that had qemu-system-ppc.) To avoid such confusion, add
explicit machine types for the different configs which will work the
same with both qemu-system-ppc and qemu-system-ppc64 and also make the
command line clearer for new users.

Signed-off-by: BALATON Zoltan 


Some thoughts on this: the first is that not everyone agrees that for qemu-system-X 
that X represents the target. There were previous discussion where some KVM people 
assumed X represented the host, i.e. ppc64 was the binary that ran all PPC guests but 
with hardware acceleration for ppc64 guests on ppc64 hosts. This was a while ago, so 
it may be worth starting a thread on qemu-devel to see what the current consensus is.


Secondly it's not clear to me why you've chosen names like "powermac_3_1" instead of 
"g4agp"? Does powermac_3_1 uniquely identify the G4 AGP Sawtooth model? For QEMU it 
is always best to emulate real machines, and whilst I understand you want to separate 
out the two versions of the mac99 machine, having "powermac_X_Y" seems less clear to me.


Finally can you post links to the device trees that you are using for each of the new 
machine types so that we have a clear reference point for future changes to the QEMU 
Mac machines? Even better include the links in the comments for each machine so that 
the information is easily visible for developers.



---
  hw/ppc/mac_newworld.c | 94 +++
  1 file changed, 94 insertions(+)

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 60c9c27986..3f5d1ec097 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -642,9 +642,103 @@ static const TypeInfo core99_machine_info = {
  },
  };
  
+static void powermac3_1_machine_class_init(ObjectClass *oc, void *data)

+{
+MachineClass *mc = MACHINE_CLASS(oc);
+
+core99_machine_class_init(oc, data);
+mc->desc = "Apple Power Mac G4 AGP (Sawtooth)";
+mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("7400_v2.9");
+}
+
+static void powermac3_1_instance_init(Object *obj)
+{
+Core99MachineState *cms = CORE99_MACHINE(obj);
+
+cms->via_config = CORE99_VIA_CONFIG_PMU;
+return;
+}
+
+static const TypeInfo powermac3_1_machine_info = {
+.name  = MACHINE_TYPE_NAME("powermac3_1"),
+.parent= TYPE_MACHINE,
+.class_init= powermac3_1_machine_class_init,
+.instance_init = powermac3_1_instance_init,
+.instance_size = sizeof(Core99MachineState),
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_FW_PATH_PROVIDER },
+{ }
+},
+};
+
+static void powerbook3_2_machine_class_init(ObjectClass *oc, void *data)
+{
+MachineClass *mc = MACHINE_CLASS(oc);
+
+core99_machine_class_init(oc, data);
+mc->desc = "Apple PowerBook G4 Titanium (Mercury)";
+mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("7400_v2.9");
+}
+
+static void powerbook3_2_instance_init(Object *obj)
+{
+Core99MachineState *cms = CORE99_MACHINE(obj);
+
+cms->via_config = CORE99_VIA_CONFIG_PMU_ADB;
+return;
+}
+
+static const TypeInfo powerbook3_2_machine_info = {
+.name  = MACHINE_TYPE_NAME("powerbook3_2"),
+.parent= TYPE_MACHINE,
+.class_init= powerbook3_2_machine_class_init,
+.instance_init = powerbook3_2_instance_init,
+.instance_size = sizeof(Core99MachineState),
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_FW_PATH_PROVIDER },
+{ }
+},
+};
+
+#ifdef TARGET_PPC64
+static void powermac7_3_machine_class_init(ObjectClass *oc, void *data)
+{
+MachineClass *mc = MACHINE_CLASS(oc);
+
+core99_machine_class_init(oc, data);
+mc->desc = "Apple Power Mac G5 (Niagara)";
+mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("970fx_v3.1");
+}
+
+static void powermac7_3_instance_init(Object *obj)
+{
+Core99MachineState *cms = CORE99_MACHINE(obj);
+
+cms->via_config = CORE99_VIA_CONFIG_PMU;
+return;
+}
+
+static const TypeInfo powermac7_3_machine_info = {
+.name  = MACHINE_TYPE_NAME("powermac7_3"),
+.parent= TYPE_MACHINE,
+.class_init= powermac7_3_machine_class_init,
+.instance_init = powermac7_3_instance_init,
+.instance_size = sizeof(Core99MachineState),
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_FW_PATH_PROVIDER },
+{ }
+},
+};
+#endif
+
  static void mac_machine_register_types(void)
  {
  type_register_static(&core99_machine_info);
+type_register_static(&powermac3_1_machine_info);
+type_regi

Re: [PATCH v5 10/11] hw/riscv/boot.c: consolidate all kernel init in riscv_load_kernel()

2023-01-10 Thread Alistair Francis

On Wed, Jan 11, 2023 at 6:21 AM Daniel Henrique Barboza
 wrote:
>
>
>
> On 1/10/23 08:43, Daniel Henrique Barboza wrote:
> >
> >
> > On 1/8/23 00:33, Bin Meng wrote:
> >> On Mon, Jan 2, 2023 at 7:55 PM Daniel Henrique Barboza
> >>  wrote:
> >>> The microchip_icicle_kit, sifive_u, spike and virt boards are now doing
> >>> the same steps when '-kernel' is used:
> >>>
> >>> - execute load_kernel()
> >>> - load init_rd()
> >>> - write kernel_cmdline
> >>>
> >>> Let's fold everything inside riscv_load_kernel() to avoid code
> >>> repetition. To not change the behavior of boards that aren't calling
> >>> riscv_load_init(), add an 'load_initrd' flag to riscv_load_kernel() and
> >> typo: should be riscv_load_initrd()
> >>
> >>> allow these boards to opt out from initrd loading.
> >>>
> >>> Cc: Palmer Dabbelt 
> >>> Signed-off-by: Daniel Henrique Barboza 
> >>> ---
> >>>   hw/riscv/boot.c| 22 +++---
> >>>   hw/riscv/microchip_pfsoc.c | 12 ++--
> >>>   hw/riscv/opentitan.c   |  2 +-
> >>>   hw/riscv/sifive_e.c|  3 ++-
> >>>   hw/riscv/sifive_u.c| 12 ++--
> >>>   hw/riscv/spike.c   | 11 +--
> >>>   hw/riscv/virt.c| 12 ++--
> >>>   include/hw/riscv/boot.h|  1 +
> >>>   8 files changed, 30 insertions(+), 45 deletions(-)
> >>>
> >> Otherwise,
> >> Reviewed-by: Bin Meng 
> >
> > Thanks!
> >
> > Alistair, let me know if you want me to send another version with the commit
> > message typo fixed. I might as well take the change to rebase it with
> > riscv-to-apply.next.
>
> While rebasing these patches on top of riscv-to-apply.next, the avocado tests
> I've introduced here started to fail both sifive_u tests:
>
> tests/avocado/riscv_opensbi.py:RiscvOpenSBI.test_riscv32_sifive_u: 
> INTERRUPTED:
> Test interrupted by SIGTERM\nRunner error occurred: ... (5.07 s)
>   (09/18) tests/avocado/riscv_opensbi.py:RiscvOpenSBI.test_riscv64_sifive_u: 
> INTERRUPTED:
> Test interrupted by SIGTERM\nRunner error occurred: ... (5.05 s)
>
>
> I proposed a fix here:
>
> https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg02035.html

Thanks!

I generally push riscv-to-apply.next before running tests, so it's
possible to break. I'm seeing similar failures.

Generally when I see failures from a series I just drop the series,
but if you have a fix that's even better :)

Alistair

>
> I can re-send this series after we get that problem figure out. Otherwise 
> we're
> going to add 2 avocado tests that are failing right from the start hehe.
>
> Thanks,
>
> Daniel
>
>
> >
> >
> > Daniel
> >
>
>

Re: [PATCH v5 10/11] hw/riscv/boot.c: consolidate all kernel init in riscv_load_kernel()

2023-01-10 Thread Alistair Francis

On Mon, Jan 2, 2023 at 9:55 PM Daniel Henrique Barboza
 wrote:
>
> The microchip_icicle_kit, sifive_u, spike and virt boards are now doing
> the same steps when '-kernel' is used:
>
> - execute load_kernel()
> - load init_rd()
> - write kernel_cmdline
>
> Let's fold everything inside riscv_load_kernel() to avoid code
> repetition. To not change the behavior of boards that aren't calling
> riscv_load_init(), add an 'load_initrd' flag to riscv_load_kernel() and
> allow these boards to opt out from initrd loading.
>
> Cc: Palmer Dabbelt 
> Signed-off-by: Daniel Henrique Barboza 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/boot.c| 22 +++---
>  hw/riscv/microchip_pfsoc.c | 12 ++--
>  hw/riscv/opentitan.c   |  2 +-
>  hw/riscv/sifive_e.c|  3 ++-
>  hw/riscv/sifive_u.c| 12 ++--
>  hw/riscv/spike.c   | 11 +--
>  hw/riscv/virt.c| 12 ++--
>  include/hw/riscv/boot.h|  1 +
>  8 files changed, 30 insertions(+), 45 deletions(-)
>
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index 2594276223..4888d5c1e0 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -175,10 +175,12 @@ target_ulong riscv_load_firmware(const char 
> *firmware_filename,
>
>  target_ulong riscv_load_kernel(MachineState *machine,
> target_ulong kernel_start_addr,
> +   bool load_initrd,
> symbol_fn_t sym_cb)
>  {
>  const char *kernel_filename = machine->kernel_filename;
>  uint64_t kernel_load_base, kernel_entry;
> +void *fdt = machine->fdt;
>
>  g_assert(kernel_filename != NULL);
>
> @@ -192,21 +194,35 @@ target_ulong riscv_load_kernel(MachineState *machine,
>  if (load_elf_ram_sym(kernel_filename, NULL, NULL, NULL,
>   NULL, &kernel_load_base, NULL, NULL, 0,
>   EM_RISCV, 1, 0, NULL, true, sym_cb) > 0) {
> -return kernel_load_base;
> +kernel_entry = kernel_load_base;
> +goto out;
>  }
>
>  if (load_uimage_as(kernel_filename, &kernel_entry, NULL, NULL,
> NULL, NULL, NULL) > 0) {
> -return kernel_entry;
> +goto out;
>  }
>
>  if (load_image_targphys_as(kernel_filename, kernel_start_addr,
> current_machine->ram_size, NULL) > 0) {
> -return kernel_start_addr;
> +kernel_entry = kernel_start_addr;
> +goto out;
>  }
>
>  error_report("could not load kernel '%s'", kernel_filename);
>  exit(1);
> +
> +out:
> +if (load_initrd && machine->initrd_filename) {
> +riscv_load_initrd(machine, kernel_entry);
> +}
> +
> +if (fdt && machine->kernel_cmdline && *machine->kernel_cmdline) {
> +qemu_fdt_setprop_string(fdt, "/chosen", "bootargs",
> +machine->kernel_cmdline);
> +}
> +
> +return kernel_entry;
>  }
>
>  void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry)
> diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c
> index 82ae5e7023..c45023a2b1 100644
> --- a/hw/riscv/microchip_pfsoc.c
> +++ b/hw/riscv/microchip_pfsoc.c
> @@ -629,16 +629,8 @@ static void 
> microchip_icicle_kit_machine_init(MachineState *machine)
>  kernel_start_addr = riscv_calc_kernel_start_addr(&s->soc.u_cpus,
>   firmware_end_addr);
>
> -kernel_entry = riscv_load_kernel(machine, kernel_start_addr, NULL);
> -
> -if (machine->initrd_filename) {
> -riscv_load_initrd(machine, kernel_entry);
> -}
> -
> -if (machine->kernel_cmdline && *machine->kernel_cmdline) {
> -qemu_fdt_setprop_string(machine->fdt, "/chosen",
> -"bootargs", machine->kernel_cmdline);
> -}
> +kernel_entry = riscv_load_kernel(machine, kernel_start_addr,
> + true, NULL);
>
>  /* Compute the fdt load address in dram */
>  fdt_load_addr = riscv_load_fdt(memmap[MICROCHIP_PFSOC_DRAM_LO].base,
> diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
> index 64d5d435b9..f6fd9725a5 100644
> --- a/hw/riscv/opentitan.c
> +++ b/hw/riscv/opentitan.c
> @@ -101,7 +101,7 @@ static void opentitan_board_init(MachineState *machine)
>  }
>
>  if (machine->kernel_filename) {
> -riscv_load_kernel(machine, memmap[IBEX_DEV_RAM].base, NULL);
> +riscv_load_kernel(machine, memmap[IBEX_DEV_RAM].base, false, NULL);
>  }
>  }
>
> diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
> index 3e3f4b0088..6835d1c807 100644
> --- a/hw/riscv/sifive_e.c
> +++ b/hw/riscv/sifive_e.c
> @@ -114,7 +114,8 @@ static void sifive_e_machine_init(MachineState *machine)
>memmap[SIFIVE_E_DEV_MROM].base, 
> &address_space_memory);
>
>  if (machine->kernel_filename)

Re: [PATCH v5 11/11] hw/riscv/boot.c: make riscv_load_initrd() static

2023-01-10 Thread Alistair Francis

On Mon, Jan 2, 2023 at 9:57 PM Daniel Henrique Barboza
 wrote:
>
> The only remaining caller is riscv_load_kernel_and_initrd() which
> belongs to the same file.
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Bin Meng 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/boot.c | 80 -
>  include/hw/riscv/boot.h |  1 -
>  2 files changed, 40 insertions(+), 41 deletions(-)
>
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index 4888d5c1e0..e868fb6ade 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -173,6 +173,46 @@ target_ulong riscv_load_firmware(const char 
> *firmware_filename,
>  exit(1);
>  }
>
> +static void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry)
> +{
> +const char *filename = machine->initrd_filename;
> +uint64_t mem_size = machine->ram_size;
> +void *fdt = machine->fdt;
> +hwaddr start, end;
> +ssize_t size;
> +
> +g_assert(filename != NULL);
> +
> +/*
> + * We want to put the initrd far enough into RAM that when the
> + * kernel is uncompressed it will not clobber the initrd. However
> + * on boards without much RAM we must ensure that we still leave
> + * enough room for a decent sized initrd, and on boards with large
> + * amounts of RAM we must avoid the initrd being so far up in RAM
> + * that it is outside lowmem and inaccessible to the kernel.
> + * So for boards with less  than 256MB of RAM we put the initrd
> + * halfway into RAM, and for boards with 256MB of RAM or more we put
> + * the initrd at 128MB.
> + */
> +start = kernel_entry + MIN(mem_size / 2, 128 * MiB);
> +
> +size = load_ramdisk(filename, start, mem_size - start);
> +if (size == -1) {
> +size = load_image_targphys(filename, start, mem_size - start);
> +if (size == -1) {
> +error_report("could not load ramdisk '%s'", filename);
> +exit(1);
> +}
> +}
> +
> +/* Some RISC-V machines (e.g. opentitan) don't have a fdt. */
> +if (fdt) {
> +end = start + size;
> +qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-start", start);
> +qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-end", end);
> +}
> +}
> +
>  target_ulong riscv_load_kernel(MachineState *machine,
> target_ulong kernel_start_addr,
> bool load_initrd,
> @@ -225,46 +265,6 @@ out:
>  return kernel_entry;
>  }
>
> -void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry)
> -{
> -const char *filename = machine->initrd_filename;
> -uint64_t mem_size = machine->ram_size;
> -void *fdt = machine->fdt;
> -hwaddr start, end;
> -ssize_t size;
> -
> -g_assert(filename != NULL);
> -
> -/*
> - * We want to put the initrd far enough into RAM that when the
> - * kernel is uncompressed it will not clobber the initrd. However
> - * on boards without much RAM we must ensure that we still leave
> - * enough room for a decent sized initrd, and on boards with large
> - * amounts of RAM we must avoid the initrd being so far up in RAM
> - * that it is outside lowmem and inaccessible to the kernel.
> - * So for boards with less  than 256MB of RAM we put the initrd
> - * halfway into RAM, and for boards with 256MB of RAM or more we put
> - * the initrd at 128MB.
> - */
> -start = kernel_entry + MIN(mem_size / 2, 128 * MiB);
> -
> -size = load_ramdisk(filename, start, mem_size - start);
> -if (size == -1) {
> -size = load_image_targphys(filename, start, mem_size - start);
> -if (size == -1) {
> -error_report("could not load ramdisk '%s'", filename);
> -exit(1);
> -}
> -}
> -
> -/* Some RISC-V machines (e.g. opentitan) don't have a fdt. */
> -if (fdt) {
> -end = start + size;
> -qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-start", start);
> -qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-end", end);
> -}
> -}
> -
>  uint64_t riscv_load_fdt(hwaddr dram_base, uint64_t mem_size, void *fdt)
>  {
>  uint64_t temp, fdt_addr;
> diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
> index c3de897371..cbd131bad7 100644
> --- a/include/hw/riscv/boot.h
> +++ b/include/hw/riscv/boot.h
> @@ -47,7 +47,6 @@ target_ulong riscv_load_kernel(MachineState *machine,
> target_ulong firmware_end_addr,
> bool load_initrd,
> symbol_fn_t sym_cb);
> -void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry);
>  uint64_t riscv_load_fdt(hwaddr dram_start, uint64_t dram_size, void *fdt);
>  void riscv_setup_rom_reset_vec(MachineState *machine, RISCVHartArrayState 
> *harts,
> hwaddr saddr,
> --
> 2.39.0
>
>

Re: [PATCH v5 09/11] hw/riscv/boot.c: use MachineState in riscv_load_kernel()

2023-01-10 Thread Alistair Francis

On Mon, Jan 2, 2023 at 9:55 PM Daniel Henrique Barboza
 wrote:
>
> All callers are using kernel_filename as machine->kernel_filename.
>
> This will also simplify the changes in riscv_load_kernel() that we're
> going to do next.
>
> Cc: Palmer Dabbelt 
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Bin Meng 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/boot.c| 3 ++-
>  hw/riscv/microchip_pfsoc.c | 3 +--
>  hw/riscv/opentitan.c   | 3 +--
>  hw/riscv/sifive_e.c| 3 +--
>  hw/riscv/sifive_u.c| 3 +--
>  hw/riscv/spike.c   | 3 +--
>  hw/riscv/virt.c| 3 +--
>  include/hw/riscv/boot.h| 2 +-
>  8 files changed, 9 insertions(+), 14 deletions(-)
>
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index d3e780c3b6..2594276223 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -173,10 +173,11 @@ target_ulong riscv_load_firmware(const char 
> *firmware_filename,
>  exit(1);
>  }
>
> -target_ulong riscv_load_kernel(const char *kernel_filename,
> +target_ulong riscv_load_kernel(MachineState *machine,
> target_ulong kernel_start_addr,
> symbol_fn_t sym_cb)
>  {
> +const char *kernel_filename = machine->kernel_filename;
>  uint64_t kernel_load_base, kernel_entry;
>
>  g_assert(kernel_filename != NULL);
> diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c
> index 1e9b0a420e..82ae5e7023 100644
> --- a/hw/riscv/microchip_pfsoc.c
> +++ b/hw/riscv/microchip_pfsoc.c
> @@ -629,8 +629,7 @@ static void 
> microchip_icicle_kit_machine_init(MachineState *machine)
>  kernel_start_addr = riscv_calc_kernel_start_addr(&s->soc.u_cpus,
>   firmware_end_addr);
>
> -kernel_entry = riscv_load_kernel(machine->kernel_filename,
> - kernel_start_addr, NULL);
> +kernel_entry = riscv_load_kernel(machine, kernel_start_addr, NULL);
>
>  if (machine->initrd_filename) {
>  riscv_load_initrd(machine, kernel_entry);
> diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
> index 85ffdac5be..64d5d435b9 100644
> --- a/hw/riscv/opentitan.c
> +++ b/hw/riscv/opentitan.c
> @@ -101,8 +101,7 @@ static void opentitan_board_init(MachineState *machine)
>  }
>
>  if (machine->kernel_filename) {
> -riscv_load_kernel(machine->kernel_filename,
> -  memmap[IBEX_DEV_RAM].base, NULL);
> +riscv_load_kernel(machine, memmap[IBEX_DEV_RAM].base, NULL);
>  }
>  }
>
> diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
> index d65d2fd869..3e3f4b0088 100644
> --- a/hw/riscv/sifive_e.c
> +++ b/hw/riscv/sifive_e.c
> @@ -114,8 +114,7 @@ static void sifive_e_machine_init(MachineState *machine)
>memmap[SIFIVE_E_DEV_MROM].base, 
> &address_space_memory);
>
>  if (machine->kernel_filename) {
> -riscv_load_kernel(machine->kernel_filename,
> -  memmap[SIFIVE_E_DEV_DTIM].base, NULL);
> +riscv_load_kernel(machine, memmap[SIFIVE_E_DEV_DTIM].base, NULL);
>  }
>  }
>
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index c40885ed5c..bac394c959 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -598,8 +598,7 @@ static void sifive_u_machine_init(MachineState *machine)
>  kernel_start_addr = riscv_calc_kernel_start_addr(&s->soc.u_cpus,
>   firmware_end_addr);
>
> -kernel_entry = riscv_load_kernel(machine->kernel_filename,
> - kernel_start_addr, NULL);
> +kernel_entry = riscv_load_kernel(machine, kernel_start_addr, NULL);
>
>  if (machine->initrd_filename) {
>  riscv_load_initrd(machine, kernel_entry);
> diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
> index 99dec74fe8..bff9475686 100644
> --- a/hw/riscv/spike.c
> +++ b/hw/riscv/spike.c
> @@ -307,8 +307,7 @@ static void spike_board_init(MachineState *machine)
>  kernel_start_addr = riscv_calc_kernel_start_addr(&s->soc[0],
>   firmware_end_addr);
>
> -kernel_entry = riscv_load_kernel(machine->kernel_filename,
> - kernel_start_addr,
> +kernel_entry = riscv_load_kernel(machine, kernel_start_addr,
>   htif_symbol_callback);
>
>  if (machine->initrd_filename) {
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 02f1369843..c8e35f861e 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -1281,8 +1281,7 @@ static void virt_machine_done(Notifier *notifier, void 
> *data)
>  kernel_start_addr = riscv_calc_kernel_start_addr(&s->soc[0],
>   firmware_end_addr);

Re: [PATCH v5 08/11] hw/riscv/boot.c: use MachineState in riscv_load_initrd()

2023-01-10 Thread Alistair Francis

On Mon, Jan 2, 2023 at 9:55 PM Daniel Henrique Barboza
 wrote:
>
> 'filename', 'mem_size' and 'fdt' from riscv_load_initrd() can all be
> retrieved by the MachineState object for all callers.
>
> Cc: Palmer Dabbelt 
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Bin Meng 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/boot.c| 6 --
>  hw/riscv/microchip_pfsoc.c | 3 +--
>  hw/riscv/sifive_u.c| 3 +--
>  hw/riscv/spike.c   | 3 +--
>  hw/riscv/virt.c| 3 +--
>  include/hw/riscv/boot.h| 3 +--
>  6 files changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index 6b948d1c9e..d3e780c3b6 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -208,9 +208,11 @@ target_ulong riscv_load_kernel(const char 
> *kernel_filename,
>  exit(1);
>  }
>
> -void riscv_load_initrd(const char *filename, uint64_t mem_size,
> -   uint64_t kernel_entry, void *fdt)
> +void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry)
>  {
> +const char *filename = machine->initrd_filename;
> +uint64_t mem_size = machine->ram_size;
> +void *fdt = machine->fdt;
>  hwaddr start, end;
>  ssize_t size;
>
> diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c
> index 593a799549..1e9b0a420e 100644
> --- a/hw/riscv/microchip_pfsoc.c
> +++ b/hw/riscv/microchip_pfsoc.c
> @@ -633,8 +633,7 @@ static void 
> microchip_icicle_kit_machine_init(MachineState *machine)
>   kernel_start_addr, NULL);
>
>  if (machine->initrd_filename) {
> -riscv_load_initrd(machine->initrd_filename, machine->ram_size,
> -  kernel_entry, machine->fdt);
> +riscv_load_initrd(machine, kernel_entry);
>  }
>
>  if (machine->kernel_cmdline && *machine->kernel_cmdline) {
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index 3e6df87b5b..c40885ed5c 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -602,8 +602,7 @@ static void sifive_u_machine_init(MachineState *machine)
>   kernel_start_addr, NULL);
>
>  if (machine->initrd_filename) {
> -riscv_load_initrd(machine->initrd_filename, machine->ram_size,
> -  kernel_entry, machine->fdt);
> +riscv_load_initrd(machine, kernel_entry);
>  }
>
>  if (machine->kernel_cmdline && *machine->kernel_cmdline) {
> diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
> index 60e2912be5..99dec74fe8 100644
> --- a/hw/riscv/spike.c
> +++ b/hw/riscv/spike.c
> @@ -312,8 +312,7 @@ static void spike_board_init(MachineState *machine)
>   htif_symbol_callback);
>
>  if (machine->initrd_filename) {
> -riscv_load_initrd(machine->initrd_filename, machine->ram_size,
> -  kernel_entry, machine->fdt);
> +riscv_load_initrd(machine, kernel_entry);
>  }
>
>  if (machine->kernel_cmdline && *machine->kernel_cmdline) {
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 6c946b6def..02f1369843 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -1285,8 +1285,7 @@ static void virt_machine_done(Notifier *notifier, void 
> *data)
>   kernel_start_addr, NULL);
>
>  if (machine->initrd_filename) {
> -riscv_load_initrd(machine->initrd_filename, machine->ram_size,
> -  kernel_entry, machine->fdt);
> +riscv_load_initrd(machine, kernel_entry);
>  }
>
>  if (machine->kernel_cmdline && *machine->kernel_cmdline) {
> diff --git a/include/hw/riscv/boot.h b/include/hw/riscv/boot.h
> index e37e1d1238..cfd72ecabf 100644
> --- a/include/hw/riscv/boot.h
> +++ b/include/hw/riscv/boot.h
> @@ -46,8 +46,7 @@ target_ulong riscv_load_firmware(const char 
> *firmware_filename,
>  target_ulong riscv_load_kernel(const char *kernel_filename,
> target_ulong firmware_end_addr,
> symbol_fn_t sym_cb);
> -void riscv_load_initrd(const char *filename, uint64_t mem_size,
> -   uint64_t kernel_entry, void *fdt);
> +void riscv_load_initrd(MachineState *machine, uint64_t kernel_entry);
>  uint64_t riscv_load_fdt(hwaddr dram_start, uint64_t dram_size, void *fdt);
>  void riscv_setup_rom_reset_vec(MachineState *machine, RISCVHartArrayState 
> *harts,
> hwaddr saddr,
> --
> 2.39.0
>
>

Re: [PATCH v5 07/11] hw/riscv: write bootargs 'chosen' FDT after riscv_load_kernel()

2023-01-10 Thread Alistair Francis

On Mon, Jan 2, 2023 at 9:55 PM Daniel Henrique Barboza
 wrote:
>
> The sifive_u, spike and virt machines are writing the 'bootargs' FDT
> node during their respective create_fdt().
>
> Given that bootargs is written only when '-append' is used, and this
> option is only allowed with the '-kernel' option, which in turn is
> already being check before executing riscv_load_kernel(), write
> 'bootargs' in the same code path as riscv_load_kernel().
>
> Cc: Palmer Dabbelt 
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Bin Meng 
> Reviewed-by: Philippe Mathieu-Daudé 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/sifive_u.c | 11 +--
>  hw/riscv/spike.c|  9 +
>  hw/riscv/virt.c | 11 +--
>  3 files changed, 15 insertions(+), 16 deletions(-)
>
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index 37f5087172..3e6df87b5b 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -117,7 +117,6 @@ static void create_fdt(SiFiveUState *s, const MemMapEntry 
> *memmap,
>  error_report("load_device_tree() failed");
>  exit(1);
>  }
> -goto update_bootargs;
>  } else {
>  fdt = ms->fdt = create_device_tree(&fdt_size);
>  if (!fdt) {
> @@ -510,11 +509,6 @@ static void create_fdt(SiFiveUState *s, const 
> MemMapEntry *memmap,
>  qemu_fdt_setprop_string(fdt, "/aliases", "serial0", nodename);
>
>  g_free(nodename);
> -
> -update_bootargs:
> -if (cmdline && *cmdline) {
> -qemu_fdt_setprop_string(fdt, "/chosen", "bootargs", cmdline);
> -}
>  }
>
>  static void sifive_u_machine_reset(void *opaque, int n, int level)
> @@ -611,6 +605,11 @@ static void sifive_u_machine_init(MachineState *machine)
>  riscv_load_initrd(machine->initrd_filename, machine->ram_size,
>kernel_entry, machine->fdt);
>  }
> +
> +if (machine->kernel_cmdline && *machine->kernel_cmdline) {
> +qemu_fdt_setprop_string(machine->fdt, "/chosen", "bootargs",
> +machine->kernel_cmdline);
> +}
>  } else {
> /*
>  * If dynamic firmware is used, it doesn't know where is the next mode
> diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
> index 5668fe0694..60e2912be5 100644
> --- a/hw/riscv/spike.c
> +++ b/hw/riscv/spike.c
> @@ -179,10 +179,6 @@ static void create_fdt(SpikeState *s, const MemMapEntry 
> *memmap,
>
>  qemu_fdt_add_subnode(fdt, "/chosen");
>  qemu_fdt_setprop_string(fdt, "/chosen", "stdout-path", "/htif");
> -
> -if (cmdline && *cmdline) {
> -qemu_fdt_setprop_string(fdt, "/chosen", "bootargs", cmdline);
> -}
>  }
>
>  static bool spike_test_elf_image(char *filename)
> @@ -319,6 +315,11 @@ static void spike_board_init(MachineState *machine)
>  riscv_load_initrd(machine->initrd_filename, machine->ram_size,
>kernel_entry, machine->fdt);
>  }
> +
> +if (machine->kernel_cmdline && *machine->kernel_cmdline) {
> +qemu_fdt_setprop_string(machine->fdt, "/chosen", "bootargs",
> +machine->kernel_cmdline);
> +}
>  } else {
> /*
>  * If dynamic firmware is used, it doesn't know where is the next mode
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 5967b136b4..6c946b6def 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -1012,7 +1012,6 @@ static void create_fdt(RISCVVirtState *s, const 
> MemMapEntry *memmap,
>  error_report("load_device_tree() failed");
>  exit(1);
>  }
> -goto update_bootargs;
>  } else {
>  mc->fdt = create_device_tree(&s->fdt_size);
>  if (!mc->fdt) {
> @@ -1050,11 +1049,6 @@ static void create_fdt(RISCVVirtState *s, const 
> MemMapEntry *memmap,
>  create_fdt_fw_cfg(s, memmap);
>  create_fdt_pmu(s);
>
> -update_bootargs:
> -if (cmdline && *cmdline) {
> -qemu_fdt_setprop_string(mc->fdt, "/chosen", "bootargs", cmdline);
> -}
> -
>  /* Pass seed to RNG */
>  qemu_guest_getrandom_nofail(rng_seed, sizeof(rng_seed));
>  qemu_fdt_setprop(mc->fdt, "/chosen", "rng-seed", rng_seed, 
> sizeof(rng_seed));
> @@ -1294,6 +1288,11 @@ static void virt_machine_done(Notifier *notifier, void 
> *data)
>  riscv_load_initrd(machine->initrd_filename, machine->ram_size,
>kernel_entry, machine->fdt);
>  }
> +
> +if (machine->kernel_cmdline && *machine->kernel_cmdline) {
> +qemu_fdt_setprop_string(machine->fdt, "/chosen", "bootargs",
> +machine->kernel_cmdline);
> +}
>  } else {
> /*
>  * If dynamic firmware is used, it doesn't know where is the next mode
> --
> 2.39.0
>
>

Re: [PATCH v5 06/11] hw/riscv: write initrd 'chosen' FDT inside riscv_load_initrd()

2023-01-10 Thread Alistair Francis

On Mon, Jan 2, 2023 at 9:54 PM Daniel Henrique Barboza
 wrote:
>
> riscv_load_initrd() returns the initrd end addr while also writing a
> 'start' var to mark the addr start. These informations are being used
> just to write the initrd FDT node. Every existing caller of
> riscv_load_initrd() is writing the FDT in the same manner.
>
> We can simplify things by writing the FDT inside riscv_load_initrd(),
> sparing callers from having to manage start/end addrs to write the FDT
> themselves.
>
> An 'if (fdt)' check is already inserted at the end of the function
> because we'll end up using it later on with other boards that doesn´t
> have a FDT.
>
> Cc: Palmer Dabbelt 
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Bin Meng 
> Reviewed-by: Philippe Mathieu-Daudé 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/boot.c| 18 --
>  hw/riscv/microchip_pfsoc.c | 10 ++
>  hw/riscv/sifive_u.c| 10 ++
>  hw/riscv/spike.c   | 10 ++
>  hw/riscv/virt.c| 10 ++
>  include/hw/riscv/boot.h|  4 ++--
>  6 files changed, 22 insertions(+), 40 deletions(-)
>
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index 31aa3385a0..6b948d1c9e 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -208,9 +208,10 @@ target_ulong riscv_load_kernel(const char 
> *kernel_filename,
>  exit(1);
>  }
>
> -hwaddr riscv_load_initrd(const char *filename, uint64_t mem_size,
> - uint64_t kernel_entry, hwaddr *start)
> +void riscv_load_initrd(const char *filename, uint64_t mem_size,
> +   uint64_t kernel_entry, void *fdt)
>  {
> +hwaddr start, end;
>  ssize_t size;
>
>  g_assert(filename != NULL);
> @@ -226,18 +227,23 @@ hwaddr riscv_load_initrd(const char *filename, uint64_t 
> mem_size,
>   * halfway into RAM, and for boards with 256MB of RAM or more we put
>   * the initrd at 128MB.
>   */
> -*start = kernel_entry + MIN(mem_size / 2, 128 * MiB);
> +start = kernel_entry + MIN(mem_size / 2, 128 * MiB);
>
> -size = load_ramdisk(filename, *start, mem_size - *start);
> +size = load_ramdisk(filename, start, mem_size - start);
>  if (size == -1) {
> -size = load_image_targphys(filename, *start, mem_size - *start);
> +size = load_image_targphys(filename, start, mem_size - start);
>  if (size == -1) {
>  error_report("could not load ramdisk '%s'", filename);
>  exit(1);
>  }
>  }
>
> -return *start + size;
> +/* Some RISC-V machines (e.g. opentitan) don't have a fdt. */
> +if (fdt) {
> +end = start + size;
> +qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-start", start);
> +qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-end", end);
> +}
>  }
>
>  uint64_t riscv_load_fdt(hwaddr dram_base, uint64_t mem_size, void *fdt)
> diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c
> index b10321b564..593a799549 100644
> --- a/hw/riscv/microchip_pfsoc.c
> +++ b/hw/riscv/microchip_pfsoc.c
> @@ -633,14 +633,8 @@ static void 
> microchip_icicle_kit_machine_init(MachineState *machine)
>   kernel_start_addr, NULL);
>
>  if (machine->initrd_filename) {
> -hwaddr start;
> -hwaddr end = riscv_load_initrd(machine->initrd_filename,
> -   machine->ram_size, kernel_entry,
> -   &start);
> -qemu_fdt_setprop_cell(machine->fdt, "/chosen",
> -  "linux,initrd-start", start);
> -qemu_fdt_setprop_cell(machine->fdt, "/chosen",
> -  "linux,initrd-end", end);
> +riscv_load_initrd(machine->initrd_filename, machine->ram_size,
> +  kernel_entry, machine->fdt);
>  }
>
>  if (machine->kernel_cmdline && *machine->kernel_cmdline) {
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index ddceb750ea..37f5087172 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -608,14 +608,8 @@ static void sifive_u_machine_init(MachineState *machine)
>   kernel_start_addr, NULL);
>
>  if (machine->initrd_filename) {
> -hwaddr start;
> -hwaddr end = riscv_load_initrd(machine->initrd_filename,
> -   machine->ram_size, kernel_entry,
> -   &start);
> -qemu_fdt_setprop_cell(machine->fdt, "/chosen",
> -  "linux,initrd-start", start);
> -qemu_fdt_setprop_cell(machine->fdt, "/chosen", 
> "linux,initrd-end",
> -  end);
> +riscv_load_initrd(machine->initrd_filename, machine->ram_size,
> +  kernel_entry,

Re: [PATCH v5 04/11] hw/riscv/boot.c: exit early if filename is NULL in load functions

2023-01-10 Thread Alistair Francis

On Mon, Jan 2, 2023 at 9:54 PM Daniel Henrique Barboza
 wrote:
>
> riscv_load_firmware(), riscv_load_initrd() and riscv_load_kernel() works
> under the assumption that a 'filename' parameter is always not NULL.
>
> This is currently the case since all callers of these functions are
> checking for NULL before calling them. Add an g_assert() to make sure
> that a NULL value in these cases are to be considered a bug.
>
> Suggested-by: Alex Bennée 
> Reviewed-by: Philippe Mathieu-Daudé 
> Signed-off-by: Daniel Henrique Barboza 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/boot.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index 98b80af51b..31aa3385a0 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -153,6 +153,8 @@ target_ulong riscv_load_firmware(const char 
> *firmware_filename,
>  uint64_t firmware_entry, firmware_end;
>  ssize_t firmware_size;
>
> +g_assert(firmware_filename != NULL);
> +
>  if (load_elf_ram_sym(firmware_filename, NULL, NULL, NULL,
>   &firmware_entry, NULL, &firmware_end, NULL,
>   0, EM_RISCV, 1, 0, NULL, true, sym_cb) > 0) {
> @@ -177,6 +179,8 @@ target_ulong riscv_load_kernel(const char 
> *kernel_filename,
>  {
>  uint64_t kernel_load_base, kernel_entry;
>
> +g_assert(kernel_filename != NULL);
> +
>  /*
>   * NB: Use low address not ELF entry point to ensure that the fw_dynamic
>   * behaviour when loading an ELF matches the fw_payload, fw_jump and BBL
> @@ -209,6 +213,8 @@ hwaddr riscv_load_initrd(const char *filename, uint64_t 
> mem_size,
>  {
>  ssize_t size;
>
> +g_assert(filename != NULL);
> +
>  /*
>   * We want to put the initrd far enough into RAM that when the
>   * kernel is uncompressed it will not clobber the initrd. However
> --
> 2.39.0
>
>

Re: [PATCH v5 01/11] tests/avocado: add RISC-V OpenSBI boot test

2023-01-10 Thread Alistair Francis

On Mon, Jan 2, 2023 at 9:53 PM Daniel Henrique Barboza
 wrote:
>
> This test is used to do a quick sanity check to ensure that we're able
> to run the existing QEMU FW image.
>
> 'sifive_u', 'spike' and 'virt' riscv64 machines, and 'sifive_u' and
> 'virt' 32 bit machines are able to run the default RISCV64_BIOS_BIN |
> RISCV32_BIOS_BIN firmware with minimal options.
>
> The riscv32 'spike' machine isn't bootable at this moment, requiring an
> OpenSBI fix [1] and QEMU side changes [2]. We could just leave at that
> or add a 'skip' test to remind us about it. To work as a reminder that
> we have a riscv32 'spike' test that should be enabled as soon as OpenSBI
> QEMU rom receives the fix, we're adding a 'skip' test:
>
> (06/18) tests/avocado/riscv_opensbi.py:RiscvOpenSBI.test_riscv32_spike:
> SKIP: requires OpenSBI fix to work
>
> [1] 
> https://patchwork.ozlabs.org/project/opensbi/patch/20221226033603.1860569-1-bm...@tinylab.org/
> [2] https://patchwork.ozlabs.org/project/qemu-devel/list/?series=334159
>
> Cc: Cleber Rosa 
> Cc: Philippe Mathieu-Daudé 
> Reviewed-by: Bin Meng 
> Tested-by: Bin Meng 
> Reviewed-by: Philippe Mathieu-Daudé 
> Signed-off-by: Daniel Henrique Barboza 

Acked-by: Alistair Francis 

Alistair

> ---
>  tests/avocado/riscv_opensbi.py | 65 ++
>  1 file changed, 65 insertions(+)
>  create mode 100644 tests/avocado/riscv_opensbi.py
>
> diff --git a/tests/avocado/riscv_opensbi.py b/tests/avocado/riscv_opensbi.py
> new file mode 100644
> index 00..e02f0d404a
> --- /dev/null
> +++ b/tests/avocado/riscv_opensbi.py
> @@ -0,0 +1,65 @@
> +# OpenSBI boot test for RISC-V machines
> +#
> +# Copyright (c) 2022, Ventana Micro
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# later.  See the COPYING file in the top-level directory.
> +
> +from avocado_qemu import QemuSystemTest
> +from avocado import skip
> +from avocado_qemu import wait_for_console_pattern
> +
> +class RiscvOpenSBI(QemuSystemTest):
> +"""
> +:avocado: tags=accel:tcg
> +"""
> +timeout = 5
> +
> +def boot_opensbi(self):
> +self.vm.set_console()
> +self.vm.launch()
> +wait_for_console_pattern(self, 'Platform Name')
> +wait_for_console_pattern(self, 'Boot HART MEDELEG')
> +
> +@skip("requires OpenSBI fix to work")
> +def test_riscv32_spike(self):
> +"""
> +:avocado: tags=arch:riscv32
> +:avocado: tags=machine:spike
> +"""
> +self.boot_opensbi()
> +
> +def test_riscv64_spike(self):
> +"""
> +:avocado: tags=arch:riscv64
> +:avocado: tags=machine:spike
> +"""
> +self.boot_opensbi()
> +
> +def test_riscv32_sifive_u(self):
> +"""
> +:avocado: tags=arch:riscv32
> +:avocado: tags=machine:sifive_u
> +"""
> +self.boot_opensbi()
> +
> +def test_riscv64_sifive_u(self):
> +"""
> +:avocado: tags=arch:riscv64
> +:avocado: tags=machine:sifive_u
> +"""
> +self.boot_opensbi()
> +
> +def test_riscv32_virt(self):
> +"""
> +:avocado: tags=arch:riscv32
> +:avocado: tags=machine:virt
> +"""
> +self.boot_opensbi()
> +
> +def test_riscv64_virt(self):
> +"""
> +:avocado: tags=arch:riscv64
> +:avocado: tags=machine:virt
> +"""
> +self.boot_opensbi()
> --
> 2.39.0
>
>

Re: [PATCH 0/2] target/riscv/cpu: fix sifive_u 32/64bits boot in riscv-to-apply.next

2023-01-10 Thread Daniel Henrique Barboza


Hi,

I mentioned that the bug were found in riscv-to-apply.next but forgot to
mentioned that the patches were also based on top of it as well:

https://github.com/alistair23/qemu/tree/riscv-to-apply.next


Thanks,


Daniel

On 1/10/23 17:14, Daniel Henrique Barboza wrote:

Hi,

I found this bug when testing my avocado changes in riscv-to-apply.next.
The sifive_u board, both 32 and 64 bits, stopped booting OpenSBI. The
guest hangs indefinitely.

Git bisect points that this patch broke things:

8c3f35d25e7e98655c609b6c1e9f103b9240f8f8 is the first bad commit
commit 8c3f35d25e7e98655c609b6c1e9f103b9240f8f8
Author: Weiwei Li 
Date:   Wed Dec 28 14:20:21 2022 +0800

 target/riscv: add support for Zca extension
 
 Modify the check for C extension to Zca (C implies Zca)

(https://github.com/alistair23/qemu/commit/8c3f35d25e7e98655c609b6c1e9f103b9240f8f8)
 


But this patch per se isn't doing anything wrong. The root of the
problem is that this patch makes assumptions based on the previous
patch:

commit a2b409aa6cadc1ed9715e1ab916ddd3dade0ba85
Author: Weiwei Li 
Date:   Wed Dec 28 14:20:20 2022 +0800

 target/riscv: add cfg properties for Zc* extension
(https://github.com/alistair23/qemu/commit/a2b409aa6cadc1ed9715e1ab916ddd3dade0ba85)

Which added a lot of logic and assumptions that are being skipped by all
the SiFive boards because, during riscv_cpu_realize(), we have this
code:

 /* If only MISA_EXT is unset for misa, then set it from properties */
 if (env->misa_ext == 0) {
 uint32_t ext = 0;
 (...)
 }

In short, we have a lot of code that are being skipped by all SiFive
CPUs because these CPUs are setting a non-zero value in set_misa() in
their respective cpu_init() functions.

It's possible to just hack in and fix the SiFive problem in isolate, but
I believe we can do better and allow all riscv_cpu_realize() to be executed
for all CPUs, regardless of what they've done during their cpu_init().


Daniel Henrique Barboza (2):
   target/riscv/cpu: set cpu->cfg in register_cpu_props()
   target/riscv/cpu.c: do not skip misa logic in riscv_cpu_realize()

  target/riscv/cpu.c | 525 +
  target/riscv/cpu.h |   4 +
  2 files changed, 292 insertions(+), 237 deletions(-)

Re: [PATCH v7 3/7] mac_{old,new}world: Pass MacOS VGA NDRV in card ROM instead of fw_cfg

2023-01-10 Thread Mark Cave-Ayland


On 04/01/2023 21:59, BALATON Zoltan wrote:


OpenBIOS cannot run FCode ROMs yet but it can detect NDRV in VGA card
ROM and add it to the device tree for MacOS. Pass the NDRV this way
instead of via fw_cfg. This solves the problem with OpenBIOS also
adding the NDRV to ati-vga which it does not work with. This does not
need any changes to OpenBIOS as this NDRV ROM handling is already
there but this patch also allows simplifying OpenBIOS later to remove
the fw_cfg ndrv handling from the vga FCode and also drop the
vga-ndrv? option which is not needed any more as users can disable the
ndrv with -device VGA,romfile="" (or override it with their own NDRV
or ROM). Once FCode support is implemented in OpenBIOS, the proper
FCode ROM can be set the same way so this paves the way to remove some
hacks.

Signed-off-by: BALATON Zoltan 
---
  hw/ppc/mac_newworld.c | 18 ++
  hw/ppc/mac_oldworld.c | 18 ++
  2 files changed, 12 insertions(+), 24 deletions(-)

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 460c14b5e3..60c9c27986 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -510,18 +510,6 @@ static void ppc_core99_init(MachineState *machine)
  fw_cfg_add_i32(fw_cfg, FW_CFG_PPC_BUSFREQ, BUSFREQ);
  fw_cfg_add_i32(fw_cfg, FW_CFG_PPC_NVRAM_ADDR, nvram_addr);
  
-/* MacOS NDRV VGA driver */

-filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, NDRV_VGA_FILENAME);
-if (filename) {
-gchar *ndrv_file;
-gsize ndrv_size;
-
-if (g_file_get_contents(filename, &ndrv_file, &ndrv_size, NULL)) {
-fw_cfg_add_file(fw_cfg, "ndrv/qemu_vga.ndrv", ndrv_file, 
ndrv_size);
-}
-g_free(filename);
-}
-
  qemu_register_boot_set(fw_cfg_boot_set, fw_cfg);
  }
  
@@ -565,6 +553,11 @@ static int core99_kvm_type(MachineState *machine, const char *arg)

  return 2;
  }
  
+static GlobalProperty props[] = {

+/* MacOS NDRV VGA driver */
+{ "VGA", "romfile", NDRV_VGA_FILENAME },
+};
+
  static void core99_machine_class_init(ObjectClass *oc, void *data)
  {
  MachineClass *mc = MACHINE_CLASS(oc);
@@ -585,6 +578,7 @@ static void core99_machine_class_init(ObjectClass *oc, void 
*data)
  #endif
  mc->default_ram_id = "ppc_core99.ram";
  mc->ignore_boot_device_suffixes = true;
+compat_props_add(mc->compat_props, props, G_N_ELEMENTS(props));
  fwc->get_dev_path = core99_fw_dev_path;
  }
  
diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c

index 5a7b25a4a8..6a1b1ad47a 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -344,18 +344,6 @@ static void ppc_heathrow_init(MachineState *machine)
  fw_cfg_add_i32(fw_cfg, FW_CFG_PPC_CLOCKFREQ, CLOCKFREQ);
  fw_cfg_add_i32(fw_cfg, FW_CFG_PPC_BUSFREQ, BUSFREQ);
  
-/* MacOS NDRV VGA driver */

-filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, NDRV_VGA_FILENAME);
-if (filename) {
-gchar *ndrv_file;
-gsize ndrv_size;
-
-if (g_file_get_contents(filename, &ndrv_file, &ndrv_size, NULL)) {
-fw_cfg_add_file(fw_cfg, "ndrv/qemu_vga.ndrv", ndrv_file, 
ndrv_size);
-}
-g_free(filename);
-}
-
  qemu_register_boot_set(fw_cfg_boot_set, fw_cfg);
  }
  
@@ -400,6 +388,11 @@ static int heathrow_kvm_type(MachineState *machine, const char *arg)

  return 2;
  }
  
+static GlobalProperty props[] = {

+/* MacOS NDRV VGA driver */
+{ "VGA", "romfile", NDRV_VGA_FILENAME },
+};
+
  static void heathrow_class_init(ObjectClass *oc, void *data)
  {
  MachineClass *mc = MACHINE_CLASS(oc);
@@ -420,6 +413,7 @@ static void heathrow_class_init(ObjectClass *oc, void *data)
  mc->default_display = "std";
  mc->ignore_boot_device_suffixes = true;
  mc->default_ram_id = "ppc_heathrow.ram";
+compat_props_add(mc->compat_props, props, G_N_ELEMENTS(props));
  fwc->get_dev_path = heathrow_fw_dev_path;
  }


The qemu_vga.ndrv is deliberately kept separate from the PCI option ROM because it is 
a binary generated by a separate project: otherwise you'd end up creating a 
dependency between OpenBIOS and QemuMacDrivers, which is almost impossible to achieve 
since qemu_vga.ndrv can only (currently) be built in an emulated MacOS 9 guest.


The best way to do this would be to extract the PCI config words from your ATI 
OpenBIOS patches and the alter drivers/vga.fs so that it only generates the 
driver,AAPL,MacOS,PowerPC property if the device id and vendor id match that of the 
QEMU VGA device.



ATB,

Mark.

Re: [PATCH v7 2/7] mac_{old, new}world: Use local variable instead of qdev_get_machine()

2023-01-10 Thread Mark Cave-Ayland


On 04/01/2023 21:59, BALATON Zoltan wrote:


We already have machine in a local variable so no need to use
qdev_get_machine(), also remove now unneded line break.

Signed-off-by: BALATON Zoltan 
---
  hw/ppc/mac_newworld.c | 3 +--
  hw/ppc/mac_oldworld.c | 3 +--
  2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 601ea518f8..460c14b5e3 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -466,8 +466,7 @@ static void ppc_core99_init(MachineState *machine)
  fw_cfg = FW_CFG(dev);
  qdev_prop_set_uint32(dev, "data_width", 1);
  qdev_prop_set_bit(dev, "dma_enabled", false);
-object_property_add_child(OBJECT(qdev_get_machine()), TYPE_FW_CFG,
-  OBJECT(fw_cfg));
+object_property_add_child(OBJECT(machine), TYPE_FW_CFG, OBJECT(fw_cfg));
  s = SYS_BUS_DEVICE(dev);
  sysbus_realize_and_unref(s, &error_fatal);
  sysbus_mmio_map(s, 0, CFG_ADDR);
diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index 558c639202..5a7b25a4a8 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -303,8 +303,7 @@ static void ppc_heathrow_init(MachineState *machine)
  fw_cfg = FW_CFG(dev);
  qdev_prop_set_uint32(dev, "data_width", 1);
  qdev_prop_set_bit(dev, "dma_enabled", false);
-object_property_add_child(OBJECT(qdev_get_machine()), TYPE_FW_CFG,
-  OBJECT(fw_cfg));
+object_property_add_child(OBJECT(machine), TYPE_FW_CFG, OBJECT(fw_cfg));
  s = SYS_BUS_DEVICE(dev);
  sysbus_realize_and_unref(s, &error_fatal);
  sysbus_mmio_map(s, 0, CFG_ADDR);


Reviewed-by: Mark Cave-Ayland 


ATB,

Mark.

Re: [PATCH v7 1/7] input/adb: Only include header where needed

2023-01-10 Thread Mark Cave-Ayland


On 04/01/2023 21:59, BALATON Zoltan wrote:


The header hw/input/adb.h is included by some files that don't need
it. Clean it up and include only where necessary.

Signed-off-by: BALATON Zoltan 
---
  hw/misc/macio/cuda.c | 2 --
  hw/misc/macio/pmu.c  | 3 ---
  hw/misc/mos6522.c| 1 -
  include/hw/misc/mac_via.h| 1 +
  include/hw/misc/macio/cuda.h | 1 +
  include/hw/misc/macio/pmu.h  | 1 +
  include/hw/misc/mos6522.h| 3 +--
  7 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/hw/misc/macio/cuda.c b/hw/misc/macio/cuda.c
index 853e88bfed..7208b90e12 100644
--- a/hw/misc/macio/cuda.c
+++ b/hw/misc/macio/cuda.c
@@ -27,8 +27,6 @@
  #include "hw/irq.h"
  #include "hw/qdev-properties.h"
  #include "migration/vmstate.h"
-#include "hw/input/adb.h"
-#include "hw/misc/mos6522.h"
  #include "hw/misc/macio/cuda.h"
  #include "qapi/error.h"
  #include "qemu/timer.h"
diff --git a/hw/misc/macio/pmu.c b/hw/misc/macio/pmu.c
index 97ef8c771b..8575bc1264 100644
--- a/hw/misc/macio/pmu.c
+++ b/hw/misc/macio/pmu.c
@@ -31,10 +31,7 @@
  #include "qemu/osdep.h"
  #include "hw/qdev-properties.h"
  #include "migration/vmstate.h"
-#include "hw/input/adb.h"
  #include "hw/irq.h"
-#include "hw/misc/mos6522.h"
-#include "hw/misc/macio/gpio.h"
  #include "hw/misc/macio/pmu.h"
  #include "qapi/error.h"
  #include "qemu/timer.h"
diff --git a/hw/misc/mos6522.c b/hw/misc/mos6522.c
index 0ed631186c..d6ba47bde9 100644
--- a/hw/misc/mos6522.c
+++ b/hw/misc/mos6522.c
@@ -25,7 +25,6 @@
   */
  
  #include "qemu/osdep.h"

-#include "hw/input/adb.h"
  #include "hw/irq.h"
  #include "hw/misc/mos6522.h"
  #include "hw/qdev-properties.h"
diff --git a/include/hw/misc/mac_via.h b/include/hw/misc/mac_via.h
index 5fe7a7f592..422da43bf9 100644
--- a/include/hw/misc/mac_via.h
+++ b/include/hw/misc/mac_via.h
@@ -12,6 +12,7 @@
  #include "exec/memory.h"
  #include "hw/sysbus.h"
  #include "hw/misc/mos6522.h"
+#include "hw/input/adb.h"
  #include "qom/object.h"
  
  
diff --git a/include/hw/misc/macio/cuda.h b/include/hw/misc/macio/cuda.h

index a71deec968..8a6678c749 100644
--- a/include/hw/misc/macio/cuda.h
+++ b/include/hw/misc/macio/cuda.h
@@ -26,6 +26,7 @@
  #ifndef CUDA_H
  #define CUDA_H
  
+#include "hw/input/adb.h"

  #include "hw/misc/mos6522.h"
  #include "qom/object.h"
  
diff --git a/include/hw/misc/macio/pmu.h b/include/hw/misc/macio/pmu.h

index 00fcdd23f5..ba76afb52a 100644
--- a/include/hw/misc/macio/pmu.h
+++ b/include/hw/misc/macio/pmu.h
@@ -10,6 +10,7 @@
  #ifndef PMU_H
  #define PMU_H
  
+#include "hw/input/adb.h"

  #include "hw/misc/mos6522.h"
  #include "hw/misc/macio/gpio.h"
  #include "qom/object.h"
diff --git a/include/hw/misc/mos6522.h b/include/hw/misc/mos6522.h
index 05872fffc9..fba45668ab 100644
--- a/include/hw/misc/mos6522.h
+++ b/include/hw/misc/mos6522.h
@@ -27,9 +27,8 @@
  #ifndef MOS6522_H
  #define MOS6522_H
  
-#include "exec/memory.h"

+#include "exec/hwaddr.h"
  #include "hw/sysbus.h"
-#include "hw/input/adb.h"
  #include "qom/object.h"
  
  #define MOS6522_NUM_REGS 16


Reviewed-by: Mark Cave-Ayland 


ATB,

Mark.

Re: intermittent hang, s390x host, bios-tables-test test, TPM

2023-01-10 Thread Peter Maydell

On Tue, 10 Jan 2023 at 19:25, Daniel P. Berrangé  wrote:
>
> On Fri, Jan 06, 2023 at 03:39:31PM +, Peter Maydell wrote:
> > Yeah. It would be good if we didn't deadlock without printing
> > the assertion, though...
> >
> > I guess we could improve qtest_kill_qemu() so it doesn't wait
> > indefinitely for QEMU to exit but instead sends a SIGKILL 20
> > seconds after the SIGTERM. (Annoyingly, there is no convenient
> > "waitpid but with a timeout" function...)
>
> We don't need to touch that. Instead the tpm-emu.c file needs to
> call  qtest_add_abrt_handler() passing a callback that will invoke
> qio_channel_close on its end of the socket. This will cause the
> QEMU process to get EOF on the other end of the socket. It then
> won't be stuck holding the iothread lock, and will be able to
> respond to SIGTERM.

That sounds straightforward and will fix this specific case
of "the QEMU process didn't exit on SIGTERM", but it would
be nice more generally if the test harness did not sit there
forever without printing the assertion in this situation.
"QEMU got permanently stuck" is something that can happen
in more than on way, after all...

thanks
-- PMM

Re: [PULL 00/29] Misc patches for 2023-01-10

2023-01-10 Thread Peter Maydell

On Tue, 10 Jan 2023 at 18:29, Paolo Bonzini  wrote:
>
> The following changes since commit 3d83b78285d6e96636130f7d449fd02e2d4deee0:
>
>   Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging 
> (2023-01-08 14:27:40 +)
>
> are available in the Git repository at:
>
>   https://gitlab.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to cd78f1d264c1ac7dfd7fa50abce0dec71a1f41ac:
>
>   configure: remove backwards-compatibility code (2023-01-09 16:54:56 +0100)
>
> 
> * Atomic memslot updates for KVM (Emanuele, David)
> * libvhost-user/libvduse warnings fixes (Marcel)
> * i386 TCG fixes (Joe, myself)
> * Remove compilation errors when -Werror=maybe-uninitialized (Eric)
> * fix GLIB_VERSION for cross-compilation (Paolo)
>
> 

This provokes a new warning in compiling the testsuite on ppc:


../../tests/fp/berkeley-testfloat-3/source/fail.c: In function 'fail':
../../tests/fp/berkeley-testfloat-3/source/fail.c:53:5: warning:
function 'fail' might be a candidate for 'gnu_printf' format attribute
[-Wsuggest-attribute=format]
   53 | vfprintf( stderr, messagePtr, varArgs );
  | ^~~~


thanks
-- PMM

Re: [PATCH] bulk: Rename TARGET_FMT_plx -> HWADDR_FMT_plx

2023-01-10 Thread BALATON Zoltan


On Tue, 10 Jan 2023, Philippe Mathieu-Daudé wrote:

The 'hwaddr' type is defined in "exec/hwaddr.h" as:

   hwaddr is the type of a physical address
  (its size can be different from 'target_ulong').

All definitions use the 'HWADDR_' prefix, except TARGET_FMT_plx:

$ fgrep define include/exec/hwaddr.h
#define HWADDR_H
#define HWADDR_BITS 64
#define HWADDR_MAX UINT64_MAX
#define TARGET_FMT_plx "%016" PRIx64
^^
#define HWADDR_PRId PRId64
#define HWADDR_PRIi PRIi64
#define HWADDR_PRIo PRIo64
#define HWADDR_PRIu PRIu64
#define HWADDR_PRIx PRIx64


Why are there both TARGET_FMT_plx and HWADDR_PRIx? Why not just use 
HWADDR_PRIx instead?


Regards,
BALATON Zoltan

Re: intermittent hang, s390x host, bios-tables-test test, TPM

2023-01-10 Thread Stefan Berger





On 1/10/23 14:47, Stefan Berger wrote:



On 1/10/23 14:27, Daniel P. Berrangé wrote:

On Tue, Jan 10, 2023 at 01:50:26PM -0500, Stefan Berger wrote:



On 1/6/23 10:16, Stefan Berger wrote:

This here seems to be the root cause. An unknown control channel
command was received from the TPM emulator backend by the control channel 
thread and we end up in g_assert_not_reached().

https://github.com/qemu/qemu/blob/master/tests/qtest/tpm-emu.c#L189



      ret = qio_channel_read(ioc, (char *)&cmd, sizeof(cmd), NULL);
      if (ret <= 0) {
      break;
      }

      cmd = be32_to_cpu(cmd);
      switch (cmd) {
   [...]
      default:
      g_debug("unimplemented %u", cmd);
      g_assert_not_reached();    <--
      }

I will run this test case in an endless loop on an x86_64 host and see what we 
get there ...


I could not recreate the issue running the  test on a ppc64 and x86_64
host. There we like >100k test runs on ppc64 and >40k on x86_64. Also
simulating the reception of an unsupported command did not lead to a
hang like shown here.


Assuming your ppc64 host is running an little endian OS, and
we're only seeing the test failure on s390x, then it points towards
the problem being an endianness issue in the TPM code. Something
missing a byteswap somewhere along the way ?


Yes, my ppc64 machine is also little endian. If the issue  was not an 
intermittent but a permanent
failure I would look for something like that. I would think it's more some sort 
of initialization
issue, like a value on the stack that occasionally set to an undesirable value 
-- maybe even in a
dependency.


I found I still had access to an s390x machine. ~2700 loops on this test case
so far but nothing... it would be good to be able to recreate the issue and
apply the fix but we'll have to do it without testing then I guess.

Does this look about right? From my tests with injecting an error it at least
seems to do what it is intended to do.

diff --git a/tests/qtest/tpm-emu.c b/tests/qtest/tpm-emu.c
index 2994d1cf42..dbc308a572 100644
--- a/tests/qtest/tpm-emu.c
+++ b/tests/qtest/tpm-emu.c
@@ -36,11 +36,19 @@ void tpm_emu_test_wait_cond(TPMTestState *s)
 g_mutex_unlock(&s->data_mutex);
 }

+static void tpm_emu_close_data_ioc(void *ioc)
+{
+g_debug("CLOSE DATA IOC");
+qio_channel_close(ioc, NULL);
+}
+
 static void *tpm_emu_tpm_thread(void *data)
 {
 TPMTestState *s = data;
 QIOChannel *ioc = s->tpm_ioc;

+qtest_add_abrt_handler(tpm_emu_close_data_ioc, ioc);
+
 s->tpm_msg = g_new(struct tpm_hdr, 1);
 while (true) {
 int minhlen = sizeof(s->tpm_msg->tag) + sizeof(s->tpm_msg->len);
@@ -77,12 +85,19 @@ static void *tpm_emu_tpm_thread(void *data)
   &error_abort);
 }

+qtest_remove_abrt_handler(ioc);
 g_free(s->tpm_msg);
 s->tpm_msg = NULL;
 object_unref(OBJECT(s->tpm_ioc));
 return NULL;
 }

+static void tpm_emu_close_ctrl_ioc(void *ioc)
+{
+g_debug("CLOSE CTRL IOC");
+qio_channel_close(ioc, NULL);
+}
+
 void *tpm_emu_ctrl_thread(void *data)
 {
 TPMTestState *s = data;
@@ -119,6 +134,8 @@ void *tpm_emu_ctrl_thread(void *data)
 s->emu_tpm_thread = g_thread_new(NULL, tpm_emu_tpm_thread, s);
 }

+qtest_add_abrt_handler(tpm_emu_close_ctrl_ioc, ioc);
+
 while (true) {
 uint32_t cmd;
 ssize_t ret;
@@ -129,6 +146,9 @@ void *tpm_emu_ctrl_thread(void *data)
 }

 cmd = be32_to_cpu(cmd);
+//g_debug("cmd=%u", cmd);
+//if (cmd == 14)
+//cmd = 100;
 switch (cmd) {
 case CMD_GET_CAPABILITY: {
 ptm_cap cap = cpu_to_be64(0x3fff);
@@ -190,6 +210,8 @@ void *tpm_emu_ctrl_thread(void *data)
 }
 }

+qtest_remove_abrt_handler(ioc);
+
 object_unref(OBJECT(ioc));
 object_unref(OBJECT(lioc));
 return NULL;



    Stefan




With regards,
Daniel

[PATCH] bulk: Rename TARGET_FMT_plx -> HWADDR_FMT_plx

2023-01-10 Thread Philippe Mathieu-Daudé

The 'hwaddr' type is defined in "exec/hwaddr.h" as:

hwaddr is the type of a physical address
   (its size can be different from 'target_ulong').

All definitions use the 'HWADDR_' prefix, except TARGET_FMT_plx:

 $ fgrep define include/exec/hwaddr.h
 #define HWADDR_H
 #define HWADDR_BITS 64
 #define HWADDR_MAX UINT64_MAX
 #define TARGET_FMT_plx "%016" PRIx64
 ^^
 #define HWADDR_PRId PRId64
 #define HWADDR_PRIi PRIi64
 #define HWADDR_PRIo PRIo64
 #define HWADDR_PRIu PRIu64
 #define HWADDR_PRIx PRIx64
 #define HWADDR_PRIX PRIX64

Since hwaddr's size can be *different* from target_ulong, it is
very confusing to read one of its format using the 'TARGET_FMT_'
prefix, normally used for the target_long / target_ulong types:

$ fgrep TARGET_FMT_ include/exec/cpu-defs.h
 #define TARGET_FMT_lx "%08x"
 #define TARGET_FMT_ld "%d"
 #define TARGET_FMT_lu "%u"
 #define TARGET_FMT_lx "%016" PRIx64
 #define TARGET_FMT_ld "%" PRId64
 #define TARGET_FMT_lu "%" PRIu64

Apparently this format was missed during commit a8170e5e97
("Rename target_phys_addr_t to hwaddr"), so complete it by
doing a bulk-rename with:

 $ sed -i -e s/TARGET_FMT_plx/HWADDR_FMT_plx/g $(git grep -l TARGET_FMT_plx)

Signed-off-by: Philippe Mathieu-Daudé 
---
 accel/tcg/cputlb.c  |  2 +-
 hw/arm/strongarm.c  | 24 
 hw/block/pflash_cfi01.c |  2 +-
 hw/char/digic-uart.c|  4 ++--
 hw/char/etraxfs_ser.c   |  4 ++--
 hw/core/loader.c|  8 
 hw/core/sysbus.c|  4 ++--
 hw/display/cirrus_vga.c |  4 ++--
 hw/display/g364fb.c |  4 ++--
 hw/display/vga.c|  8 
 hw/dma/etraxfs_dma.c| 14 +++---
 hw/dma/pl330.c  | 14 +++---
 hw/dma/xilinx_axidma.c  |  4 ++--
 hw/dma/xlnx_csu_dma.c   |  4 ++--
 hw/i2c/mpc_i2c.c|  4 ++--
 hw/i386/multiboot.c |  8 
 hw/i386/xen/xen-hvm.c   |  8 
 hw/i386/xen/xen-mapcache.c  | 16 
 hw/i386/xen/xen_platform.c  |  4 ++--
 hw/intc/arm_gicv3_dist.c|  8 
 hw/intc/arm_gicv3_its.c | 14 +++---
 hw/intc/arm_gicv3_redist.c  |  8 
 hw/intc/exynos4210_combiner.c   | 10 +-
 hw/misc/auxbus.c|  2 +-
 hw/misc/ivshmem.c   |  6 +++---
 hw/misc/macio/mac_dbdma.c   |  4 ++--
 hw/misc/mst_fpga.c  |  4 ++--
 hw/net/allwinner-sun8i-emac.c   |  4 ++--
 hw/net/allwinner_emac.c |  4 ++--
 hw/net/fsl_etsec/etsec.c|  4 ++--
 hw/net/fsl_etsec/rings.c|  4 ++--
 hw/net/pcnet.c  |  4 ++--
 hw/net/rocker/rocker.c  | 26 +-
 hw/net/rocker/rocker_desc.c |  2 +-
 hw/net/xilinx_axienet.c |  4 ++--
 hw/net/xilinx_ethlite.c |  6 +++---
 hw/pci-bridge/pci_expander_bridge.c |  2 +-
 hw/pci-host/bonito.c| 14 +++---
 hw/pci-host/ppce500.c   |  4 ++--
 hw/pci/pci_host.c   |  4 ++--
 hw/ppc/ppc4xx_sdram.c   |  2 +-
 hw/rtc/exynos4210_rtc.c |  4 ++--
 hw/sh4/sh7750.c |  4 ++--
 hw/ssi/xilinx_spi.c |  4 ++--
 hw/ssi/xilinx_spips.c   |  8 
 hw/timer/digic-timer.c  |  4 ++--
 hw/timer/etraxfs_timer.c|  2 +-
 hw/timer/exynos4210_mct.c   |  2 +-
 hw/timer/exynos4210_pwm.c   |  4 ++--
 hw/virtio/virtio-mmio.c |  4 ++--
 hw/xen/xen_pt.c |  4 ++--
 include/exec/hwaddr.h   |  2 +-
 monitor/misc.c  |  2 +-
 softmmu/memory.c| 18 +-
 softmmu/memory_mapping.c|  4 ++--
 softmmu/physmem.c   | 10 +-
 target/i386/monitor.c   |  6 +++---
 target/loongarch/tlb_helper.c   |  2 +-
 target/microblaze/op_helper.c   |  2 +-
 target/mips/tcg/sysemu/tlb_helper.c |  2 +-
 target/ppc/mmu-hash32.c | 14 +++---
 target/ppc/mmu-hash64.c | 12 ++--
 target/ppc/mmu_common.c | 26 +-
 target/ppc/mmu_helper.c |  4 ++--
 target/riscv/cpu_helper.c   | 10 +-
 target/riscv/monitor.c  |  2 +-
 target/sparc/ldst_helper.c  |  6 +++---
 target/sparc/mmu_helper.c   | 10 +-
 target/tricore/helper.c |  2 +-
 69 files changed, 227 insertions(+), 227 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 4948729917..4e040a1cb9 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1142,7 +1142,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
 &xlat, &sz, fu

Re: [PATCH v2 4/5] util/qht: use striped locks under TSAN

2023-01-10 Thread Alex Bennée



Emilio Cota  writes:

> Fixes this tsan crash, easy to reproduce with any large enough program:
>
> $ tests/unit/test-qht
> 1..2
> ThreadSanitizer: CHECK failed: sanitizer_deadlock_detector.h:67 
> "((n_all_locks_)) < 
> (((sizeof(all_locks_with_contexts_)/sizeof((all_locks_with_contexts_)[0]" 
> (0x40, 0x40) (tid=1821568)
> #0 __tsan::CheckUnwind() 
> ../../../../src/libsanitizer/tsan/tsan_rtl.cpp:353 (libtsan.so.2+0x90034)
> #1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long 
> long, unsigned long long) 
> ../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86 
> (libtsan.so.2+0xca555)
> #2 __sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, 
> __sanitizer::BasicBitVector > >::addLock(unsigned long, 
> unsigned long, unsigned int) 
> ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:67
>  (libtsan.so.2+0xb3616)
> #3 __sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, 
> __sanitizer::BasicBitVector > >::addLock(unsigned long, 
> unsigned long, unsigned int) 
> ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:59
>  (libtsan.so.2+0xb3616)
> #4 __sanitizer::DeadlockDetector<__sanitizer::TwoLevelBitVector<1ul, 
> __sanitizer::BasicBitVector > 
> >::onLockAfter(__sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul,
>  __sanitizer::BasicBitVector > >*, unsigned long, unsigned 
> int) 
> ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:216
>  (libtsan.so.2+0xb3616)
> #5 __sanitizer::DD::MutexAfterLock(__sanitizer::DDCallback*, 
> __sanitizer::DDMutex*, bool, bool) 
> ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector1.cpp:169
>  (libtsan.so.2+0xb3616)
> #6 __tsan::MutexPostLock(__tsan::ThreadState*, unsigned long, unsigned 
> long, unsigned int, int) 
> ../../../../src/libsanitizer/tsan/tsan_rtl_mutex.cpp:200 
> (libtsan.so.2+0xa3382)
> #7 __tsan_mutex_post_lock 
> ../../../../src/libsanitizer/tsan/tsan_interface_ann.cpp:384 
> (libtsan.so.2+0x76bc3)
> #8 qemu_spin_lock /home/cota/src/qemu/include/qemu/thread.h:259 
> (test-qht+0x44a97)
> #9 qht_map_lock_buckets ../util/qht.c:253 (test-qht+0x44a97)
> #10 do_qht_iter ../util/qht.c:809 (test-qht+0x45f33)
> #11 qht_iter ../util/qht.c:821 (test-qht+0x45f33)
> #12 iter_check ../tests/unit/test-qht.c:121 (test-qht+0xe473)
> #13 qht_do_test ../tests/unit/test-qht.c:202 (test-qht+0xe473)
> #14 qht_test ../tests/unit/test-qht.c:240 (test-qht+0xe7c1)
> #15 test_default ../tests/unit/test-qht.c:246 (test-qht+0xe828)
> #16   (libglib-2.0.so.0+0x7daed)
> #17   (libglib-2.0.so.0+0x7d80a)
> #18   (libglib-2.0.so.0+0x7d80a)
> #19 g_test_run_suite  (libglib-2.0.so.0+0x7dfe9)
> #20 g_test_run  (libglib-2.0.so.0+0x7e055)
> #21 main ../tests/unit/test-qht.c:259 (test-qht+0xd2c6)
> #22 __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 
> (libc.so.6+0x29d8f)
> #23 __libc_start_main_impl ../csu/libc-start.c:392 (libc.so.6+0x29e3f)
> #24 _start  (test-qht+0xdb44)
>
> Signed-off-by: Emilio Cota 
> ---
>  util/qht.c | 101 +
>  1 file changed, 87 insertions(+), 14 deletions(-)
>
> diff --git a/util/qht.c b/util/qht.c
> index 15866299e6..70cc733d5d 100644
> --- a/util/qht.c
> +++ b/util/qht.c
> @@ -151,6 +151,22 @@ struct qht_bucket {
>  
>  QEMU_BUILD_BUG_ON(sizeof(struct qht_bucket) > QHT_BUCKET_ALIGN);
>  
> +/*
> + * Under TSAN, we use striped locks instead of one lock per bucket chain.
> + * This avoids crashing under TSAN, since TSAN aborts the program if more 
> than
> + * 64 locks are held (this is a hardcoded limit in TSAN).
> + * When resizing a QHT we grab all the buckets' locks, which can easily
> + * go over TSAN's limit. By using striped locks, we avoid this problem.
> + *
> + * Note: this number must be a power of two for easy index computation.
> + */
> +#define QHT_TSAN_BUCKET_LOCKS_BITS 4
> +#define QHT_TSAN_BUCKET_LOCKS (1 << QHT_TSAN_BUCKET_LOCKS_BITS)
> +
> +struct qht_tsan_lock {
> +QemuSpin lock;
> +} QEMU_ALIGNED(QHT_BUCKET_ALIGN);
> +
>  /**
>   * struct qht_map - structure to track an array of buckets
>   * @rcu: used by RCU. Keep it as the top field in the struct to help valgrind
> @@ -160,6 +176,7 @@ QEMU_BUILD_BUG_ON(sizeof(struct qht_bucket) > 
> QHT_BUCKET_ALIGN);
>   * @n_added_buckets: number of added (i.e. "non-head") buckets
>   * @n_added_buckets_threshold: threshold to trigger an upward resize once the
>   * number of added buckets surpasses it.
> + * @tsan_bucket_locks: Array of striped locks to be used only under TSAN.
>   *
>   * Buckets are tracked in what we call a "map", i.e. this structure.
>   */
> @@ -169,6 +186,9 @@ struct qht_map {
>  size_t n_buckets;
>  size_t n_added_buckets;
>  size_t n_added_buckets_threshold;
> +#ifdef CONFIG_TSAN
> +

Re: [PATCH v5 10/11] hw/riscv/boot.c: consolidate all kernel init in riscv_load_kernel()

2023-01-10 Thread Daniel Henrique Barboza





On 1/10/23 08:43, Daniel Henrique Barboza wrote:



On 1/8/23 00:33, Bin Meng wrote:

On Mon, Jan 2, 2023 at 7:55 PM Daniel Henrique Barboza
 wrote:

The microchip_icicle_kit, sifive_u, spike and virt boards are now doing
the same steps when '-kernel' is used:

- execute load_kernel()
- load init_rd()
- write kernel_cmdline

Let's fold everything inside riscv_load_kernel() to avoid code
repetition. To not change the behavior of boards that aren't calling
riscv_load_init(), add an 'load_initrd' flag to riscv_load_kernel() and

typo: should be riscv_load_initrd()


allow these boards to opt out from initrd loading.

Cc: Palmer Dabbelt 
Signed-off-by: Daniel Henrique Barboza 
---
  hw/riscv/boot.c    | 22 +++---
  hw/riscv/microchip_pfsoc.c | 12 ++--
  hw/riscv/opentitan.c   |  2 +-
  hw/riscv/sifive_e.c    |  3 ++-
  hw/riscv/sifive_u.c    | 12 ++--
  hw/riscv/spike.c   | 11 +--
  hw/riscv/virt.c    | 12 ++--
  include/hw/riscv/boot.h    |  1 +
  8 files changed, 30 insertions(+), 45 deletions(-)


Otherwise,
Reviewed-by: Bin Meng 


Thanks!

Alistair, let me know if you want me to send another version with the commit
message typo fixed. I might as well take the change to rebase it with
riscv-to-apply.next.


While rebasing these patches on top of riscv-to-apply.next, the avocado tests
I've introduced here started to fail both sifive_u tests:

tests/avocado/riscv_opensbi.py:RiscvOpenSBI.test_riscv32_sifive_u: INTERRUPTED:
Test interrupted by SIGTERM\nRunner error occurred: ... (5.07 s)
 (09/18) tests/avocado/riscv_opensbi.py:RiscvOpenSBI.test_riscv64_sifive_u: 
INTERRUPTED:
Test interrupted by SIGTERM\nRunner error occurred: ... (5.05 s)


I proposed a fix here:

https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg02035.html

I can re-send this series after we get that problem figure out. Otherwise we're
going to add 2 avocado tests that are failing right from the start hehe.

Thanks,

Daniel





Daniel

[PATCH 2/2] target/riscv/cpu.c: do not skip misa logic in riscv_cpu_realize()

2023-01-10 Thread Daniel Henrique Barboza

All RISCV CPUs are setting cpu->cfg during their cpu_init() functions,
meaning that there's no reason to skip all the misa validation and setup
if misa_ext was set beforehand - especially since we're setting an
updated value in set_misa() in the end.

Put this code chunk into a new riscv_cpu_validate_set_extensions()
helper and always execute it regardless of what the board set in
env->misa_ext.

This will put more responsibility in how each board is going to init
their attributes and extensions if they're not using the defaults.
It'll also allow realize() to do its job looking only at the extensions
enabled per se, not corner cases that some CPUs might have, and we won't
have to change multiple code paths to fix or change how extensions work.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 485 +++--
 1 file changed, 248 insertions(+), 237 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index b8c1edb7c2..33ed59a1b6 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -631,6 +631,250 @@ static void riscv_cpu_disas_set_info(CPUState *s, 
disassemble_info *info)
 }
 }
 
+/*
+ * Check consistency between chosen extensions while setting
+ * cpu->cfg accordingly, doing a set_misa() in the end.
+ */
+static void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
+{
+CPURISCVState *env = &cpu->env;
+uint32_t ext = 0;
+
+/* Do some ISA extension error checking */
+if (cpu->cfg.ext_g && !(cpu->cfg.ext_i && cpu->cfg.ext_m &&
+cpu->cfg.ext_a && cpu->cfg.ext_f &&
+cpu->cfg.ext_d &&
+cpu->cfg.ext_icsr && cpu->cfg.ext_ifencei)) {
+warn_report("Setting G will also set IMAFD_Zicsr_Zifencei");
+cpu->cfg.ext_i = true;
+cpu->cfg.ext_m = true;
+cpu->cfg.ext_a = true;
+cpu->cfg.ext_f = true;
+cpu->cfg.ext_d = true;
+cpu->cfg.ext_icsr = true;
+cpu->cfg.ext_ifencei = true;
+}
+
+if (cpu->cfg.ext_i && cpu->cfg.ext_e) {
+error_setg(errp,
+   "I and E extensions are incompatible");
+return;
+}
+
+if (!cpu->cfg.ext_i && !cpu->cfg.ext_e) {
+error_setg(errp,
+   "Either I or E extension must be set");
+return;
+}
+
+if (cpu->cfg.ext_s && !cpu->cfg.ext_u) {
+error_setg(errp,
+   "Setting S extension without U extension is illegal");
+return;
+}
+
+if (cpu->cfg.ext_h && !cpu->cfg.ext_i) {
+error_setg(errp,
+   "H depends on an I base integer ISA with 32 x registers");
+return;
+}
+
+if (cpu->cfg.ext_h && !cpu->cfg.ext_s) {
+error_setg(errp, "H extension implicitly requires S-mode");
+return;
+}
+
+if (cpu->cfg.ext_f && !cpu->cfg.ext_icsr) {
+error_setg(errp, "F extension requires Zicsr");
+return;
+}
+
+if ((cpu->cfg.ext_zawrs) && !cpu->cfg.ext_a) {
+error_setg(errp, "Zawrs extension requires A extension");
+return;
+}
+
+if ((cpu->cfg.ext_zfh || cpu->cfg.ext_zfhmin) && !cpu->cfg.ext_f) {
+error_setg(errp, "Zfh/Zfhmin extensions require F extension");
+return;
+}
+
+if (cpu->cfg.ext_d && !cpu->cfg.ext_f) {
+error_setg(errp, "D extension requires F extension");
+return;
+}
+
+if (cpu->cfg.ext_v && !cpu->cfg.ext_d) {
+error_setg(errp, "V extension requires D extension");
+return;
+}
+
+if ((cpu->cfg.ext_zve32f || cpu->cfg.ext_zve64f) && !cpu->cfg.ext_f) {
+error_setg(errp, "Zve32f/Zve64f extensions require F extension");
+return;
+}
+
+/* Set the ISA extensions, checks should have happened above */
+if (cpu->cfg.ext_zdinx || cpu->cfg.ext_zhinx ||
+cpu->cfg.ext_zhinxmin) {
+cpu->cfg.ext_zfinx = true;
+}
+
+if (cpu->cfg.ext_zfinx) {
+if (!cpu->cfg.ext_icsr) {
+error_setg(errp, "Zfinx extension requires Zicsr");
+return;
+}
+if (cpu->cfg.ext_f) {
+error_setg(errp,
+"Zfinx cannot be supported together with F extension");
+return;
+}
+}
+
+if (cpu->cfg.ext_c) {
+cpu->cfg.ext_zca = true;
+if (cpu->cfg.ext_f && env->misa_mxl_max == MXL_RV32) {
+cpu->cfg.ext_zcf = true;
+}
+if (cpu->cfg.ext_d) {
+cpu->cfg.ext_zcd = true;
+}
+}
+
+if (env->misa_mxl_max != MXL_RV32 && cpu->cfg.ext_zcf) {
+error_setg(errp, "Zcf extension is only relevant to RV32");
+return;
+}
+
+if (!cpu->cfg.ext_f && cpu->cfg.ext_zcf) {
+error_setg(errp, "Zcf extension requires F extension");
+return;
+}
+
+if (!cpu->cfg.ext_d && cpu->cfg.ext_zcd) {
+error_setg(errp, "Zcd extension requires D extension");
+return;
+}
+
+i

[PATCH 0/2] target/riscv/cpu: fix sifive_u 32/64bits boot in riscv-to-apply.next

2023-01-10 Thread Daniel Henrique Barboza

Hi,

I found this bug when testing my avocado changes in riscv-to-apply.next.
The sifive_u board, both 32 and 64 bits, stopped booting OpenSBI. The
guest hangs indefinitely.

Git bisect points that this patch broke things:

8c3f35d25e7e98655c609b6c1e9f103b9240f8f8 is the first bad commit
commit 8c3f35d25e7e98655c609b6c1e9f103b9240f8f8
Author: Weiwei Li 
Date:   Wed Dec 28 14:20:21 2022 +0800

target/riscv: add support for Zca extension

Modify the check for C extension to Zca (C implies Zca)
(https://github.com/alistair23/qemu/commit/8c3f35d25e7e98655c609b6c1e9f103b9240f8f8)


But this patch per se isn't doing anything wrong. The root of the
problem is that this patch makes assumptions based on the previous
patch:

commit a2b409aa6cadc1ed9715e1ab916ddd3dade0ba85
Author: Weiwei Li 
Date:   Wed Dec 28 14:20:20 2022 +0800

target/riscv: add cfg properties for Zc* extension
(https://github.com/alistair23/qemu/commit/a2b409aa6cadc1ed9715e1ab916ddd3dade0ba85)

Which added a lot of logic and assumptions that are being skipped by all
the SiFive boards because, during riscv_cpu_realize(), we have this
code:

/* If only MISA_EXT is unset for misa, then set it from properties */
if (env->misa_ext == 0) {
uint32_t ext = 0;
(...)
}

In short, we have a lot of code that are being skipped by all SiFive
CPUs because these CPUs are setting a non-zero value in set_misa() in
their respective cpu_init() functions.

It's possible to just hack in and fix the SiFive problem in isolate, but
I believe we can do better and allow all riscv_cpu_realize() to be executed
for all CPUs, regardless of what they've done during their cpu_init().


Daniel Henrique Barboza (2):
  target/riscv/cpu: set cpu->cfg in register_cpu_props()
  target/riscv/cpu.c: do not skip misa logic in riscv_cpu_realize()

 target/riscv/cpu.c | 525 +
 target/riscv/cpu.h |   4 +
 2 files changed, 292 insertions(+), 237 deletions(-)

-- 
2.39.0

[PATCH 1/2] target/riscv/cpu: set cpu->cfg in register_cpu_props()

2023-01-10 Thread Daniel Henrique Barboza

There is an informal contract between the cpu_init() functions and
riscv_cpu_realize(): if cpu->env.misa_ext is zero, assume that the
default settings were loaded via register_cpu_props() and do validations
to set env.misa_ext.  If it's not zero, skip this whole process and
assume that the board somehow did everything.

At this moment, all SiFive CPUs are setting a non-zero misa_ext during
their cpu_init() and skipping a good chunk of riscv_cpu_realize().
This causes problems when the code being skipped in riscv_cpu_realize()
contains fixes or assumptions that affects all CPUs, meaning that SiFive
CPUs are missing out.

To allow this code to not be skipped anymore, all the cpu->cfg.ext_* attributes
needs to be set during cpu_init() time. At this moment this is being done in
register_cpu_props(). The SiFive oards are setting their own extensions during
cpu_init() though, meaning that they don't want all the defaults from
register_cpu_props().

Let's move the contract between *_cpu_init() and riscv_cpu_realize() to
register_cpu_props(). Inside this function we'll check if cpu->env.misa_ext
was set and, if that's the case, set all relevant cpu->cfg.ext_*
attributes, and only that. Leave the 'misa_ext' = 0 case as is today,
i.e. loading all the defaults from riscv_cpu_extensions[].

register_cpu_props() can then be called by all the cpu_init() functions,
including the SiFive ones. This will make all CPUs behave more in line
with that riscv_cpu_realize() expects.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 40 
 target/riscv/cpu.h |  4 
 2 files changed, 44 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index ee3659cc7e..b8c1edb7c2 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -262,6 +262,7 @@ static void rv64_sifive_u_cpu_init(Object *obj)
 {
 CPURISCVState *env = &RISCV_CPU(obj)->env;
 set_misa(env, MXL_RV64, RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
+register_cpu_props(DEVICE(obj));
 set_priv_version(env, PRIV_VERSION_1_10_0);
 }
 
@@ -271,6 +272,7 @@ static void rv64_sifive_e_cpu_init(Object *obj)
 RISCVCPU *cpu = RISCV_CPU(obj);
 
 set_misa(env, MXL_RV64, RVI | RVM | RVA | RVC | RVU);
+register_cpu_props(DEVICE(obj));
 set_priv_version(env, PRIV_VERSION_1_10_0);
 cpu->cfg.mmu = false;
 }
@@ -305,6 +307,7 @@ static void rv32_sifive_u_cpu_init(Object *obj)
 {
 CPURISCVState *env = &RISCV_CPU(obj)->env;
 set_misa(env, MXL_RV32, RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
+register_cpu_props(DEVICE(obj));
 set_priv_version(env, PRIV_VERSION_1_10_0);
 }
 
@@ -314,6 +317,7 @@ static void rv32_sifive_e_cpu_init(Object *obj)
 RISCVCPU *cpu = RISCV_CPU(obj);
 
 set_misa(env, MXL_RV32, RVI | RVM | RVA | RVC | RVU);
+register_cpu_props(DEVICE(obj));
 set_priv_version(env, PRIV_VERSION_1_10_0);
 cpu->cfg.mmu = false;
 }
@@ -324,6 +328,7 @@ static void rv32_ibex_cpu_init(Object *obj)
 RISCVCPU *cpu = RISCV_CPU(obj);
 
 set_misa(env, MXL_RV32, RVI | RVM | RVC | RVU);
+register_cpu_props(DEVICE(obj));
 set_priv_version(env, PRIV_VERSION_1_11_0);
 cpu->cfg.mmu = false;
 cpu->cfg.epmp = true;
@@ -335,6 +340,7 @@ static void rv32_imafcu_nommu_cpu_init(Object *obj)
 RISCVCPU *cpu = RISCV_CPU(obj);
 
 set_misa(env, MXL_RV32, RVI | RVM | RVA | RVF | RVC | RVU);
+register_cpu_props(DEVICE(obj));
 set_priv_version(env, PRIV_VERSION_1_10_0);
 cpu->cfg.mmu = false;
 }
@@ -1139,10 +1145,44 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_END_OF_LIST(),
 };
 
+/*
+ * Register CPU props based on env.misa_ext. If a non-zero
+ * value was set, register only the required cpu->cfg.ext_*
+ * properties and leave. env.misa_ext = 0 means that we want
+ * all the default properties to be registered.
+ */
 static void register_cpu_props(DeviceState *dev)
 {
+RISCVCPU *cpu = RISCV_CPU(OBJECT(dev));
+uint32_t misa_ext = cpu->env.misa_ext;
 Property *prop;
 
+/*
+ * If misa_ext is not zero, set cfg properties now to
+ * allow them to be read during riscv_cpu_realize()
+ * later on.
+ */
+if (cpu->env.misa_ext != 0) {
+cpu->cfg.ext_i = misa_ext & RVI;
+cpu->cfg.ext_e = misa_ext & RVE;
+cpu->cfg.ext_m = misa_ext & RVM;
+cpu->cfg.ext_a = misa_ext & RVA;
+cpu->cfg.ext_f = misa_ext & RVF;
+cpu->cfg.ext_d = misa_ext & RVD;
+cpu->cfg.ext_v = misa_ext & RVV;
+cpu->cfg.ext_c = misa_ext & RVC;
+cpu->cfg.ext_s = misa_ext & RVS;
+cpu->cfg.ext_u = misa_ext & RVU;
+cpu->cfg.ext_h = misa_ext & RVH;
+cpu->cfg.ext_j = misa_ext & RVJ;
+
+/*
+ * We don't want to set the default riscv_cpu_extensions
+ * in this case.
+ */
+return;
+}
+
 for (prop = riscv_cpu_extensions; prop && prop->name; prop++) {
 qdev_property_add_static(dev, prop);
 }

Re: [PATCH v3 1/6] migration: Allow immutable device state to be migrated early (i.e., before RAM)

2023-01-10 Thread Peter Xu

On Tue, Jan 10, 2023 at 12:52:32PM +0100, David Hildenbrand wrote:
> The following seems to work,

That looks much better at least from the diffstat pov (comparing to the
existing patch 1+5 and the framework changes), thanks.

> but makes analyze-migration.py angry:
> 
> $ ./scripts/analyze-migration.py -f STATEFILE
> Traceback (most recent call last):
>   File "/home/dhildenb/git/qemu/./scripts/analyze-migration.py", line 605, in 
> 
> dump.read(dump_memory = args.memory)
>   File "/home/dhildenb/git/qemu/./scripts/analyze-migration.py", line 539, in 
> read
> classdesc = self.section_classes[section_key]
> ^
> KeyError: (':00:03.0/virtio-mem-early', 0)
> 
> 
> We need the vmdesc to create info for the device.

Migration may ignore the save entry if save_state() not provided in the
"devices" section:

if ((!se->ops || !se->ops->save_state) && !se->vmsd) {
continue;
}

Could you try providing a shim save_state() for the new virtio-mem save
entry?

/*
 * Shim function to make sure the save entry will be dumped into "devices"
 * section, to make analyze-migration.py happy.
 */
static void virtio_mem_save_state_early(QEMUFile *file, void *opaque)
{
}

Then:

static const SaveVMHandlers vmstate_virtio_mem_device_early_ops = {
.save_setup = virtio_mem_save_setup_early,
.save_state = virtio_mem_save_state_early,
.load_state = virtio_mem_load_state_early,
};

I'm not 100% sure it'll work yet, but maybe worth trying.

Thanks,

-- 
Peter Xu

Re: [RFC] Reducing NEED_CPU_H usage

2023-01-10 Thread Richard Henderson


On 12/28/22 08:16, Alessandro Di Federico wrote:

## `target_ulong`

`target_ulong` is `uint32_t` in 32-bit targets and `uint64_t` in 64-bit
targets.

Problem: This is used in many many places to represent addresses in
code that could become target-independent.

Proposed solution: we can convert it to:

 typedef uint64_t target_address;


We have other typedefs that are better for this, e.g. vaddr.

However, at some point we do want to keep some target addresses in the proper size.  For 
instance within the softmmu tlb, where CPUTLBEntry is either 16 or 32 bytes, depending.


(On the other hand, if we drop support for 32-bit hosts, as we keep threatening to do, 
then CPUTLB is always 32 bytes, and we might as well use vaddr there too.  But not until 
32-bit hosts are gone.)




The problem with this is that, if arithmetic operations are performed
on it, we might get undesired results:

 // Was: char load_data(target_ulong address)
 char load_data(target_address address) {
   char *base_address = get_base_address();
   // On a 32-bits target this would overflow, it doesn't with
   // uint64_t
   target_address real_address = address + 1;
   return *(base_address + real_address);
 }


Doesn't, or shouldn't matter, because we should never do anything like this in generic 
code.  Note that


vaddr ptr = ...;
cpu_ldl_le_data(env, ptr + offset)

does not have the problem you describe, because any overflow is truncated within the load 
function.




## `abi_ulong`

Similar to `target_ulong`, but with alignment info.


Pardon?  There's no alignment info in abi_ulong.

The difference is that 'target_ulong' is the size of the target register, and 'abi_ulong' 
is the 'unsigned long' in the target's C ABI.  Consider e.g. x32 (x86_64 with ilp32 abi), 
for which target_ulong is 64-bit but abi_ulong is 32-bit.


This only applies to user-only, and should not matter for this project.

There *is* an 'abi_ptr' type, which is shared between softmmu and user-only, which might 
be able to be replaced by 'vaddr'.  Or 'typedef vaddr abi_ptr' in softmmu mode.  I haven't 
done a survey on that to be certain.



## `TCGv`

`TCGv` is a macro for `TCGv_i32` for 32-bit targets and `TCGv_i64`
for 64-bit targets.


The idea is that this macro should only be visible to target-specific code, and the macro 
provides the swizzling/encoding to the concrete type functions.


Problem: it makes `tcg-op.h` 


This is fine.


and, more importantly, `tcg-op.c`


This one requires some work within tcg/ to handle two target address sizes simultaneously. 
 It should not be technically difficult to solve, but it does involve adding a few TCG 
opcodes and adjusting all tcg backends.




Solution: transform current functions using them into target-specific
wrappers that dispatch to target-agnostic functions that accept
`TCGv_dyn` instead of `TCGv`:

 typedef struct {
 union {
 TCGv_i32 i32;
 TCGv_i64 i64;
 };
 bool is_64;
 } TCGv_dyn;


This forgets that both TCGv_i32 and TCGv_i64 are represented by TCGTemp, which contains 
'TCGType type' to discriminate.  This is not exposed to target/, but it's there.


Anyway, there's no need for this.


## `TARGET_` macros

These are macros that provide target-specific information.

Problem: they need to be abandoned in translation units that need to
become target agnostic.

Solution: promote them to fields of a `struct`.
Current ideas:

 TARGET_TB_PCREL -> TranslationBlock.pc_rel


I'd been thinking a bit on the cpu, but a CF_* bit works well.
It gets initialized for each TB from CPUState.tcg_cflags.
TBD where we'd initialize the new bit for each cpu...



 TARGET_PAGE_BITS -> TranslationBlock.page_bits
 TARGET_PAGE_MASK -> TranslationBlock.page_mask


You need to look at how TARGET_PAGE_BITS_VARY works.  The memory subsystem needs rewriting 
if we were to support multiple page sizes.  What we can support now is one single global 
page size, selected at startup.




 TARGET_PAGE_ALIGN -> CPUArchState.page_align
   -> DisasContextBase.page_align


This remains a trivial macro based on the variable TARGET_PAGE_MASK.



 TARGET_LONG_BITS -> TCGContext.long_bits


Yes.

I've been considering how to generalize this to arbitrary address widths, in order to 
better support ARM top-byte-ignore and RISC-V J extension (pointer masking).  But in the 
short term I'm happy with this number being exactly 64 or 32.



 TARGET_PAGE_SIZE -> ???


Remains a trivial macro based on the variable TARGET_PAGE_MASK.


 TCG_OVERSIZED_GUEST -> ???


Goes away if we drop support for 32-bit hosts, or restrict 32-bit hosts to 32-bit guests. 
I have no other good ideas.




 TARGET_FMT_lx -> ???


VADDR_PRIx, mostly.  May need resolving on a case-by-case basis.


 CPU_RESOLVING_TYPE -> ???


Would need to be part of the per-target shared library interface.


## `CPUState`

`CPUState` is a t

Re: [PATCH 1/1] hw/arm/sbsa-ref.c: Start APs powered off

2023-01-10 Thread Rebecca Cran


On 1/5/23 10:34, Peter Maydell wrote:


This board disables QEMU's own PSCI implementation and relies on
a guest EL3 firmware to provide PSCI. So how will that EL3
firmware implement the "power on" to bring up the secondaries?
QEMU has the APIs to allow implementation of a model of a
hardware power controller (target/arm/arm-powerctl.h) but
as far as I can see the sbsa-ref board doesn't yet implement
one, so if you start the CPUs in the powered-off state there's
no way for them ever to be powered on.


Sorry, I've been working on a machine where the power controller _was_ 
implemented so I missed that that's not present in sbsa-ref.


--
Rebecca Cran

Re: intermittent hang, s390x host, bios-tables-test test, TPM

2023-01-10 Thread Stefan Berger





On 1/10/23 14:27, Daniel P. Berrangé wrote:

On Tue, Jan 10, 2023 at 01:50:26PM -0500, Stefan Berger wrote:



On 1/6/23 10:16, Stefan Berger wrote:

This here seems to be the root cause. An unknown control channel
command was received from the TPM emulator backend by the control channel 
thread and we end up in g_assert_not_reached().

https://github.com/qemu/qemu/blob/master/tests/qtest/tpm-emu.c#L189



      ret = qio_channel_read(ioc, (char *)&cmd, sizeof(cmd), NULL);
      if (ret <= 0) {
      break;
      }

      cmd = be32_to_cpu(cmd);
      switch (cmd) {
   [...]
      default:
      g_debug("unimplemented %u", cmd);
      g_assert_not_reached();    <--
      }

I will run this test case in an endless loop on an x86_64 host and see what we 
get there ...


I could not recreate the issue running the  test on a ppc64 and x86_64
host. There we like >100k test runs on ppc64 and >40k on x86_64. Also
simulating the reception of an unsupported command did not lead to a
hang like shown here.


Assuming your ppc64 host is running an little endian OS, and
we're only seeing the test failure on s390x, then it points towards
the problem being an endianness issue in the TPM code. Something
missing a byteswap somewhere along the way ?


Yes, my ppc64 machine is also little endian. If the issue  was not an 
intermittent but a permanent
failure I would look for something like that. I would think it's more some sort 
of initialization
issue, like a value on the stack that occasionally set to an undesirable value 
-- maybe even in a
dependency.

   Stefan




With regards,
Daniel

Re: intermittent hang, s390x host, bios-tables-test test, TPM

2023-01-10 Thread Daniel P . Berrangé

On Fri, Jan 06, 2023 at 10:16:36AM -0500, Stefan Berger wrote:
> 
> 
> On 1/6/23 07:10, Peter Maydell wrote:
> > I'm seeing an intermittent hang on the s390 CI runner in the
> > bios-tables-test test. It looks like we've deadlocked because:
> > 
> >   * the TPM device is waiting for data on its socket that never arrives,
> > and it's holding the iothread lock
> >   * QEMU is therefore not making forward progress;
> > in particular it is unable to handle qtest queries/responses
> >   * the test binary thread 1 is waiting to get a response to its
> > qtest command, which is not going to arrive
> >   * test binary thread 3 (tpm_emu_ctrl_thread) is has hit an
> > assertion and is trying to kill QEMU via qtest_kill_qemu()
> >   * qtest_kill_qemu() is only a "SIGTERM and wait", so will wait
> > forever, because QEMU won't respond to the SIGTERM while it's
> > blocked waiting for the TPM device to release the iothread lock
> >   * because the ctrl-thread is waiting for QEMU to exit, it's never
> > going to send the data that would unblock the TPM device emulation
> > 
> [...]
> 
> > 
> > Thread 3 (Thread 0x3ff8dafe900 (LWP 2661316)):
> > #0  0x03ff8e9c6002 in __GI___wait4 (pid=,
> > stat_loc=stat_loc@entry=0x2aa0b42c9bc, options=,
> > usage=usage@entry=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
> > #1  0x03ff8e9c5f72 in __GI___waitpid (pid=,
> > stat_loc=stat_loc@entry=0x2aa0b42c9bc, options=options@entry=0) at
> > waitpid.c:38
> > #2  0x02aa0952a516 in qtest_wait_qemu (s=0x2aa0b42c9b0) at
> > ../tests/qtest/libqtest.c:206
> > #3  0x02aa0952a58a in qtest_kill_qemu (s=0x2aa0b42c9b0) at
> > ../tests/qtest/libqtest.c:229
> > #4  0x03ff8f0c288e in g_hook_list_invoke () from
> > /lib/s390x-linux-gnu/libglib-2.0.so.0
> > #5  
> > #6  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> > #7  0x03ff8e9240a2 in __GI_abort () at abort.c:79
> > #8  0x03ff8f0feda8 in g_assertion_message () from
> > /lib/s390x-linux-gnu/libglib-2.0.so.0
> > #9  0x03ff8f0fedfe in g_assertion_message_expr () from
> > /lib/s390x-linux-gnu/libglib-2.0.so.0
> > #10 0x02aa09522904 in tpm_emu_ctrl_thread (data=0x3fff5ffa160) at
> > ../tests/qtest/tpm-emu.c:189
> 
> This here seems to be the root cause. An unknown control channel command was 
> received from the TPM emulator backend by the control channel thread and we 
> end up in g_assert_not_reached().
> 
> https://github.com/qemu/qemu/blob/master/tests/qtest/tpm-emu.c#L189
> 
> 
> 
> ret = qio_channel_read(ioc, (char *)&cmd, sizeof(cmd), NULL);
> if (ret <= 0) {
> break;
> }
> 
> cmd = be32_to_cpu(cmd);
> switch (cmd) {
>  [...]
> default:
> g_debug("unimplemented %u", cmd);
> g_assert_not_reached();<--
> }
> 
> I will run this test case in an endless loop on an x86_64 host and see what 
> we get there ...

The QEMU stack trace shows:

#7  0x02aa1224a2ca in tpm_emulator_cancel_cmd (tb=)
at ../backends/tpm/tpm_emulator.c:500
#8  0x02aa121e68c4 in tpm_tis_mmio_write (opaque=0x2aa1529ec20,
addr=24, val=64, size=) at
../hw/tpm/tpm_tis_common.c:663


IOW, we're getting CMD_CANCEL_TPM_CMD, which is indeed not handled
by any 'case:' in the switch in qtest/tpm-emu.c


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH 01/26] scripts/ci: update gitlab-runner playbook to use latest runner

2023-01-10 Thread Alex Bennée

We were using quite and old runner on our machines and running into
issues with stalling jobs. Gitlab in the meantime now reliably provide
the latest packaged versions of the runner under a stable URL. This
update:

  - creates a per-arch subdir for builds
  - switches from binary tarballs to deb packages
  - re-uses the same binary for the secondary runner
  - updates distro check for second to 22.04

Note this script isn't fully idempotent as we end up accumulating
runners especially during testing. However we also want to be able to
run twice with different GitLab keys (e.g. project and personal) so I
think we just have to be mindful of that during testing.

Signed-off-by: Alex Bennée 

---
v2
  - only register aarch32 runner, move service start post both registers
  - tested on s390x
---
 scripts/ci/setup/gitlab-runner.yml | 56 +++---
 scripts/ci/setup/vars.yml.template |  2 --
 2 files changed, 13 insertions(+), 45 deletions(-)

diff --git a/scripts/ci/setup/gitlab-runner.yml 
b/scripts/ci/setup/gitlab-runner.yml
index 33128be85d..95d4199c03 100644
--- a/scripts/ci/setup/gitlab-runner.yml
+++ b/scripts/ci/setup/gitlab-runner.yml
@@ -50,60 +50,30 @@
 
 - name: Download the matching gitlab-runner
   get_url:
-dest: /usr/local/bin/gitlab-runner
-url: "https://s3.amazonaws.com/gitlab-runner-downloads/v{{ 
gitlab_runner_version  }}/binaries/gitlab-runner-{{ gitlab_runner_os }}-{{ 
gitlab_runner_arch }}"
-owner: gitlab-runner
-group: gitlab-runner
-mode: u=rwx,g=rwx,o=rx
-
-- name: Register the gitlab-runner
-  command: "/usr/local/bin/gitlab-runner register --non-interactive --url 
{{ gitlab_runner_server_url }} --registration-token {{ 
gitlab_runner_registration_token }} --executor shell --tag-list {{ 
ansible_facts[\"architecture\"] }},{{ ansible_facts[\"distribution\"]|lower 
}}_{{ ansible_facts[\"distribution_version\"] }} --description '{{ 
ansible_facts[\"distribution\"] }} {{ ansible_facts[\"distribution_version\"] 
}} {{ ansible_facts[\"architecture\"] }} ({{ ansible_facts[\"os_family\"] }})'"
-
-- name: Install the gitlab-runner service using its own functionality
-  command: /usr/local/bin/gitlab-runner install --user gitlab-runner 
--working-directory /home/gitlab-runner
-  register: gitlab_runner_install_service_result
-  failed_when: "gitlab_runner_install_service_result.rc != 0 and \"already 
exists\" not in gitlab_runner_install_service_result.stderr"
+dest: "/root/"
+url: 
"https://gitlab-runner-downloads.s3.amazonaws.com/latest/deb/gitlab-runner_{{ 
gitlab_runner_arch }}.deb"
 
-- name: Enable the gitlab-runner service
-  service:
-name: gitlab-runner
-state: started
-enabled: yes
+- name: Install gitlab-runner via package manager
+  apt: deb="/root/gitlab-runner_{{ gitlab_runner_arch }}.deb"
 
-- name: Download secondary gitlab-runner
-  get_url:
-dest: /usr/local/bin/gitlab-runner-arm
-url: "https://s3.amazonaws.com/gitlab-runner-downloads/v{{ 
gitlab_runner_version  }}/binaries/gitlab-runner-{{ gitlab_runner_os }}-arm"
-owner: gitlab-runner
-group: gitlab-runner
-mode: u=rwx,g=rwx,o=rx
-  when:
-- ansible_facts['distribution'] == 'Ubuntu'
-- ansible_facts['architecture'] == 'aarch64'
-- ansible_facts['distribution_version'] == '20.04'
+- name: Register the gitlab-runner
+  command: "/usr/bin/gitlab-runner register --non-interactive --url {{ 
gitlab_runner_server_url }} --registration-token {{ 
gitlab_runner_registration_token }} --executor shell --tag-list {{ 
ansible_facts[\"architecture\"] }},{{ ansible_facts[\"distribution\"]|lower 
}}_{{ ansible_facts[\"distribution_version\"] }} --description '{{ 
ansible_facts[\"distribution\"] }} {{ ansible_facts[\"distribution_version\"] 
}} {{ ansible_facts[\"architecture\"] }} ({{ ansible_facts[\"os_family\"] }})'"
 
+# The secondary runner will still run under the single gitlab-runner 
service
 - name: Register secondary gitlab-runner
-  command: "/usr/local/bin/gitlab-runner-arm register --non-interactive 
--url {{ gitlab_runner_server_url }} --registration-token {{ 
gitlab_runner_registration_token }} --executor shell --tag-list aarch32,{{ 
ansible_facts[\"distribution\"]|lower }}_{{ 
ansible_facts[\"distribution_version\"] }} --description '{{ 
ansible_facts[\"distribution\"] }} {{ ansible_facts[\"distribution_version\"] 
}} {{ ansible_facts[\"architecture\"] }} ({{ ansible_facts[\"os_family\"] }})'"
+  command: "/usr/bin/gitlab-runner register --non-interactive --url {{ 
gitlab_runner_server_url }} --registration-token {{ 
gitlab_runner_registration_token }} --executor shell --tag-list aarch32,{{ 
ansible_facts[\"distribution\"]|lower }}_{{ 
ansible_facts[\"distribution_version\"] }} --description '{{ 
ansible_facts[\"distribution\"] }} {{ ansible_facts[\"distribution_version\"] 
}} {{ ansib

[PATCH 03/26] gitlab: just use plain --cc=clang for custom runner build

2023-01-10 Thread Alex Bennée

I think this was because older Ubuntu's didn't alias clang to whatever
the latest version was. They do now so lets use that and not break.

Signed-off-by: Alex Bennée 
---
 .gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml 
b/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
index abeb33eaff..725ca8ffea 100644
--- a/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
+++ b/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
@@ -81,7 +81,7 @@ ubuntu-22.04-aarch64-clang:
  script:
  - mkdir build
  - cd build
- - ../configure --disable-libssh --cc=clang-10 --cxx=clang++-10 
--enable-sanitizers
+ - ../configure --disable-libssh --cc=clang --cxx=clang++ --enable-sanitizers
|| { cat config.log meson-logs/meson-log.txt; exit 1; }
  - make --output-sync -j`nproc --ignore=40`
  - make --output-sync -j`nproc --ignore=40` check
-- 
2.34.1

Re: intermittent hang, s390x host, bios-tables-test test, TPM

2023-01-10 Thread Daniel P . Berrangé

On Fri, Jan 06, 2023 at 03:39:31PM +, Peter Maydell wrote:
> On Fri, 6 Jan 2023 at 15:16, Stefan Berger  wrote:
> >
> >
> >
> > On 1/6/23 07:10, Peter Maydell wrote:
> > > I'm seeing an intermittent hang on the s390 CI runner in the
> > > bios-tables-test test. It looks like we've deadlocked because:
> > >
> > >   * the TPM device is waiting for data on its socket that never arrives,
> > > and it's holding the iothread lock
> > >   * QEMU is therefore not making forward progress;
> > > in particular it is unable to handle qtest queries/responses
> > >   * the test binary thread 1 is waiting to get a response to its
> > > qtest command, which is not going to arrive
> > >   * test binary thread 3 (tpm_emu_ctrl_thread) is has hit an
> > > assertion and is trying to kill QEMU via qtest_kill_qemu()
> > >   * qtest_kill_qemu() is only a "SIGTERM and wait", so will wait
> > > forever, because QEMU won't respond to the SIGTERM while it's
> > > blocked waiting for the TPM device to release the iothread lock
> > >   * because the ctrl-thread is waiting for QEMU to exit, it's never
> > > going to send the data that would unblock the TPM device emulation
> > >
> > [...]
> >
> > >
> > > Thread 3 (Thread 0x3ff8dafe900 (LWP 2661316)):
> > > #0  0x03ff8e9c6002 in __GI___wait4 (pid=,
> > > stat_loc=stat_loc@entry=0x2aa0b42c9bc, options=,
> > > usage=usage@entry=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
> > > #1  0x03ff8e9c5f72 in __GI___waitpid (pid=,
> > > stat_loc=stat_loc@entry=0x2aa0b42c9bc, options=options@entry=0) at
> > > waitpid.c:38
> > > #2  0x02aa0952a516 in qtest_wait_qemu (s=0x2aa0b42c9b0) at
> > > ../tests/qtest/libqtest.c:206
> > > #3  0x02aa0952a58a in qtest_kill_qemu (s=0x2aa0b42c9b0) at
> > > ../tests/qtest/libqtest.c:229
> > > #4  0x03ff8f0c288e in g_hook_list_invoke () from
> > > /lib/s390x-linux-gnu/libglib-2.0.so.0
> > > #5  
> > > #6  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> > > #7  0x03ff8e9240a2 in __GI_abort () at abort.c:79
> > > #8  0x03ff8f0feda8 in g_assertion_message () from
> > > /lib/s390x-linux-gnu/libglib-2.0.so.0
> > > #9  0x03ff8f0fedfe in g_assertion_message_expr () from
> > > /lib/s390x-linux-gnu/libglib-2.0.so.0
> > > #10 0x02aa09522904 in tpm_emu_ctrl_thread (data=0x3fff5ffa160) at
> > > ../tests/qtest/tpm-emu.c:189
> >
> > This here seems to be the root cause. An unknown control channel command
> > was received from the TPM emulator backend by the control channel thread
> > and we end up in g_assert_not_reached().
> 
> Yeah. It would be good if we didn't deadlock without printing
> the assertion, though...
> 
> I guess we could improve qtest_kill_qemu() so it doesn't wait
> indefinitely for QEMU to exit but instead sends a SIGKILL 20
> seconds after the SIGTERM. (Annoyingly, there is no convenient
> "waitpid but with a timeout" function...)

We don't need to touch that. Instead the tpm-emu.c file needs to
call  qtest_add_abrt_handler() passing a callback that will invoke
qio_channel_close on its end of the socket. This will cause the
QEMU process to get EOF on the other end of the socket. It then
won't be stuck holding the iothread lock, and will be able to
respond to SIGTERM.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: intermittent hang, s390x host, bios-tables-test test, TPM

2023-01-10 Thread Daniel P . Berrangé

On Tue, Jan 10, 2023 at 01:50:26PM -0500, Stefan Berger wrote:
> 
> 
> On 1/6/23 10:16, Stefan Berger wrote:
>> This here seems to be the root cause. An unknown control channel
>> command was received from the TPM emulator backend by the control channel 
>> thread and we end up in g_assert_not_reached().
> > 
> > https://github.com/qemu/qemu/blob/master/tests/qtest/tpm-emu.c#L189
> > 
> > 
> > 
> >      ret = qio_channel_read(ioc, (char *)&cmd, sizeof(cmd), NULL);
> >      if (ret <= 0) {
> >      break;
> >      }
> > 
> >      cmd = be32_to_cpu(cmd);
> >      switch (cmd) {
> >   [...]
> >      default:
> >      g_debug("unimplemented %u", cmd);
> >      g_assert_not_reached();    <--
> >      }
> > 
> > I will run this test case in an endless loop on an x86_64 host and see what 
> > we get there ...
> 
> I could not recreate the issue running the  test on a ppc64 and x86_64
> host. There we like >100k test runs on ppc64 and >40k on x86_64. Also
> simulating the reception of an unsupported command did not lead to a
> hang like shown here.

Assuming your ppc64 host is running an little endian OS, and
we're only seeing the test failure on s390x, then it points towards
the problem being an endianness issue in the TPM code. Something
missing a byteswap somewhere along the way ?


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH v6 31/51] hw/xen: Implement EVTCHNOP_bind_virq

2023-01-10 Thread David Woodhouse

From: David Woodhouse 

Add the array of virq ports to each vCPU so that we can deliver timers,
debug ports, etc. Global virqs are allocated against vCPU 0 initially,
but can be migrated to other vCPUs (when we implement that).

The kernel needs to know about VIRQ_TIMER in order to accelerate timers,
so tell it via KVM_XEN_VCPU_ATTR_TYPE_TIMER. Also save/restore the value
of the singleshot timer across migration, as the kernel will handle the
hypercalls automatically now.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xen_evtchn.c  | 85 
 hw/i386/kvm/xen_evtchn.h  |  2 +
 include/sysemu/kvm_xen.h  |  1 +
 target/i386/cpu.h |  4 ++
 target/i386/kvm/xen-emu.c | 91 +++
 target/i386/machine.c |  2 +
 6 files changed, 185 insertions(+)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index 0e5b33e417..4942663ddf 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -237,6 +237,11 @@ static bool valid_port(evtchn_port_t port)
 }
 }
 
+static bool valid_vcpu(uint32_t vcpu)
+{
+return !!qemu_get_cpu(vcpu);
+}
+
 int xen_evtchn_status_op(struct evtchn_status *status)
 {
 XenEvtchnState *s = xen_evtchn_singleton;
@@ -487,6 +492,43 @@ static void free_port(XenEvtchnState *s, evtchn_port_t 
port)
 clear_port_pending(s, port);
 }
 
+static int allocate_port(XenEvtchnState *s, uint32_t vcpu, uint16_t type,
+ uint16_t val, evtchn_port_t *port)
+{
+evtchn_port_t p = 1;
+
+for (p = 1; valid_port(p); p++) {
+if (s->port_table[p].type == EVTCHNSTAT_closed) {
+s->port_table[p].vcpu = vcpu;
+s->port_table[p].type = type;
+s->port_table[p].type_val = val;
+
+*port = p;
+
+if (s->nr_ports < p + 1) {
+s->nr_ports = p + 1;
+}
+
+return 0;
+}
+}
+return -ENOSPC;
+}
+
+static bool virq_is_global(uint32_t virq)
+{
+switch (virq) {
+case VIRQ_TIMER:
+case VIRQ_DEBUG:
+case VIRQ_XENOPROF:
+case VIRQ_XENPMU:
+return false;
+
+default:
+return true;
+}
+}
+
 static int close_port(XenEvtchnState *s, evtchn_port_t port)
 {
 XenEvtchnPort *p = &s->port_table[port];
@@ -495,6 +537,11 @@ static int close_port(XenEvtchnState *s, evtchn_port_t 
port)
 case EVTCHNSTAT_closed:
 return -ENOENT;
 
+case EVTCHNSTAT_virq:
+kvm_xen_set_vcpu_virq(virq_is_global(p->type_val) ? 0 : p->vcpu,
+  p->type_val, 0);
+break;
+
 default:
 break;
 }
@@ -546,3 +593,41 @@ int xen_evtchn_unmask_op(struct evtchn_unmask *unmask)
 
 return ret;
 }
+
+int xen_evtchn_bind_virq_op(struct evtchn_bind_virq *virq)
+{
+XenEvtchnState *s = xen_evtchn_singleton;
+int ret;
+
+if (!s) {
+return -ENOTSUP;
+}
+
+if (virq->virq >= NR_VIRQS) {
+return -EINVAL;
+}
+
+/* Global VIRQ must be allocated on vCPU0 first */
+if (virq_is_global(virq->virq) && virq->vcpu != 0) {
+return -EINVAL;
+}
+
+if (!valid_vcpu(virq->vcpu)) {
+return -ENOENT;
+}
+
+qemu_mutex_lock(&s->port_lock);
+
+ret = allocate_port(s, virq->vcpu, EVTCHNSTAT_virq, virq->virq,
+&virq->port);
+if (!ret) {
+ret = kvm_xen_set_vcpu_virq(virq->vcpu, virq->virq, virq->port);
+if (ret) {
+free_port(s, virq->port);
+}
+}
+
+qemu_mutex_unlock(&s->port_lock);
+
+return ret;
+}
diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h
index 69c6b0d743..0ea13dda3a 100644
--- a/hw/i386/kvm/xen_evtchn.h
+++ b/hw/i386/kvm/xen_evtchn.h
@@ -18,8 +18,10 @@ int xen_evtchn_set_callback_param(uint64_t param);
 struct evtchn_status;
 struct evtchn_close;
 struct evtchn_unmask;
+struct evtchn_bind_virq;
 int xen_evtchn_status_op(struct evtchn_status *status);
 int xen_evtchn_close_op(struct evtchn_close *close);
 int xen_evtchn_unmask_op(struct evtchn_unmask *unmask);
+int xen_evtchn_bind_virq_op(struct evtchn_bind_virq *virq);
 
 #endif /* QEMU_XEN_EVTCHN_H */
diff --git a/include/sysemu/kvm_xen.h b/include/sysemu/kvm_xen.h
index 2192ceea10..b2bcacd761 100644
--- a/include/sysemu/kvm_xen.h
+++ b/include/sysemu/kvm_xen.h
@@ -22,6 +22,7 @@
 uint32_t kvm_xen_get_caps(void);
 void *kvm_xen_get_vcpu_info_hva(uint32_t vcpu_id);
 void kvm_xen_inject_vcpu_callback_vector(uint32_t vcpu_id, int type);
+int kvm_xen_set_vcpu_virq(uint32_t vcpu_id, uint16_t virq, uint16_t port);
 
 #define kvm_xen_has_cap(cap) (!!(kvm_xen_get_caps() &   \
  KVM_XEN_HVM_CONFIG_ ## cap))
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index c9b12e7476..dba8732fc6 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -27,6 +27,8 @@
 #include "qapi/qapi-types-common.h"
 #include "qemu/cpu-float.h"
 
+#define XEN_NR_VIRQS 24
+
 /* The x86 has a strong

[PATCH v6 37/51] hw/xen: Implement EVTCHNOP_reset

2023-01-10 Thread David Woodhouse

From: David Woodhouse 

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xen_evtchn.c  | 29 +
 hw/i386/kvm/xen_evtchn.h  |  3 +++
 target/i386/kvm/xen-emu.c | 17 +
 3 files changed, 49 insertions(+)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index ad75cddc5e..6b6df39978 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -738,6 +738,35 @@ static int close_port(XenEvtchnState *s, evtchn_port_t 
port)
 return 0;
 }
 
+int xen_evtchn_soft_reset(void)
+{
+XenEvtchnState *s = xen_evtchn_singleton;
+int i;
+
+if (!s) {
+return -ENOTSUP;
+}
+
+qemu_mutex_lock(&s->port_lock);
+
+for (i = 0; i < s->nr_ports; i++) {
+close_port(s, i);
+}
+
+qemu_mutex_unlock(&s->port_lock);
+
+return 0;
+}
+
+int xen_evtchn_reset_op(struct evtchn_reset *reset)
+{
+if (reset->dom != DOMID_SELF && reset->dom != xen_domid) {
+return -ESRCH;
+}
+
+return xen_evtchn_soft_reset();
+}
+
 int xen_evtchn_close_op(struct evtchn_close *close)
 {
 XenEvtchnState *s = xen_evtchn_singleton;
diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h
index 486b031c82..5d3e03553f 100644
--- a/hw/i386/kvm/xen_evtchn.h
+++ b/hw/i386/kvm/xen_evtchn.h
@@ -13,6 +13,7 @@
 #define QEMU_XEN_EVTCHN_H
 
 void xen_evtchn_create(void);
+int xen_evtchn_soft_reset(void);
 int xen_evtchn_set_callback_param(uint64_t param);
 
 struct evtchn_status;
@@ -24,6 +25,7 @@ struct evtchn_send;
 struct evtchn_alloc_unbound;
 struct evtchn_bind_interdomain;
 struct evtchn_bind_vcpu;
+struct evtchn_reset;
 int xen_evtchn_status_op(struct evtchn_status *status);
 int xen_evtchn_close_op(struct evtchn_close *close);
 int xen_evtchn_unmask_op(struct evtchn_unmask *unmask);
@@ -33,5 +35,6 @@ int xen_evtchn_send_op(struct evtchn_send *send);
 int xen_evtchn_alloc_unbound_op(struct evtchn_alloc_unbound *alloc);
 int xen_evtchn_bind_interdomain_op(struct evtchn_bind_interdomain 
*interdomain);
 int xen_evtchn_bind_vcpu_op(struct evtchn_bind_vcpu *vcpu);
+int xen_evtchn_reset_op(struct evtchn_reset *reset);
 
 #endif /* QEMU_XEN_EVTCHN_H */
diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index afc6d28357..730284a067 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -950,6 +950,18 @@ static bool kvm_xen_hcall_evtchn_op(struct kvm_xen_exit 
*exit, X86CPU *cpu,
 err = xen_evtchn_bind_vcpu_op(&vcpu);
 break;
 }
+case EVTCHNOP_reset: {
+struct evtchn_reset reset;
+
+qemu_build_assert(sizeof(reset) == 2);
+if (kvm_copy_from_gva(cs, arg, &reset, sizeof(reset))) {
+err = -EFAULT;
+break;
+}
+
+err = xen_evtchn_reset_op(&reset);
+break;
+}
 default:
 return false;
 }
@@ -963,6 +975,11 @@ static int kvm_xen_soft_reset(void)
 CPUState *cpu;
 int err;
 
+err = xen_evtchn_soft_reset();
+if (err) {
+return err;
+}
+
 err = xen_evtchn_set_callback_param(0);
 if (err) {
 return err;
-- 
2.35.3

[RFC PATCH v1 05/15] hw/xen: Add foreignmem operations to allow redirection to internal emulation

2023-01-10 Thread David Woodhouse

From: David Woodhouse 

Signed-off-by: David Woodhouse 
Signed-off-by: Paul Durrant 
---
 hw/char/xen_console.c|  8 ++--
 hw/display/xenfb.c   | 20 +-
 hw/xen/xen-operations.c  | 63 
 include/hw/xen/xen_backend_ops.h | 26 +
 include/hw/xen/xen_common.h  | 13 ---
 softmmu/globals.c|  1 +
 6 files changed, 105 insertions(+), 26 deletions(-)

diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index 19ad6c946a..e9cef3e1ef 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -237,9 +237,9 @@ static int con_initialise(struct XenLegacyDevice *xendev)
 
 if (!xendev->dev) {
 xen_pfn_t mfn = con->ring_ref;
-con->sring = xenforeignmemory_map(xen_fmem, con->xendev.dom,
-  PROT_READ | PROT_WRITE,
-  1, &mfn, NULL);
+con->sring = qemu_xen_foreignmem_map(con->xendev.dom, NULL,
+ PROT_READ | PROT_WRITE,
+ 1, &mfn, NULL);
 } else {
 con->sring = xen_be_map_grant_ref(xendev, con->ring_ref,
   PROT_READ | PROT_WRITE);
@@ -269,7 +269,7 @@ static void con_disconnect(struct XenLegacyDevice *xendev)
 
 if (con->sring) {
 if (!xendev->dev) {
-xenforeignmemory_unmap(xen_fmem, con->sring, 1);
+qemu_xen_foreignmem_unmap(con->sring, 1);
 } else {
 xen_be_unmap_grant_ref(xendev, con->sring, con->ring_ref);
 }
diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index 260eb38a76..2c4016fcbd 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -98,8 +98,9 @@ static int common_bind(struct common *c)
 if (xenstore_read_fe_int(&c->xendev, "event-channel", 
&c->xendev.remote_port) == -1)
 return -1;
 
-c->page = xenforeignmemory_map(xen_fmem, c->xendev.dom,
-   PROT_READ | PROT_WRITE, 1, &mfn, NULL);
+c->page = qemu_xen_foreignmem_map(c->xendev.dom, NULL,
+  PROT_READ | PROT_WRITE, 1, &mfn,
+  NULL);
 if (c->page == NULL)
 return -1;
 
@@ -115,7 +116,7 @@ static void common_unbind(struct common *c)
 {
 xen_pv_unbind_evtchn(&c->xendev);
 if (c->page) {
-xenforeignmemory_unmap(xen_fmem, c->page, 1);
+qemu_xen_foreignmem_unmap(c->page, 1);
 c->page = NULL;
 }
 }
@@ -500,15 +501,16 @@ static int xenfb_map_fb(struct XenFB *xenfb)
 fbmfns = g_new0(xen_pfn_t, xenfb->fbpages);
 
 xenfb_copy_mfns(mode, n_fbdirs, pgmfns, pd);
-map = xenforeignmemory_map(xen_fmem, xenfb->c.xendev.dom,
-   PROT_READ, n_fbdirs, pgmfns, NULL);
+map = qemu_xen_foreignmem_map(xenfb->c.xendev.dom, NULL, PROT_READ,
+  n_fbdirs, pgmfns, NULL);
 if (map == NULL)
 goto out;
 xenfb_copy_mfns(mode, xenfb->fbpages, fbmfns, map);
-xenforeignmemory_unmap(xen_fmem, map, n_fbdirs);
+qemu_xen_foreignmem_unmap(map, n_fbdirs);
 
-xenfb->pixels = xenforeignmemory_map(xen_fmem, xenfb->c.xendev.dom,
-PROT_READ, xenfb->fbpages, fbmfns, NULL);
+xenfb->pixels = qemu_xen_foreignmem_map(xenfb->c.xendev.dom, NULL,
+PROT_READ, xenfb->fbpages,
+fbmfns, NULL);
 if (xenfb->pixels == NULL)
 goto out;
 
@@ -927,7 +929,7 @@ static void fb_disconnect(struct XenLegacyDevice *xendev)
  *   Replacing the framebuffer with anonymous shared memory
  *   instead.  This releases the guest pages and keeps qemu happy.
  */
-xenforeignmemory_unmap(xen_fmem, fb->pixels, fb->fbpages);
+qemu_xen_foreignmem_unmap(fb->pixels, fb->fbpages);
 fb->pixels = mmap(fb->pixels, fb->fbpages * XC_PAGE_SIZE,
   PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANON,
   -1, 0);
diff --git a/hw/xen/xen-operations.c b/hw/xen/xen-operations.c
index 73dabac8e5..4c6b305cc4 100644
--- a/hw/xen/xen-operations.c
+++ b/hw/xen/xen-operations.c
@@ -22,6 +22,7 @@
  */
 #undef XC_WANT_COMPAT_EVTCHN_API
 #undef XC_WANT_COMPAT_GNTTAB_API
+#undef XC_WANT_COMPAT_MAP_FOREIGN_API
 
 #include 
 
@@ -56,10 +57,13 @@ typedef xc_gnttab xengnttab_handle;
 #define xengnttab_map_domain_grant_refs(h, c, d, r, p) \
 xc_gnttab_map_domain_grant_refs(h, c, d, r, p)
 
+typedef xc_interface xenforeignmemory_handle;
+
 #else /* CONFIG_XEN_CTRL_INTERFACE_VERSION >= 40701 */
 
 #include 
 #include 
+#include 
 
 #endif
 
@@ -218,6 +222,64 @@ static struct gnttab_backend_ops libxengnttab_backend_ops 
= {
 .unmap = libxengnttab_backend_unmap,
 };
 
+#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 40701
+
+static void *libxenforeignmem_backend_map(uint32_t dom, void *addr, int prot,
+

[PULL 03/29] accel: introduce accelerator blocker API

2023-01-10 Thread Paolo Bonzini

From: Emanuele Giuseppe Esposito 

This API allows the accelerators to prevent vcpus from issuing
new ioctls while execting a critical section marked with the
accel_ioctl_inhibit_begin/end functions.

Note that all functions submitting ioctls must mark where the
ioctl is being called with accel_{cpu_}ioctl_begin/end().

This API requires the caller to always hold the BQL.
API documentation is in sysemu/accel-blocker.h

Internally, it uses a QemuLockCnt together with a per-CPU QemuLockCnt
(to minimize cache line bouncing) to keep avoid that new ioctls
run when the critical section starts, and a QemuEvent to wait
that all running ioctls finish.

Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <2022154758.1372674-2-eespo...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 accel/accel-blocker.c  | 154 +
 accel/meson.build  |   2 +-
 hw/core/cpu-common.c   |   2 +
 include/hw/core/cpu.h  |   3 +
 include/sysemu/accel-blocker.h |  56 
 util/meson.build   |   2 +-
 6 files changed, 217 insertions(+), 2 deletions(-)
 create mode 100644 accel/accel-blocker.c
 create mode 100644 include/sysemu/accel-blocker.h

diff --git a/accel/accel-blocker.c b/accel/accel-blocker.c
new file mode 100644
index ..1e7f423462df
--- /dev/null
+++ b/accel/accel-blocker.c
@@ -0,0 +1,154 @@
+/*
+ * Lock to inhibit accelerator ioctls
+ *
+ * Copyright (c) 2022 Red Hat Inc.
+ *
+ * Author: Emanuele Giuseppe Esposito   
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/thread.h"
+#include "qemu/main-loop.h"
+#include "hw/core/cpu.h"
+#include "sysemu/accel-blocker.h"
+
+static QemuLockCnt accel_in_ioctl_lock;
+static QemuEvent accel_in_ioctl_event;
+
+void accel_blocker_init(void)
+{
+qemu_lockcnt_init(&accel_in_ioctl_lock);
+qemu_event_init(&accel_in_ioctl_event, false);
+}
+
+void accel_ioctl_begin(void)
+{
+if (likely(qemu_mutex_iothread_locked())) {
+return;
+}
+
+/* block if lock is taken in kvm_ioctl_inhibit_begin() */
+qemu_lockcnt_inc(&accel_in_ioctl_lock);
+}
+
+void accel_ioctl_end(void)
+{
+if (likely(qemu_mutex_iothread_locked())) {
+return;
+}
+
+qemu_lockcnt_dec(&accel_in_ioctl_lock);
+/* change event to SET. If event was BUSY, wake up all waiters */
+qemu_event_set(&accel_in_ioctl_event);
+}
+
+void accel_cpu_ioctl_begin(CPUState *cpu)
+{
+if (unlikely(qemu_mutex_iothread_locked())) {
+return;
+}
+
+/* block if lock is taken in kvm_ioctl_inhibit_begin() */
+qemu_lockcnt_inc(&cpu->in_ioctl_lock);
+}
+
+void accel_cpu_ioctl_end(CPUState *cpu)
+{
+if (unlikely(qemu_mutex_iothread_locked())) {
+return;
+}
+
+qemu_lockcnt_dec(&cpu->in_ioctl_lock);
+/* change event to SET. If event was BUSY, wake up all waiters */
+qemu_event_set(&accel_in_ioctl_event);
+}
+
+static bool accel_has_to_wait(void)
+{
+CPUState *cpu;
+bool needs_to_wait = false;
+
+CPU_FOREACH(cpu) {
+if (qemu_lockcnt_count(&cpu->in_ioctl_lock)) {
+/* exit the ioctl, if vcpu is running it */
+qemu_cpu_kick(cpu);
+needs_to_wait = true;
+}
+}
+
+return needs_to_wait || qemu_lockcnt_count(&accel_in_ioctl_lock);
+}
+
+void accel_ioctl_inhibit_begin(void)
+{
+CPUState *cpu;
+
+/*
+ * We allow to inhibit only when holding the BQL, so we can identify
+ * when an inhibitor wants to issue an ioctl easily.
+ */
+g_assert(qemu_mutex_iothread_locked());
+
+/* Block further invocations of the ioctls outside the BQL.  */
+CPU_FOREACH(cpu) {
+qemu_lockcnt_lock(&cpu->in_ioctl_lock);
+}
+qemu_lockcnt_lock(&accel_in_ioctl_lock);
+
+/* Keep waiting until there are running ioctls */
+while (true) {
+
+/* Reset event to FREE

Re: [PATCH qemu v3] x86: don't let decompressed kernel image clobber setup_data

2023-01-10 Thread Michael S. Tsirkin

On Tue, Jan 10, 2023 at 04:34:49PM +0100, Jason A. Donenfeld wrote:
> Hi Michael,
> 
> Could you queue up this patch and mark it as a fix for 7.2.1? It is a
> straight-up bug fix for a 7.2 regression that's now affected several
> users.

OK. In the future pls cc me if you want me to merge a patch. Thanks!

> - It has two Tested-by tags on the thread.
> - hpa, the maintainer of the kernel side of this, confirmed on one of
>   the various tributary threads that this approach is a correct one.
> - It doesn't introduce any new functionality.
> 
> For your convenience, you can grab this out of lore here:
> 
>   https://lore.kernel.org/lkml/20221230220725.618763-1-ja...@zx2c4.com/
> 
> Or if you want to yolo it:
> 
>   curl 
> https://lore.kernel.org/lkml/20221230220725.618763-1-ja...@zx2c4.com/raw | 
> git am -s
> 
> It's now sat silent on the mailing list for a while. So let's please get
> this committed and backported so that the bug reports stop coming in.
> 
> Thanks,
> Jason
> 
>

[PATCH 10/26] Update lcitool and fedora to 37

2023-01-10 Thread Alex Bennée

From: Marc-André Lureau 

Fedora 35 is EOL.

Update to upstream lcitool, that dropped f35 and added f37.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Thomas Huth 
Message-Id: <20230110132700.833690-7-marcandre.lur...@redhat.com>
Signed-off-by: Alex Bennée 
---
 tests/docker/dockerfiles/fedora-win32-cross.docker | 4 ++--
 tests/docker/dockerfiles/fedora-win64-cross.docker | 4 ++--
 tests/docker/dockerfiles/fedora.docker | 4 ++--
 tests/lcitool/libvirt-ci   | 2 +-
 tests/lcitool/refresh  | 6 +++---
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tests/docker/dockerfiles/fedora-win32-cross.docker 
b/tests/docker/dockerfiles/fedora-win32-cross.docker
index 75383ba185..cc5d1ac4be 100644
--- a/tests/docker/dockerfiles/fedora-win32-cross.docker
+++ b/tests/docker/dockerfiles/fedora-win32-cross.docker
@@ -1,10 +1,10 @@
 # THIS FILE WAS AUTO-GENERATED
 #
-#  $ lcitool dockerfile --layers all --cross mingw32 fedora-35 qemu
+#  $ lcitool dockerfile --layers all --cross mingw32 fedora-37 qemu
 #
 # https://gitlab.com/libvirt/libvirt-ci
 
-FROM registry.fedoraproject.org/fedora:35
+FROM registry.fedoraproject.org/fedora:37
 
 RUN dnf install -y nosync && \
 echo -e '#!/bin/sh\n\
diff --git a/tests/docker/dockerfiles/fedora-win64-cross.docker 
b/tests/docker/dockerfiles/fedora-win64-cross.docker
index 98c03dc13b..cabbf4edfc 100644
--- a/tests/docker/dockerfiles/fedora-win64-cross.docker
+++ b/tests/docker/dockerfiles/fedora-win64-cross.docker
@@ -1,10 +1,10 @@
 # THIS FILE WAS AUTO-GENERATED
 #
-#  $ lcitool dockerfile --layers all --cross mingw64 fedora-35 qemu
+#  $ lcitool dockerfile --layers all --cross mingw64 fedora-37 qemu
 #
 # https://gitlab.com/libvirt/libvirt-ci
 
-FROM registry.fedoraproject.org/fedora:35
+FROM registry.fedoraproject.org/fedora:37
 
 RUN dnf install -y nosync && \
 echo -e '#!/bin/sh\n\
diff --git a/tests/docker/dockerfiles/fedora.docker 
b/tests/docker/dockerfiles/fedora.docker
index d200c7fc10..f44b005000 100644
--- a/tests/docker/dockerfiles/fedora.docker
+++ b/tests/docker/dockerfiles/fedora.docker
@@ -1,10 +1,10 @@
 # THIS FILE WAS AUTO-GENERATED
 #
-#  $ lcitool dockerfile --layers all fedora-35 qemu
+#  $ lcitool dockerfile --layers all fedora-37 qemu
 #
 # https://gitlab.com/libvirt/libvirt-ci
 
-FROM registry.fedoraproject.org/fedora:35
+FROM registry.fedoraproject.org/fedora:37
 
 RUN dnf install -y nosync && \
 echo -e '#!/bin/sh\n\
diff --git a/tests/lcitool/libvirt-ci b/tests/lcitool/libvirt-ci
index e3eb28cf2e..319a534c22 16
--- a/tests/lcitool/libvirt-ci
+++ b/tests/lcitool/libvirt-ci
@@ -1 +1 @@
-Subproject commit e3eb28cf2e17fbcf7fe7e19505ee432b8ec5bbb5
+Subproject commit 319a534c220f53fc8670254cac25d6f662c82112
diff --git a/tests/lcitool/refresh b/tests/lcitool/refresh
index fa966e4009..a5ea0efc3b 100755
--- a/tests/lcitool/refresh
+++ b/tests/lcitool/refresh
@@ -111,7 +111,7 @@ try:
 generate_dockerfile("centos8", "centos-stream-8")
 generate_dockerfile("debian-amd64", "debian-11",
 trailer="".join(debian11_extras))
-generate_dockerfile("fedora", "fedora-35")
+generate_dockerfile("fedora", "fedora-37")
 generate_dockerfile("opensuse-leap", "opensuse-leap-153")
 generate_dockerfile("ubuntu2004", "ubuntu-2004",
 trailer="".join(ubuntu2004_tsanhack))
@@ -161,12 +161,12 @@ try:
 trailer=cross_build("s390x-linux-gnu-",
 "s390x-softmmu,s390x-linux-user"))
 
-generate_dockerfile("fedora-win32-cross", "fedora-35",
+generate_dockerfile("fedora-win32-cross", "fedora-37",
 cross="mingw32",
 trailer=cross_build("i686-w64-mingw32-",
 "i386-softmmu"))
 
-generate_dockerfile("fedora-win64-cross", "fedora-35",
+generate_dockerfile("fedora-win64-cross", "fedora-37",
 cross="mingw64",
 trailer=cross_build("x86_64-w64-mingw32-",
 "x86_64-softmmu"))
-- 
2.34.1

Re: [PATCH v4 04/11] iotests: QemuStorageDaemon: add cmd() method like in QEMUMachine.

2023-01-10 Thread John Snow

On Tue, Jan 10, 2023 at 3:38 AM Vladimir Sementsov-Ogievskiy
 wrote:
>
> Add similar method for consistency.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/iotests.py | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index c69b10ac82..dd08cd8a2b 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -462,6 +462,10 @@ def qmp(self, cmd: str, args: Optional[Dict[str, 
> object]] = None) \
>  assert self._qmp is not None
>  return self._qmp.cmd_raw(cmd, args)
>
> +def cmd(self, cmd: str, args: Optional[Dict[str, object]] = None) \
> +-> QMPMessage:
> +return self._qmp.cmd(cmd, **args)
> +

The typing of this is off -- try "make check-dev" in qemu.git/python to see:

iotests.py:467: error: Item "None" of "Optional[QEMUMonitorProtocol]"
has no attribute "cmd"  [union-attr]
iotests.py:467: error: Argument after ** must be a mapping, not
"Optional[Dict[str, object]]"  [arg-type]
iotests.py:467: error: Incompatible return value type (got
"Union[object, Any]", expected "Dict[str, Any]")  [return-value]
Found 3 errors in 1 file (checked 32 source files)

You need to assert that self._qmp is not None for the first; the
second seems to do with a potentially "None" argument for args, and
the third has to do with the difference between returning the entire
raw response and just the return value.

I started making a fixup branch, but I stopped around here.
https://gitlab.com/jsnow/qemu/-/commits/vlad-iotest-patches

>  def stop(self, kill_signal=15):
>  self._p.send_signal(kill_signal)
>  self._p.wait()
> --
> 2.34.1
>

Re: [PATCH] Makefile: allow 'make uninstall'

2023-01-10 Thread Christian Borntraeger


Am 10.01.23 um 16:13 schrieb Peter Maydell:

Meson supports an "uninstall", so we can easily allow it to work by
not suppressing the forwarding of it from Make to meson.

We originally suppressed this because Meson's 'uninstall' has a hole
in it: it will remove everything that is installed by a mechanism
meson knows about, but not things installed by "custom install
scripts", and there is no "custom uninstall script" mechanism.

For QEMU, though, the only thing that was being installed by a custom
install script was the LC_MESSAGES files handled by Meson's i18n
module, and that code was fixed in Meson commit 487d45c1e5bfff0fbdb4,
which is present in Meson 0.60.0 and later.  Since we already require
a Meson version newer than that, we're now safe to enable
'uninstall', as it will now correctly uninstall everything that was
installed.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/109
Signed-off-by: Peter Maydell 


Always missed that functionality. Thanks.

Re: [PULL 06/29] target/i386: Remove compilation errors when -Werror=maybe-uninitialized

2023-01-10 Thread Eric Auger

Hi Paolo,

On 1/10/23 17:02, Paolo Bonzini wrote:
> From: Eric Auger 
>
> To avoid compilation errors when -Werror=maybe-uninitialized is used,
> replace 'case 3' by 'default'.
>
> Otherwise we get:
>
> ../target/i386/ops_sse.h: In function â€˜helper_vpermdq_ymmâ€™:
> ../target/i386/ops_sse.h:2495:13: error: â€˜r3â€™ may be used
> uninitialized in this function [-Werror=maybe-uninitialized]
>2495 | d->Q(3) = r3;
> | ^~~~
> ../target/i386/ops_sse.h:2494:13: error: â€˜r2â€™ may be used
> uninitialized in this function [-Werror=maybe-uninitialized]
>2494 | d->Q(2) = r2;
> | ^~~~
> ../target/i386/ops_sse.h:2493:13: error: â€˜r1â€™ may be used
> uninitialized in this function [-Werror=maybe-uninitialized]
>2493 | d->Q(1) = r1;
> | ^~~~
> ../target/i386/ops_sse.h:2492:13: error: â€˜r0â€™ may be used
> uninitialized in this function [-Werror=maybe-uninitialized]
>2492 | d->Q(0) = r0;
> | ^~~~
>
> Signed-off-by: Eric Auger 
> Suggested-by: Stefan Weil 
> Fixes: 790684776861 ("target/i386: reimplement 0x0f 0x3a, add AVX")
> Message-Id: <20221221163652.1239362-1-eric.au...@redhat.com>
> Signed-off-by: Paolo Bonzini 

you pulled v1 but there were additional comments afterwards and last
iteration was:
https://lore.kernel.org/all/20221222140158.1260748-1-eric.au...@redhat.com/

Thanks

Eric
> ---
>  target/i386/ops_sse.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
> index 3cbc36a59d1a..c442c8c10cdc 100644
> --- a/target/i386/ops_sse.h
> +++ b/target/i386/ops_sse.h
> @@ -2466,7 +2466,7 @@ void helper_vpermdq_ymm(Reg *d, Reg *v, Reg *s, 
> uint32_t order)
>  r0 = s->Q(0);
>  r1 = s->Q(1);
>  break;
> -case 3:
> +default:
>  r0 = s->Q(2);
>  r1 = s->Q(3);
>  break;
> @@ -2484,7 +2484,7 @@ void helper_vpermdq_ymm(Reg *d, Reg *v, Reg *s, 
> uint32_t order)
>  r2 = s->Q(0);
>  r3 = s->Q(1);
>  break;
> -case 3:
> +default:
>  r2 = s->Q(2);
>  r3 = s->Q(3);
>  break;

[PULL 02/29] i386: Emit correct error code for 64-bit IDT entry

2023-01-10 Thread Paolo Bonzini

From: Joe Richey 

When in 64-bit mode, IDT entiries are 16 bytes, so `intno * 16` is used
for base/limit/offset calculations. However, even in 64-bit mode, the
exception error code still uses bits [3,16) for the invlaid interrupt
index.

This means the error code should still be `intno * 8 + 2` even in 64-bit
mode.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1382
Signed-off-by: Joe Richey 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/seg_helper.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/i386/tcg/seg_helper.c b/target/i386/tcg/seg_helper.c
index 539189b4d184..03b58e94a2d4 100644
--- a/target/i386/tcg/seg_helper.c
+++ b/target/i386/tcg/seg_helper.c
@@ -882,7 +882,7 @@ static void do_interrupt64(CPUX86State *env, int intno, int 
is_int,
 
 dt = &env->idt;
 if (intno * 16 + 15 > dt->limit) {
-raise_exception_err(env, EXCP0D_GPF, intno * 16 + 2);
+raise_exception_err(env, EXCP0D_GPF, intno * 8 + 2);
 }
 ptr = dt->base + intno * 16;
 e1 = cpu_ldl_kernel(env, ptr);
@@ -895,18 +895,18 @@ static void do_interrupt64(CPUX86State *env, int intno, 
int is_int,
 case 15: /* 386 trap gate */
 break;
 default:
-raise_exception_err(env, EXCP0D_GPF, intno * 16 + 2);
+raise_exception_err(env, EXCP0D_GPF, intno * 8 + 2);
 break;
 }
 dpl = (e2 >> DESC_DPL_SHIFT) & 3;
 cpl = env->hflags & HF_CPL_MASK;
 /* check privilege if software int */
 if (is_int && dpl < cpl) {
-raise_exception_err(env, EXCP0D_GPF, intno * 16 + 2);
+raise_exception_err(env, EXCP0D_GPF, intno * 8 + 2);
 }
 /* check valid bit */
 if (!(e2 & DESC_P_MASK)) {
-raise_exception_err(env, EXCP0B_NOSEG, intno * 16 + 2);
+raise_exception_err(env, EXCP0B_NOSEG, intno * 8 + 2);
 }
 selector = e1 >> 16;
 offset = ((target_ulong)e3 << 32) | (e2 & 0x) | (e1 & 0x);
-- 
2.38.1

[PATCH v6 48/51] i386/xen: handle HVMOP_get_param

2023-01-10 Thread David Woodhouse

From: Joao Martins 

Which is used to fetch xenstore PFN and port to be used
by the guest. This is preallocated by the toolstack when
guest will just read those and use it straight away.

Signed-off-by: Joao Martins 
Signed-off-by: David Woodhouse 
---
 target/i386/kvm/xen-emu.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index 86decbe8d3..25508e6599 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -735,6 +735,42 @@ out:
 return true;
 }
 
+static bool handle_get_param(struct kvm_xen_exit *exit, X86CPU *cpu,
+ uint64_t arg)
+{
+CPUState *cs = CPU(cpu);
+struct xen_hvm_param hp;
+int err = 0;
+
+/* No need for 32/64 compat handling */
+qemu_build_assert(sizeof(hp) == 16);
+
+if (kvm_copy_from_gva(cs, arg, &hp, sizeof(hp))) {
+err = -EFAULT;
+goto out;
+}
+
+if (hp.domid != DOMID_SELF && hp.domid != xen_domid) {
+err = -ESRCH;
+goto out;
+}
+
+switch (hp.index) {
+case HVM_PARAM_STORE_PFN:
+hp.value = XEN_SPECIAL_PFN(XENSTORE);
+break;
+default:
+return false;
+}
+
+if (kvm_copy_to_gva(cs, arg, &hp, sizeof(hp))) {
+err = -EFAULT;
+}
+out:
+exit->u.hcall.result = err;
+return true;
+}
+
 static int kvm_xen_hcall_evtchn_upcall_vector(struct kvm_xen_exit *exit,
   X86CPU *cpu, uint64_t arg)
 {
@@ -779,6 +815,9 @@ static bool kvm_xen_hcall_hvm_op(struct kvm_xen_exit *exit, 
X86CPU *cpu,
 case HVMOP_set_param:
 return handle_set_param(exit, cpu, arg);
 
+case HVMOP_get_param:
+return handle_get_param(exit, cpu, arg);
+
 default:
 return false;
 }
-- 
2.35.3

[PATCH 24/26] translator: always pair plugin_gen_insn_{start, end} calls

2023-01-10 Thread Alex Bennée

From: Emilio Cota 

Related: #1381

Signed-off-by: Emilio Cota 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20230108164731.61469-3-c...@braap.org>
Signed-off-by: Alex Bennée 
---
 accel/tcg/translator.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 061519691f..ef5193c67e 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -100,19 +100,24 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, 
int max_insns,
 ops->translate_insn(db, cpu);
 }
 
-/* Stop translation if translate_insn so indicated.  */
-if (db->is_jmp != DISAS_NEXT) {
-break;
-}
-
 /*
  * We can't instrument after instructions that change control
  * flow although this only really affects post-load operations.
+ *
+ * Calling plugin_gen_insn_end() before we possibly stop translation
+ * is important. Even if this ends up as dead code, plugin generation
+ * needs to see a matching plugin_gen_insn_{start,end}() pair in order
+ * to accurately track instrumented helpers that might access memory.
  */
 if (plugin_enabled) {
 plugin_gen_insn_end();
 }
 
+/* Stop translation if translate_insn so indicated.  */
+if (db->is_jmp != DISAS_NEXT) {
+break;
+}
+
 /* Stop translation if the output buffer is full,
or we have executed all of the allowed instructions.  */
 if (tcg_op_buf_full() || db->num_insns >= db->max_insns) {
-- 
2.34.1

Re: Postcopy migration failed with "qemu-system-x86_64: postcopy_ram_listen_thread: loadvm failed: -5"

2023-01-10 Thread Dr. David Alan Gilbert

* Kei IKEDA (s2280...@st.go.tuat.ac.jp) wrote:
> Hi!
> 
> I am experimenting with post-copy by modifying qemu-6.0.0 in my research.
> I transfer a VM between two machines but it fails most of the time with
> these error messages.
> 
> ```
> qemu-system-x86_64: postcopy_ram_listen_thread: loadvm failed: -5
> 
> 
> qemu-system-x86_64: error while loading state for instance 0x0 of device
> 'kvm-tpr-opt'
> 
> qemu-system-x86_64: load of migration failed: Operation not permitted
> ```
> 
> I checked that it does not happen in vanilla qemu-6.0.0 post-copy migration
> so my modifying causes this error.
> 
> I want to fix this error but I don't know what these error messages mean.
> 
> If anyone knows what the situation is with this error, please let me know.

My guess from that error is that the vapic_post_load function in
hw/i386/kvmvacpic.c is erroring - or something around that;  it tells
you it's the 'kvm-tpr-opt' device that's failing to load, so you need to
find out why.

Are your changes anything to do with apic?

Dave

> Thanks.
> 
> ---Experiment information---
> machine
>  HPE ProLiant DL360 Gen10
> OS
>  Linux 5.11.22 with Ubuntu 22.04.1 LTS
> Startup command
>  source side
>   ```
>   sudo /home/work/qemu-6.0.0/build/x86_64-softmmu/qemu-system-x86_64 -cpu
> host -smp 8 -m 16G -enable-kvm -drive
> if=virtio,file=/var/nfs/migrate/64G.qcow2,cache=none -monitor stdio -qmp
> tcp:localhost:4445,server,nowait -net nic -netdev
> bridge,helper=/usr/lib/qemu/qemu-bridge-helper,id=hn0 -device
> virtio-net-pci,netdev=hn0,id=br0,mac=00:16:3e:33:ad:7d -net
> user,smb=/var/nfs/migrate,hostfwd=tcp::5557-:22,hostfwd=tcp::8000-:11211
>  ```
>  destination side
>   ```
>   sudo /home/kei/work/qeme-6.0.0/build/x86_64-softmmu/qemu-system-x86_64
> -cpu host -smp 8 -m 16G -enable-kvm -drive
> if=virtio,file=/var/nfs/migrate/64G.qcow2,cache=none -monitor stdio
> -incoming tcp:0: -qmp tcp:0:4446,server,nowait -net nic -netdev
> bridge,helper=/usr/lib/qemu/qemu-bridge-helper,id=hn0 -device
> virtio-net-pci,netdev=hn0,id=br0,mac=00:16:3e:33:ad:7d -net
> user,smb=/var/nfs/migrate,hostfwd=tcp::5557-:22,hostfwd=tcp::8000-:11211
>   ```
> ---
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[PATCH 13/26] semihosting: Write back semihosting data before completion callback

2023-01-10 Thread Alex Bennée

From: Keith Packard 

'lock_user' allocates a host buffer to shadow a target buffer,
'unlock_user' copies that host buffer back to the target and frees the
host memory. If the completion function uses the target buffer, it
must be called after unlock_user to ensure the data are present.

This caused the arm-compatible TARGET_SYS_READC to fail as the
completion function, common_semi_readc_cb, pulled data from the target
buffer which would not have been gotten the console data.

I decided to fix all instances of this pattern instead of just the
console_read function to make things consistent and potentially fix
bugs in other cases.

Signed-off-by: Keith Packard 
Reviewed-by: Richard Henderson 
Message-Id: <20221012014822.1242170-1-kei...@keithp.com>
Signed-off-by: Alex Bennée 
---
 semihosting/syscalls.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/semihosting/syscalls.c b/semihosting/syscalls.c
index 5893c760c5..ba28194b59 100644
--- a/semihosting/syscalls.c
+++ b/semihosting/syscalls.c
@@ -319,11 +319,11 @@ static void host_read(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 ret = RETRY_ON_EINTR(read(gf->hostfd, ptr, len));
 if (ret == -1) {
-complete(cs, -1, errno);
 unlock_user(ptr, buf, 0);
+complete(cs, -1, errno);
 } else {
-complete(cs, ret, 0);
 unlock_user(ptr, buf, ret);
+complete(cs, ret, 0);
 }
 }
 
@@ -339,8 +339,8 @@ static void host_write(CPUState *cs, 
gdb_syscall_complete_cb complete,
 return;
 }
 ret = write(gf->hostfd, ptr, len);
-complete(cs, ret, ret == -1 ? errno : 0);
 unlock_user(ptr, buf, 0);
+complete(cs, ret, ret == -1 ? errno : 0);
 }
 
 static void host_lseek(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -426,8 +426,8 @@ static void host_stat(CPUState *cs, gdb_syscall_complete_cb 
complete,
 ret = -1;
 }
 }
-complete(cs, ret, err);
 unlock_user(name, fname, 0);
+complete(cs, ret, err);
 }
 
 static void host_remove(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -444,8 +444,8 @@ static void host_remove(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 
 ret = remove(p);
-complete(cs, ret, ret ? errno : 0);
 unlock_user(p, fname, 0);
+complete(cs, ret, ret ? errno : 0);
 }
 
 static void host_rename(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -469,9 +469,9 @@ static void host_rename(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 
 ret = rename(ostr, nstr);
-complete(cs, ret, ret ? errno : 0);
 unlock_user(ostr, oname, 0);
 unlock_user(nstr, nname, 0);
+complete(cs, ret, ret ? errno : 0);
 }
 
 static void host_system(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -488,8 +488,8 @@ static void host_system(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 
 ret = system(p);
-complete(cs, ret, ret == -1 ? errno : 0);
 unlock_user(p, cmd, 0);
+complete(cs, ret, ret == -1 ? errno : 0);
 }
 
 static void host_gettimeofday(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -554,8 +554,8 @@ static void staticfile_read(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 memcpy(ptr, gf->staticfile.data + gf->staticfile.off, len);
 gf->staticfile.off += len;
-complete(cs, len, 0);
 unlock_user(ptr, buf, len);
+complete(cs, len, 0);
 }
 
 static void staticfile_lseek(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -608,8 +608,8 @@ static void console_read(CPUState *cs, 
gdb_syscall_complete_cb complete,
 return;
 }
 ret = qemu_semihosting_console_read(cs, ptr, len);
-complete(cs, ret, 0);
 unlock_user(ptr, buf, ret);
+complete(cs, ret, 0);
 }
 
 static void console_write(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -624,8 +624,8 @@ static void console_write(CPUState *cs, 
gdb_syscall_complete_cb complete,
 return;
 }
 ret = qemu_semihosting_console_write(ptr, len);
-complete(cs, ret ? ret : -1, ret ? 0 : EIO);
 unlock_user(ptr, buf, 0);
+complete(cs, ret ? ret : -1, ret ? 0 : EIO);
 }
 
 static void console_fstat(CPUState *cs, gdb_syscall_complete_cb complete,
-- 
2.34.1

Re: [PATCH] Makefile: allow 'make uninstall'

2023-01-10 Thread Thomas Huth


On 10/01/2023 16.13, Peter Maydell wrote:

Meson supports an "uninstall", so we can easily allow it to work by
not suppressing the forwarding of it from Make to meson.

We originally suppressed this because Meson's 'uninstall' has a hole
in it: it will remove everything that is installed by a mechanism
meson knows about, but not things installed by "custom install
scripts", and there is no "custom uninstall script" mechanism.

For QEMU, though, the only thing that was being installed by a custom
install script was the LC_MESSAGES files handled by Meson's i18n
module, and that code was fixed in Meson commit 487d45c1e5bfff0fbdb4,
which is present in Meson 0.60.0 and later.  Since we already require
a Meson version newer than that, we're now safe to enable
'uninstall', as it will now correctly uninstall everything that was
installed.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/109
Signed-off-by: Peter Maydell 
---
  Makefile | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Works for me!

Tested-by: Thomas Huth

Re: [PATCH] python: QEMUMachine: enable qmp accept timeout by default

2023-01-10 Thread John Snow

On Tue, Jan 10, 2023 at 12:06 PM John Snow  wrote:
>
>
>
> On Tue, Jan 10, 2023, 3:53 AM Vladimir Sementsov-Ogievskiy 
>  wrote:
>>
>> On 7/12/22 00:21, John Snow wrote:
>> > On Mon, Jul 11, 2022 at 5:16 PM John Snow  wrote:
>> >>
>> >> On Fri, Jun 24, 2022 at 3:53 PM Vladimir Sementsov-Ogievskiy
>> >>  wrote:
>> >>>
>> >>> I've spent much time trying to debug hanging pipeline in gitlab. I
>> >>> started from and idea that I have problem in code in my series (which
>> >>> has some timeouts). Finally I found that the problem is that I've used
>> >>> QEMUMachine class directly to avoid qtest, and didn't add necessary
>> >>> arguments. Qemu fails and we wait for qmp accept endlessly. In gitlab
>> >>> it's just stopped by timeout (one hour) with no sign of what's going
>> >>> wrong.
>> >>>
>> >>> With timeout enabled, gitlab don't wait for an hour and prints all
>> >>> needed information.
>> >>>
>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> >>> ---
>> >>>
>> >>> Hi all!
>> >>>
>> >>> Just compare this
>> >>>https://gitlab.com/vsementsov/qemu/-/pipelines/572232557
>> >>> and this
>> >>>https://gitlab.com/vsementsov/qemu/-/pipelines/572526252
>> >>>
>> >>> and you'll see that the latter is much better.
>> >>>
>> >>>   python/qemu/machine/machine.py | 2 +-
>> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> >>>
>> >>> diff --git a/python/qemu/machine/machine.py 
>> >>> b/python/qemu/machine/machine.py
>> >>> index 37191f433b..01a12f6f73 100644
>> >>> --- a/python/qemu/machine/machine.py
>> >>> +++ b/python/qemu/machine/machine.py
>> >>> @@ -131,7 +131,7 @@ def __init__(self,
>> >>>drain_console: bool = False,
>> >>>console_log: Optional[str] = None,
>> >>>log_dir: Optional[str] = None,
>> >>> - qmp_timer: Optional[float] = None):
>> >>> + qmp_timer: float = 30):
>> >>>   '''
>> >>>   Initialize a QEMUMachine
>> >>>
>> >>> --
>> >>> 2.25.1
>> >>>
>> >>
>> >> Oh, this is because machine.py uses the qmp_timer for *all* timeouts,
>> >> and not just the QMP commands themselves, and this relates to the work
>> >> Marc Andre is doing with regards to changing the launch mechanism to
>> >> handle the race condition when the QEMU launch fails, but the QMP
>> >> connection just sits waiting.
>> >>
>> >> I'm quite of the mind that it's really time to rewrite machine.py to
>> >> use the native asyncio interfaces I've been writing to help manage
>> >> this, but in the meantime I think this is probably a reasonable
>> >> concession and a more useful default.
>> >>
>> >> ...I think. Willing to take it for now and re-investigate when the
>> >> other fixes make it to the tree.
>> >>
>> >> Reviewed-by: John Snow 
>> >
>> > Oh, keep the type as Optional[float], though, so the timeout can be
>> > disabled again, and keeps the type consistent with the qtest
>> > derivative class. I've staged your patch with that change made, let me
>> > know if that's not OK. Modified patch is on my python branch:
>> >
>> > Thanks, merged.
>> >
>>
>> Hmm, seems that's lost.. I don't see it neither in master nor in your python 
>> branch..
>>
>> --
>> Best regards,
>> Vladimir
>
>
> :(
>
> I'll fix it. Thanks for resending the iotests series, too - the old version 
> was at the very top of my inbox :)

Re-edited and Re-staged:

https://gitlab.com/jsnow/qemu/-/commits/python

--js

1 2 3 4 >

1 - 100 of 394 matches

Mail list logo