Re: [PATCH v9 00/10] vhost-vdpa: add support for configure interrupt

2021-10-18 Thread Michael S. Tsirkin
On Thu, Sep 30, 2021 at 10:33:38AM +0800, Cindy Lu wrote:
> these patches add the support for configure interrupt
> 
> These codes are all tested in vp-vdpa (support configure interrupt)
> vdpa_sim (not support configure interrupt), virtio tap device
> 
> test in virtio-pci bus and virtio-mmio bus


I was inclined to let it slide but it hangs make check
so needs more work.
Meanwhile please go over how the patchset is structured,
and over description of each patch.
I sent some comments but same applied to everything.

Also, pls document the index == -1 hack in more detail.
how does it work and why it's helpful.

Thanks!

> Change in v2:
> Add support for virtio-mmio bus
> active the notifier while the backend support configure interrupt
> misc fixes from v1
> 
> Change in v3
> fix the coding style problems
> 
> Change in v4
> misc fixes from v3
> merge the set_config_notifier to set_guest_notifier
> when vdpa start, check the feature by VIRTIO_NET_F_STATUS
> 
> Change in v5
> misc fixes from v4
> split the code to introduce configure interrupt type and the callback function
> will init the configure interrupt in all virtio-pci and virtio-mmio bus, but 
> will
> only active while using vhost-vdpa driver
> 
> Change in v6
> misc fixes from v5
> decouple virtqueue from interrupt setting and misc process
> fix the bug in virtio_net_handle_rx
> use -1 as the queue number to identify if the interrupt is configure interrupt
> 
> Change in v7
> misc fixes from v6
> decouple virtqueue from interrupt setting and misc process
> decouple virtqueue from vector use/release process
> decouple virtqueue from set notifier fd handler process
> move config_notifier and masked_config_notifier to VirtIODevice
> fix the bug in virtio_net_handle_rx, add more information
> add VIRTIO_CONFIG_IRQ_IDX as the queue number to identify if the interrupt is 
> configure interrupt
> 
> Change in v8
> misc fixes from v7
> decouple virtqueue from interrupt setting and misc process
> decouple virtqueue from vector use/release process
> decouple virtqueue from set notifier fd handler process
> move the vhost configure interrupt to vhost_net
> 
> Change in v9
> misc fixes from v8
> address the comments for v8
> 
> Cindy Lu (10):
>   virtio: introduce macro IRTIO_CONFIG_IRQ_IDX
>   virtio-pci: decouple notifier from interrupt process
>   virtio-pci: decouple the single vector from the interrupt process
>   vhost: add new call back function for config interrupt
>   vhost-vdpa: add support for config interrupt call back
>   virtio: add support for configure interrupt
>   virtio-net: add support for configure interrupt
>   vhost: add support for configure interrupt
>   virtio-mmio: add support for configure interrupt
>   virtio-pci: add support for configure interrupt
> 
>  hw/display/vhost-user-gpu.c   |   6 +
>  hw/net/vhost_net.c|  10 ++
>  hw/net/virtio-net.c   |  16 +-
>  hw/virtio/trace-events|   2 +
>  hw/virtio/vhost-user-fs.c |   9 +-
>  hw/virtio/vhost-vdpa.c|   7 +
>  hw/virtio/vhost-vsock-common.c|   6 +
>  hw/virtio/vhost.c |  76 +
>  hw/virtio/virtio-crypto.c |   6 +
>  hw/virtio/virtio-mmio.c   |  27 
>  hw/virtio/virtio-pci.c| 260 --
>  hw/virtio/virtio-pci.h|   4 +-
>  hw/virtio/virtio.c|  29 
>  include/hw/virtio/vhost-backend.h |   3 +
>  include/hw/virtio/vhost.h |   4 +
>  include/hw/virtio/virtio.h|   6 +
>  include/net/vhost_net.h   |   3 +
>  17 files changed, 386 insertions(+), 88 deletions(-)
> 
> -- 
> 2.21.3




Re: [PATCH v9 10/10] virtio-pci: add support for configure interrupt

2021-10-18 Thread Michael S. Tsirkin
On Thu, Sep 30, 2021 at 10:33:48AM +0800, Cindy Lu wrote:
> Add support for configure interrupt, The process is used kvm_irqfd_assign
> to set the gsi to kernel. When the configure notifier was signal by
> host, qemu will inject a msix interrupt to guest
> 
> Signed-off-by: Cindy Lu 
> ---
>  hw/virtio/virtio-pci.c | 88 +-
>  hw/virtio/virtio-pci.h |  4 +-
>  2 files changed, 72 insertions(+), 20 deletions(-)
> 
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index d0a2c2fb81..50179c2ba1 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -728,7 +728,8 @@ static int virtio_pci_get_notifier(VirtIOPCIProxy *proxy, 
> int queue_no,
>  VirtQueue *vq;
>  
>  if (queue_no == VIRTIO_CONFIG_IRQ_IDX) {
> -return -1;
> +*n = virtio_config_get_guest_notifier(vdev);
> +*vector = vdev->config_vector;
>  } else {
>  if (!virtio_queue_get_num(vdev, queue_no)) {
>  return -1;

So here you are rewriting code you added previously ... not great.


> @@ -806,6 +807,10 @@ static int kvm_virtio_pci_vector_use(VirtIOPCIProxy 
> *proxy, int nvqs)
>  return ret;
>  }
>  
> +static int kvm_virtio_pci_vector_config_use(VirtIOPCIProxy *proxy)
> +{
> +return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX);
> +}
>  
>  static void kvm_virtio_pci_vector_release_one(VirtIOPCIProxy *proxy,
>int queue_no)
> @@ -829,6 +834,7 @@ static void 
> kvm_virtio_pci_vector_release_one(VirtIOPCIProxy *proxy,
>  }
>  kvm_virtio_pci_vq_vector_release(proxy, vector);
>  }
> +
>  static void kvm_virtio_pci_vector_release(VirtIOPCIProxy *proxy, int nvqs)
>  {
>  int queue_no;
> @@ -842,6 +848,11 @@ static void kvm_virtio_pci_vector_release(VirtIOPCIProxy 
> *proxy, int nvqs)
>  }
>  }
>  
> +static void kvm_virtio_pci_vector_config_release(VirtIOPCIProxy *proxy)
> +{
> +kvm_virtio_pci_vector_release_one(proxy, VIRTIO_CONFIG_IRQ_IDX);
> +}
> +
>  static int virtio_pci_one_vector_unmask(VirtIOPCIProxy *proxy,
> unsigned int queue_no,
> unsigned int vector,
> @@ -923,9 +934,17 @@ static int virtio_pci_vector_unmask(PCIDevice *dev, 
> unsigned vector,
>  }
>  vq = virtio_vector_next_queue(vq);
>  }
> -
> +/* unmask config intr */
> +n = virtio_config_get_guest_notifier(vdev);
> +ret = virtio_pci_one_vector_unmask(proxy, VIRTIO_CONFIG_IRQ_IDX, vector,
> +   msg, n);
> +if (ret < 0) {
> +goto undo_config;
> +}
>  return 0;
> -
> +undo_config:
> +n = virtio_config_get_guest_notifier(vdev);
> +virtio_pci_one_vector_mask(proxy, VIRTIO_CONFIG_IRQ_IDX, vector, n);
>  undo:
>  vq = virtio_vector_first_queue(vdev, vector);
>  while (vq && unmasked >= 0) {
> @@ -959,6 +978,8 @@ static void virtio_pci_vector_mask(PCIDevice *dev, 
> unsigned vector)
>  }
>  vq = virtio_vector_next_queue(vq);
>  }
> +n = virtio_config_get_guest_notifier(vdev);
> +virtio_pci_one_vector_mask(proxy, VIRTIO_CONFIG_IRQ_IDX, vector, n);
>  }
>  
>  static void virtio_pci_vector_poll(PCIDevice *dev,
> @@ -971,19 +992,17 @@ static void virtio_pci_vector_poll(PCIDevice *dev,
>  int queue_no;
>  unsigned int vector;
>  EventNotifier *notifier;
> -VirtQueue *vq;
> -
> -for (queue_no = 0; queue_no < proxy->nvqs_with_notifiers; queue_no++) {
> -if (!virtio_queue_get_num(vdev, queue_no)) {
> +int ret;
> +for (queue_no = VIRTIO_CONFIG_IRQ_IDX;
> + queue_no < proxy->nvqs_with_notifiers; queue_no++) {

Oh, it turns out it's important that this value is -1,
otherwise the loop will just go crazy.



> +ret = virtio_pci_get_notifier(proxy, queue_no, ¬ifier, &vector);
> +if (ret < 0) {
>  break;
>  }
> -vector = virtio_queue_vector(vdev, queue_no);
>  if (vector < vector_start || vector >= vector_end ||
>  !msix_is_masked(dev, vector)) {
>  continue;
>  }
> -vq = virtio_get_queue(vdev, queue_no);
> -notifier = virtio_queue_get_guest_notifier(vq);
>  if (k->guest_notifier_pending) {
>  if (k->guest_notifier_pending(vdev, queue_no)) {
>  msix_set_pending(dev, vector);
> @@ -994,23 +1013,42 @@ static void virtio_pci_vector_poll(PCIDevice *dev,
>  }
>  }
>  
> +void virtio_pci_set_guest_notifier_fd_handler(VirtIODevice *vdev, VirtQueue 
> *vq,
> +  int n, bool assign,
> +  bool with_irqfd)
> +{
> +if (n == VIRTIO_CONFIG_IRQ_IDX) {
> +virtio_config_set_guest_notifier_fd_handler(vdev, assign, 
> with_irqfd);
> +} else {
> +virtio_queue_set_guest_notifier_fd_handler(vq, assign, with_irqfd);
> +}

Re: [PATCH v9 05/10] vhost-vdpa: add support for config interrupt call back

2021-10-18 Thread Michael S. Tsirkin
On Thu, Sep 30, 2021 at 10:33:43AM +0800, Cindy Lu wrote:
> Add new call back function in vhost-vdpa, this call back function will
> set the fb number to hardware.
> 
> Signed-off-by: Cindy Lu 

fb being what? you mean fd. said fd doing what exactly?
all this needs to be in the commit log pls.

> ---
>  hw/virtio/trace-events | 2 ++
>  hw/virtio/vhost-vdpa.c | 7 +++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
> index 8ed19e9d0c..836e73d1f7 100644
> --- a/hw/virtio/trace-events
> +++ b/hw/virtio/trace-events
> @@ -52,6 +52,8 @@ vhost_vdpa_set_vring_call(void *dev, unsigned int index, 
> int fd) "dev: %p index:
>  vhost_vdpa_get_features(void *dev, uint64_t features) "dev: %p features: 
> 0x%"PRIx64
>  vhost_vdpa_set_owner(void *dev) "dev: %p"
>  vhost_vdpa_vq_get_addr(void *dev, void *vq, uint64_t desc_user_addr, 
> uint64_t avail_user_addr, uint64_t used_user_addr) "dev: %p vq: %p 
> desc_user_addr: 0x%"PRIx64" avail_user_addr: 0x%"PRIx64" used_user_addr: 
> 0x%"PRIx64
> +vhost_vdpa_set_config_call(void *dev, int fd)"dev: %p fd: %d"
> +
>  
>  # virtio.c
>  virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned 
> out_num) "elem %p size %zd in_num %u out_num %u"
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 4fa414feea..73764afc61 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -622,6 +622,12 @@ static int vhost_vdpa_set_vring_call(struct vhost_dev 
> *dev,
>  trace_vhost_vdpa_set_vring_call(dev, file->index, file->fd);
>  return vhost_vdpa_call(dev, VHOST_SET_VRING_CALL, file);
>  }
> +static int vhost_vdpa_set_config_call(struct vhost_dev *dev,
> +   int fd)
> +{
> +trace_vhost_vdpa_set_config_call(dev, fd);
> +return vhost_vdpa_call(dev, VHOST_VDPA_SET_CONFIG_CALL, &fd);
> +}
>  
>  static int vhost_vdpa_get_features(struct vhost_dev *dev,
>   uint64_t *features)
> @@ -688,4 +694,5 @@ const VhostOps vdpa_ops = {
>  .vhost_get_device_id = vhost_vdpa_get_device_id,
>  .vhost_vq_get_addr = vhost_vdpa_vq_get_addr,
>  .vhost_force_iommu = vhost_vdpa_force_iommu,
> +.vhost_set_config_call = vhost_vdpa_set_config_call,
>  };
> -- 
> 2.21.3




Re: [PATCH v9 01/10] virtio: introduce macro IRTIO_CONFIG_IRQ_IDX

2021-10-18 Thread Michael S. Tsirkin
On Thu, Sep 30, 2021 at 10:33:39AM +0800, Cindy Lu wrote:
> To support configure interrupt for vhost-vdpa
> introduce VIRTIO_CONFIG_IRQ_IDX -1 as config queue index, Then we can reuse
> the function guest_notifier_mask and guest_notifier_pending.
> Add the check of queue index, if the driver does not support configure
> interrupt, the function will just return
> 
> Signed-off-by: Cindy Lu 

typo in subject

Also the commit log and subject do not seem to match what patch is
doing. Description makes it look like a refactoring, but
it isn't. guest_notifier_mask don't exist.
And I'm not sure why it's safe to do nothing e.g. in
pending.




> ---
>  hw/display/vhost-user-gpu.c|  6 ++
>  hw/net/virtio-net.c| 10 +++---
>  hw/virtio/vhost-user-fs.c  |  9 +++--
>  hw/virtio/vhost-vsock-common.c |  6 ++
>  hw/virtio/virtio-crypto.c  |  6 ++
>  include/hw/virtio/virtio.h |  2 ++
>  6 files changed, 34 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
> index 49df56cd14..73ad3d84c9 100644
> --- a/hw/display/vhost-user-gpu.c
> +++ b/hw/display/vhost-user-gpu.c
> @@ -485,6 +485,9 @@ vhost_user_gpu_guest_notifier_pending(VirtIODevice *vdev, 
> int idx)
>  {
>  VhostUserGPU *g = VHOST_USER_GPU(vdev);
>  
> +if (idx == VIRTIO_CONFIG_IRQ_IDX) {
> +return false;
> +}
>  return vhost_virtqueue_pending(&g->vhost->dev, idx);
>  }
>  
> @@ -493,6 +496,9 @@ vhost_user_gpu_guest_notifier_mask(VirtIODevice *vdev, 
> int idx, bool mask)
>  {
>  VhostUserGPU *g = VHOST_USER_GPU(vdev);
>  
> +if (idx == VIRTIO_CONFIG_IRQ_IDX) {
> +return;
> +}
>  vhost_virtqueue_mask(&g->vhost->dev, vdev, idx, mask);
>  }
>  
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 16d20cdee5..65b7cabcaf 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -3152,7 +3152,10 @@ static bool 
> virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
>  VirtIONet *n = VIRTIO_NET(vdev);
>  NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
>  assert(n->vhost_started);
> -return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
> +if (idx != VIRTIO_CONFIG_IRQ_IDX) {
> +return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
> +}
> +return false;
>  }
>  
>  static void virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
> @@ -3161,8 +3164,9 @@ static void virtio_net_guest_notifier_mask(VirtIODevice 
> *vdev, int idx,
>  VirtIONet *n = VIRTIO_NET(vdev);
>  NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
>  assert(n->vhost_started);
> -vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
> - vdev, idx, mask);
> +if (idx != VIRTIO_CONFIG_IRQ_IDX) {
> +vhost_net_virtqueue_mask(get_vhost_net(nc->peer), vdev, idx, mask);
> +}
>  }
>  
>  static void virtio_net_set_config_size(VirtIONet *n, uint64_t host_features)
> diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> index c595957983..309c8efabf 100644
> --- a/hw/virtio/vhost-user-fs.c
> +++ b/hw/virtio/vhost-user-fs.c
> @@ -156,11 +156,13 @@ static void vuf_handle_output(VirtIODevice *vdev, 
> VirtQueue *vq)
>   */
>  }
>  
> -static void vuf_guest_notifier_mask(VirtIODevice *vdev, int idx,
> -bool mask)
> +static void vuf_guest_notifier_mask(VirtIODevice *vdev, int idx, bool mask)
>  {
>  VHostUserFS *fs = VHOST_USER_FS(vdev);
>  
> +if (idx == VIRTIO_CONFIG_IRQ_IDX) {
> +return;
> +}
>  vhost_virtqueue_mask(&fs->vhost_dev, vdev, idx, mask);
>  }
>  
> @@ -168,6 +170,9 @@ static bool vuf_guest_notifier_pending(VirtIODevice 
> *vdev, int idx)
>  {
>  VHostUserFS *fs = VHOST_USER_FS(vdev);
>  
> +if (idx == VIRTIO_CONFIG_IRQ_IDX) {
> +return false;
> +}
>  return vhost_virtqueue_pending(&fs->vhost_dev, idx);
>  }
>  
> diff --git a/hw/virtio/vhost-vsock-common.c b/hw/virtio/vhost-vsock-common.c
> index 4ad6e234ad..2112b44802 100644
> --- a/hw/virtio/vhost-vsock-common.c
> +++ b/hw/virtio/vhost-vsock-common.c
> @@ -101,6 +101,9 @@ static void 
> vhost_vsock_common_guest_notifier_mask(VirtIODevice *vdev, int idx,
>  {
>  VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
>  
> +if (idx == VIRTIO_CONFIG_IRQ_IDX) {
> +return;
> +}
>  vhost_virtqueue_mask(&vvc->vhost_dev, vdev, idx, mask);
>  }
>  
> @@ -109,6 +112,9 @@ static bool 
> vhost_vsock_common_guest_notifier_pending(VirtIODevice *vdev,
>  {
>  VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
>  
> +if (idx == VIRTIO_CONFIG_IRQ_IDX) {
> +return false;
> +}
>  return vhost_virtqueue_pending(&vvc->vhost_dev, idx);
>  }
>  
> diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
> index 54f9bbb789..1d5192f8b4 100644
> --- a/hw/virtio/virtio-crypto.c
> +++ b/hw/virtio/virtio-cryp

Re: [PATCH v9 04/10] vhost: add new call back function for config interrupt

2021-10-18 Thread Michael S. Tsirkin
On Thu, Sep 30, 2021 at 10:33:42AM +0800, Cindy Lu wrote:
> To support the config interrupt, we need to
> add a new call back function for config interrupt.
> 
> Signed-off-by: Cindy Lu 

Pls make commit log more informative.
Doing what? Called back when?


> ---
>  include/hw/virtio/vhost-backend.h | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/include/hw/virtio/vhost-backend.h 
> b/include/hw/virtio/vhost-backend.h
> index 8475c5a29d..e732d2e702 100644
> --- a/include/hw/virtio/vhost-backend.h
> +++ b/include/hw/virtio/vhost-backend.h
> @@ -126,6 +126,8 @@ typedef int (*vhost_get_device_id_op)(struct vhost_dev 
> *dev, uint32_t *dev_id);
>  
>  typedef bool (*vhost_force_iommu_op)(struct vhost_dev *dev);
>  
> +typedef int (*vhost_set_config_call_op)(struct vhost_dev *dev,
> +   int fd);
>  typedef struct VhostOps {
>  VhostBackendType backend_type;
>  vhost_backend_init vhost_backend_init;
> @@ -171,6 +173,7 @@ typedef struct VhostOps {
>  vhost_vq_get_addr_op  vhost_vq_get_addr;
>  vhost_get_device_id_op vhost_get_device_id;
>  vhost_force_iommu_op vhost_force_iommu;
> +vhost_set_config_call_op vhost_set_config_call;
>  } VhostOps;
>  
>  extern const VhostOps user_ops;
> -- 
> 2.21.3




Re: [PATCH v9 10/10] virtio-pci: add support for configure interrupt

2021-10-18 Thread Michael S. Tsirkin
On Thu, Sep 30, 2021 at 10:33:48AM +0800, Cindy Lu wrote:
> Add support for configure interrupt, The process is used kvm_irqfd_assign
> to set the gsi to kernel. When the configure notifier was signal by
> host, qemu will inject a msix interrupt to guest
> 
> Signed-off-by: Cindy Lu 

This one makes make check hang on my machine.

Just make, then:
QTEST_QEMU_STORAGE_DAEMON_BINARY=./build/storage-daemon/qemu-storage-daemon \
QTEST_QEMU_BINARY=build/x86_64-softmmu/qemu-system-x86_64 \
./build/tests/qtest/qos-test

and observe it hang.


> ---
>  hw/virtio/virtio-pci.c | 88 +-
>  hw/virtio/virtio-pci.h |  4 +-
>  2 files changed, 72 insertions(+), 20 deletions(-)
> 
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index d0a2c2fb81..50179c2ba1 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -728,7 +728,8 @@ static int virtio_pci_get_notifier(VirtIOPCIProxy *proxy, 
> int queue_no,
>  VirtQueue *vq;
>  
>  if (queue_no == VIRTIO_CONFIG_IRQ_IDX) {
> -return -1;
> +*n = virtio_config_get_guest_notifier(vdev);
> +*vector = vdev->config_vector;
>  } else {
>  if (!virtio_queue_get_num(vdev, queue_no)) {
>  return -1;
> @@ -806,6 +807,10 @@ static int kvm_virtio_pci_vector_use(VirtIOPCIProxy 
> *proxy, int nvqs)
>  return ret;
>  }
>  
> +static int kvm_virtio_pci_vector_config_use(VirtIOPCIProxy *proxy)
> +{
> +return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX);
> +}
>  
>  static void kvm_virtio_pci_vector_release_one(VirtIOPCIProxy *proxy,
>int queue_no)
> @@ -829,6 +834,7 @@ static void 
> kvm_virtio_pci_vector_release_one(VirtIOPCIProxy *proxy,
>  }
>  kvm_virtio_pci_vq_vector_release(proxy, vector);
>  }
> +
>  static void kvm_virtio_pci_vector_release(VirtIOPCIProxy *proxy, int nvqs)
>  {
>  int queue_no;
> @@ -842,6 +848,11 @@ static void kvm_virtio_pci_vector_release(VirtIOPCIProxy 
> *proxy, int nvqs)
>  }
>  }
>  
> +static void kvm_virtio_pci_vector_config_release(VirtIOPCIProxy *proxy)
> +{
> +kvm_virtio_pci_vector_release_one(proxy, VIRTIO_CONFIG_IRQ_IDX);
> +}
> +
>  static int virtio_pci_one_vector_unmask(VirtIOPCIProxy *proxy,
> unsigned int queue_no,
> unsigned int vector,
> @@ -923,9 +934,17 @@ static int virtio_pci_vector_unmask(PCIDevice *dev, 
> unsigned vector,
>  }
>  vq = virtio_vector_next_queue(vq);
>  }
> -
> +/* unmask config intr */
> +n = virtio_config_get_guest_notifier(vdev);
> +ret = virtio_pci_one_vector_unmask(proxy, VIRTIO_CONFIG_IRQ_IDX, vector,
> +   msg, n);
> +if (ret < 0) {
> +goto undo_config;
> +}
>  return 0;
> -
> +undo_config:
> +n = virtio_config_get_guest_notifier(vdev);
> +virtio_pci_one_vector_mask(proxy, VIRTIO_CONFIG_IRQ_IDX, vector, n);
>  undo:
>  vq = virtio_vector_first_queue(vdev, vector);
>  while (vq && unmasked >= 0) {
> @@ -959,6 +978,8 @@ static void virtio_pci_vector_mask(PCIDevice *dev, 
> unsigned vector)
>  }
>  vq = virtio_vector_next_queue(vq);
>  }
> +n = virtio_config_get_guest_notifier(vdev);
> +virtio_pci_one_vector_mask(proxy, VIRTIO_CONFIG_IRQ_IDX, vector, n);
>  }
>  
>  static void virtio_pci_vector_poll(PCIDevice *dev,
> @@ -971,19 +992,17 @@ static void virtio_pci_vector_poll(PCIDevice *dev,
>  int queue_no;
>  unsigned int vector;
>  EventNotifier *notifier;
> -VirtQueue *vq;
> -
> -for (queue_no = 0; queue_no < proxy->nvqs_with_notifiers; queue_no++) {
> -if (!virtio_queue_get_num(vdev, queue_no)) {
> +int ret;
> +for (queue_no = VIRTIO_CONFIG_IRQ_IDX;
> + queue_no < proxy->nvqs_with_notifiers; queue_no++) {
> +ret = virtio_pci_get_notifier(proxy, queue_no, ¬ifier, &vector);
> +if (ret < 0) {
>  break;
>  }
> -vector = virtio_queue_vector(vdev, queue_no);
>  if (vector < vector_start || vector >= vector_end ||
>  !msix_is_masked(dev, vector)) {
>  continue;
>  }
> -vq = virtio_get_queue(vdev, queue_no);
> -notifier = virtio_queue_get_guest_notifier(vq);
>  if (k->guest_notifier_pending) {
>  if (k->guest_notifier_pending(vdev, queue_no)) {
>  msix_set_pending(dev, vector);
> @@ -994,23 +1013,42 @@ static void virtio_pci_vector_poll(PCIDevice *dev,
>  }
>  }
>  
> +void virtio_pci_set_guest_notifier_fd_handler(VirtIODevice *vdev, VirtQueue 
> *vq,
> +  int n, bool assign,
> +  bool with_irqfd)
> +{
> +if (n == VIRTIO_CONFIG_IRQ_IDX) {
> +virtio_config_set_guest_notifier_fd_handler(vdev, assign, 
> with_irqfd);
> +} else {

Re: [PATCH v3 1/2] vhost-user: fix VirtQ notifier cleanup

2021-10-18 Thread Xueming(Steven) Li
On Tue, 2021-10-19 at 02:15 -0400, Michael S. Tsirkin wrote:
> On Fri, Oct 08, 2021 at 03:58:04PM +0800, Xueming Li wrote:
> > When vhost-user device cleanup and unmmap notifier address, VM cpu
> > thread that writing the notifier failed with accessing invalid address.
> > 
> > To avoid this concurrent issue, wait memory flatview update by draining
> > rcu callbacks, then unmap notifiers.
> > 
> > Fixes: 44866521bd6e ("vhost-user: support registering external host 
> > notifiers")
> > Cc: tiwei@intel.com
> > Cc: qemu-sta...@nongnu.org
> > Cc: Yuwei Zhang 
> > Signed-off-by: Xueming Li 
> > ---
> >  hw/virtio/vhost-user.c | 21 ++---
> >  1 file changed, 14 insertions(+), 7 deletions(-)
> > 
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index bf6e50223c..b2e948bdc7 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -1165,6 +1165,12 @@ static void vhost_user_host_notifier_remove(struct 
> > vhost_dev *dev,
> >  
> >  if (n->addr && n->set) {
> >  virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
> > +if (!qemu_in_vcpu_thread()) { /* Avoid vCPU dead lock. */
> > +/* Wait for VM threads accessing old flatview which contains 
> > notifier. */
> > +drain_call_rcu();
> > +}
> > +munmap(n->addr, qemu_real_host_page_size);
> > +n->addr = NULL;
> >  n->set = false;
> >  }
> >  }
> 
> 
> ../hw/virtio/vhost-user.c: In function ‘vhost_user_host_notifier_remove’:
> ../hw/virtio/vhost-user.c:1168:14: error: implicit declaration of function 
> ‘qemu_in_vcpu_thread’ [-Werror=implicit-function-declaration]
>  1168 | if (!qemu_in_vcpu_thread()) { /* Avoid vCPU dead lock. */
>   |  ^~~
> ../hw/virtio/vhost-user.c:1168:14: error: nested extern declaration of 
> ‘qemu_in_vcpu_thread’ [-Werror=nested-externs]
> cc1: all warnings being treated as errors
> ninja: build stopped: subcommand failed.
> make[1]: *** [Makefile:162: run-ninja] Error 1
> make[1]: Leaving directory '/scm/qemu/build'
> make: *** [GNUmakefile:11: all] Error 2
> 
> 
> Although the following patch fixes it, bisect is broken.

Yes, really an issue, v4 posted, thanks!

> 
> 
> > @@ -1502,12 +1508,7 @@ static int 
> > vhost_user_slave_handle_vring_host_notifier(struct vhost_dev *dev,
> >  
> >  n = &user->notifier[queue_idx];
> >  
> > -if (n->addr) {
> > -virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
> > -object_unparent(OBJECT(&n->mr));
> > -munmap(n->addr, page_size);
> > -n->addr = NULL;
> > -}
> > +vhost_user_host_notifier_remove(dev, queue_idx);
> >  
> >  if (area->u64 & VHOST_USER_VRING_NOFD_MASK) {
> >  return 0;
> > @@ -2485,11 +2486,17 @@ void vhost_user_cleanup(VhostUserState *user)
> >  for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> >  if (user->notifier[i].addr) {
> >  object_unparent(OBJECT(&user->notifier[i].mr));
> > +}
> > +}
> > +memory_region_transaction_commit();
> > +/* Wait for VM threads accessing old flatview which contains notifier. 
> > */
> > +drain_call_rcu();
> > +for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> > +if (user->notifier[i].addr) {
> >  munmap(user->notifier[i].addr, qemu_real_host_page_size);
> >  user->notifier[i].addr = NULL;
> >  }
> >  }
> > -memory_region_transaction_commit();
> >  user->chr = NULL;
> >  }
> >  
> > -- 
> > 2.33.0
> 



[PATCH v4 2/2] vhost-user: remove VirtQ notifier restore

2021-10-18 Thread Xueming Li
When vhost-user vdpa client restart, VQ notifier resources become
invalid, no need to keep mmap, vdpa client will set VQ notifier after
reconnect.

Removes VQ notifier restore and related flags.

Fixes: 44866521bd6e ("vhost-user: support registering external host notifiers")
Cc: tiwei@intel.com
Cc: qemu-sta...@nongnu.org
Cc: Yuwei Zhang 
Signed-off-by: Xueming Li 
---
 hw/virtio/vhost-user.c | 19 +--
 include/hw/virtio/vhost-user.h |  1 -
 2 files changed, 1 insertion(+), 19 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index cfca1b9adc..cc33f4b042 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1144,19 +1144,6 @@ static int vhost_user_set_vring_num(struct vhost_dev 
*dev,
 return vhost_set_vring(dev, VHOST_USER_SET_VRING_NUM, ring);
 }
 
-static void vhost_user_host_notifier_restore(struct vhost_dev *dev,
- int queue_idx)
-{
-struct vhost_user *u = dev->opaque;
-VhostUserHostNotifier *n = &u->user->notifier[queue_idx];
-VirtIODevice *vdev = dev->vdev;
-
-if (n->addr && !n->set) {
-virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, true);
-n->set = true;
-}
-}
-
 static void vhost_user_host_notifier_remove(struct vhost_dev *dev,
 int queue_idx)
 {
@@ -1164,7 +1151,7 @@ static void vhost_user_host_notifier_remove(struct 
vhost_dev *dev,
 VhostUserHostNotifier *n = &u->user->notifier[queue_idx];
 VirtIODevice *vdev = dev->vdev;
 
-if (n->addr && n->set) {
+if (n->addr) {
 virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
 if (!qemu_in_vcpu_thread()) { /* Avoid vCPU dead lock. */
 /* Wait for VM threads accessing old flatview which contains 
notifier. */
@@ -1172,15 +1159,12 @@ static void vhost_user_host_notifier_remove(struct 
vhost_dev *dev,
 }
 munmap(n->addr, qemu_real_host_page_size);
 n->addr = NULL;
-n->set = false;
 }
 }
 
 static int vhost_user_set_vring_base(struct vhost_dev *dev,
  struct vhost_vring_state *ring)
 {
-vhost_user_host_notifier_restore(dev, ring->index);
-
 return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, ring);
 }
 
@@ -1540,7 +1524,6 @@ static int 
vhost_user_slave_handle_vring_host_notifier(struct vhost_dev *dev,
 }
 
 n->addr = addr;
-n->set = true;
 
 return 0;
 }
diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
index a9abca3288..f6012b2078 100644
--- a/include/hw/virtio/vhost-user.h
+++ b/include/hw/virtio/vhost-user.h
@@ -14,7 +14,6 @@
 typedef struct VhostUserHostNotifier {
 MemoryRegion mr;
 void *addr;
-bool set;
 } VhostUserHostNotifier;
 
 typedef struct VhostUserState {
-- 
2.33.0




[PATCH v4 0/2] Improve vhost-user VQ notifier unmap

2021-10-18 Thread Xueming Li
When vDPA applicaiton in client mode shutdown, unmapped VQ notifier
might being accessed by vCPU thread under high tx traffic, it will
crash VM in rare conditon. This patch try to fix it with better RCU
sychronization of new flatview.

v2: no RCU draining on vCPU thread
v3: minor fix on coding style and comments
https://lists.nongnu.org/archive/html/qemu-devel/2021-10/msg01764.html
v4: fix first patch compilation

Xueming Li (2):
  vhost-user: fix VirtQ notifier cleanup
  vhost-user: remove VirtQ notifier restore

 hw/virtio/vhost-user.c | 41 +-
 include/hw/virtio/vhost-user.h |  1 -
 2 files changed, 16 insertions(+), 26 deletions(-)

-- 
2.33.0




[PATCH v4 1/2] vhost-user: fix VirtQ notifier cleanup

2021-10-18 Thread Xueming Li
When vhost-user device cleanup and unmmap notifier address, VM cpu
thread that writing the notifier failed with accessing invalid address.

To avoid this concurrent issue, wait memory flatview update by draining
rcu callbacks, then unmap notifiers.

Fixes: 44866521bd6e ("vhost-user: support registering external host notifiers")
Cc: tiwei@intel.com
Cc: qemu-sta...@nongnu.org
Cc: Yuwei Zhang 
Signed-off-by: Xueming Li 
---
 hw/virtio/vhost-user.c | 22 +++---
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index bf6e50223c..cfca1b9adc 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -18,6 +18,7 @@
 #include "chardev/char-fe.h"
 #include "io/channel-socket.h"
 #include "sysemu/kvm.h"
+#include "sysemu/cpus.h"
 #include "qemu/error-report.h"
 #include "qemu/main-loop.h"
 #include "qemu/sockets.h"
@@ -1165,6 +1166,12 @@ static void vhost_user_host_notifier_remove(struct 
vhost_dev *dev,
 
 if (n->addr && n->set) {
 virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
+if (!qemu_in_vcpu_thread()) { /* Avoid vCPU dead lock. */
+/* Wait for VM threads accessing old flatview which contains 
notifier. */
+drain_call_rcu();
+}
+munmap(n->addr, qemu_real_host_page_size);
+n->addr = NULL;
 n->set = false;
 }
 }
@@ -1502,12 +1509,7 @@ static int 
vhost_user_slave_handle_vring_host_notifier(struct vhost_dev *dev,
 
 n = &user->notifier[queue_idx];
 
-if (n->addr) {
-virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
-object_unparent(OBJECT(&n->mr));
-munmap(n->addr, page_size);
-n->addr = NULL;
-}
+vhost_user_host_notifier_remove(dev, queue_idx);
 
 if (area->u64 & VHOST_USER_VRING_NOFD_MASK) {
 return 0;
@@ -2485,11 +2487,17 @@ void vhost_user_cleanup(VhostUserState *user)
 for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
 if (user->notifier[i].addr) {
 object_unparent(OBJECT(&user->notifier[i].mr));
+}
+}
+memory_region_transaction_commit();
+/* Wait for VM threads accessing old flatview which contains notifier. */
+drain_call_rcu();
+for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
+if (user->notifier[i].addr) {
 munmap(user->notifier[i].addr, qemu_real_host_page_size);
 user->notifier[i].addr = NULL;
 }
 }
-memory_region_transaction_commit();
 user->chr = NULL;
 }
 
-- 
2.33.0




[PATCH v4] tests: qtest: Add virtio-iommu test

2021-10-18 Thread Eric Auger
Add the framework to test the virtio-iommu-pci device
and tests exercising the attach/detach, map/unmap API.

Signed-off-by: Eric Auger 

---

This applies on top of jean-Philippe's
[PATCH v4 00/11] virtio-iommu: Add ACPI support
branch can be found at:
https://github.com/eauger/qemu.git
branch qtest-virtio-iommu-v4

To run the tests:
make tests/qtest/qos-test
cd build
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon 
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64  tests/qtest/qos-test

v3 -> v4:
- removed the virtio-iommu-device graph pieces thus fix arm make check execution
- s/end-point/endpoint
- initialize struct virtio_iommu_req_attach struct
- add cpu_to_lexx
- test overlapping map requests
- comment fixes
- check send_map returned value()
- introduce read_tail_status()
- remove Thomas' A-b due to code changes

v2 -> v3:
- s/memread/qtest_memread
- s/char buffer[64]/struct virtio_iommu_req_tail buffer
- added Thomas' A-b

v1 -> v2:
- fix the license info (Thomas)
- use UINT64_MAX (Philippe)
---
 tests/qtest/libqos/meson.build|   1 +
 tests/qtest/libqos/virtio-iommu.c | 126 
 tests/qtest/libqos/virtio-iommu.h |  40 
 tests/qtest/meson.build   |   1 +
 tests/qtest/virtio-iommu-test.c   | 327 ++
 5 files changed, 495 insertions(+)
 create mode 100644 tests/qtest/libqos/virtio-iommu.c
 create mode 100644 tests/qtest/libqos/virtio-iommu.h
 create mode 100644 tests/qtest/virtio-iommu-test.c

diff --git a/tests/qtest/libqos/meson.build b/tests/qtest/libqos/meson.build
index 1f5c8f1053..ba90bbe2b8 100644
--- a/tests/qtest/libqos/meson.build
+++ b/tests/qtest/libqos/meson.build
@@ -40,6 +40,7 @@ libqos_srcs = files('../libqtest.c',
 'virtio-rng.c',
 'virtio-scsi.c',
 'virtio-serial.c',
+'virtio-iommu.c',
 
 # qgraph machines:
 'aarch64-xlnx-zcu102-machine.c',
diff --git a/tests/qtest/libqos/virtio-iommu.c 
b/tests/qtest/libqos/virtio-iommu.c
new file mode 100644
index 00..18cba4ca36
--- /dev/null
+++ b/tests/qtest/libqos/virtio-iommu.c
@@ -0,0 +1,126 @@
+/*
+ * libqos driver virtio-iommu-pci framework
+ *
+ * Copyright (c) 2021 Red Hat, Inc.
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at your
+ * option) any later version.  See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+#include "qemu/module.h"
+#include "qgraph.h"
+#include "virtio-iommu.h"
+#include "hw/virtio/virtio-iommu.h"
+
+static QGuestAllocator *alloc;
+
+/* virtio-iommu-device */
+static void *qvirtio_iommu_get_driver(QVirtioIOMMU *v_iommu,
+  const char *interface)
+{
+if (!g_strcmp0(interface, "virtio-iommu")) {
+return v_iommu;
+}
+if (!g_strcmp0(interface, "virtio")) {
+return v_iommu->vdev;
+}
+
+fprintf(stderr, "%s not present in virtio-iommu-device\n", interface);
+g_assert_not_reached();
+}
+
+static void virtio_iommu_cleanup(QVirtioIOMMU *interface)
+{
+qvirtqueue_cleanup(interface->vdev->bus, interface->vq, alloc);
+}
+
+static void virtio_iommu_setup(QVirtioIOMMU *interface)
+{
+QVirtioDevice *vdev = interface->vdev;
+uint64_t features;
+
+features = qvirtio_get_features(vdev);
+features &= ~(QVIRTIO_F_BAD_FEATURE |
+  (1ull << VIRTIO_RING_F_INDIRECT_DESC) |
+  (1ull << VIRTIO_RING_F_EVENT_IDX) |
+  (1ull << VIRTIO_IOMMU_F_BYPASS));
+qvirtio_set_features(vdev, features);
+interface->vq = qvirtqueue_setup(interface->vdev, alloc, 0);
+qvirtio_set_driver_ok(interface->vdev);
+}
+
+/* virtio-iommu-pci */
+static void *qvirtio_iommu_pci_get_driver(void *object, const char *interface)
+{
+QVirtioIOMMUPCI *v_iommu = object;
+if (!g_strcmp0(interface, "pci-device")) {
+return v_iommu->pci_vdev.pdev;
+}
+return qvirtio_iommu_get_driver(&v_iommu->iommu, interface);
+}
+
+static void qvirtio_iommu_pci_destructor(QOSGraphObject *obj)
+{
+QVirtioIOMMUPCI *iommu_pci = (QVirtioIOMMUPCI *) obj;
+QVirtioIOMMU *interface = &iommu_pci->iommu;
+QOSGraphObject *pci_vobj =  &iommu_pci->pci_vdev.obj;
+
+virtio_iommu_cleanup(interface);
+qvirtio_pci_destructor(pci_vobj);
+}
+
+static void qvirtio_iommu_pci_start_hw(QOSGraphObject *obj)
+{
+QVirtioIOMMUPCI *iommu_pci = (QVirtioIOMMUPCI *) obj;
+QVirtioIOMMU *interface = &iommu_pci->iommu;
+QOSGraphObject *pci_vobj =  &iommu_pci->pci_vdev.obj;
+
+qvirtio_pci_start_hw(pci_vobj);
+virtio_iommu_setup(interface);
+}
+
+
+static void *virtio_iommu_pci_create(void *pci_bus, QGuestAllocator *t_alloc,
+   void *addr)
+{
+QVirtioIOMMUPCI *virtio_rpci = g_new0(QVirtioIOMMUPCI, 1);
+QVirtioIOMMU *interface = &virtio_rpci->iommu;
+QOSGraphObject *obj = &virtio_rpci->pci_vdev.obj;
+
+virtio_pci_init(&v

Re: [PATCH v3 2/2] vhost-user: remove VirtQ notifier restore

2021-10-18 Thread Michael S. Tsirkin
On Fri, Oct 08, 2021 at 03:58:05PM +0800, Xueming Li wrote:
> When vhost-user vdpa client restart, VQ notifier resources become
> invalid, no need to keep mmap, vdpa client will set VQ notifier after
> reconnect.
> 
> Removes VQ notifier restore and related flags.
> 
> Fixes: 44866521bd6e ("vhost-user: support registering external host 
> notifiers")
> Cc: tiwei@intel.com
> Cc: qemu-sta...@nongnu.org
> Cc: Yuwei Zhang 
> Signed-off-by: Xueming Li 

Pls fix up bisect and repost.
Also, can you please clarify what does "no need" mean?
You include a Fixes tag but is there a bug? What behaviour
are you trying to fix? A resource leak?
Or are you just simplifying code?
If the later then no need for a Fixes tag.




> ---
>  hw/virtio/vhost-user.c | 20 ++--
>  include/hw/virtio/vhost-user.h |  1 -
>  2 files changed, 2 insertions(+), 19 deletions(-)
> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index b2e948bdc7..d127aa478a 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -22,6 +22,7 @@
>  #include "qemu/main-loop.h"
>  #include "qemu/sockets.h"
>  #include "sysemu/cryptodev.h"
> +#include "sysemu/cpus.h"
>  #include "migration/migration.h"
>  #include "migration/postcopy-ram.h"
>  #include "trace.h"
> @@ -1143,19 +1144,6 @@ static int vhost_user_set_vring_num(struct vhost_dev 
> *dev,
>  return vhost_set_vring(dev, VHOST_USER_SET_VRING_NUM, ring);
>  }
>  
> -static void vhost_user_host_notifier_restore(struct vhost_dev *dev,
> - int queue_idx)
> -{
> -struct vhost_user *u = dev->opaque;
> -VhostUserHostNotifier *n = &u->user->notifier[queue_idx];
> -VirtIODevice *vdev = dev->vdev;
> -
> -if (n->addr && !n->set) {
> -virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, true);
> -n->set = true;
> -}
> -}
> -
>  static void vhost_user_host_notifier_remove(struct vhost_dev *dev,
>  int queue_idx)
>  {
> @@ -1163,7 +1151,7 @@ static void vhost_user_host_notifier_remove(struct 
> vhost_dev *dev,
>  VhostUserHostNotifier *n = &u->user->notifier[queue_idx];
>  VirtIODevice *vdev = dev->vdev;
>  
> -if (n->addr && n->set) {
> +if (n->addr) {
>  virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
>  if (!qemu_in_vcpu_thread()) { /* Avoid vCPU dead lock. */
>  /* Wait for VM threads accessing old flatview which contains 
> notifier. */
> @@ -1171,15 +1159,12 @@ static void vhost_user_host_notifier_remove(struct 
> vhost_dev *dev,
>  }
>  munmap(n->addr, qemu_real_host_page_size);
>  n->addr = NULL;
> -n->set = false;
>  }
>  }
>  
>  static int vhost_user_set_vring_base(struct vhost_dev *dev,
>   struct vhost_vring_state *ring)
>  {
> -vhost_user_host_notifier_restore(dev, ring->index);
> -
>  return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, ring);
>  }
>  
> @@ -1539,7 +1524,6 @@ static int 
> vhost_user_slave_handle_vring_host_notifier(struct vhost_dev *dev,
>  }
>  
>  n->addr = addr;
> -n->set = true;
>  
>  return 0;
>  }
> diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
> index a9abca3288..f6012b2078 100644
> --- a/include/hw/virtio/vhost-user.h
> +++ b/include/hw/virtio/vhost-user.h
> @@ -14,7 +14,6 @@
>  typedef struct VhostUserHostNotifier {
>  MemoryRegion mr;
>  void *addr;
> -bool set;
>  } VhostUserHostNotifier;
>  
>  typedef struct VhostUserState {
> -- 
> 2.33.0




Re: [PATCH] tests/vm/openbsd: Update to release 7.0

2021-10-18 Thread Thomas Huth

On 18/10/2021 22.53, Richard Henderson wrote:

There are two minor changes required in the script for the
network configuration of the newer release.

Signed-off-by: Richard Henderson 
---
  tests/vm/openbsd | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/tests/vm/openbsd b/tests/vm/openbsd
index c4c78a80f1..337fe7c303 100755
--- a/tests/vm/openbsd
+++ b/tests/vm/openbsd
@@ -22,8 +22,8 @@ class OpenBSDVM(basevm.BaseVM):
  name = "openbsd"
  arch = "x86_64"
  
-link = "https://cdn.openbsd.org/pub/OpenBSD/6.9/amd64/install69.iso";

-csum = "140d26548aec680e34bb5f82295414228e7f61e4f5e7951af066014fda2d6e43"
+link = "https://cdn.openbsd.org/pub/OpenBSD/7.0/amd64/install70.iso";
+csum = "1882f9a23c9800e5dba3dbd2cf0126f552605c915433ef4c5bb672610a4ca3a4"
  size = "20G"
  pkgs = [
  # tools
@@ -95,10 +95,9 @@ class OpenBSDVM(basevm.BaseVM):
  self.console_wait_send("Terminal type",   "xterm\n")
  self.console_wait_send("System hostname", "openbsd\n")
  self.console_wait_send("Which network interface", "vio0\n")
-self.console_wait_send("IPv4 address","dhcp\n")
+self.console_wait_send("IPv4 address","autoconf\n")
  self.console_wait_send("IPv6 address","none\n")
  self.console_wait_send("Which network interface", "done\n")
-self.console_wait_send("DNS domain name", "localnet\n")
  self.console_wait("Password for root account")
  self.console_send("%s\n" % self._config["root_pass"])
  self.console_wait("Password for root account")


Works for me!

Tested-by: Thomas Huth 





Re: [PATCH v4 3/3] bios-tables-test: Generate reference table for virt/DBG2

2021-10-18 Thread Eric Auger
Hi Richard,

On 10/18/21 11:00 PM, Richard Henderson wrote:
> On 10/7/21 12:29 AM, Eric Auger wrote:
>> diff --git a/tests/data/acpi/virt/DBG2 b/tests/data/acpi/virt/DBG2
>> index
>> e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..86e6314f7b0235ef8ed3e0221e09f996c41f5e98
>> 100644
>> GIT binary patch
>> literal 87
>> zcmZ>9ayJTR0D|*Q{>~o33QiFL&I&-l2owUbL9`AKgJ=eA21Zr}H4uw|p@A7lh%qQJ
>> TFmQk+Il-a=3=Gcxz6J~c3~mVl
>>
>> literal 0
>> HcmV?d1
>>
>
> Something went wrong here:
>
> Applying: bios-tables-test: Generate reference table for virt/DBG2
> error: corrupt binary patch at line 75: --
>
> Can you please re-send?

Sure I will resend asap. Sorry for the inconvenience.

Eric
>
>
> r~
>




Re: [PATCH 0/6] RfC: try improve native hotplug for pcie root ports

2021-10-18 Thread Gerd Hoffmann
  Hi,

> > Yes.  Maybe ask rh qe to run the patch set through their hotplug test
> > suite (to avoid a apci-hotplug style disaster where qe found various
> > issues after release)?
> 
> I'll poke around to see if they can help us... we'll need
> a backport for that though.

Easy, it's a clean cherry-pick for 6.1, scratch build is on the way.

> > > I would also like to see a shorter timeout, maybe 100ms, so
> > > that we are more responsive to guest changes in resending request.
> > 
> > I don't think it is a good idea to go for a shorter timeout given that
> > the 5 seconds are in the specs and we want avoid a resent request being
> > interpreted as cancel.
> > It also wouldn't change anything at least for linux guests because linux
> > is waiting those 5 seconds (with power indicator in blinking state).
> > Only the reason for refusing 'device_del' changes from "5 secs not over
> > yet" to "guest is busy processing the hotplug request".
> 
> First 5 seconds yes. But the retries afterwards?

Hmm, maybe, but I'd tend to keep it simple and go for 5 secs no matter
what.  If the guest isn't responding (maybe because it is in the middle
of a reboot) it's unlikely that fast re-requests are fundamentally
changing things.

> > We could consider to tackle the 5sec timeout on the guest side, i.e.
> > have linux skip the 5sec wait in case the root port is virtual (should
> > be easy to figure by checking the pci id).
> > 
> > take care,
> >   Gerd
> 
> Yes ... do we want to control how long it blinks from hypervisor side?

Is there a good reason for that?
If not, then no.  Keep it simple.

When the guest powers off the slot pcie_cap_slot_write_config() will
happily unplug the device without additional checks (no check whenever
the 5 seconds are over, also no check whenever there is a pending unplug
request in the first place).

So in theory the guest turning off slot power quickly should work just
fine and speed up the unplug process in the common case (guest is
up'n'running and responsitive).  Down to 1-2 secs instead of 5-7.
Didn't actually test that though.

take care,
  Gerd




Re: [PATCH v6] Work around vhost-user-blk-test hang

2021-10-18 Thread Michael S. Tsirkin
On Mon, Oct 18, 2021 at 10:33:02PM +, Raphael Norwitz wrote:
> On Mon, Oct 18, 2021 at 05:50:41PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Oct 14, 2021 at 04:32:23AM +, Raphael Norwitz wrote:
> > > The vhost-user-blk-test qtest has been hanging intermittently for a
> > > while. The root cause is not yet fully understood, but the hang is
> > > impacting enough users that it is important to merge a workaround for
> > > it.
> > > 
> > > The race which causes the hang occurs early on in vhost-user setup,
> > > where a vhost-user message is never received by the backend. Forcing
> > > QEMU to wait until the storage-daemon has had some time to initialize
> > > prevents the hang. Thus the existing storage-daemon pidfile option can
> > > be used to implement a workaround cleanly and effectively, since it
> > > creates a file only once the storage-daemon initialization is complete.
> > > 
> > > This change implements a workaround for the vhost-user-blk-test hang by
> > > making QEMU wait until the storage-daemon has written out a pidfile
> > > before attempting to connect and send messages over the vhost-user
> > > socket.
> > > 
> > > Some relevent mailing list discussions:
> > > 
> > > [1] 
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_qemu-2Ddevel_CAFEAcA8kYpz9LiPNxnWJAPSjc-3Dnv532bEdyfynaBeMeohqBp3A-40mail.gmail.com_&d=DwIBAg&c=s883GpUCOChKOHiocYtGcg&r=In4gmR1pGzKB8G5p6LUrWqkSMec2L5EtXZow_FZNJZk&m=eDRDFhe3H61BSSpDvy3PKzwQIa2grX5hNMhigtjMCJ8&s=c6OKIl0NMsDqP0-ZNnVjHhDq2psXIVszz-uBKw_8pEo&e=
> > >  
> > > [2] 
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_qemu-2Ddevel_YWaky-252FKVbS-252FKZjlV-40stefanha-2Dx1.localdomain_&d=DwIBAg&c=s883GpUCOChKOHiocYtGcg&r=In4gmR1pGzKB8G5p6LUrWqkSMec2L5EtXZow_FZNJZk&m=eDRDFhe3H61BSSpDvy3PKzwQIa2grX5hNMhigtjMCJ8&s=B4EM_0f7TXqsh18YEKOg-cFHabUjsVA5Ie1riDXaB7A&e=
> > >  
> > > 
> > > Signed-off-by: Raphael Norwitz 
> > > Reviewed-by: Eric Blake 
> > 
> > 
> > Um. Does not seem to make things better for me:
> > 
> > **
> > ERROR:../tests/qtest/vhost-user-blk-test.c:950:start_vhost_user_blk: 
> > assertion failed (retries < PIDFILE_RETRIES): (5 < 5)
> > ERROR qtest-x86_64/qos-test - Bail out! 
> > ERROR:../tests/qtest/vhost-user-blk-test.c:950:start_vhost_user_blk: 
> > assertion failed (retries < PIDFILE_RETRIES): (5 < 5)
> > 
> > At this point I just disabled the test in meson. No need to make
> > everyone suffer.
> 
> Makes sense. Do you still want to persue the workaround?
> 
> If so, can you share some details on how you're running the test?
> 
> I've gone through 1000+ iterations using the script I posted here:
> https://lore.kernel.org/qemu-devel/20210827165253.GA14291@raphael-debian-dev/
> without hitting a failure.

Hmm my box was busy... now that it's idle I can't repro the failure.



> >
> > 
> > > ---
> > >  tests/qtest/vhost-user-blk-test.c | 29 -
> > >  1 file changed, 28 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/tests/qtest/vhost-user-blk-test.c 
> > > b/tests/qtest/vhost-user-blk-test.c
> > > index 6f108a1b62..c6626a286b 100644
> > > --- a/tests/qtest/vhost-user-blk-test.c
> > > +++ b/tests/qtest/vhost-user-blk-test.c
> > > @@ -24,6 +24,7 @@
> > >  #define TEST_IMAGE_SIZE (64 * 1024 * 1024)
> > >  #define QVIRTIO_BLK_TIMEOUT_US  (30 * 1000 * 1000)
> > >  #define PCI_SLOT_HP 0x06
> > > +#define PIDFILE_RETRIES 5
> > >  
> > >  typedef struct {
> > >  pid_t pid;
> > 
> > 
> > Don't like the arbitrary retries counter.
> > 
> > Let's warn maybe, but on a busy machine we might not complete this
> > in time ...
> 
> So you would like it to warn and keep trying forever? Or would you
> rather set a much more lenient deadline? (1 min? 5 min?)

I'm not entirely sure ... Maybe 1 min is enough.
But we want to check that daemon is alive.
And maybe print something about still waiting for it to come up
every X seconds?

> > 
> > 
> > > @@ -885,7 +886,8 @@ static void start_vhost_user_blk(GString *cmd_line, 
> > > int vus_instances,
> > >   int num_queues)
> > >  {
> > >  const char *vhost_user_blk_bin = qtest_qemu_storage_daemon_binary();
> > > -int i;
> > > +int i, retries;
> > > +char *daemon_pidfile_path;
> > >  gchar *img_path;
> > >  GString *storage_daemon_command = g_string_new(NULL);
> > >  QemuStorageDaemonState *qsd;
> > > @@ -898,6 +900,8 @@ static void start_vhost_user_blk(GString *cmd_line, 
> > > int vus_instances,
> > >  " -object memory-backend-memfd,id=mem,size=256M,share=on "
> > >  " -M memory-backend=mem -m 256M ");
> > >  
> > > +daemon_pidfile_path = g_strdup_printf("/tmp/daemon-%d", getpid());
> > > +
> > 
> > Ugh. Predictable paths directly in /tmp are problematic .. mktemp?
> > 
> 
> Ack
> 
> > >  for (i = 0; i < vus_instances; i++) {
> > >  int fd;
> > >  char *sock_path = create_listen_socket(&fd);
> > > @@ -914

Re: [PATCH v4 1/2] sev/i386: Introduce sev_add_kernel_loader_hashes for measured linux boot

2021-10-18 Thread Dov Murik



On 18/10/2021 21:02, Tom Lendacky wrote:
> On 9/30/21 12:49 AM, Dov Murik wrote:
> 
> ...
> 
>> +/*
>> + * Add the hashes of the linux kernel/initrd/cmdline to an encrypted
>> guest page
>> + * which is included in SEV's initial memory measurement.
>> + */
>> +bool sev_add_kernel_loader_hashes(SevKernelLoaderContext *ctx, Error
>> **errp)
>> +{
>> +    uint8_t *data;
>> +    SevHashTableDescriptor *area;
>> +    SevHashTable *ht;
>> +    uint8_t cmdline_hash[HASH_SIZE];
>> +    uint8_t initrd_hash[HASH_SIZE];
>> +    uint8_t kernel_hash[HASH_SIZE];
>> +    uint8_t *hashp;
>> +    size_t hash_len = HASH_SIZE;
>> +    int aligned_len;
>> +
>> +    if (!pc_system_ovmf_table_find(SEV_HASH_TABLE_RV_GUID, &data,
>> NULL)) {
>> +    error_setg(errp, "SEV: kernel specified but OVMF has no hash
>> table guid");
>> +    return false;
>> +    }
> 
> This breaks backwards compatibility with an older OVMF image. Any older
> OVMF image with SEV support that doesn't have the hash table GUID will
> now fail to boot using -kernel/-initrd/-append, where it used to be able
> to boot before.
> 


Thanks Tom for noticing this.

Just so we're on the same page: this patch is already merged.


We're dealing with a scenario of launching a guest with SEV enabled and
with -kernel.  The behaviours are:


A. With current QEMU:

A1. New AmdSev OVMF build: OVMF will verify the hashes and boot correctly.
A2. New Generic OvmfPkgX64 build: No verification but will boot correctly.

A3. Old AmdSev OVMF build: QEMU aborts the launch because there's no
hash table GUID.
A4. Old Generic OvmfPkgX64 build: QEMU aborts the launch because there's
no hash table GUID.


B. With older QEMU (before this patch was merged):

B1. New AmdSev OVMF build: OVMF will try to verify the hashes but they
are not populated; boot aborted.
B2. New Generic OvmfPkgX64 build: No verification but will boot correctly.

B3. Old AmdSev OVMF build: OVMF aborts the launch because -kernel is not
supported at all.
B4. Old Generic OvmfPkgX64 build: No verification but will boot correctly.


So the problem you are raising is scenario A4 (as opposed to previous
behaviour B4).



> Is that anything we need to be concerned about?
> 

Possible solutions:

1. Do nothing. For users that encounter this: tell them to upgrade OVMF.
2. Modify the code: remove the line: error_setg(errp, "SEV: kernel
specified but OVMF has no hash table guid")

I think that option 2 will not degrade security *if* the Guest Owner
verifies the measurement (which is mandatory anyway; otherwise the
untrusted host can replace OVMF with a "malicious" version that doesn't
verify the hashes). Skipping silently might make debugging a bit harder.
Maybe we can print a warning and return, and then the guest launch will
continue?

Other ideas?


-Dov




Re: [PATCH v3 1/2] vhost-user: fix VirtQ notifier cleanup

2021-10-18 Thread Michael S. Tsirkin
On Fri, Oct 08, 2021 at 03:58:04PM +0800, Xueming Li wrote:
> When vhost-user device cleanup and unmmap notifier address, VM cpu
> thread that writing the notifier failed with accessing invalid address.
> 
> To avoid this concurrent issue, wait memory flatview update by draining
> rcu callbacks, then unmap notifiers.
> 
> Fixes: 44866521bd6e ("vhost-user: support registering external host 
> notifiers")
> Cc: tiwei@intel.com
> Cc: qemu-sta...@nongnu.org
> Cc: Yuwei Zhang 
> Signed-off-by: Xueming Li 
> ---
>  hw/virtio/vhost-user.c | 21 ++---
>  1 file changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index bf6e50223c..b2e948bdc7 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -1165,6 +1165,12 @@ static void vhost_user_host_notifier_remove(struct 
> vhost_dev *dev,
>  
>  if (n->addr && n->set) {
>  virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
> +if (!qemu_in_vcpu_thread()) { /* Avoid vCPU dead lock. */
> +/* Wait for VM threads accessing old flatview which contains 
> notifier. */
> +drain_call_rcu();
> +}
> +munmap(n->addr, qemu_real_host_page_size);
> +n->addr = NULL;
>  n->set = false;
>  }
>  }


../hw/virtio/vhost-user.c: In function ‘vhost_user_host_notifier_remove’:
../hw/virtio/vhost-user.c:1168:14: error: implicit declaration of function 
‘qemu_in_vcpu_thread’ [-Werror=implicit-function-declaration]
 1168 | if (!qemu_in_vcpu_thread()) { /* Avoid vCPU dead lock. */
  |  ^~~
../hw/virtio/vhost-user.c:1168:14: error: nested extern declaration of 
‘qemu_in_vcpu_thread’ [-Werror=nested-externs]
cc1: all warnings being treated as errors
ninja: build stopped: subcommand failed.
make[1]: *** [Makefile:162: run-ninja] Error 1
make[1]: Leaving directory '/scm/qemu/build'
make: *** [GNUmakefile:11: all] Error 2


Although the following patch fixes it, bisect is broken.


> @@ -1502,12 +1508,7 @@ static int 
> vhost_user_slave_handle_vring_host_notifier(struct vhost_dev *dev,
>  
>  n = &user->notifier[queue_idx];
>  
> -if (n->addr) {
> -virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, false);
> -object_unparent(OBJECT(&n->mr));
> -munmap(n->addr, page_size);
> -n->addr = NULL;
> -}
> +vhost_user_host_notifier_remove(dev, queue_idx);
>  
>  if (area->u64 & VHOST_USER_VRING_NOFD_MASK) {
>  return 0;
> @@ -2485,11 +2486,17 @@ void vhost_user_cleanup(VhostUserState *user)
>  for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
>  if (user->notifier[i].addr) {
>  object_unparent(OBJECT(&user->notifier[i].mr));
> +}
> +}
> +memory_region_transaction_commit();
> +/* Wait for VM threads accessing old flatview which contains notifier. */
> +drain_call_rcu();
> +for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> +if (user->notifier[i].addr) {
>  munmap(user->notifier[i].addr, qemu_real_host_page_size);
>  user->notifier[i].addr = NULL;
>  }
>  }
> -memory_region_transaction_commit();
>  user->chr = NULL;
>  }
>  
> -- 
> 2.33.0




[PATCH v2 1/2] rcu: Introduce force_rcu notifier

2021-10-18 Thread Greg Kurz
The drain_rcu_call() function can be blocked as long as an RCU reader
stays in a read-side critical section. This is typically what happens
when a TCG vCPU is executing a busy loop. It can deadlock the QEMU
monitor as reported in https://gitlab.com/qemu-project/qemu/-/issues/650 .

This can be avoided by allowing drain_rcu_call() to enforce an RCU grace
period. Since each reader might need to do specific actions to end a
read-side critical section, do it with notifiers.

Prepare ground for this by adding a notifier list to the RCU reader
struct and use it in wait_for_readers() if drain_rcu_call() is in
progress. An API is added for readers to register their notifiers.

This is largely based on a draft from Paolo Bonzini.

Suggested-by: Paolo Bonzini 
Signed-off-by: Greg Kurz 
---
 include/qemu/rcu.h | 16 
 util/rcu.c | 22 +-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h
index 515d327cf11c..d8c4fd8686b4 100644
--- a/include/qemu/rcu.h
+++ b/include/qemu/rcu.h
@@ -27,6 +27,7 @@
 #include "qemu/thread.h"
 #include "qemu/queue.h"
 #include "qemu/atomic.h"
+#include "qemu/notify.h"
 #include "qemu/sys_membarrier.h"
 
 #ifdef __cplusplus
@@ -66,6 +67,14 @@ struct rcu_reader_data {
 
 /* Data used for registry, protected by rcu_registry_lock */
 QLIST_ENTRY(rcu_reader_data) node;
+
+/*
+ * NotifierList used to force an RCU grace period.  Accessed under
+ * rcu_registry_lock.  Note that the notifier is called _outside_
+ * the thread!
+ */
+NotifierList force_rcu;
+void *force_rcu_data;
 };
 
 extern __thread struct rcu_reader_data rcu_reader;
@@ -180,6 +189,13 @@ G_DEFINE_AUTOPTR_CLEANUP_FUNC(RCUReadAuto, 
rcu_read_auto_unlock)
 #define RCU_READ_LOCK_GUARD() \
 g_autoptr(RCUReadAuto) _rcu_read_auto __attribute__((unused)) = 
rcu_read_auto_lock()
 
+/*
+ * Force-RCU notifiers tell readers that they should exit their
+ * read-side critical section.
+ */
+void rcu_add_force_rcu_notifier(Notifier *n, void *data);
+void rcu_remove_force_rcu_notifier(Notifier *n);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/util/rcu.c b/util/rcu.c
index 13ac0f75cb2a..0c68f068e23d 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -46,6 +46,7 @@
 unsigned long rcu_gp_ctr = RCU_GP_LOCKED;
 
 QemuEvent rcu_gp_event;
+static int in_drain_call_rcu;
 static QemuMutex rcu_registry_lock;
 static QemuMutex rcu_sync_lock;
 
@@ -107,6 +108,8 @@ static void wait_for_readers(void)
  * get some extra futex wakeups.
  */
 qatomic_set(&index->waiting, false);
+} else if (qatomic_read(&in_drain_call_rcu)) {
+notifier_list_notify(&index->force_rcu, index->force_rcu_data);
 }
 }
 
@@ -293,7 +296,6 @@ void call_rcu1(struct rcu_head *node, void (*func)(struct 
rcu_head *node))
 qemu_event_set(&rcu_call_ready_event);
 }
 
-
 struct rcu_drain {
 struct rcu_head rcu;
 QemuEvent drain_complete_event;
@@ -339,8 +341,10 @@ void drain_call_rcu(void)
  * assumed.
  */
 
+qatomic_inc(&in_drain_call_rcu);
 call_rcu1(&rcu_drain.rcu, drain_rcu_callback);
 qemu_event_wait(&rcu_drain.drain_complete_event);
+qatomic_dec(&in_drain_call_rcu);
 
 if (locked) {
 qemu_mutex_lock_iothread();
@@ -363,6 +367,22 @@ void rcu_unregister_thread(void)
 qemu_mutex_unlock(&rcu_registry_lock);
 }
 
+void rcu_add_force_rcu_notifier(Notifier *n, void *data)
+{
+qemu_mutex_lock(&rcu_registry_lock);
+notifier_list_add(&rcu_reader.force_rcu, n);
+rcu_reader.force_rcu_data = data;
+qemu_mutex_unlock(&rcu_registry_lock);
+}
+
+void rcu_remove_force_rcu_notifier(Notifier *n)
+{
+qemu_mutex_lock(&rcu_registry_lock);
+rcu_reader.force_rcu_data = NULL;
+notifier_remove(n);
+qemu_mutex_unlock(&rcu_registry_lock);
+}
+
 static void rcu_init_complete(void)
 {
 QemuThread thread;
-- 
2.31.1




[PATCH v2 2/2] accel/tcg: Register a force_rcu notifier

2021-10-18 Thread Greg Kurz
A TCG vCPU doing a busy loop systematicaly hangs the QEMU monitor
if the user passes 'device_add' without argument. This is because
drain_cpu_all() which is called from qmp_device_add() cannot return
if readers don't exit read-side critical sections. That is typically
what busy-looping TCG vCPUs do, both in MTTCG and RR modes:

int cpu_exec(CPUState *cpu)
{
[...]
rcu_read_lock();
[...]
while (!cpu_handle_exception(cpu, &ret)) {
// Busy loop keeps vCPU here
}
[...]
rcu_read_unlock();

return ret;
}

Have all vCPUs register a force_rcu notifier that will kick them out
of the loop using async_run_on_cpu(). The notifier is called with the
rcu_registry_lock mutex held, using async_run_on_cpu() ensures there
are no deadlocks.

The notifier implementation is shared by MTTCG and RR since both are
affected.

Suggested-by: Paolo Bonzini 
Fixes: 7bed89958bfb ("device_core: use drain_call_rcu in in qmp_device_add")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/650
Signed-off-by: Greg Kurz 
---
 accel/tcg/tcg-accel-ops-mttcg.c |  3 +++
 accel/tcg/tcg-accel-ops-rr.c|  3 +++
 accel/tcg/tcg-accel-ops.c   | 15 +++
 accel/tcg/tcg-accel-ops.h   |  2 ++
 4 files changed, 23 insertions(+)

diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
index 847d2079d21f..ea4a3217ce3f 100644
--- a/accel/tcg/tcg-accel-ops-mttcg.c
+++ b/accel/tcg/tcg-accel-ops-mttcg.c
@@ -44,11 +44,13 @@
 static void *mttcg_cpu_thread_fn(void *arg)
 {
 CPUState *cpu = arg;
+Notifier force_rcu = { .notify = tcg_cpus_force_rcu };
 
 assert(tcg_enabled());
 g_assert(!icount_enabled());
 
 rcu_register_thread();
+rcu_add_force_rcu_notifier(&force_rcu, cpu);
 tcg_register_thread();
 
 qemu_mutex_lock_iothread();
@@ -100,6 +102,7 @@ static void *mttcg_cpu_thread_fn(void *arg)
 
 tcg_cpus_destroy(cpu);
 qemu_mutex_unlock_iothread();
+rcu_remove_force_rcu_notifier(&force_rcu);
 rcu_unregister_thread();
 return NULL;
 }
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
index a5fd26190e20..fc0c4905caf5 100644
--- a/accel/tcg/tcg-accel-ops-rr.c
+++ b/accel/tcg/tcg-accel-ops-rr.c
@@ -144,9 +144,11 @@ static void rr_deal_with_unplugged_cpus(void)
 static void *rr_cpu_thread_fn(void *arg)
 {
 CPUState *cpu = arg;
+Notifier force_rcu = { .notify = tcg_cpus_force_rcu };
 
 assert(tcg_enabled());
 rcu_register_thread();
+rcu_add_force_rcu_notifier(&force_rcu, cpu);
 tcg_register_thread();
 
 qemu_mutex_lock_iothread();
@@ -255,6 +257,7 @@ static void *rr_cpu_thread_fn(void *arg)
 rr_deal_with_unplugged_cpus();
 }
 
+rcu_remove_force_rcu_notifier(&force_rcu);
 rcu_unregister_thread();
 return NULL;
 }
diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c
index 1a8e8390bd60..7f0d2b06044a 100644
--- a/accel/tcg/tcg-accel-ops.c
+++ b/accel/tcg/tcg-accel-ops.c
@@ -91,6 +91,21 @@ void tcg_handle_interrupt(CPUState *cpu, int mask)
 }
 }
 
+static void do_nothing(CPUState *cpu, run_on_cpu_data d)
+{
+}
+
+void tcg_cpus_force_rcu(Notifier *notify, void *data)
+{
+CPUState *cpu = data;
+
+/*
+ * Called with rcu_registry_lock held, using async_run_on_cpu() ensures
+ * that there are no deadlocks.
+ */
+async_run_on_cpu(cpu, do_nothing, RUN_ON_CPU_NULL);
+}
+
 static void tcg_accel_ops_init(AccelOpsClass *ops)
 {
 if (qemu_tcg_mttcg_enabled()) {
diff --git a/accel/tcg/tcg-accel-ops.h b/accel/tcg/tcg-accel-ops.h
index 6a5fcef88980..8742041c8aea 100644
--- a/accel/tcg/tcg-accel-ops.h
+++ b/accel/tcg/tcg-accel-ops.h
@@ -18,5 +18,7 @@ void tcg_cpus_destroy(CPUState *cpu);
 int tcg_cpus_exec(CPUState *cpu);
 void tcg_handle_interrupt(CPUState *cpu, int mask);
 void tcg_cpu_init_cflags(CPUState *cpu, bool parallel);
+/* Common force_rcu notifier for MTTCG and RR */
+void tcg_cpus_force_rcu(Notifier *notify, void *data);
 
 #endif /* TCG_CPUS_H */
-- 
2.31.1




[PATCH v2 0/2] accel/tcg: Fix monitor deadlock

2021-10-18 Thread Greg Kurz
Commit 7bed89958bfb ("device_core: use drain_call_rcu in in qmp_device_add")
introduced a regression in QEMU 6.0 : passing device_add without argument
hangs the monitor. This was reported against qemu-system-mips64 with TGC,
but I could consistently reproduce it with other targets (x86 and ppc64).

See https://gitlab.com/qemu-project/qemu/-/issues/650 for details.

The problem is that an emulated busy-looping vCPU can stay forever in
its RCU read-side critical section and prevent drain_call_rcu() to return.
This series fixes the issue by letting RCU kick vCPUs out of the read-side
critical section when drain_call_rcu() is in progress. This is achieved
through notifiers, as suggested by Paolo Bonzini.

v2:
- moved notifier list to RCU reader data
- separate API for notifier registration
- CPUState passed as an opaque pointer

Greg Kurz (2):
  rcu: Introduce force_rcu notifier
  accel/tcg: Register a force_rcu notifier

 accel/tcg/tcg-accel-ops-mttcg.c |  3 +++
 accel/tcg/tcg-accel-ops-rr.c|  3 +++
 accel/tcg/tcg-accel-ops.c   | 15 +++
 accel/tcg/tcg-accel-ops.h   |  2 ++
 include/qemu/rcu.h  | 16 
 util/rcu.c  | 22 +-
 6 files changed, 60 insertions(+), 1 deletion(-)

-- 
2.31.1





Re: [PATCH v4 2/2] monitor: refactor set/expire_password and allow VNC display id

2021-10-18 Thread Markus Armbruster
Stefan Reiter  writes:

> On 10/14/21 9:14 AM, Markus Armbruster wrote:
>> Stefan Reiter  writes:
>> 
>>> On 10/12/21 11:24 AM, Markus Armbruster wrote:

[...]

 I'd split this patch into three parts: item 1., 2.+3., 4.-6., because
 each part can stand on its own.
>>>
>>> The reason why I didn't do that initially is more to do with the C side.
>>> I think splitting it up as you describe would make for some awkward diffs
>>> on the qmp_set_password/hmp_set_password side.
>>>
>>> I can try of course.
>> 
>> It's a suggestion, not a demand.  I'm a compulsory patch splitter.
>
> I'll just have a go and see what falls out. I do agree that this patch is a
> bit long on its own.

Thanks!

>>>   Though I also want to have it said that I'm not a fan
>>> of how the HMP functions have to expand so much to accommodate the QAPI
>>> structure in general.
>> 
>> Care to elaborate?
>
> Well, before this patch 'hmp_set_password' was for all intents and purposes a
> single function call to 'qmp_set_password'. Now it has a bunch of parameter
> parsing and even validation (e.g. enum parsing).

Yes, HMP requires us to do more work manually than QMP does.  Raising
HMP's level of automation to QMP's would be nice, but it would also be a
big project.

>  That's why I asked in the
> v3 patch (I think?) if there was a way to call the QAPI style parsing from
> there, since the parameters are all named the same and in a qdict already..
>
> Something like:
>
>void hmp_set_password(Monitor *mon, const QDict *qdict)
>{
>  ExpirePasswordOptions opts = qapi_magical_parse_fn(qdict);
>  qmp_set_password(&opts, &err);
>  [error handling]
>}

Same structure as qmp_marshal_set_password(), where the "magical parse"
part uses a visitor function generated from the QAPI schema for its
argument type.

HMP works differently.  There, we only have .args_type in
hmp-commands.hx.  Since this is much less expressive than the QAPI
schema, generic code can do much less work for us.  Which means we get
to write more code by hand.

By converting QMP from 'str' to enum, we lift parsing from the
qmp_set_password() to its callers.  qmp_marshal_set_password() does it
for free.  hmp_set_password() needs handwritten code.  It wouldn't with
a QAPI-schema-based HMP, but as I said, that's a big project.

> That being said, I don't mind the current form enough to make this a bigger
> discussion either, so if there isn't an easy way to do the above, let's just
> leave it like it is.

There is no easy way to do the above.




Re: [PATCH 0/6] RfC: try improve native hotplug for pcie root ports

2021-10-18 Thread Michael S. Tsirkin
On Tue, Oct 19, 2021 at 07:21:44AM +0200, Gerd Hoffmann wrote:
> On Mon, Oct 18, 2021 at 11:36:45AM -0400, Michael S. Tsirkin wrote:
> > On Mon, Oct 11, 2021 at 02:04:58PM +0200, Gerd Hoffmann wrote:
> > > 
> > > 
> > > Gerd Hoffmann (6):
> > >   pci: implement power state
> > >   pcie: implement slow power control for pcie root ports
> > >   pcie: add power indicator blink check
> > >   pcie: factor out pcie_cap_slot_unplug()
> > >   pcie: fast unplug when slot power is off
> > >   pcie: expire pending delete
> > 
> > So what's left to do here?
> > I'm guessing more testing?
> 
> Yes.  Maybe ask rh qe to run the patch set through their hotplug test
> suite (to avoid a apci-hotplug style disaster where qe found various
> issues after release)?

I'll poke around to see if they can help us... we'll need
a backport for that though.

> > I would also like to see a shorter timeout, maybe 100ms, so
> > that we are more responsive to guest changes in resending request.
> 
> I don't think it is a good idea to go for a shorter timeout given that
> the 5 seconds are in the specs and we want avoid a resent request being
> interpreted as cancel.
> It also wouldn't change anything at least for linux guests because linux
> is waiting those 5 seconds (with power indicator in blinking state).
> Only the reason for refusing 'device_del' changes from "5 secs not over
> yet" to "guest is busy processing the hotplug request".

First 5 seconds yes. But the retries afterwards?

> 
> We could consider to tackle the 5sec timeout on the guest side, i.e.
> have linux skip the 5sec wait in case the root port is virtual (should
> be easy to figure by checking the pci id).
> 
> take care,
>   Gerd

Yes ... do we want to control how long it blinks from hypervisor side?
And if we do then what? Shorten retry period?

-- 
MST




Re: [PATCH 0/6] RfC: try improve native hotplug for pcie root ports

2021-10-18 Thread Gerd Hoffmann
On Mon, Oct 18, 2021 at 11:36:45AM -0400, Michael S. Tsirkin wrote:
> On Mon, Oct 11, 2021 at 02:04:58PM +0200, Gerd Hoffmann wrote:
> > 
> > 
> > Gerd Hoffmann (6):
> >   pci: implement power state
> >   pcie: implement slow power control for pcie root ports
> >   pcie: add power indicator blink check
> >   pcie: factor out pcie_cap_slot_unplug()
> >   pcie: fast unplug when slot power is off
> >   pcie: expire pending delete
> 
> So what's left to do here?
> I'm guessing more testing?

Yes.  Maybe ask rh qe to run the patch set through their hotplug test
suite (to avoid a apci-hotplug style disaster where qe found various
issues after release)?

> I would also like to see a shorter timeout, maybe 100ms, so
> that we are more responsive to guest changes in resending request.

I don't think it is a good idea to go for a shorter timeout given that
the 5 seconds are in the specs and we want avoid a resent request being
interpreted as cancel.

It also wouldn't change anything at least for linux guests because linux
is waiting those 5 seconds (with power indicator in blinking state).
Only the reason for refusing 'device_del' changes from "5 secs not over
yet" to "guest is busy processing the hotplug request".

We could consider to tackle the 5sec timeout on the guest side, i.e.
have linux skip the 5sec wait in case the root port is virtual (should
be easy to figure by checking the pci id).

take care,
  Gerd




Re: [PATCH] Hexagon (target/hexagon) put writes to USR into temp until commit

2021-10-18 Thread Philippe Mathieu-Daudé
On 10/12/21 11:31, Taylor Simpson wrote:
> Change SET_USR_FIELD to write to hex_new_value[HEX_REG_USR] instead
> of hex_gpr[HEX_REG_USR].
> 
> Then, we need code to mark the instructions that can set implicitly
> set USR
> - Macros added to hex_common.py
> - A_FPOP added in translate.c
> 
> Signed-off-by: Taylor Simpson 
> ---
>  target/hexagon/macros.h  | 2 +-
>  target/hexagon/attribs_def.h.inc | 1 +
>  target/hexagon/translate.c   | 9 -
>  target/hexagon/hex_common.py | 2 ++
>  4 files changed, 12 insertions(+), 2 deletions(-)

This is stale v1. git-publish helps to avoid this workflow
mistakes ;) https://github.com/stefanha/git-publish



Re: [PATCH 1/2] Hexagon (target/hexagon) more tcg_constant_*

2021-10-18 Thread Philippe Mathieu-Daudé
On 10/12/21 11:31, Taylor Simpson wrote:
> Change additional tcg_const_tl to tcg_constant_tl
> 
> Note that gen_pred_cancal had slot_mask initialized with tcg_const_tl.
> However, it is not constant throughout, so we initialize it with
> tcg_temp_new and replace the first use with the constant value.
> 
> Inspired-by: Richard Henderson 
> Inspired-by: Philippe Mathieu-Daud 

UTF-8 copy/paste mojibake, otherwise:
Reviewed-by: Philippe Mathieu-Daudé 

> Signed-off-by: Taylor Simpson 
> ---
>  target/hexagon/gen_tcg.h|  9 +++--
>  target/hexagon/macros.h |  7 +++
>  target/hexagon/translate.c  |  3 +--
>  target/hexagon/gen_tcg_funcs.py | 11 ++-
>  4 files changed, 9 insertions(+), 21 deletions(-)



Re: [PATCH] block/file-posix: Fix return value translation for AIO discards.

2021-10-18 Thread Philippe Mathieu-Daudé
+Stefan

On 10/18/21 20:07, Ari Sundholm wrote:
> AIO discards regressed as a result of the following commit:
>   0dfc7af2 block/file-posix: Optimize for macOS
> 
> When trying to run blkdiscard within a Linux guest, the request would
> fail, with some errors in dmesg:
> 
>  [ snip ] 
> [4.010070] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [4.011061] sd 2:0:0:0: [sda] tag#0 Sense Key : Aborted Command
> [current]
> [4.011061] sd 2:0:0:0: [sda] tag#0 Add. Sense: I/O process
> terminated
> [4.011061] sd 2:0:0:0: [sda] tag#0 CDB: Unmap/Read sub-channel 42
> 00 00 00 00 00 00 00 18 00
> [4.011061] blk_update_request: I/O error, dev sda, sector 0
>  [ snip ] 
> 
> This turns out to be a result of a flaw in changes to the error value
> translation logic in handle_aiocb_discard(). The default return value
> may be left untranslated in some configurations, and the wrong variable
> is used in one translation.
> 
> Fix both issues.

Worth including:

Cc: qemu-sta...@nongnu.org
Fixes: 0dfc7af2b28 ("block/file-posix: Optimize for macOS")

> Signed-off-by: Ari Sundholm 
> Signed-off-by: Emil Karlson 
> ---
>  block/file-posix.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 53be0bdc1b..6def2a4cba 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1807,7 +1807,7 @@ static int handle_aiocb_copy_range(void *opaque)
>  static int handle_aiocb_discard(void *opaque)
>  {
>  RawPosixAIOData *aiocb = opaque;
> -int ret = -EOPNOTSUPP;
> +int ret = -ENOTSUP;
>  BDRVRawState *s = aiocb->bs->opaque;
>  
>  if (!s->has_discard) {
> @@ -1829,7 +1829,7 @@ static int handle_aiocb_discard(void *opaque)
>  #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
>  ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> aiocb->aio_offset, aiocb->aio_nbytes);
> -ret = translate_err(-errno);
> +ret = translate_err(ret);
>  #elif defined(__APPLE__) && (__MACH__)
>  fpunchhole_t fpunchhole;
>  fpunchhole.fp_flags = 0;
> 




Re: [PATCH] rebuild-expected-aml.sh: allow partial target list

2021-10-18 Thread Ani Sinha



On Mon, 18 Oct 2021, Michael S. Tsirkin wrote:

> Only rebuild AML for configured targets.
>
> Signed-off-by: Michael S. Tsirkin 

Reviewed-by: Ani Sinha 

> ---
>  tests/data/acpi/rebuild-expected-aml.sh | 22 +-
>  1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/tests/data/acpi/rebuild-expected-aml.sh 
> b/tests/data/acpi/rebuild-expected-aml.sh
> index fc78770544..dcf2e2f221 100755
> --- a/tests/data/acpi/rebuild-expected-aml.sh
> +++ b/tests/data/acpi/rebuild-expected-aml.sh
> @@ -12,7 +12,7 @@
>  # This work is licensed under the terms of the GNU GPLv2.
>  # See the COPYING.LIB file in the top-level directory.
>
> -qemu_bins="./qemu-system-x86_64 ./qemu-system-aarch64"
> +qemu_arches="x86_64 aarch64"
>
>  if [ ! -e "tests/qtest/bios-tables-test" ]; then
>  echo "Test: bios-tables-test is required! Run make check before this 
> script."
> @@ -20,6 +20,26 @@ if [ ! -e "tests/qtest/bios-tables-test" ]; then
>  exit 1;
>  fi
>
> +if grep TARGET_DIRS= config-host.mak; then
> +for arch in $qemu_arches; do
> +if  grep TARGET_DIRS= config-host.mak | grep "$arch"-softmmu;
> +then
> +qemu_bins="$qemu_bins ./qemu-system-$arch"
> +fi
> +done
> +else
> +echo "config-host.mak missing!"
> +echo "Run this script from the build directory."
> +exit 1;
> +fi
> +
> +if [ -z "$qemu_bins" ]; then
> +echo "Only the following architectures are currently supported: 
> $qemu_arches"
> +echo "None of these configured!"
> +echo "To fix, run configure --target-list=x86_64-softmmu,aarch64-softmmu"
> +exit 1;
> +fi
> +
>  for qemu in $qemu_bins; do
>  if [ ! -e $qemu ]; then
>  echo "Run 'make' to build the following QEMU executables: $qemu_bins"
> --
> MST
>
>



Re: [PATCH] block/file-posix: Fix return value translation for AIO discards.

2021-10-18 Thread Akihiko Odaki
Reviewed-by: Akihiko Odaki 

On Tue, Oct 19, 2021 at 3:08 AM Ari Sundholm  wrote:
>
> AIO discards regressed as a result of the following commit:
> 0dfc7af2 block/file-posix: Optimize for macOS
>
> When trying to run blkdiscard within a Linux guest, the request would
> fail, with some errors in dmesg:
>
>  [ snip ] 
> [4.010070] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [4.011061] sd 2:0:0:0: [sda] tag#0 Sense Key : Aborted Command
> [current]
> [4.011061] sd 2:0:0:0: [sda] tag#0 Add. Sense: I/O process
> terminated
> [4.011061] sd 2:0:0:0: [sda] tag#0 CDB: Unmap/Read sub-channel 42
> 00 00 00 00 00 00 00 18 00
> [4.011061] blk_update_request: I/O error, dev sda, sector 0
>  [ snip ] 
>
> This turns out to be a result of a flaw in changes to the error value
> translation logic in handle_aiocb_discard(). The default return value
> may be left untranslated in some configurations, and the wrong variable
> is used in one translation.
>
> Fix both issues.
>
> Signed-off-by: Ari Sundholm 
> Signed-off-by: Emil Karlson 
> ---
>  block/file-posix.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 53be0bdc1b..6def2a4cba 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1807,7 +1807,7 @@ static int handle_aiocb_copy_range(void *opaque)
>  static int handle_aiocb_discard(void *opaque)
>  {
>  RawPosixAIOData *aiocb = opaque;
> -int ret = -EOPNOTSUPP;
> +int ret = -ENOTSUP;
>  BDRVRawState *s = aiocb->bs->opaque;
>
>  if (!s->has_discard) {
> @@ -1829,7 +1829,7 @@ static int handle_aiocb_discard(void *opaque)
>  #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
>  ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> aiocb->aio_offset, aiocb->aio_nbytes);
> -ret = translate_err(-errno);
> +ret = translate_err(ret);
>  #elif defined(__APPLE__) && (__MACH__)
>  fpunchhole_t fpunchhole;
>  fpunchhole.fp_flags = 0;
> --
> 2.31.1
>



Re: [PATCH v4 15/16] target/riscv: Use riscv_csrrw_debug for cpu_dump

2021-10-18 Thread Richard Henderson

On 10/18/21 5:01 PM, Richard Henderson wrote:

+result = riscv_csrrw_debug(env, dump_csrs[i].csrno, &val, 0, 0);
+assert(result == RISCV_EXCP_NONE);
+qemu_fprintf(f, " %-8s " TARGET_FMT_lx "\n",
+ dump_csrs[i].name, val);


Ho hum, this assert fires under testing.
I'll have another look tomorrow.

r~



Re: [PATCH v4 14/16] target/riscv: Align gprs and fprs in cpu_dump

2021-10-18 Thread LIU Zhiwei



On 2021/10/19 上午8:01, Richard Henderson wrote:

Allocate 8 columns per register name.

Signed-off-by: Richard Henderson 

Reviewed-by: LIU Zhiwei

Zhiwei

---
  target/riscv/cpu.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 4e1920d5f0..f352c2b74c 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -240,7 +240,7 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
  qemu_fprintf(f, " %s %d\n", "V  =  ", 
riscv_cpu_virt_enabled(env));
  }
  #endif
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "pc  ", env->pc);
+qemu_fprintf(f, " %-8s " TARGET_FMT_lx "\n", "pc", env->pc);
  #ifndef CONFIG_USER_ONLY
  qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mhartid ", env->mhartid);
  qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mstatus ", 
(target_ulong)env->mstatus);
@@ -290,15 +290,16 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, 
int flags)
  #endif
  
  for (i = 0; i < 32; i++) {

-qemu_fprintf(f, " %s " TARGET_FMT_lx,
+qemu_fprintf(f, " %-8s " TARGET_FMT_lx,
   riscv_int_regnames[i], env->gpr[i]);
  if ((i & 3) == 3) {
  qemu_fprintf(f, "\n");
  }
  }
+
  if (flags & CPU_DUMP_FPU) {
  for (i = 0; i < 32; i++) {
-qemu_fprintf(f, " %s %016" PRIx64,
+qemu_fprintf(f, " %-8s %016" PRIx64,
   riscv_fpr_regnames[i], env->fpr[i]);
  if ((i & 3) == 3) {
  qemu_fprintf(f, "\n");




Re: [PATCH v4 09/16] target/riscv: Replace DisasContext.w with DisasContext.ol

2021-10-18 Thread Richard Henderson

On 10/18/21 7:24 PM, LIU Zhiwei wrote:

@@ -223,10 +238,15 @@ static TCGv dest_gpr(DisasContext *ctx, int reg_num)
  static void gen_set_gpr(DisasContext *ctx, int reg_num, TCGv t)
  {
  if (reg_num != 0) {
-    if (ctx->w) {
+    switch (get_ol(ctx)) {
+    case MXL_RV32:
  tcg_gen_ext32s_tl(cpu_gpr[reg_num], t);
-    } else {
+    break;
+    case MXL_RV64:
  tcg_gen_mov_tl(cpu_gpr[reg_num], t);
+    break;
+    default:
+    g_assert_not_reached();
  }
  }
  }


The dest_gpr currently will force call gen_set_gpr.  However, many cases for MXL_RV64,  
dest_gpr will return a global TCG variable and don't need to call gen_set_gpr.


In the MXL_RV64 case to which you refer, tcg_gen_mov_tl will be presented with a nop move, 
and will emit no code.  Similarly for TARGET_RISCV32 and the use of tcg_gen_ext32s_tl, 
which will expand to tcg_gen_mov_i32.



And even x0 can avoid to call gen_set_gpr.  I don't find a good way to utilize 
this point.


You shouldn't special case x0 unless absolutely necessary.  We check for it here so that 
we do not have to scatter many checks across the code base.



r~



Re: [PATCH v1 2/2] target/riscv: Organise the CPU properties

2021-10-18 Thread Bin Meng
On Mon, Oct 18, 2021 at 12:32 PM Alistair Francis
 wrote:
>
> From: Alistair Francis 

Possible commit description:

Organise the CPU properties so that standard extensions come first
then followed by experimental extensions.

>
> Signed-off-by: Alistair Francis 
> ---
>  target/riscv/cpu.c | 17 ++---
>  1 file changed, 10 insertions(+), 7 deletions(-)
>

Reviewed-by: Bin Meng 



Re: [PATCH v1 1/2] target/riscv: Remove some unused macros

2021-10-18 Thread Bin Meng
On Mon, Oct 18, 2021 at 5:30 PM Philippe Mathieu-Daudé  wrote:
>
> On 10/18/21 06:32, Alistair Francis wrote:
> > From: Alistair Francis 
>
> Possible commit description:
>
>  Since commit 1a9540d1f1a ("target/riscv: Drop support for ISA
>  spec version 1.09.1") these definitions are unused, remove them.

I believe the commit tag should come in the same line otherwise it may
break any script that extracts such from the commit message.

>
> Reviewed-by: Philippe Mathieu-Daudé 

Reviewed-by: Bin Meng 

>
> > Signed-off-by: Alistair Francis 
> > ---
> >  target/riscv/cpu_bits.h | 8 
> >  1 file changed, 8 deletions(-)
>
> BTW I strongly suggest you to use git-publish for your
> series / pull requests:
>
>   https://github.com/stefanha/git-publish
>

Regards,
Bin



Re: [PATCH v4 09/16] target/riscv: Replace DisasContext.w with DisasContext.ol

2021-10-18 Thread LIU Zhiwei



On 2021/10/19 上午8:01, Richard Henderson wrote:

In preparation for RV128, consider more than just "w" for
operand size modification.  This will be used for the "d"
insns from RV128 as well.

Rename oper_len to get_olen to better match get_xlen.

Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
  target/riscv/translate.c| 71 -
  target/riscv/insn_trans/trans_rvb.c.inc |  8 +--
  target/riscv/insn_trans/trans_rvi.c.inc | 18 +++
  target/riscv/insn_trans/trans_rvm.c.inc | 10 ++--
  4 files changed, 63 insertions(+), 44 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 3f1abbac5c..6ed925c003 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -67,7 +67,7 @@ typedef struct DisasContext {
 to any system register, which includes CSR_FRM, so we do not have
 to reset this known value.  */
  int frm;
-bool w;
+RISCVMXL ol;
  bool virt_enabled;
  bool ext_ifencei;
  bool hlsx;
@@ -104,12 +104,17 @@ static inline int __attribute__((unused)) 
get_xlen(DisasContext *ctx)
  return 16 << get_xl(ctx);
  }
  
-/* The word size for this operation. */

-static inline int oper_len(DisasContext *ctx)
-{
-return ctx->w ? 32 : TARGET_LONG_BITS;
-}
+/* The operation length, as opposed to the xlen. */
+#ifdef TARGET_RISCV32
+#define get_ol(ctx)MXL_RV32
+#else
+#define get_ol(ctx)((ctx)->ol)
+#endif
  
+static inline int get_olen(DisasContext *ctx)

+{
+return 16 << get_ol(ctx);
+}
  
  /*

   * RISC-V requires NaN-boxing of narrower width floating point values.
@@ -197,24 +202,34 @@ static TCGv get_gpr(DisasContext *ctx, int reg_num, 
DisasExtend ext)
  return ctx->zero;
  }
  
-switch (ctx->w ? ext : EXT_NONE) {

-case EXT_NONE:
-return cpu_gpr[reg_num];
-case EXT_SIGN:
-t = temp_new(ctx);
-tcg_gen_ext32s_tl(t, cpu_gpr[reg_num]);
-return t;
-case EXT_ZERO:
-t = temp_new(ctx);
-tcg_gen_ext32u_tl(t, cpu_gpr[reg_num]);
-return t;
+switch (get_ol(ctx)) {
+case MXL_RV32:
+switch (ext) {
+case EXT_NONE:
+break;
+case EXT_SIGN:
+t = temp_new(ctx);
+tcg_gen_ext32s_tl(t, cpu_gpr[reg_num]);
+return t;
+case EXT_ZERO:
+t = temp_new(ctx);
+tcg_gen_ext32u_tl(t, cpu_gpr[reg_num]);
+return t;
+default:
+g_assert_not_reached();
+}
+break;
+case MXL_RV64:
+break;
+default:
+g_assert_not_reached();
  }
-g_assert_not_reached();
+return cpu_gpr[reg_num];
  }
  
  static TCGv dest_gpr(DisasContext *ctx, int reg_num)

  {
-if (reg_num == 0 || ctx->w) {
+if (reg_num == 0 || get_olen(ctx) < TARGET_LONG_BITS) {
  return temp_new(ctx);
  }
  return cpu_gpr[reg_num];
@@ -223,10 +238,15 @@ static TCGv dest_gpr(DisasContext *ctx, int reg_num)
  static void gen_set_gpr(DisasContext *ctx, int reg_num, TCGv t)
  {
  if (reg_num != 0) {
-if (ctx->w) {
+switch (get_ol(ctx)) {
+case MXL_RV32:
  tcg_gen_ext32s_tl(cpu_gpr[reg_num], t);
-} else {
+break;
+case MXL_RV64:
  tcg_gen_mov_tl(cpu_gpr[reg_num], t);
+break;
+default:
+g_assert_not_reached();
  }
  }
  }


The dest_gpr currently will force call gen_set_gpr.  However, many cases 
for MXL_RV64,  dest_gpr will return a global TCG variable and don't need 
to call gen_set_gpr.


And even x0 can avoid to call gen_set_gpr.  I don't find a good way to 
utilize this point.


therwise,

Reviewed-by: LIU Zhiwei

Zhiwei


@@ -387,7 +407,7 @@ static bool gen_shift_imm_fn(DisasContext *ctx, arg_shift 
*a, DisasExtend ext,
   void (*func)(TCGv, TCGv, target_long))
  {
  TCGv dest, src1;
-int max_len = oper_len(ctx);
+int max_len = get_olen(ctx);
  
  if (a->shamt >= max_len) {

  return false;
@@ -406,7 +426,7 @@ static bool gen_shift_imm_tl(DisasContext *ctx, arg_shift 
*a, DisasExtend ext,
   void (*func)(TCGv, TCGv, TCGv))
  {
  TCGv dest, src1, src2;
-int max_len = oper_len(ctx);
+int max_len = get_olen(ctx);
  
  if (a->shamt >= max_len) {

  return false;
@@ -430,7 +450,7 @@ static bool gen_shift(DisasContext *ctx, arg_r *a, 
DisasExtend ext,
  TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE);
  TCGv ext2 = tcg_temp_new();
  
-tcg_gen_andi_tl(ext2, src2, oper_len(ctx) - 1);

+tcg_gen_andi_tl(ext2, src2, get_olen(ctx) - 1);
  func(dest, src1, ext2);
  
  gen_set_gpr(ctx, a->rd, dest);

@@ -530,7 +550,6 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
  ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX);
  ctx->xl = FIELD_EX32(tb_flags, TB_FLAGS, XL

Re: [PATCH v1 1/2] target/riscv: Remove some unused macros

2021-10-18 Thread Alistair Francis
On Mon, Oct 18, 2021 at 7:30 PM Philippe Mathieu-Daudé  wrote:
>
> On 10/18/21 06:32, Alistair Francis wrote:
> > From: Alistair Francis 
>
> Possible commit description:
>
>  Since commit 1a9540d1f1a ("target/riscv: Drop support for ISA
>  spec version 1.09.1") these definitions are unused, remove them.

Thanks, added.

>
> Reviewed-by: Philippe Mathieu-Daudé 
>
> > Signed-off-by: Alistair Francis 
> > ---
> >  target/riscv/cpu_bits.h | 8 
> >  1 file changed, 8 deletions(-)
>
> BTW I strongly suggest you to use git-publish for your
> series / pull requests:
>
>   https://github.com/stefanha/git-publish

Cool! I'll check it out.

Alistair

>
> Regards,
>
> Phil.



[PATCH v4 16/16] target/riscv: Compute mstatus.sd on demand

2021-10-18 Thread Richard Henderson
The position of this read-only field is dependent on the
current cpu width.  Rather than having to compute that
difference in many places, compute it only on read.

Signed-off-by: Richard Henderson 
---
 target/riscv/cpu_helper.c |  3 +--
 target/riscv/csr.c| 37 ++---
 target/riscv/translate.c  |  5 ++---
 3 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 429afd1f48..0d1132f39d 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -185,10 +185,9 @@ bool riscv_cpu_fp_enabled(CPURISCVState *env)
 
 void riscv_cpu_swap_hypervisor_regs(CPURISCVState *env)
 {
-uint64_t sd = riscv_cpu_mxl(env) == MXL_RV32 ? MSTATUS32_SD : MSTATUS64_SD;
 uint64_t mstatus_mask = MSTATUS_MXR | MSTATUS_SUM | MSTATUS_FS |
 MSTATUS_SPP | MSTATUS_SPIE | MSTATUS_SIE |
-MSTATUS64_UXL | sd;
+MSTATUS64_UXL;
 bool current_virt = riscv_cpu_virt_enabled(env);
 
 g_assert(riscv_has_ext(env, RVH));
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index c4a479ddd2..69e4d65fcd 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -477,10 +477,28 @@ static RISCVException read_mhartid(CPURISCVState *env, 
int csrno,
 }
 
 /* Machine Trap Setup */
+
+/* We do not store SD explicitly, only compute it on demand. */
+static uint64_t add_status_sd(RISCVMXL xl, uint64_t status)
+{
+if ((status & MSTATUS_FS) == MSTATUS_FS ||
+(status & MSTATUS_XS) == MSTATUS_XS) {
+switch (xl) {
+case MXL_RV32:
+return status | MSTATUS32_SD;
+case MXL_RV64:
+return status | MSTATUS64_SD;
+default:
+g_assert_not_reached();
+}
+}
+return status;
+}
+
 static RISCVException read_mstatus(CPURISCVState *env, int csrno,
target_ulong *val)
 {
-*val = env->mstatus;
+*val = add_status_sd(riscv_cpu_mxl(env), env->mstatus);
 return RISCV_EXCP_NONE;
 }
 
@@ -498,7 +516,6 @@ static RISCVException write_mstatus(CPURISCVState *env, int 
csrno,
 {
 uint64_t mstatus = env->mstatus;
 uint64_t mask = 0;
-int dirty;
 
 /* flush tlb on mstatus fields that affect VM */
 if ((val ^ mstatus) & (MSTATUS_MXR | MSTATUS_MPP | MSTATUS_MPV |
@@ -520,12 +537,7 @@ static RISCVException write_mstatus(CPURISCVState *env, 
int csrno,
 
 mstatus = (mstatus & ~mask) | (val & mask);
 
-dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) |
-((mstatus & MSTATUS_XS) == MSTATUS_XS);
-if (riscv_cpu_mxl(env) == MXL_RV32) {
-mstatus = set_field(mstatus, MSTATUS32_SD, dirty);
-} else {
-mstatus = set_field(mstatus, MSTATUS64_SD, dirty);
+if (riscv_cpu_mxl(env) == MXL_RV64) {
 /* SXL and UXL fields are for now read only */
 mstatus = set_field(mstatus, MSTATUS64_SXL, MXL_RV64);
 mstatus = set_field(mstatus, MSTATUS64_UXL, MXL_RV64);
@@ -798,13 +810,8 @@ static RISCVException read_sstatus(CPURISCVState *env, int 
csrno,
 {
 target_ulong mask = (sstatus_v1_10_mask);
 
-if (riscv_cpu_mxl(env) == MXL_RV32) {
-mask |= SSTATUS32_SD;
-} else {
-mask |= SSTATUS64_SD;
-}
-
-*val = env->mstatus & mask;
+/* TODO: Use SXL not MXL. */
+*val = add_status_sd(riscv_cpu_mxl(env), env->mstatus & mask);
 return RISCV_EXCP_NONE;
 }
 
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index de013fbf9b..35245aafa7 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -280,7 +280,6 @@ static void gen_jal(DisasContext *ctx, int rd, target_ulong 
imm)
 static void mark_fs_dirty(DisasContext *ctx)
 {
 TCGv tmp;
-target_ulong sd = get_xl(ctx) == MXL_RV32 ? MSTATUS32_SD : MSTATUS64_SD;
 
 if (ctx->mstatus_fs != MSTATUS_FS) {
 /* Remember the state change for the rest of the TB. */
@@ -288,7 +287,7 @@ static void mark_fs_dirty(DisasContext *ctx)
 
 tmp = tcg_temp_new();
 tcg_gen_ld_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus));
-tcg_gen_ori_tl(tmp, tmp, MSTATUS_FS | sd);
+tcg_gen_ori_tl(tmp, tmp, MSTATUS_FS);
 tcg_gen_st_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus));
 tcg_temp_free(tmp);
 }
@@ -299,7 +298,7 @@ static void mark_fs_dirty(DisasContext *ctx)
 
 tmp = tcg_temp_new();
 tcg_gen_ld_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus_hs));
-tcg_gen_ori_tl(tmp, tmp, MSTATUS_FS | sd);
+tcg_gen_ori_tl(tmp, tmp, MSTATUS_FS);
 tcg_gen_st_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus_hs));
 tcg_temp_free(tmp);
 }
-- 
2.25.1




[PATCH v4 15/16] target/riscv: Use riscv_csrrw_debug for cpu_dump

2021-10-18 Thread Richard Henderson
Use the official debug read interface to the csrs,
rather than referencing the env slots directly.
Put the list of csrs to dump into a table.

Signed-off-by: Richard Henderson 
---
 target/riscv/cpu.c | 99 +-
 1 file changed, 55 insertions(+), 44 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f352c2b74c..b81b880900 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -241,52 +241,63 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, 
int flags)
 }
 #endif
 qemu_fprintf(f, " %-8s " TARGET_FMT_lx "\n", "pc", env->pc);
+
 #ifndef CONFIG_USER_ONLY
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mhartid ", env->mhartid);
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mstatus ", 
(target_ulong)env->mstatus);
-if (riscv_cpu_mxl(env) == MXL_RV32) {
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mstatush ",
- (target_ulong)(env->mstatus >> 32));
+{
+static const struct {
+const char *name;
+int csrno;
+int misa;
+bool rv32;
+} dump_csrs[] = {
+{ "mhartid",CSR_MHARTID },
+{ "mstatus",CSR_MSTATUS },
+{ "mstatush",   CSR_MSTATUSH, 0, true },
+{ "hstatus",CSR_HSTATUS,  RVH },
+{ "vsstatus",   CSR_VSSTATUS, RVH },
+{ "mip",CSR_MIP },
+{ "mie",CSR_MIE },
+{ "mideleg",CSR_MIDELEG },
+{ "hideleg",CSR_HIDELEG,  RVH },
+{ "medeleg",CSR_MEDELEG },
+{ "hedeleg",CSR_HEDELEG,  RVH },
+{ "mtvec",  CSR_MTVEC },
+{ "stvec",  CSR_STVEC },
+{ "vstvec", CSR_VSTVEC,   RVH },
+{ "mepc",   CSR_MEPC },
+{ "sepc",   CSR_SEPC },
+{ "vsepc",  CSR_VSEPC,RVH },
+{ "mcause", CSR_MCAUSE },
+{ "scause", CSR_SCAUSE },
+{ "vscause",CSR_VSCAUSE,  RVH },
+{ "mtval",  CSR_MTVAL },
+{ "stval",  CSR_STVAL },
+{ "htval",  CSR_HTVAL,RVH },
+{ "mtval2", CSR_MTVAL2,   RVH },
+{ "mscratch",   CSR_MSCRATCH },
+{ "sscratch",   CSR_SSCRATCH },
+{ "satp",   CSR_SATP},
+};
+
+bool rv32 = riscv_cpu_mxl(env) == MXL_RV32;
+
+for (int i = 0; i < ARRAY_SIZE(dump_csrs); ++i) {
+target_ulong val = 0;
+RISCVException result;
+
+if (dump_csrs[i].misa && !riscv_has_ext(env, dump_csrs[i].misa)) {
+continue;
+}
+if (dump_csrs[i].rv32 && !rv32) {
+continue;
+}
+
+result = riscv_csrrw_debug(env, dump_csrs[i].csrno, &val, 0, 0);
+assert(result == RISCV_EXCP_NONE);
+qemu_fprintf(f, " %-8s " TARGET_FMT_lx "\n",
+ dump_csrs[i].name, val);
+}
 }
-if (riscv_has_ext(env, RVH)) {
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "hstatus ", env->hstatus);
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "vsstatus ",
- (target_ulong)env->vsstatus);
-}
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mip ", env->mip);
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mie ", env->mie);
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mideleg ", env->mideleg);
-if (riscv_has_ext(env, RVH)) {
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "hideleg ", env->hideleg);
-}
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "medeleg ", env->medeleg);
-if (riscv_has_ext(env, RVH)) {
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "hedeleg ", env->hedeleg);
-}
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mtvec   ", env->mtvec);
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "stvec   ", env->stvec);
-if (riscv_has_ext(env, RVH)) {
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "vstvec  ", env->vstvec);
-}
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mepc", env->mepc);
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "sepc", env->sepc);
-if (riscv_has_ext(env, RVH)) {
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "vsepc   ", env->vsepc);
-}
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mcause  ", env->mcause);
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "scause  ", env->scause);
-if (riscv_has_ext(env, RVH)) {
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "vscause ", env->vscause);
-}
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mtval   ", env->mtval);
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "stval   ", env->stval);
-if (riscv_has_ext(env, RVH)) {
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "htval ", env->htval);
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mtval2 ", env->mtval2);
-}
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n",

[PATCH v4 12/16] target/riscv: Use gen_unary_per_ol for RVB

2021-10-18 Thread Richard Henderson
The count zeros instructions require a separate implementation
for RV32 when TARGET_LONG_BITS == 64.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/translate.c| 16 
 target/riscv/insn_trans/trans_rvb.c.inc | 33 -
 2 files changed, 32 insertions(+), 17 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 5d54570cc9..ebcd1c8431 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -486,6 +486,22 @@ static bool gen_unary(DisasContext *ctx, arg_r2 *a, 
DisasExtend ext,
 return true;
 }
 
+static bool gen_unary_per_ol(DisasContext *ctx, arg_r2 *a, DisasExtend ext,
+ void (*f_tl)(TCGv, TCGv),
+ void (*f_32)(TCGv, TCGv))
+{
+int olen = get_olen(ctx);
+
+if (olen != TARGET_LONG_BITS) {
+if (olen == 32) {
+f_tl = f_32;
+} else {
+g_assert_not_reached();
+}
+}
+return gen_unary(ctx, a, ext, f_tl);
+}
+
 static uint32_t opcode_at(DisasContextBase *dcbase, target_ulong pc)
 {
 DisasContext *ctx = container_of(dcbase, DisasContext, base);
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc 
b/target/riscv/insn_trans/trans_rvb.c.inc
index c62eea433a..0c2120428d 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -47,10 +47,18 @@ static void gen_clz(TCGv ret, TCGv arg1)
 tcg_gen_clzi_tl(ret, arg1, TARGET_LONG_BITS);
 }
 
+static void gen_clzw(TCGv ret, TCGv arg1)
+{
+TCGv t = tcg_temp_new();
+tcg_gen_shli_tl(t, arg1, 32);
+tcg_gen_clzi_tl(ret, t, 32);
+tcg_temp_free(t);
+}
+
 static bool trans_clz(DisasContext *ctx, arg_clz *a)
 {
 REQUIRE_ZBB(ctx);
-return gen_unary(ctx, a, EXT_ZERO, gen_clz);
+return gen_unary_per_ol(ctx, a, EXT_NONE, gen_clz, gen_clzw);
 }
 
 static void gen_ctz(TCGv ret, TCGv arg1)
@@ -58,10 +66,15 @@ static void gen_ctz(TCGv ret, TCGv arg1)
 tcg_gen_ctzi_tl(ret, arg1, TARGET_LONG_BITS);
 }
 
+static void gen_ctzw(TCGv ret, TCGv arg1)
+{
+tcg_gen_ctzi_tl(ret, arg1, 32);
+}
+
 static bool trans_ctz(DisasContext *ctx, arg_ctz *a)
 {
 REQUIRE_ZBB(ctx);
-return gen_unary(ctx, a, EXT_ZERO, gen_ctz);
+return gen_unary_per_ol(ctx, a, EXT_ZERO, gen_ctz, gen_ctzw);
 }
 
 static bool trans_cpop(DisasContext *ctx, arg_cpop *a)
@@ -314,14 +327,6 @@ static bool trans_zext_h_64(DisasContext *ctx, 
arg_zext_h_64 *a)
 return gen_unary(ctx, a, EXT_NONE, tcg_gen_ext16u_tl);
 }
 
-static void gen_clzw(TCGv ret, TCGv arg1)
-{
-TCGv t = tcg_temp_new();
-tcg_gen_shli_tl(t, arg1, 32);
-tcg_gen_clzi_tl(ret, t, 32);
-tcg_temp_free(t);
-}
-
 static bool trans_clzw(DisasContext *ctx, arg_clzw *a)
 {
 REQUIRE_64BIT(ctx);
@@ -329,17 +334,11 @@ static bool trans_clzw(DisasContext *ctx, arg_clzw *a)
 return gen_unary(ctx, a, EXT_NONE, gen_clzw);
 }
 
-static void gen_ctzw(TCGv ret, TCGv arg1)
-{
-tcg_gen_ori_tl(ret, arg1, (target_ulong)MAKE_64BIT_MASK(32, 32));
-tcg_gen_ctzi_tl(ret, ret, 64);
-}
-
 static bool trans_ctzw(DisasContext *ctx, arg_ctzw *a)
 {
 REQUIRE_64BIT(ctx);
 REQUIRE_ZBB(ctx);
-return gen_unary(ctx, a, EXT_NONE, gen_ctzw);
+return gen_unary(ctx, a, EXT_ZERO, gen_ctzw);
 }
 
 static bool trans_cpopw(DisasContext *ctx, arg_cpopw *a)
-- 
2.25.1




[PATCH v4 14/16] target/riscv: Align gprs and fprs in cpu_dump

2021-10-18 Thread Richard Henderson
Allocate 8 columns per register name.

Signed-off-by: Richard Henderson 
---
 target/riscv/cpu.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 4e1920d5f0..f352c2b74c 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -240,7 +240,7 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
 qemu_fprintf(f, " %s %d\n", "V  =  ", riscv_cpu_virt_enabled(env));
 }
 #endif
-qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "pc  ", env->pc);
+qemu_fprintf(f, " %-8s " TARGET_FMT_lx "\n", "pc", env->pc);
 #ifndef CONFIG_USER_ONLY
 qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mhartid ", env->mhartid);
 qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mstatus ", 
(target_ulong)env->mstatus);
@@ -290,15 +290,16 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, 
int flags)
 #endif
 
 for (i = 0; i < 32; i++) {
-qemu_fprintf(f, " %s " TARGET_FMT_lx,
+qemu_fprintf(f, " %-8s " TARGET_FMT_lx,
  riscv_int_regnames[i], env->gpr[i]);
 if ((i & 3) == 3) {
 qemu_fprintf(f, "\n");
 }
 }
+
 if (flags & CPU_DUMP_FPU) {
 for (i = 0; i < 32; i++) {
-qemu_fprintf(f, " %s %016" PRIx64,
+qemu_fprintf(f, " %-8s %016" PRIx64,
  riscv_fpr_regnames[i], env->fpr[i]);
 if ((i & 3) == 3) {
 qemu_fprintf(f, "\n");
-- 
2.25.1




[PATCH v4 09/16] target/riscv: Replace DisasContext.w with DisasContext.ol

2021-10-18 Thread Richard Henderson
In preparation for RV128, consider more than just "w" for
operand size modification.  This will be used for the "d"
insns from RV128 as well.

Rename oper_len to get_olen to better match get_xlen.

Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/translate.c| 71 -
 target/riscv/insn_trans/trans_rvb.c.inc |  8 +--
 target/riscv/insn_trans/trans_rvi.c.inc | 18 +++
 target/riscv/insn_trans/trans_rvm.c.inc | 10 ++--
 4 files changed, 63 insertions(+), 44 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 3f1abbac5c..6ed925c003 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -67,7 +67,7 @@ typedef struct DisasContext {
to any system register, which includes CSR_FRM, so we do not have
to reset this known value.  */
 int frm;
-bool w;
+RISCVMXL ol;
 bool virt_enabled;
 bool ext_ifencei;
 bool hlsx;
@@ -104,12 +104,17 @@ static inline int __attribute__((unused)) 
get_xlen(DisasContext *ctx)
 return 16 << get_xl(ctx);
 }
 
-/* The word size for this operation. */
-static inline int oper_len(DisasContext *ctx)
-{
-return ctx->w ? 32 : TARGET_LONG_BITS;
-}
+/* The operation length, as opposed to the xlen. */
+#ifdef TARGET_RISCV32
+#define get_ol(ctx)MXL_RV32
+#else
+#define get_ol(ctx)((ctx)->ol)
+#endif
 
+static inline int get_olen(DisasContext *ctx)
+{
+return 16 << get_ol(ctx);
+}
 
 /*
  * RISC-V requires NaN-boxing of narrower width floating point values.
@@ -197,24 +202,34 @@ static TCGv get_gpr(DisasContext *ctx, int reg_num, 
DisasExtend ext)
 return ctx->zero;
 }
 
-switch (ctx->w ? ext : EXT_NONE) {
-case EXT_NONE:
-return cpu_gpr[reg_num];
-case EXT_SIGN:
-t = temp_new(ctx);
-tcg_gen_ext32s_tl(t, cpu_gpr[reg_num]);
-return t;
-case EXT_ZERO:
-t = temp_new(ctx);
-tcg_gen_ext32u_tl(t, cpu_gpr[reg_num]);
-return t;
+switch (get_ol(ctx)) {
+case MXL_RV32:
+switch (ext) {
+case EXT_NONE:
+break;
+case EXT_SIGN:
+t = temp_new(ctx);
+tcg_gen_ext32s_tl(t, cpu_gpr[reg_num]);
+return t;
+case EXT_ZERO:
+t = temp_new(ctx);
+tcg_gen_ext32u_tl(t, cpu_gpr[reg_num]);
+return t;
+default:
+g_assert_not_reached();
+}
+break;
+case MXL_RV64:
+break;
+default:
+g_assert_not_reached();
 }
-g_assert_not_reached();
+return cpu_gpr[reg_num];
 }
 
 static TCGv dest_gpr(DisasContext *ctx, int reg_num)
 {
-if (reg_num == 0 || ctx->w) {
+if (reg_num == 0 || get_olen(ctx) < TARGET_LONG_BITS) {
 return temp_new(ctx);
 }
 return cpu_gpr[reg_num];
@@ -223,10 +238,15 @@ static TCGv dest_gpr(DisasContext *ctx, int reg_num)
 static void gen_set_gpr(DisasContext *ctx, int reg_num, TCGv t)
 {
 if (reg_num != 0) {
-if (ctx->w) {
+switch (get_ol(ctx)) {
+case MXL_RV32:
 tcg_gen_ext32s_tl(cpu_gpr[reg_num], t);
-} else {
+break;
+case MXL_RV64:
 tcg_gen_mov_tl(cpu_gpr[reg_num], t);
+break;
+default:
+g_assert_not_reached();
 }
 }
 }
@@ -387,7 +407,7 @@ static bool gen_shift_imm_fn(DisasContext *ctx, arg_shift 
*a, DisasExtend ext,
  void (*func)(TCGv, TCGv, target_long))
 {
 TCGv dest, src1;
-int max_len = oper_len(ctx);
+int max_len = get_olen(ctx);
 
 if (a->shamt >= max_len) {
 return false;
@@ -406,7 +426,7 @@ static bool gen_shift_imm_tl(DisasContext *ctx, arg_shift 
*a, DisasExtend ext,
  void (*func)(TCGv, TCGv, TCGv))
 {
 TCGv dest, src1, src2;
-int max_len = oper_len(ctx);
+int max_len = get_olen(ctx);
 
 if (a->shamt >= max_len) {
 return false;
@@ -430,7 +450,7 @@ static bool gen_shift(DisasContext *ctx, arg_r *a, 
DisasExtend ext,
 TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE);
 TCGv ext2 = tcg_temp_new();
 
-tcg_gen_andi_tl(ext2, src2, oper_len(ctx) - 1);
+tcg_gen_andi_tl(ext2, src2, get_olen(ctx) - 1);
 func(dest, src1, ext2);
 
 gen_set_gpr(ctx, a->rd, dest);
@@ -530,7 +550,6 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX);
 ctx->xl = FIELD_EX32(tb_flags, TB_FLAGS, XL);
 ctx->cs = cs;
-ctx->w = false;
 ctx->ntemp = 0;
 memset(ctx->temp, 0, sizeof(ctx->temp));
 
@@ -554,9 +573,9 @@ static void riscv_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cpu)
 CPURISCVState *env = cpu->env_ptr;
 uint16_t opcode16 = translator_lduw(env, &ctx->base, ctx->base.pc_next);
 
+ctx->ol = ctx->xl;
 decode_opc(env, ctx, opcode16);
 ctx->base.pc_next = ctx->pc_succ_

[PATCH v4 13/16] target/riscv: Use gen_shift*_per_ol for RVB, RVI

2021-10-18 Thread Richard Henderson
Most shift instructions require a separate implementation
for RV32 when TARGET_LONG_BITS == 64.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/translate.c| 31 +
 target/riscv/insn_trans/trans_rvb.c.inc | 92 ++---
 target/riscv/insn_trans/trans_rvi.c.inc | 26 +++
 3 files changed, 97 insertions(+), 52 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index ebcd1c8431..de013fbf9b 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -438,6 +438,22 @@ static bool gen_shift_imm_fn(DisasContext *ctx, arg_shift 
*a, DisasExtend ext,
 return true;
 }
 
+static bool gen_shift_imm_fn_per_ol(DisasContext *ctx, arg_shift *a,
+DisasExtend ext,
+void (*f_tl)(TCGv, TCGv, target_long),
+void (*f_32)(TCGv, TCGv, target_long))
+{
+int olen = get_olen(ctx);
+if (olen != TARGET_LONG_BITS) {
+if (olen == 32) {
+f_tl = f_32;
+} else {
+g_assert_not_reached();
+}
+}
+return gen_shift_imm_fn(ctx, a, ext, f_tl);
+}
+
 static bool gen_shift_imm_tl(DisasContext *ctx, arg_shift *a, DisasExtend ext,
  void (*func)(TCGv, TCGv, TCGv))
 {
@@ -474,6 +490,21 @@ static bool gen_shift(DisasContext *ctx, arg_r *a, 
DisasExtend ext,
 return true;
 }
 
+static bool gen_shift_per_ol(DisasContext *ctx, arg_r *a, DisasExtend ext,
+ void (*f_tl)(TCGv, TCGv, TCGv),
+ void (*f_32)(TCGv, TCGv, TCGv))
+{
+int olen = get_olen(ctx);
+if (olen != TARGET_LONG_BITS) {
+if (olen == 32) {
+f_tl = f_32;
+} else {
+g_assert_not_reached();
+}
+}
+return gen_shift(ctx, a, ext, f_tl);
+}
+
 static bool gen_unary(DisasContext *ctx, arg_r2 *a, DisasExtend ext,
   void (*func)(TCGv, TCGv))
 {
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc 
b/target/riscv/insn_trans/trans_rvb.c.inc
index 0c2120428d..cc39e6033b 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -227,22 +227,70 @@ static bool trans_bexti(DisasContext *ctx, arg_bexti *a)
 return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_bext);
 }
 
+static void gen_rorw(TCGv ret, TCGv arg1, TCGv arg2)
+{
+TCGv_i32 t1 = tcg_temp_new_i32();
+TCGv_i32 t2 = tcg_temp_new_i32();
+
+/* truncate to 32-bits */
+tcg_gen_trunc_tl_i32(t1, arg1);
+tcg_gen_trunc_tl_i32(t2, arg2);
+
+tcg_gen_rotr_i32(t1, t1, t2);
+
+/* sign-extend 64-bits */
+tcg_gen_ext_i32_tl(ret, t1);
+
+tcg_temp_free_i32(t1);
+tcg_temp_free_i32(t2);
+}
+
 static bool trans_ror(DisasContext *ctx, arg_ror *a)
 {
 REQUIRE_ZBB(ctx);
-return gen_shift(ctx, a, EXT_NONE, tcg_gen_rotr_tl);
+return gen_shift_per_ol(ctx, a, EXT_NONE, tcg_gen_rotr_tl, gen_rorw);
+}
+
+static void gen_roriw(TCGv ret, TCGv arg1, target_long shamt)
+{
+TCGv_i32 t1 = tcg_temp_new_i32();
+
+tcg_gen_trunc_tl_i32(t1, arg1);
+tcg_gen_rotri_i32(t1, t1, shamt);
+tcg_gen_ext_i32_tl(ret, t1);
+
+tcg_temp_free_i32(t1);
 }
 
 static bool trans_rori(DisasContext *ctx, arg_rori *a)
 {
 REQUIRE_ZBB(ctx);
-return gen_shift_imm_fn(ctx, a, EXT_NONE, tcg_gen_rotri_tl);
+return gen_shift_imm_fn_per_ol(ctx, a, EXT_NONE,
+   tcg_gen_rotri_tl, gen_roriw);
+}
+
+static void gen_rolw(TCGv ret, TCGv arg1, TCGv arg2)
+{
+TCGv_i32 t1 = tcg_temp_new_i32();
+TCGv_i32 t2 = tcg_temp_new_i32();
+
+/* truncate to 32-bits */
+tcg_gen_trunc_tl_i32(t1, arg1);
+tcg_gen_trunc_tl_i32(t2, arg2);
+
+tcg_gen_rotl_i32(t1, t1, t2);
+
+/* sign-extend 64-bits */
+tcg_gen_ext_i32_tl(ret, t1);
+
+tcg_temp_free_i32(t1);
+tcg_temp_free_i32(t2);
 }
 
 static bool trans_rol(DisasContext *ctx, arg_rol *a)
 {
 REQUIRE_ZBB(ctx);
-return gen_shift(ctx, a, EXT_NONE, tcg_gen_rotl_tl);
+return gen_shift_per_ol(ctx, a, EXT_NONE, tcg_gen_rotl_tl, gen_rolw);
 }
 
 static void gen_rev8_32(TCGv ret, TCGv src1)
@@ -349,24 +397,6 @@ static bool trans_cpopw(DisasContext *ctx, arg_cpopw *a)
 return gen_unary(ctx, a, EXT_ZERO, tcg_gen_ctpop_tl);
 }
 
-static void gen_rorw(TCGv ret, TCGv arg1, TCGv arg2)
-{
-TCGv_i32 t1 = tcg_temp_new_i32();
-TCGv_i32 t2 = tcg_temp_new_i32();
-
-/* truncate to 32-bits */
-tcg_gen_trunc_tl_i32(t1, arg1);
-tcg_gen_trunc_tl_i32(t2, arg2);
-
-tcg_gen_rotr_i32(t1, t1, t2);
-
-/* sign-extend 64-bits */
-tcg_gen_ext_i32_tl(ret, t1);
-
-tcg_temp_free_i32(t1);
-tcg_temp_free_i32(t2);
-}
-
 static bool trans_rorw(DisasContext *ctx, arg_rorw *a)
 {
 REQUIRE_64BIT(ctx);
@@ -380,25 +410,7 @@ static bool trans_roriw(DisasContext *ctx, arg_roriw *a)
 REQUIRE_64BIT(ctx);
 R

[PATCH v4 06/16] target/riscv: Use REQUIRE_64BIT in amo_check64

2021-10-18 Thread Richard Henderson
Use the same REQUIRE_64BIT check that we use elsewhere,
rather than open-coding the use of is_32bit.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 081a5ca34d..d60279b295 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -743,7 +743,8 @@ static bool amo_check(DisasContext *s, arg_rwdvm* a)
 
 static bool amo_check64(DisasContext *s, arg_rwdvm* a)
 {
-return !is_32bit(s) && amo_check(s, a);
+REQUIRE_64BIT(s);
+return amo_check(s, a);
 }
 
 GEN_VEXT_TRANS(vamoswapw_v, 0, rwdvm, amo_op, amo_check)
-- 
2.25.1




[PATCH v4 08/16] target/riscv: Replace is_32bit with get_xl/get_xlen

2021-10-18 Thread Richard Henderson
In preparation for RV128, replace a simple predicate
with a more versatile test.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/translate.c | 33 ++---
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index f7634c175a..3f1abbac5c 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -91,16 +91,19 @@ static inline bool has_ext(DisasContext *ctx, uint32_t ext)
 }
 
 #ifdef TARGET_RISCV32
-# define is_32bit(ctx)  true
+#define get_xl(ctx)MXL_RV32
 #elif defined(CONFIG_USER_ONLY)
-# define is_32bit(ctx)  false
+#define get_xl(ctx)MXL_RV64
 #else
-static inline bool is_32bit(DisasContext *ctx)
-{
-return ctx->xl == MXL_RV32;
-}
+#define get_xl(ctx)((ctx)->xl)
 #endif
 
+/* The word size for this machine mode. */
+static inline int __attribute__((unused)) get_xlen(DisasContext *ctx)
+{
+return 16 << get_xl(ctx);
+}
+
 /* The word size for this operation. */
 static inline int oper_len(DisasContext *ctx)
 {
@@ -257,7 +260,7 @@ static void gen_jal(DisasContext *ctx, int rd, target_ulong 
imm)
 static void mark_fs_dirty(DisasContext *ctx)
 {
 TCGv tmp;
-target_ulong sd = is_32bit(ctx) ? MSTATUS32_SD : MSTATUS64_SD;
+target_ulong sd = get_xl(ctx) == MXL_RV32 ? MSTATUS32_SD : MSTATUS64_SD;
 
 if (ctx->mstatus_fs != MSTATUS_FS) {
 /* Remember the state change for the rest of the TB. */
@@ -316,16 +319,16 @@ EX_SH(12)
 }  \
 } while (0)
 
-#define REQUIRE_32BIT(ctx) do { \
-if (!is_32bit(ctx)) {   \
-return false;   \
-}   \
+#define REQUIRE_32BIT(ctx) do {\
+if (get_xl(ctx) != MXL_RV32) { \
+return false;  \
+}  \
 } while (0)
 
-#define REQUIRE_64BIT(ctx) do { \
-if (is_32bit(ctx)) {\
-return false;   \
-}   \
+#define REQUIRE_64BIT(ctx) do {\
+if (get_xl(ctx) < MXL_RV64) {  \
+return false;  \
+}  \
 } while (0)
 
 static int ex_rvc_register(DisasContext *ctx, int reg)
-- 
2.25.1




[PATCH v4 11/16] target/riscv: Adjust trans_rev8_32 for riscv64

2021-10-18 Thread Richard Henderson
When target_long is 64-bit, we still want a 32-bit bswap for rev8.
Since this opcode is specific to RV32, we need not conditionalize.

Acked-by: Alistair Francis 
Reviewed-by: LIU Zhiwei 
Signed-off-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvb.c.inc | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/target/riscv/insn_trans/trans_rvb.c.inc 
b/target/riscv/insn_trans/trans_rvb.c.inc
index 66dd51de49..c62eea433a 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -232,11 +232,16 @@ static bool trans_rol(DisasContext *ctx, arg_rol *a)
 return gen_shift(ctx, a, EXT_NONE, tcg_gen_rotl_tl);
 }
 
+static void gen_rev8_32(TCGv ret, TCGv src1)
+{
+tcg_gen_bswap32_tl(ret, src1, TCG_BSWAP_OS);
+}
+
 static bool trans_rev8_32(DisasContext *ctx, arg_rev8_32 *a)
 {
 REQUIRE_32BIT(ctx);
 REQUIRE_ZBB(ctx);
-return gen_unary(ctx, a, EXT_NONE, tcg_gen_bswap_tl);
+return gen_unary(ctx, a, EXT_NONE, gen_rev8_32);
 }
 
 static bool trans_rev8_64(DisasContext *ctx, arg_rev8_64 *a)
-- 
2.25.1




[PATCH v4 05/16] target/riscv: Add MXL/SXL/UXL to TB_FLAGS

2021-10-18 Thread Richard Henderson
Begin adding support for switching XLEN at runtime.  Extract the
effective XLEN from MISA and MSTATUS and store for use during translation.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/cpu.h|  2 ++
 target/riscv/cpu.c|  8 
 target/riscv/cpu_helper.c | 33 +
 target/riscv/csr.c|  3 +++
 target/riscv/translate.c  |  2 +-
 5 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d0e82135a9..c24bc9a039 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -395,6 +395,8 @@ FIELD(TB_FLAGS, VILL, 8, 1)
 /* Is a Hypervisor instruction load/store allowed? */
 FIELD(TB_FLAGS, HLSX, 9, 1)
 FIELD(TB_FLAGS, MSTATUS_HS_FS, 10, 2)
+/* The combination of MXL/SXL/UXL that applies to the current cpu mode. */
+FIELD(TB_FLAGS, XL, 12, 2)
 
 #ifdef TARGET_RISCV32
 #define riscv_cpu_mxl(env)  ((void)(env), MXL_RV32)
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 1857670a69..4e1920d5f0 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -355,6 +355,14 @@ static void riscv_cpu_reset(DeviceState *dev)
 env->misa_mxl = env->misa_mxl_max;
 env->priv = PRV_M;
 env->mstatus &= ~(MSTATUS_MIE | MSTATUS_MPRV);
+if (env->misa_mxl > MXL_RV32) {
+/*
+ * The reset status of SXL/UXL is undefined, but mstatus is WARL
+ * and we must ensure that the value after init is valid for read.
+ */
+env->mstatus = set_field(env->mstatus, MSTATUS64_SXL, env->misa_mxl);
+env->mstatus = set_field(env->mstatus, MSTATUS64_UXL, env->misa_mxl);
+}
 env->mcause = 0;
 env->pc = env->resetvec;
 env->two_stage_lookup = false;
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 403f54171d..429afd1f48 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -35,6 +35,37 @@ int riscv_cpu_mmu_index(CPURISCVState *env, bool ifetch)
 #endif
 }
 
+static RISCVMXL cpu_get_xl(CPURISCVState *env)
+{
+#if defined(TARGET_RISCV32)
+return MXL_RV32;
+#elif defined(CONFIG_USER_ONLY)
+return MXL_RV64;
+#else
+RISCVMXL xl = riscv_cpu_mxl(env);
+
+/*
+ * When emulating a 32-bit-only cpu, use RV32.
+ * When emulating a 64-bit cpu, and MXL has been reduced to RV32,
+ * MSTATUSH doesn't have UXL/SXL, therefore XLEN cannot be widened
+ * back to RV64 for lower privs.
+ */
+if (xl != MXL_RV32) {
+switch (env->priv) {
+case PRV_M:
+break;
+case PRV_U:
+xl = get_field(env->mstatus, MSTATUS64_UXL);
+break;
+default: /* PRV_S | PRV_H */
+xl = get_field(env->mstatus, MSTATUS64_SXL);
+break;
+}
+}
+return xl;
+#endif
+}
+
 void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong *pc,
   target_ulong *cs_base, uint32_t *pflags)
 {
@@ -78,6 +109,8 @@ void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong 
*pc,
 }
 #endif
 
+flags = FIELD_DP32(flags, TB_FLAGS, XL, cpu_get_xl(env));
+
 *pflags = flags;
 }
 
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 9c0753bc8b..c4a479ddd2 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -526,6 +526,9 @@ static RISCVException write_mstatus(CPURISCVState *env, int 
csrno,
 mstatus = set_field(mstatus, MSTATUS32_SD, dirty);
 } else {
 mstatus = set_field(mstatus, MSTATUS64_SD, dirty);
+/* SXL and UXL fields are for now read only */
+mstatus = set_field(mstatus, MSTATUS64_SXL, MXL_RV64);
+mstatus = set_field(mstatus, MSTATUS64_UXL, MXL_RV64);
 }
 env->mstatus = mstatus;
 
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 66857732e8..f7634c175a 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -514,7 +514,6 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 #else
 ctx->virt_enabled = false;
 #endif
-ctx->xl = env->misa_mxl;
 ctx->misa_ext = env->misa_ext;
 ctx->frm = -1;  /* unknown rounding mode */
 ctx->ext_ifencei = cpu->cfg.ext_ifencei;
@@ -526,6 +525,7 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 ctx->lmul = FIELD_EX32(tb_flags, TB_FLAGS, LMUL);
 ctx->mlen = 1 << (ctx->sew  + 3 - ctx->lmul);
 ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX);
+ctx->xl = FIELD_EX32(tb_flags, TB_FLAGS, XL);
 ctx->cs = cs;
 ctx->w = false;
 ctx->ntemp = 0;
-- 
2.25.1




[PATCH v4 04/16] target/riscv: Replace riscv_cpu_is_32bit with riscv_cpu_mxl

2021-10-18 Thread Richard Henderson
Shortly, the set of supported XL will not be just 32 and 64,
and representing that properly using the enumeration will be
imperative.

Two places, booting and gdb, intentionally use misa_mxl_max
to emphasize the use of the reset value of misa.mxl, and not
the current cpu state.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/cpu.h|  9 -
 hw/riscv/boot.c   |  2 +-
 semihosting/arm-compat-semi.c |  2 +-
 target/riscv/cpu.c| 24 ++--
 target/riscv/cpu_helper.c | 12 ++--
 target/riscv/csr.c| 24 
 target/riscv/gdbstub.c|  2 +-
 target/riscv/monitor.c|  4 ++--
 8 files changed, 45 insertions(+), 34 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index e708fcc168..d0e82135a9 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -396,7 +396,14 @@ FIELD(TB_FLAGS, VILL, 8, 1)
 FIELD(TB_FLAGS, HLSX, 9, 1)
 FIELD(TB_FLAGS, MSTATUS_HS_FS, 10, 2)
 
-bool riscv_cpu_is_32bit(CPURISCVState *env);
+#ifdef TARGET_RISCV32
+#define riscv_cpu_mxl(env)  ((void)(env), MXL_RV32)
+#else
+static inline RISCVMXL riscv_cpu_mxl(CPURISCVState *env)
+{
+return env->misa_mxl;
+}
+#endif
 
 /*
  * A simplification for VLMAX
diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
index 993bf89064..d1ffc7b56c 100644
--- a/hw/riscv/boot.c
+++ b/hw/riscv/boot.c
@@ -35,7 +35,7 @@
 
 bool riscv_is_32bit(RISCVHartArrayState *harts)
 {
-return riscv_cpu_is_32bit(&harts->harts[0].env);
+return harts->harts[0].env.misa_mxl_max == MXL_RV32;
 }
 
 target_ulong riscv_calc_kernel_start_addr(RISCVHartArrayState *harts,
diff --git a/semihosting/arm-compat-semi.c b/semihosting/arm-compat-semi.c
index 01badea99c..37963becae 100644
--- a/semihosting/arm-compat-semi.c
+++ b/semihosting/arm-compat-semi.c
@@ -775,7 +775,7 @@ static inline bool is_64bit_semihosting(CPUArchState *env)
 #if defined(TARGET_ARM)
 return is_a64(env);
 #elif defined(TARGET_RISCV)
-return !riscv_cpu_is_32bit(env);
+return riscv_cpu_mxl(env) != MXL_RV32;
 #else
 #error un-handled architecture
 #endif
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index fdf031a394..1857670a69 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -108,11 +108,6 @@ const char *riscv_cpu_get_trap_name(target_ulong cause, 
bool async)
 }
 }
 
-bool riscv_cpu_is_32bit(CPURISCVState *env)
-{
-return env->misa_mxl == MXL_RV32;
-}
-
 static void set_misa(CPURISCVState *env, RISCVMXL mxl, uint32_t ext)
 {
 env->misa_mxl_max = env->misa_mxl = mxl;
@@ -249,7 +244,7 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
 #ifndef CONFIG_USER_ONLY
 qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mhartid ", env->mhartid);
 qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mstatus ", 
(target_ulong)env->mstatus);
-if (riscv_cpu_is_32bit(env)) {
+if (riscv_cpu_mxl(env) == MXL_RV32) {
 qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mstatush ",
  (target_ulong)(env->mstatus >> 32));
 }
@@ -372,10 +367,16 @@ static void riscv_cpu_reset(DeviceState *dev)
 static void riscv_cpu_disas_set_info(CPUState *s, disassemble_info *info)
 {
 RISCVCPU *cpu = RISCV_CPU(s);
-if (riscv_cpu_is_32bit(&cpu->env)) {
+
+switch (riscv_cpu_mxl(&cpu->env)) {
+case MXL_RV32:
 info->print_insn = print_insn_riscv32;
-} else {
+break;
+case MXL_RV64:
 info->print_insn = print_insn_riscv64;
+break;
+default:
+g_assert_not_reached();
 }
 }
 
@@ -631,10 +632,13 @@ static gchar *riscv_gdb_arch_name(CPUState *cs)
 RISCVCPU *cpu = RISCV_CPU(cs);
 CPURISCVState *env = &cpu->env;
 
-if (riscv_cpu_is_32bit(env)) {
+switch (riscv_cpu_mxl(env)) {
+case MXL_RV32:
 return g_strdup("riscv:rv32");
-} else {
+case MXL_RV64:
 return g_strdup("riscv:rv64");
+default:
+g_assert_not_reached();
 }
 }
 
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 14d1d3cb72..403f54171d 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -152,7 +152,7 @@ bool riscv_cpu_fp_enabled(CPURISCVState *env)
 
 void riscv_cpu_swap_hypervisor_regs(CPURISCVState *env)
 {
-uint64_t sd = riscv_cpu_is_32bit(env) ? MSTATUS32_SD : MSTATUS64_SD;
+uint64_t sd = riscv_cpu_mxl(env) == MXL_RV32 ? MSTATUS32_SD : MSTATUS64_SD;
 uint64_t mstatus_mask = MSTATUS_MXR | MSTATUS_SUM | MSTATUS_FS |
 MSTATUS_SPP | MSTATUS_SPIE | MSTATUS_SIE |
 MSTATUS64_UXL | sd;
@@ -447,7 +447,7 @@ static int get_physical_address(CPURISCVState *env, hwaddr 
*physical,
 
 if (first_stage == true) {
 if (use_background) {
-if (riscv_cpu_is_32bit(env)) {
+if (riscv_cpu_mxl(env) == MXL_RV32) {
 base = (hwaddr)get_field(env->vsatp, SATP32_PPN) <<

[PATCH v4 10/16] target/riscv: Use gen_arith_per_ol for RVM

2021-10-18 Thread Richard Henderson
The multiply high-part instructions require a separate
implementation for RV32 when TARGET_LONG_BITS == 64.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/translate.c| 16 +++
 target/riscv/insn_trans/trans_rvm.c.inc | 26 ++---
 2 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 6ed925c003..5d54570cc9 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -403,6 +403,22 @@ static bool gen_arith(DisasContext *ctx, arg_r *a, 
DisasExtend ext,
 return true;
 }
 
+static bool gen_arith_per_ol(DisasContext *ctx, arg_r *a, DisasExtend ext,
+ void (*f_tl)(TCGv, TCGv, TCGv),
+ void (*f_32)(TCGv, TCGv, TCGv))
+{
+int olen = get_olen(ctx);
+
+if (olen != TARGET_LONG_BITS) {
+if (olen == 32) {
+f_tl = f_32;
+} else {
+g_assert_not_reached();
+}
+}
+return gen_arith(ctx, a, ext, f_tl);
+}
+
 static bool gen_shift_imm_fn(DisasContext *ctx, arg_shift *a, DisasExtend ext,
  void (*func)(TCGv, TCGv, target_long))
 {
diff --git a/target/riscv/insn_trans/trans_rvm.c.inc 
b/target/riscv/insn_trans/trans_rvm.c.inc
index 9a1fe3c799..2af0e5c139 100644
--- a/target/riscv/insn_trans/trans_rvm.c.inc
+++ b/target/riscv/insn_trans/trans_rvm.c.inc
@@ -33,10 +33,16 @@ static void gen_mulh(TCGv ret, TCGv s1, TCGv s2)
 tcg_temp_free(discard);
 }
 
+static void gen_mulh_w(TCGv ret, TCGv s1, TCGv s2)
+{
+tcg_gen_mul_tl(ret, s1, s2);
+tcg_gen_sari_tl(ret, ret, 32);
+}
+
 static bool trans_mulh(DisasContext *ctx, arg_mulh *a)
 {
 REQUIRE_EXT(ctx, RVM);
-return gen_arith(ctx, a, EXT_NONE, gen_mulh);
+return gen_arith_per_ol(ctx, a, EXT_SIGN, gen_mulh, gen_mulh_w);
 }
 
 static void gen_mulhsu(TCGv ret, TCGv arg1, TCGv arg2)
@@ -54,10 +60,23 @@ static void gen_mulhsu(TCGv ret, TCGv arg1, TCGv arg2)
 tcg_temp_free(rh);
 }
 
+static void gen_mulhsu_w(TCGv ret, TCGv arg1, TCGv arg2)
+{
+TCGv t1 = tcg_temp_new();
+TCGv t2 = tcg_temp_new();
+
+tcg_gen_ext32s_tl(t1, arg1);
+tcg_gen_ext32u_tl(t2, arg2);
+tcg_gen_mul_tl(ret, t1, t2);
+tcg_temp_free(t1);
+tcg_temp_free(t2);
+tcg_gen_sari_tl(ret, ret, 32);
+}
+
 static bool trans_mulhsu(DisasContext *ctx, arg_mulhsu *a)
 {
 REQUIRE_EXT(ctx, RVM);
-return gen_arith(ctx, a, EXT_NONE, gen_mulhsu);
+return gen_arith_per_ol(ctx, a, EXT_NONE, gen_mulhsu, gen_mulhsu_w);
 }
 
 static void gen_mulhu(TCGv ret, TCGv s1, TCGv s2)
@@ -71,7 +90,8 @@ static void gen_mulhu(TCGv ret, TCGv s1, TCGv s2)
 static bool trans_mulhu(DisasContext *ctx, arg_mulhu *a)
 {
 REQUIRE_EXT(ctx, RVM);
-return gen_arith(ctx, a, EXT_NONE, gen_mulhu);
+/* gen_mulh_w works for either sign as input. */
+return gen_arith_per_ol(ctx, a, EXT_ZERO, gen_mulhu, gen_mulh_w);
 }
 
 static void gen_div(TCGv ret, TCGv source1, TCGv source2)
-- 
2.25.1




[PATCH v4 01/16] target/riscv: Move cpu_get_tb_cpu_state out of line

2021-10-18 Thread Richard Henderson
Move the function to cpu_helper.c, as it is large and growing.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/cpu.h| 47 ++-
 target/riscv/cpu_helper.c | 46 ++
 2 files changed, 48 insertions(+), 45 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 9e55b2f5b1..7084efc452 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -413,51 +413,8 @@ static inline uint32_t vext_get_vlmax(RISCVCPU *cpu, 
target_ulong vtype)
 return cpu->cfg.vlen >> (sew + 3 - lmul);
 }
 
-static inline void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong *pc,
-target_ulong *cs_base, uint32_t 
*pflags)
-{
-uint32_t flags = 0;
-
-*pc = env->pc;
-*cs_base = 0;
-
-if (riscv_has_ext(env, RVV)) {
-uint32_t vlmax = vext_get_vlmax(env_archcpu(env), env->vtype);
-bool vl_eq_vlmax = (env->vstart == 0) && (vlmax == env->vl);
-flags = FIELD_DP32(flags, TB_FLAGS, VILL,
-FIELD_EX64(env->vtype, VTYPE, VILL));
-flags = FIELD_DP32(flags, TB_FLAGS, SEW,
-FIELD_EX64(env->vtype, VTYPE, VSEW));
-flags = FIELD_DP32(flags, TB_FLAGS, LMUL,
-FIELD_EX64(env->vtype, VTYPE, VLMUL));
-flags = FIELD_DP32(flags, TB_FLAGS, VL_EQ_VLMAX, vl_eq_vlmax);
-} else {
-flags = FIELD_DP32(flags, TB_FLAGS, VILL, 1);
-}
-
-#ifdef CONFIG_USER_ONLY
-flags |= TB_FLAGS_MSTATUS_FS;
-#else
-flags |= cpu_mmu_index(env, 0);
-if (riscv_cpu_fp_enabled(env)) {
-flags |= env->mstatus & MSTATUS_FS;
-}
-
-if (riscv_has_ext(env, RVH)) {
-if (env->priv == PRV_M ||
-(env->priv == PRV_S && !riscv_cpu_virt_enabled(env)) ||
-(env->priv == PRV_U && !riscv_cpu_virt_enabled(env) &&
-get_field(env->hstatus, HSTATUS_HU))) {
-flags = FIELD_DP32(flags, TB_FLAGS, HLSX, 1);
-}
-
-flags = FIELD_DP32(flags, TB_FLAGS, MSTATUS_HS_FS,
-   get_field(env->mstatus_hs, MSTATUS_FS));
-}
-#endif
-
-*pflags = flags;
-}
+void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong *pc,
+  target_ulong *cs_base, uint32_t *pflags);
 
 RISCVException riscv_csrrw(CPURISCVState *env, int csrno,
target_ulong *ret_value,
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index d41d5cd27c..14d1d3cb72 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -35,6 +35,52 @@ int riscv_cpu_mmu_index(CPURISCVState *env, bool ifetch)
 #endif
 }
 
+void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong *pc,
+  target_ulong *cs_base, uint32_t *pflags)
+{
+uint32_t flags = 0;
+
+*pc = env->pc;
+*cs_base = 0;
+
+if (riscv_has_ext(env, RVV)) {
+uint32_t vlmax = vext_get_vlmax(env_archcpu(env), env->vtype);
+bool vl_eq_vlmax = (env->vstart == 0) && (vlmax == env->vl);
+flags = FIELD_DP32(flags, TB_FLAGS, VILL,
+FIELD_EX64(env->vtype, VTYPE, VILL));
+flags = FIELD_DP32(flags, TB_FLAGS, SEW,
+FIELD_EX64(env->vtype, VTYPE, VSEW));
+flags = FIELD_DP32(flags, TB_FLAGS, LMUL,
+FIELD_EX64(env->vtype, VTYPE, VLMUL));
+flags = FIELD_DP32(flags, TB_FLAGS, VL_EQ_VLMAX, vl_eq_vlmax);
+} else {
+flags = FIELD_DP32(flags, TB_FLAGS, VILL, 1);
+}
+
+#ifdef CONFIG_USER_ONLY
+flags |= TB_FLAGS_MSTATUS_FS;
+#else
+flags |= cpu_mmu_index(env, 0);
+if (riscv_cpu_fp_enabled(env)) {
+flags |= env->mstatus & MSTATUS_FS;
+}
+
+if (riscv_has_ext(env, RVH)) {
+if (env->priv == PRV_M ||
+(env->priv == PRV_S && !riscv_cpu_virt_enabled(env)) ||
+(env->priv == PRV_U && !riscv_cpu_virt_enabled(env) &&
+get_field(env->hstatus, HSTATUS_HU))) {
+flags = FIELD_DP32(flags, TB_FLAGS, HLSX, 1);
+}
+
+flags = FIELD_DP32(flags, TB_FLAGS, MSTATUS_HS_FS,
+   get_field(env->mstatus_hs, MSTATUS_FS));
+}
+#endif
+
+*pflags = flags;
+}
+
 #ifndef CONFIG_USER_ONLY
 static int riscv_cpu_local_irq_pending(CPURISCVState *env)
 {
-- 
2.25.1




[PATCH v4 03/16] target/riscv: Split misa.mxl and misa.ext

2021-10-18 Thread Richard Henderson
The hw representation of misa.mxl is at the high bits of the
misa csr.  Representing this in the same way inside QEMU
results in overly complex code trying to check that field.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/cpu.h  | 15 +++
 linux-user/elfload.c|  2 +-
 linux-user/riscv/cpu_loop.c |  2 +-
 target/riscv/cpu.c  | 78 +
 target/riscv/csr.c  | 44 ++---
 target/riscv/gdbstub.c  |  8 ++--
 target/riscv/machine.c  | 10 +++--
 target/riscv/translate.c| 10 +++--
 8 files changed, 100 insertions(+), 69 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 7084efc452..e708fcc168 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -25,6 +25,7 @@
 #include "exec/cpu-defs.h"
 #include "fpu/softfloat-types.h"
 #include "qom/object.h"
+#include "cpu_bits.h"
 
 #define TCG_GUEST_DEFAULT_MO 0
 
@@ -51,9 +52,6 @@
 # define TYPE_RISCV_CPU_BASETYPE_RISCV_CPU_BASE64
 #endif
 
-#define RV32 ((target_ulong)1 << (TARGET_LONG_BITS - 2))
-#define RV64 ((target_ulong)2 << (TARGET_LONG_BITS - 2))
-
 #define RV(x) ((target_ulong)1 << (x - 'A'))
 
 #define RVI RV('I')
@@ -133,8 +131,12 @@ struct CPURISCVState {
 target_ulong priv_ver;
 target_ulong bext_ver;
 target_ulong vext_ver;
-target_ulong misa;
-target_ulong misa_mask;
+
+/* RISCVMXL, but uint32_t for vmstate migration */
+uint32_t misa_mxl;  /* current mxl */
+uint32_t misa_mxl_max;  /* max mxl for this cpu */
+uint32_t misa_ext;  /* current extensions */
+uint32_t misa_ext_mask; /* max ext for this cpu */
 
 uint32_t features;
 
@@ -313,7 +315,7 @@ struct RISCVCPU {
 
 static inline int riscv_has_ext(CPURISCVState *env, target_ulong ext)
 {
-return (env->misa & ext) != 0;
+return (env->misa_ext & ext) != 0;
 }
 
 static inline bool riscv_feature(CPURISCVState *env, int feature)
@@ -322,7 +324,6 @@ static inline bool riscv_feature(CPURISCVState *env, int 
feature)
 }
 
 #include "cpu_user.h"
-#include "cpu_bits.h"
 
 extern const char * const riscv_int_regnames[];
 extern const char * const riscv_fpr_regnames[];
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 2404d482ba..214c1aa40d 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1448,7 +1448,7 @@ static uint32_t get_elf_hwcap(void)
 uint32_t mask = MISA_BIT('I') | MISA_BIT('M') | MISA_BIT('A')
 | MISA_BIT('F') | MISA_BIT('D') | MISA_BIT('C');
 
-return cpu->env.misa & mask;
+return cpu->env.misa_ext & mask;
 #undef MISA_BIT
 }
 
diff --git a/linux-user/riscv/cpu_loop.c b/linux-user/riscv/cpu_loop.c
index 9859a366e4..e5bb6d908a 100644
--- a/linux-user/riscv/cpu_loop.c
+++ b/linux-user/riscv/cpu_loop.c
@@ -133,7 +133,7 @@ void target_cpu_copy_regs(CPUArchState *env, struct 
target_pt_regs *regs)
 env->gpr[xSP] = regs->sp;
 env->elf_flags = info->elf_flags;
 
-if ((env->misa & RVE) && !(env->elf_flags & EF_RISCV_RVE)) {
+if ((env->misa_ext & RVE) && !(env->elf_flags & EF_RISCV_RVE)) {
 error_report("Incompatible ELF: RVE cpu requires RVE ABI binary");
 exit(EXIT_FAILURE);
 }
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 1d69d1887e..fdf031a394 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -110,16 +110,13 @@ const char *riscv_cpu_get_trap_name(target_ulong cause, 
bool async)
 
 bool riscv_cpu_is_32bit(CPURISCVState *env)
 {
-if (env->misa & RV64) {
-return false;
-}
-
-return true;
+return env->misa_mxl == MXL_RV32;
 }
 
-static void set_misa(CPURISCVState *env, target_ulong misa)
+static void set_misa(CPURISCVState *env, RISCVMXL mxl, uint32_t ext)
 {
-env->misa_mask = env->misa = misa;
+env->misa_mxl_max = env->misa_mxl = mxl;
+env->misa_ext_mask = env->misa_ext = ext;
 }
 
 static void set_priv_version(CPURISCVState *env, int priv_ver)
@@ -148,9 +145,9 @@ static void riscv_any_cpu_init(Object *obj)
 {
 CPURISCVState *env = &RISCV_CPU(obj)->env;
 #if defined(TARGET_RISCV32)
-set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVU);
+set_misa(env, MXL_RV32, RVI | RVM | RVA | RVF | RVD | RVC | RVU);
 #elif defined(TARGET_RISCV64)
-set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVU);
+set_misa(env, MXL_RV64, RVI | RVM | RVA | RVF | RVD | RVC | RVU);
 #endif
 set_priv_version(env, PRIV_VERSION_1_11_0);
 }
@@ -160,20 +157,20 @@ static void rv64_base_cpu_init(Object *obj)
 {
 CPURISCVState *env = &RISCV_CPU(obj)->env;
 /* We set this in the realise function */
-set_misa(env, RV64);
+set_misa(env, MXL_RV64, 0);
 }
 
 static void rv64_sifive_u_cpu_init(Object *obj)
 {
 CPURISCVState *env = &RISCV_CPU(obj)->env;
-set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
+set_misa(env, MXL_RV64, RVI | RVM | RVA | RVF | RVD | RVC | RV

[PATCH v4 07/16] target/riscv: Properly check SEW in amo_op

2021-10-18 Thread Richard Henderson
We're currently assuming SEW <= 3, and the "else" from
the SEW == 3 must be less.  Use a switch and explicitly
bound both SEW and SEQ for all cases.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 26 +
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index d60279b295..d16446d3bb 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -704,18 +704,20 @@ static bool amo_op(DisasContext *s, arg_rwdvm *a, uint8_t 
seq)
 gen_helper_exit_atomic(cpu_env);
 s->base.is_jmp = DISAS_NORETURN;
 return true;
-} else {
-if (s->sew == 3) {
-if (!is_32bit(s)) {
-fn = fnsd[seq];
-} else {
-/* Check done in amo_check(). */
-g_assert_not_reached();
-}
-} else {
-assert(seq < ARRAY_SIZE(fnsw));
-fn = fnsw[seq];
-}
+}
+
+switch (s->sew) {
+case 0 ... 2:
+assert(seq < ARRAY_SIZE(fnsw));
+fn = fnsw[seq];
+break;
+case 3:
+/* XLEN check done in amo_check(). */
+assert(seq < ARRAY_SIZE(fnsd));
+fn = fnsd[seq];
+break;
+default:
+g_assert_not_reached();
 }
 
 data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
-- 
2.25.1




[PATCH v4 02/16] target/riscv: Create RISCVMXL enumeration

2021-10-18 Thread Richard Henderson
Move the MXL_RV* defines to enumerators.

Reviewed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Richard Henderson 
---
 target/riscv/cpu_bits.h | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 999187a9ee..e248c6bf6d 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -364,9 +364,11 @@
 #define MISA32_MXL  0xC000
 #define MISA64_MXL  0xC000ULL
 
-#define MXL_RV321
-#define MXL_RV642
-#define MXL_RV128   3
+typedef enum {
+MXL_RV32  = 1,
+MXL_RV64  = 2,
+MXL_RV128 = 3,
+} RISCVMXL;
 
 /* sstatus CSR bits */
 #define SSTATUS_UIE 0x0001
-- 
2.25.1




[PATCH v4 00/16] target/riscv: Rationalize XLEN and operand length

2021-10-18 Thread Richard Henderson
This is a partial patch set attempting to set things in the
right direction for both the UXL and RV128 patch sets.


r~


Changes for v4:
  * Use riscv_csrrw_debug for cpu_dump.
This fixes the issue that Alistair pointed out wrt the
MSTATUS.SD bit not being correct in the dump; note that
gdbstub already uses riscv_csrrw_debug, and so did not
have a problem.
  * Align the registers in cpu_dump.

Changes for v3:
  * Fix CONFIG_ typo.
  * Fix ctzw typo.
  * Mark get_xlen unused (clang werror)
  * Compute MSTATUS_SD on demand.

Changes for v2:
  * Set mxl/sxl/uxl at reset.
  * Set sxl/uxl in write_mstatus.


Richard Henderson (16):
  target/riscv: Move cpu_get_tb_cpu_state out of line
  target/riscv: Create RISCVMXL enumeration
  target/riscv: Split misa.mxl and misa.ext
  target/riscv: Replace riscv_cpu_is_32bit with riscv_cpu_mxl
  target/riscv: Add MXL/SXL/UXL to TB_FLAGS
  target/riscv: Use REQUIRE_64BIT in amo_check64
  target/riscv: Properly check SEW in amo_op
  target/riscv: Replace is_32bit with get_xl/get_xlen
  target/riscv: Replace DisasContext.w with DisasContext.ol
  target/riscv: Use gen_arith_per_ol for RVM
  target/riscv: Adjust trans_rev8_32 for riscv64
  target/riscv: Use gen_unary_per_ol for RVB
  target/riscv: Use gen_shift*_per_ol for RVB, RVI
  target/riscv: Align gprs and fprs in cpu_dump
  target/riscv: Use riscv_csrrw_debug for cpu_dump
  target/riscv: Compute mstatus.sd on demand

 target/riscv/cpu.h  |  73 +++-
 target/riscv/cpu_bits.h |   8 +-
 hw/riscv/boot.c |   2 +-
 linux-user/elfload.c|   2 +-
 linux-user/riscv/cpu_loop.c |   2 +-
 semihosting/arm-compat-semi.c   |   2 +-
 target/riscv/cpu.c  | 212 ++--
 target/riscv/cpu_helper.c   |  92 +-
 target/riscv/csr.c  | 104 +++-
 target/riscv/gdbstub.c  |  10 +-
 target/riscv/machine.c  |  10 +-
 target/riscv/monitor.c  |   4 +-
 target/riscv/translate.c| 174 ++-
 target/riscv/insn_trans/trans_rvb.c.inc | 140 +---
 target/riscv/insn_trans/trans_rvi.c.inc |  44 ++---
 target/riscv/insn_trans/trans_rvm.c.inc |  36 +++-
 target/riscv/insn_trans/trans_rvv.c.inc |  29 ++--
 17 files changed, 590 insertions(+), 354 deletions(-)

-- 
2.25.1




Re: [PATCH v2 04/22] target/riscv: Improve fidelity of guest external interrupts

2021-10-18 Thread Alistair Francis
On Mon, Oct 18, 2021 at 10:55 PM Anup Patel  wrote:
>
> On Fri, Oct 15, 2021 at 11:54 AM Alistair Francis  
> wrote:
> >
> > On Thu, Sep 16, 2021 at 11:42 PM Anup Patel  wrote:
> > >
> > > On Wed, Sep 15, 2021 at 6:19 AM Alistair Francis  
> > > wrote:
> > > >
> > > > On Tue, Sep 14, 2021 at 2:33 AM Anup Patel  wrote:
> > > > >
> > > > > On Thu, Sep 9, 2021 at 12:14 PM Alistair Francis 
> > > > >  wrote:
> > > > > >
> > > > > > On Thu, Sep 2, 2021 at 9:26 PM Anup Patel  
> > > > > > wrote:
> > > > > > >
> > > > > > > The guest external interrupts for external interrupt controller 
> > > > > > > are
> > > > > > > not delivered to the guest running under hypervisor on time. This
> > > > > > > results in a guest having sluggish response to serial console 
> > > > > > > input
> > > > > > > and other I/O events. To improve timely delivery of guest external
> > > > > > > interrupts, we check and inject interrupt upon every sret 
> > > > > > > instruction.
> > > > > > >
> > > > > > > Signed-off-by: Anup Patel 
> > > > > > > ---
> > > > > > >  target/riscv/op_helper.c | 9 +
> > > > > > >  1 file changed, 9 insertions(+)
> > > > > > >
> > > > > > > diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
> > > > > > > index ee7c24efe7..4c995c239e 100644
> > > > > > > --- a/target/riscv/op_helper.c
> > > > > > > +++ b/target/riscv/op_helper.c
> > > > > > > @@ -129,6 +129,15 @@ target_ulong helper_sret(CPURISCVState *env, 
> > > > > > > target_ulong cpu_pc_deb)
> > > > > > >
> > > > > > >  riscv_cpu_set_mode(env, prev_priv);
> > > > > > >
> > > > > > > +/*
> > > > > > > + * QEMU does not promptly deliver guest external interrupts
> > > > > > > + * to a guest running on a hypervisor which in-turn is 
> > > > > > > running
> > > > > > > + * on QEMU. We make dummy call to riscv_cpu_update_mip() upon
> > > > > > > + * every sret instruction so that QEMU pickup guest external
> > > > > > > + * interrupts sooner.
> > > > > > > + */
> > > > > > > + riscv_cpu_update_mip(env_archcpu(env), 0, 0);
> > > > > >
> > > > > > This doesn't seem right. I don't understand why we need this?
> > > > > >
> > > > > > riscv_cpu_update_mip() is called when an interrupt is delivered to 
> > > > > > the
> > > > > > CPU, if we are missing interrupts then that is a bug somewhere else.
> > > > >
> > > > > I have finally figured out the cause of guest external interrupts 
> > > > > being
> > > > > missed by Guest/VM.
> > > > >
> > > > > The riscv_cpu_set_irq() which handles guest external interrupt lines
> > > > > of each CPU is called asynchronously. This function in-turn calls
> > > > > riscv_cpu_update_mip() but the CPU might be in host mode (V=0)
> > > > > or in Guest/VM mode (V=1). If the CPU is in host mode (V=0) when
> > > >
> > > > The IRQ being raised should just directly call riscv_cpu_update_mip()
> > > > so the IRQ should happen straight away.
> > >
> > > That's not true for guest external interrupts. The target Guest/VM might
> > > not be running on the receiving HART.
> > >
> > > Let say IMSIC injected guest external IRQ1 to HARTx which is meant
> > > for a Guest/VM, so the riscv_cpu_set_irq() will call 
> > > riscv_cpu_update_mip().
> > > If HARTx might be in HS-mode or HARTx might be running some
> > > other Guest/VM then cpu_interrupt() request queued by 
> > > riscv_cpu_update_mip()
> > > will not result in any interrupt being injected. This further means that
> > > QEMU has to check and inject guest external interrupts to target
> > > Guest/VM when HARTx makes a switch from HS-mode to VS-mode. By
> > > calling riscv_cpu_update_mip() upon SRET instruction we are ensuring
> > > that if any guest external interrupt was missed then it is injected ot
> > > VS-mode.
> >
> > Ah ok.
> >
> > So the problem is that an interrupt can occur for a guest, while that
> > guest isn't executing.
>
> Yes, that's right.
>
> >
> > So for example a CPU is executing with V=0. `riscv_cpu_update_mip()`
> > is called, which triggers a hard interrupt. That in turn calls
> > `riscv_cpu_exec_interrupt()` and `riscv_cpu_local_irq_pending()`.
>
> In this case, the hard interrupt is actually a guest external interrupt
> which is tracked via hgeip CSR. The hgeip CSR is updated immediately
> but `riscv_cpu_local_irq_pending()` might be called while V=0 hence
> no interrupt.
>
> >
> > This results in the guest Hypervisor receiving the interrupt, which it
> > then doesn't act on? Or is MIP set but `riscv_cpu_local_irq_pending()`
> > returns false due to the enable checks?
>
> Here, hgeip CSR will be set and it will be reflected in mip.VSEIP
> bit only when hypervisor schedules/runs the Guest (i.e. V=1 and
> hstatus.VGEIN pointing to the appropriate bit of hgeip csr).
>
> >
> > That either seems like a guest bug or that we need some functionality
> > in `riscv_cpu_swap_hypervisor_regs()` to trigger an interrupt on
> > context swap.
>
> This certainly is not a bug with Guest or Hypervisor but rather an
> issue of i

Re: [PATCH v6] Work around vhost-user-blk-test hang

2021-10-18 Thread Raphael Norwitz
On Mon, Oct 18, 2021 at 05:50:41PM -0400, Michael S. Tsirkin wrote:
> On Thu, Oct 14, 2021 at 04:32:23AM +, Raphael Norwitz wrote:
> > The vhost-user-blk-test qtest has been hanging intermittently for a
> > while. The root cause is not yet fully understood, but the hang is
> > impacting enough users that it is important to merge a workaround for
> > it.
> > 
> > The race which causes the hang occurs early on in vhost-user setup,
> > where a vhost-user message is never received by the backend. Forcing
> > QEMU to wait until the storage-daemon has had some time to initialize
> > prevents the hang. Thus the existing storage-daemon pidfile option can
> > be used to implement a workaround cleanly and effectively, since it
> > creates a file only once the storage-daemon initialization is complete.
> > 
> > This change implements a workaround for the vhost-user-blk-test hang by
> > making QEMU wait until the storage-daemon has written out a pidfile
> > before attempting to connect and send messages over the vhost-user
> > socket.
> > 
> > Some relevent mailing list discussions:
> > 
> > [1] 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_qemu-2Ddevel_CAFEAcA8kYpz9LiPNxnWJAPSjc-3Dnv532bEdyfynaBeMeohqBp3A-40mail.gmail.com_&d=DwIBAg&c=s883GpUCOChKOHiocYtGcg&r=In4gmR1pGzKB8G5p6LUrWqkSMec2L5EtXZow_FZNJZk&m=eDRDFhe3H61BSSpDvy3PKzwQIa2grX5hNMhigtjMCJ8&s=c6OKIl0NMsDqP0-ZNnVjHhDq2psXIVszz-uBKw_8pEo&e=
> >  
> > [2] 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_qemu-2Ddevel_YWaky-252FKVbS-252FKZjlV-40stefanha-2Dx1.localdomain_&d=DwIBAg&c=s883GpUCOChKOHiocYtGcg&r=In4gmR1pGzKB8G5p6LUrWqkSMec2L5EtXZow_FZNJZk&m=eDRDFhe3H61BSSpDvy3PKzwQIa2grX5hNMhigtjMCJ8&s=B4EM_0f7TXqsh18YEKOg-cFHabUjsVA5Ie1riDXaB7A&e=
> >  
> > 
> > Signed-off-by: Raphael Norwitz 
> > Reviewed-by: Eric Blake 
> 
> 
> Um. Does not seem to make things better for me:
> 
> **
> ERROR:../tests/qtest/vhost-user-blk-test.c:950:start_vhost_user_blk: 
> assertion failed (retries < PIDFILE_RETRIES): (5 < 5)
> ERROR qtest-x86_64/qos-test - Bail out! 
> ERROR:../tests/qtest/vhost-user-blk-test.c:950:start_vhost_user_blk: 
> assertion failed (retries < PIDFILE_RETRIES): (5 < 5)
> 
> At this point I just disabled the test in meson. No need to make
> everyone suffer.

Makes sense. Do you still want to persue the workaround?

If so, can you share some details on how you're running the test?

I've gone through 1000+ iterations using the script I posted here:
https://lore.kernel.org/qemu-devel/20210827165253.GA14291@raphael-debian-dev/
without hitting a failure.

>
> 
> > ---
> >  tests/qtest/vhost-user-blk-test.c | 29 -
> >  1 file changed, 28 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tests/qtest/vhost-user-blk-test.c 
> > b/tests/qtest/vhost-user-blk-test.c
> > index 6f108a1b62..c6626a286b 100644
> > --- a/tests/qtest/vhost-user-blk-test.c
> > +++ b/tests/qtest/vhost-user-blk-test.c
> > @@ -24,6 +24,7 @@
> >  #define TEST_IMAGE_SIZE (64 * 1024 * 1024)
> >  #define QVIRTIO_BLK_TIMEOUT_US  (30 * 1000 * 1000)
> >  #define PCI_SLOT_HP 0x06
> > +#define PIDFILE_RETRIES 5
> >  
> >  typedef struct {
> >  pid_t pid;
> 
> 
> Don't like the arbitrary retries counter.
> 
> Let's warn maybe, but on a busy machine we might not complete this
> in time ...

So you would like it to warn and keep trying forever? Or would you
rather set a much more lenient deadline? (1 min? 5 min?)

> 
> 
> > @@ -885,7 +886,8 @@ static void start_vhost_user_blk(GString *cmd_line, int 
> > vus_instances,
> >   int num_queues)
> >  {
> >  const char *vhost_user_blk_bin = qtest_qemu_storage_daemon_binary();
> > -int i;
> > +int i, retries;
> > +char *daemon_pidfile_path;
> >  gchar *img_path;
> >  GString *storage_daemon_command = g_string_new(NULL);
> >  QemuStorageDaemonState *qsd;
> > @@ -898,6 +900,8 @@ static void start_vhost_user_blk(GString *cmd_line, int 
> > vus_instances,
> >  " -object memory-backend-memfd,id=mem,size=256M,share=on "
> >  " -M memory-backend=mem -m 256M ");
> >  
> > +daemon_pidfile_path = g_strdup_printf("/tmp/daemon-%d", getpid());
> > +
> 
> Ugh. Predictable paths directly in /tmp are problematic .. mktemp?
> 

Ack

> >  for (i = 0; i < vus_instances; i++) {
> >  int fd;
> >  char *sock_path = create_listen_socket(&fd);
> > @@ -914,6 +918,9 @@ static void start_vhost_user_blk(GString *cmd_line, int 
> > vus_instances,
> > i + 1, sock_path);
> >  }
> >  
> > +g_string_append_printf(storage_daemon_command, "--pidfile %s ",
> > +   daemon_pidfile_path);
> > +
> >  g_test_message("starting vhost-user backend: %s",
> > storage_daemon_command->str);
> >  pid_t pid = fork();
> > @@ -930,7 +937,27 @@ static void start_vhost_user_blk(GString *cmd_line, 
> > int vus_ins

Re: TCP/IP connections sometimes stop retransmitting packets (in nested virtualization case)

2021-10-18 Thread Maxim Levitsky
On Mon, 2021-10-18 at 16:49 -0400, Michael S. Tsirkin wrote:
> On Mon, Oct 18, 2021 at 11:05:23AM -0700, Eric Dumazet wrote:
> > 
> > On 10/17/21 3:50 AM, Maxim Levitsky wrote:
> > > Hi!
> > >  
> > > This is a follow up mail to my mail about NFS client deadlock I was 
> > > trying to debug last week:
> > > https://lore.kernel.org/all/e10b46b04fe4427fa50901dda71fb5f5a26af33e.ca...@redhat.com/T/#u
> > >  
> > > I strongly believe now that this is not related to NFS, but rather to 
> > > some issue in networking stack and maybe
> > > to somewhat non standard .config I was using for the kernels which has 
> > > many advanced networking options disabled
> > > (to cut on compile time).
> > > This is why I choose to start a new thread about it.
> > >  
> > > Regarding the custom .config file, in particular I disabled 
> > > CONFIG_NET_SCHED and CONFIG_TCP_CONG_ADVANCED. 
> > > Both host and the fedora32 VM run the same kernel with those options 
> > > disabled.
> > > 
> > > 
> > > My setup is a VM (fedora32) which runs Win10 HyperV VM inside, nested, 
> > > which in turn runs a fedora32 VM
> > > (but I was able to reproduce it with ordinary HyperV disabled VM running 
> > > in the same fedora 32 VM)
> > >  
> > > The host is running a NFS server, and the fedora32 VM runs a NFS client 
> > > which is used to read/write to a qcow2 file
> > > which contains the disk of the nested Win10 VM. The L3 VM which windows 
> > > VM optionally
> > > runs, is contained in the same qcow2 file.
> > > 
> > > 
> > > I managed to capture (using wireshark) packets around the failure in both 
> > > L0 and L1.
> > > The trace shows fair number of lost packets, a bit more than I would 
> > > expect from communication that is running on the same host, 
> > > but they are retransmitted and don't cause any issues until the moment of 
> > > failure.
> > > 
> > > 
> > > The failure happens when one packet which is sent from host to the guest,
> > > is not received by the guest (as evident by the L1 trace, and by the 
> > > following SACKS from the guest which exclude this packet), 
> > > and then the host (on which the NFS server runs) never attempts to 
> > > re-transmit it.
> > > 
> > > 
> > > The host keeps on sending further TCP packets with replies to previous 
> > > RPC calls it received from the fedora32 VM,
> > > with an increasing sequence number, as evident from both traces, and the 
> > > fedora32 VM keeps on SACK'ing those received packets, 
> > > patiently waiting for the retransmission.
> > >  
> > > After around 12 minutes (!), the host RSTs the connection.
> > > 
> > > It is worth mentioning that while all of this is happening, the fedora32 
> > > VM can become hung if one attempts to access the files 
> > > on the NFS share because effectively all NFS communication is blocked on 
> > > TCP level.
> > > 
> > > I attached an extract from the two traces  (in L0 and L1) around the 
> > > failure up to the RST packet.
> > > 
> > > In this trace the second packet with TCP sequence number 1736557331 
> > > (first one was empty without data) is not received by the guest
> > > and then never retransmitted by the host.
> > > 
> > > Also worth noting that to ease on storage I captured only 512 bytes of 
> > > each packet, but wireshark
> > > notes how many bytes were in the actual packet.
> > >  
> > > Best regards,
> > >   Maxim Levitsky
> > 
> > TCP has special logic not attempting a retransmit if it senses the prior
> > packet has not been consumed yet.
> > 
> > Usually, the consume part is done from NIC drivers at TC completion time,
> > when NIC signals packet has been sent to the wire.
> > 
> > It seems one skb is essentially leaked somewhere, and leaked (not freed)
> 
> Thanks Eric!
> 
> Maxim since the packets that leak are transmitted on the host,
> the question then is what kind of device do you use on the host
> to talk to the guest? tap?
> 
> 
Yes, tap with bridge, similiar to how libvirt does 'bridge' networking for vms.
I use my own set of scripts to run qemu directly.

Usually vhost is used in both L0 and L1, and it 'seems' to help to reproduce it,
but I did reproduced this with vhost disabled on both L0 and L1.

The capture was done on the bridge interface on L0, and on a virtual network 
card in L1.

It does seem that I am unable to make it fail again (maybe luck?) with 
CONFIG_NET_SCHED (and its suboptions)
and CONFIG_TCP_CONG_ADVANCED set back to defaults (everything 'm')

Also just to avoid going on the wrong path, note that I did once reproduce this 
on e1000e virtual nic,
thus virtio is likely not to blame here.


Thanks,
Best regards,
Maxim Levitsky




Re: [PATCH v6] Work around vhost-user-blk-test hang

2021-10-18 Thread Michael S. Tsirkin
On Thu, Oct 14, 2021 at 04:32:23AM +, Raphael Norwitz wrote:
> The vhost-user-blk-test qtest has been hanging intermittently for a
> while. The root cause is not yet fully understood, but the hang is
> impacting enough users that it is important to merge a workaround for
> it.
> 
> The race which causes the hang occurs early on in vhost-user setup,
> where a vhost-user message is never received by the backend. Forcing
> QEMU to wait until the storage-daemon has had some time to initialize
> prevents the hang. Thus the existing storage-daemon pidfile option can
> be used to implement a workaround cleanly and effectively, since it
> creates a file only once the storage-daemon initialization is complete.
> 
> This change implements a workaround for the vhost-user-blk-test hang by
> making QEMU wait until the storage-daemon has written out a pidfile
> before attempting to connect and send messages over the vhost-user
> socket.
> 
> Some relevent mailing list discussions:
> 
> [1] 
> https://lore.kernel.org/qemu-devel/CAFEAcA8kYpz9LiPNxnWJAPSjc=nv532bedyfynabemeohqb...@mail.gmail.com/
> [2] 
> https://lore.kernel.org/qemu-devel/YWaky%2FKVbS%2FKZjlV@stefanha-x1.localdomain/
> 
> Signed-off-by: Raphael Norwitz 
> Reviewed-by: Eric Blake 


Um. Does not seem to make things better for me:

**
ERROR:../tests/qtest/vhost-user-blk-test.c:950:start_vhost_user_blk: assertion 
failed (retries < PIDFILE_RETRIES): (5 < 5)
ERROR qtest-x86_64/qos-test - Bail out! 
ERROR:../tests/qtest/vhost-user-blk-test.c:950:start_vhost_user_blk: assertion 
failed (retries < PIDFILE_RETRIES): (5 < 5)

At this point I just disabled the test in meson. No need to make
everyone suffer.


> ---
>  tests/qtest/vhost-user-blk-test.c | 29 -
>  1 file changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/vhost-user-blk-test.c 
> b/tests/qtest/vhost-user-blk-test.c
> index 6f108a1b62..c6626a286b 100644
> --- a/tests/qtest/vhost-user-blk-test.c
> +++ b/tests/qtest/vhost-user-blk-test.c
> @@ -24,6 +24,7 @@
>  #define TEST_IMAGE_SIZE (64 * 1024 * 1024)
>  #define QVIRTIO_BLK_TIMEOUT_US  (30 * 1000 * 1000)
>  #define PCI_SLOT_HP 0x06
> +#define PIDFILE_RETRIES 5
>  
>  typedef struct {
>  pid_t pid;


Don't like the arbitrary retries counter.

Let's warn maybe, but on a busy machine we might not complete this
in time ...


> @@ -885,7 +886,8 @@ static void start_vhost_user_blk(GString *cmd_line, int 
> vus_instances,
>   int num_queues)
>  {
>  const char *vhost_user_blk_bin = qtest_qemu_storage_daemon_binary();
> -int i;
> +int i, retries;
> +char *daemon_pidfile_path;
>  gchar *img_path;
>  GString *storage_daemon_command = g_string_new(NULL);
>  QemuStorageDaemonState *qsd;
> @@ -898,6 +900,8 @@ static void start_vhost_user_blk(GString *cmd_line, int 
> vus_instances,
>  " -object memory-backend-memfd,id=mem,size=256M,share=on "
>  " -M memory-backend=mem -m 256M ");
>  
> +daemon_pidfile_path = g_strdup_printf("/tmp/daemon-%d", getpid());
> +

Ugh. Predictable paths directly in /tmp are problematic .. mktemp?

>  for (i = 0; i < vus_instances; i++) {
>  int fd;
>  char *sock_path = create_listen_socket(&fd);
> @@ -914,6 +918,9 @@ static void start_vhost_user_blk(GString *cmd_line, int 
> vus_instances,
> i + 1, sock_path);
>  }
>  
> +g_string_append_printf(storage_daemon_command, "--pidfile %s ",
> +   daemon_pidfile_path);
> +
>  g_test_message("starting vhost-user backend: %s",
> storage_daemon_command->str);
>  pid_t pid = fork();
> @@ -930,7 +937,27 @@ static void start_vhost_user_blk(GString *cmd_line, int 
> vus_instances,
>  execlp("/bin/sh", "sh", "-c", storage_daemon_command->str, NULL);
>  exit(1);
>  }
> +
> +/*
> + * FIXME: The loop here ensures the storage-daemon has come up properly
> + *before allowing the test to proceed. This is a workaround for
> + *a race which used to cause the vhost-user-blk-test to hang. It
> + *should be deleted once the root cause is fully understood and
> + *fixed.
> + */
> +retries = 0;
> +while (access(daemon_pidfile_path, F_OK) != 0) {
> +g_assert_cmpint(retries, <, PIDFILE_RETRIES);
> +
> +retries++;
> +g_usleep(1000);
> +}
> +
>  g_string_free(storage_daemon_command, true);
> +if (access(daemon_pidfile_path, F_OK) == 0) {
> +unlink(daemon_pidfile_path);
> +}
> +g_free(daemon_pidfile_path);
>  
>  qsd = g_new(QemuStorageDaemonState, 1);
>  qsd->pid = pid;
> -- 
> 2.20.1




Re: [PATCH v2 04/15] tests: acpi: q35: test for x2APIC entries in SRAT

2021-10-18 Thread Michael S. Tsirkin
On Thu, Sep 02, 2021 at 07:35:40AM -0400, Igor Mammedov wrote:
> Set -smp 1,maxcpus=288 to test for ACPI code that
> deal with CPUs with large APIC ID (>255).
> 
> PS:
> Test requires KVM and in-kernel irqchip support,
> so skip test if KVM is not available.
> 
> Signed-off-by: Igor Mammedov 
> ---
> v3:
>   - add dedicated test instead of abusing 'numamem' one
>   - add 'kvm' prefix to the test name
>   ("Michael S. Tsirkin" )
> v2:
>   - switch to qtest_has_accel() API
> 
> CC: th...@redhat.com
> CC: lviv...@redhat.com
> ---
>  tests/qtest/bios-tables-test.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
> index 51d3a4e239..1f6779da87 100644
> --- a/tests/qtest/bios-tables-test.c
> +++ b/tests/qtest/bios-tables-test.c
> @@ -1033,6 +1033,19 @@ static void test_acpi_q35_tcg_numamem(void)
>  free_test_data(&data);
>  }
>  
> +static void test_acpi_q35_kvm_xapic(void)
> +{
> +test_data data;
> +
> +memset(&data, 0, sizeof(data));
> +data.machine = MACHINE_Q35;
> +data.variant = ".xapic";
> +test_acpi_one(" -object memory-backend-ram,id=ram0,size=128M"
> +  " -numa node -numa node,memdev=ram0"
> +  " -machine kernel-irqchip=on -smp 1,maxcpus=288", &data);
> +free_test_data(&data);
> +}
> +
>  static void test_acpi_q35_tcg_nosmm(void)
>  {
>  test_data data;


This causes an annoying message each time I run it:

qemu-system-x86_64: -accel kvm: warning: Number of hotpluggable cpus requested 
(288) exceeds the recommended cpus supported by KVM (240)

what gives?


> @@ -1506,6 +1519,7 @@ static void test_acpi_oem_fields_virt(void)
>  int main(int argc, char *argv[])
>  {
>  const char *arch = qtest_get_arch();
> +const bool has_kvm = qtest_has_accel("kvm");
>  int ret;
>  
>  g_test_init(&argc, &argv, NULL);
> @@ -1561,6 +1575,9 @@ int main(int argc, char *argv[])
>  if (strcmp(arch, "x86_64") == 0) {
>  qtest_add_func("acpi/microvm/pcie", test_acpi_microvm_pcie_tcg);
>  }
> +if (has_kvm) {
> +qtest_add_func("acpi/q35/kvm/xapic", test_acpi_q35_kvm_xapic);
> +}
>  } else if (strcmp(arch, "aarch64") == 0) {
>  qtest_add_func("acpi/virt", test_acpi_virt_tcg);
>  qtest_add_func("acpi/virt/numamem", test_acpi_virt_tcg_numamem);
> -- 
> 2.27.0




[PATCH] rebuild-expected-aml.sh: allow partial target list

2021-10-18 Thread Michael S. Tsirkin
Only rebuild AML for configured targets.

Signed-off-by: Michael S. Tsirkin 
---
 tests/data/acpi/rebuild-expected-aml.sh | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/tests/data/acpi/rebuild-expected-aml.sh 
b/tests/data/acpi/rebuild-expected-aml.sh
index fc78770544..dcf2e2f221 100755
--- a/tests/data/acpi/rebuild-expected-aml.sh
+++ b/tests/data/acpi/rebuild-expected-aml.sh
@@ -12,7 +12,7 @@
 # This work is licensed under the terms of the GNU GPLv2.
 # See the COPYING.LIB file in the top-level directory.
 
-qemu_bins="./qemu-system-x86_64 ./qemu-system-aarch64"
+qemu_arches="x86_64 aarch64"
 
 if [ ! -e "tests/qtest/bios-tables-test" ]; then
 echo "Test: bios-tables-test is required! Run make check before this 
script."
@@ -20,6 +20,26 @@ if [ ! -e "tests/qtest/bios-tables-test" ]; then
 exit 1;
 fi
 
+if grep TARGET_DIRS= config-host.mak; then
+for arch in $qemu_arches; do
+if  grep TARGET_DIRS= config-host.mak | grep "$arch"-softmmu;
+then
+qemu_bins="$qemu_bins ./qemu-system-$arch"
+fi
+done
+else
+echo "config-host.mak missing!"
+echo "Run this script from the build directory."
+exit 1;
+fi
+
+if [ -z "$qemu_bins" ]; then
+echo "Only the following architectures are currently supported: 
$qemu_arches"
+echo "None of these configured!"
+echo "To fix, run configure --target-list=x86_64-softmmu,aarch64-softmmu"
+exit 1;
+fi
+
 for qemu in $qemu_bins; do
 if [ ! -e $qemu ]; then
 echo "Run 'make' to build the following QEMU executables: $qemu_bins"
-- 
MST




Re: [PULL v2 00/23] Pull bsd user 20211018 patches

2021-10-18 Thread Richard Henderson

On 10/18/21 12:00 PM, Warner Losh wrote:

The following changes since commit c148a0572130ff485cd2249fbdd1a3260d5e10a4:

   Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20211016' into 
staging (2021-10-16 11:16:28 -0700)

are available in the Git repository at:

   g...@gitlab.com:bsdimp/qemu.git  tags/pull-bsd-user-20211018-pull-request

for you to fetch changes up to 5abfac277d25feb5f12332422c03ea1cb21c6aa1:

   bsd-user/signal: Create a dummy signal queueing function (2021-10-18 
12:51:39 -0600)


Applied, thanks.

r~



Re: [PATCH 2/2] Hexagon (target/hexagon) put writes to USR into temp until commit

2021-10-18 Thread Richard Henderson

On 10/12/21 2:31 AM, Taylor Simpson wrote:

Change SET_USR_FIELD to write to hex_new_value[HEX_REG_USR] instead
of hex_gpr[HEX_REG_USR].

Then, we need code to mark the instructions that can set implicitly
set USR
- Macros added to hex_common.py
- A_FPOP added in translate.c

Test case added in tests/tcg/hexagon/overflow.c

Signed-off-by: Taylor Simpson
---


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 1/2] Hexagon (target/hexagon) more tcg_constant_*

2021-10-18 Thread Richard Henderson

On 10/12/21 2:31 AM, Taylor Simpson wrote:

Change additional tcg_const_tl to tcg_constant_tl

Note that gen_pred_cancal had slot_mask initialized with tcg_const_tl.
However, it is not constant throughout, so we initialize it with
tcg_temp_new and replace the first use with the constant value.

Inspired-by: Richard Henderson
Inspired-by: Philippe Mathieu-Daud
Signed-off-by: Taylor Simpson
---
  target/hexagon/gen_tcg.h|  9 +++--
  target/hexagon/macros.h |  7 +++
  target/hexagon/translate.c  |  3 +--
  target/hexagon/gen_tcg_funcs.py | 11 ++-
  4 files changed, 9 insertions(+), 21 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v4 3/3] bios-tables-test: Generate reference table for virt/DBG2

2021-10-18 Thread Richard Henderson

On 10/7/21 12:29 AM, Eric Auger wrote:

diff --git a/tests/data/acpi/virt/DBG2 b/tests/data/acpi/virt/DBG2
index 
e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..86e6314f7b0235ef8ed3e0221e09f996c41f5e98
 100644
GIT binary patch
literal 87
zcmZ>9ayJTR0D|*Q{>~o33QiFL&I&-l2owUbL9`AKgJ=eA21Zr}H4uw|p@A7lh%qQJ
TFmQk+Il-a=3=Gcxz6J~c3~mVl

literal 0
HcmV?d1



Something went wrong here:

Applying: bios-tables-test: Generate reference table for virt/DBG2
error: corrupt binary patch at line 75: --

Can you please re-send?


r~



Re: [PATCH v2] hw/elf_ops.h: switch to ssize_t for elf loader return type

2021-10-18 Thread Richard Henderson

On 10/14/21 12:43 PM, Luc Michel wrote:

Until now, int was used as the return type for all the ELF
loader related functions. The returned value is the sum of all loaded
program headers "MemSize" fields.

Because of the overflow check in elf_ops.h, trying to load an ELF bigger
than INT_MAX will fail. Switch to ssize_t to remove this limitation.

Reviewed-by: Philippe Mathieu-Daudé
Signed-off-by: Luc Michel
---
v2:
   - Turn load_elf ret local variable to ssize_t [Stefano]
   - Add Phil's R-B
---
  include/hw/elf_ops.h | 27 ++--
  include/hw/loader.h  | 60 ++--
  hw/core/loader.c | 60 +++-
  3 files changed, 75 insertions(+), 72 deletions(-)


I'm going to queue this to target-arm.next, as there doesn't seem to be another more 
obvious tree.



r~



[PATCH] tests/vm/openbsd: Update to release 7.0

2021-10-18 Thread Richard Henderson
There are two minor changes required in the script for the
network configuration of the newer release.

Signed-off-by: Richard Henderson 
---
 tests/vm/openbsd | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/tests/vm/openbsd b/tests/vm/openbsd
index c4c78a80f1..337fe7c303 100755
--- a/tests/vm/openbsd
+++ b/tests/vm/openbsd
@@ -22,8 +22,8 @@ class OpenBSDVM(basevm.BaseVM):
 name = "openbsd"
 arch = "x86_64"
 
-link = "https://cdn.openbsd.org/pub/OpenBSD/6.9/amd64/install69.iso";
-csum = "140d26548aec680e34bb5f82295414228e7f61e4f5e7951af066014fda2d6e43"
+link = "https://cdn.openbsd.org/pub/OpenBSD/7.0/amd64/install70.iso";
+csum = "1882f9a23c9800e5dba3dbd2cf0126f552605c915433ef4c5bb672610a4ca3a4"
 size = "20G"
 pkgs = [
 # tools
@@ -95,10 +95,9 @@ class OpenBSDVM(basevm.BaseVM):
 self.console_wait_send("Terminal type",   "xterm\n")
 self.console_wait_send("System hostname", "openbsd\n")
 self.console_wait_send("Which network interface", "vio0\n")
-self.console_wait_send("IPv4 address","dhcp\n")
+self.console_wait_send("IPv4 address","autoconf\n")
 self.console_wait_send("IPv6 address","none\n")
 self.console_wait_send("Which network interface", "done\n")
-self.console_wait_send("DNS domain name", "localnet\n")
 self.console_wait("Password for root account")
 self.console_send("%s\n" % self._config["root_pass"])
 self.console_wait("Password for root account")
-- 
2.25.1




Re: TCP/IP connections sometimes stop retransmitting packets (in nested virtualization case)

2021-10-18 Thread Michael S. Tsirkin
On Mon, Oct 18, 2021 at 11:05:23AM -0700, Eric Dumazet wrote:
> 
> 
> On 10/17/21 3:50 AM, Maxim Levitsky wrote:
> > Hi!
> >  
> > This is a follow up mail to my mail about NFS client deadlock I was trying 
> > to debug last week:
> > https://lore.kernel.org/all/e10b46b04fe4427fa50901dda71fb5f5a26af33e.ca...@redhat.com/T/#u
> >  
> > I strongly believe now that this is not related to NFS, but rather to some 
> > issue in networking stack and maybe
> > to somewhat non standard .config I was using for the kernels which has many 
> > advanced networking options disabled
> > (to cut on compile time).
> > This is why I choose to start a new thread about it.
> >  
> > Regarding the custom .config file, in particular I disabled 
> > CONFIG_NET_SCHED and CONFIG_TCP_CONG_ADVANCED. 
> > Both host and the fedora32 VM run the same kernel with those options 
> > disabled.
> > 
> > 
> > My setup is a VM (fedora32) which runs Win10 HyperV VM inside, nested, 
> > which in turn runs a fedora32 VM
> > (but I was able to reproduce it with ordinary HyperV disabled VM running in 
> > the same fedora 32 VM)
> >  
> > The host is running a NFS server, and the fedora32 VM runs a NFS client 
> > which is used to read/write to a qcow2 file
> > which contains the disk of the nested Win10 VM. The L3 VM which windows VM 
> > optionally
> > runs, is contained in the same qcow2 file.
> > 
> > 
> > I managed to capture (using wireshark) packets around the failure in both 
> > L0 and L1.
> > The trace shows fair number of lost packets, a bit more than I would expect 
> > from communication that is running on the same host, 
> > but they are retransmitted and don't cause any issues until the moment of 
> > failure.
> > 
> > 
> > The failure happens when one packet which is sent from host to the guest,
> > is not received by the guest (as evident by the L1 trace, and by the 
> > following SACKS from the guest which exclude this packet), 
> > and then the host (on which the NFS server runs) never attempts to 
> > re-transmit it.
> > 
> > 
> > The host keeps on sending further TCP packets with replies to previous RPC 
> > calls it received from the fedora32 VM,
> > with an increasing sequence number, as evident from both traces, and the 
> > fedora32 VM keeps on SACK'ing those received packets, 
> > patiently waiting for the retransmission.
> >  
> > After around 12 minutes (!), the host RSTs the connection.
> > 
> > It is worth mentioning that while all of this is happening, the fedora32 VM 
> > can become hung if one attempts to access the files 
> > on the NFS share because effectively all NFS communication is blocked on 
> > TCP level.
> > 
> > I attached an extract from the two traces  (in L0 and L1) around the 
> > failure up to the RST packet.
> > 
> > In this trace the second packet with TCP sequence number 1736557331 (first 
> > one was empty without data) is not received by the guest
> > and then never retransmitted by the host.
> > 
> > Also worth noting that to ease on storage I captured only 512 bytes of each 
> > packet, but wireshark
> > notes how many bytes were in the actual packet.
> >  
> > Best regards,
> > Maxim Levitsky
> 
> TCP has special logic not attempting a retransmit if it senses the prior
> packet has not been consumed yet.
> 
> Usually, the consume part is done from NIC drivers at TC completion time,
> when NIC signals packet has been sent to the wire.
> 
> It seems one skb is essentially leaked somewhere, and leaked (not freed)

Thanks Eric!

Maxim since the packets that leak are transmitted on the host,
the question then is what kind of device do you use on the host
to talk to the guest? tap?


-- 
MST




Re: [PATCH v2 05/15] tests: acpi: update expected tables blobs

2021-10-18 Thread Michael S. Tsirkin
On Thu, Sep 02, 2021 at 07:35:41AM -0400, Igor Mammedov wrote:
> Update adds CPU entries to MADT/SRAT/FACP and DSDT to cover 288 CPUs.
> Notable changes are that CPUs with APIC ID 255 and higher
> use 'Processor Local x2APIC Affinity' structure in SRAT and
> "Device" element in DSDT.
> 
> FACP:
> - Use APIC Cluster Model (V4) : 0
> + Use APIC Cluster Model (V4) : 1
> 
> SRAT:
> ...
> +[1010h 4112   1]Subtable Type : 00 [Processor Local 
> APIC/SAPIC Affinity]
> +[1011h 4113   1]   Length : 10
> +
> +[1012h 4114   1]  Proximity Domain Low(8) : 00
> +[1013h 4115   1]  Apic ID : FE
> +[1014h 4116   4]Flags (decoded below) : 0001
> + Enabled : 1
> +[1018h 4120   1]  Local Sapic EID : 00
> +[1019h 4121   3]Proximity Domain High(24) : 00
> +[101Ch 4124   4] Clock Domain : 
> +
> +[1020h 4128   1]Subtable Type : 02 [Processor Local x2APIC 
> Affinity]
> +[1021h 4129   1]   Length : 18
> +
> +[1022h 4130   2]Reserved1 : 
> +[1024h 4132   4] Proximity Domain : 0001
> +[1028h 4136   4]  Apic ID : 00FF
> +[102Ch 4140   4]Flags (decoded below) : 0001
> + Enabled : 1
> +[1030h 4144   4] Clock Domain : 
> +[1034h 4148   4]Reserved2 : 
> 
> ...
> 
> +[1320h 4896   1]Subtable Type : 02 [Processor Local x2APIC 
> Affinity]
> +[1321h 4897   1]   Length : 18
> +
> +[1322h 4898   2]Reserved1 : 
> +[1324h 4900   4] Proximity Domain : 0001
> +[1328h 4904   4]  Apic ID : 011F
> +[132Ch 4908   4]Flags (decoded below) : 0001
> + Enabled : 1
> +[1330h 4912   4] Clock Domain : 
> +[1334h 4916   4]Reserved2 : 
> 
> DSDT:
> 
> ...
> +Processor (C0FE, 0xFE, 0x, 0x00)
> +{
> ...
> +}
> +
> +Device (C0FF)
> +{
> +Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: 
> Hardware ID
> +Name (_UID, 0xFF)  // _UID: Unique ID
> ...
> +}
> 
> +Device (C11F)
> +{
> +Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: 
> Hardware ID
> +Name (_UID, 0x011F)  // _UID: Unique ID
> ...
> +}
> 
> APIC:
> +[034h 0052   1]Subtable Type : 00 [Processor Local APIC]
> +[035h 0053   1]   Length : 08
> +[036h 0054   1] Processor ID : 01
> +[037h 0055   1]Local Apic ID : 01
> +[038h 0056   4]Flags (decoded below) : 
> +   Processor Enabled : 0
> 
> ...
> 
> +[81Ch 2076   1]Subtable Type : 00 [Processor Local APIC]
> +[81Dh 2077   1]   Length : 08
> +[81Eh 2078   1] Processor ID : FE
> +[81Fh 2079   1]Local Apic ID : FE
> +[820h 2080   4]Flags (decoded below) : 
> +   Processor Enabled : 0
> +
> +[824h 2084   1]Subtable Type : 09 [Processor Local x2APIC]
> +[825h 2085   1]   Length : 10
> +[826h 2086   2] Reserved : 
> +[828h 2088   4]  Processor x2Apic ID : 00FF
> +[82Ch 2092   4]Flags (decoded below) : 
> +   Processor Enabled : 0
> +[830h 2096   4]Processor UID : 00FF
> 
> ...
> 
> +[A24h 2596   1]Subtable Type : 09 [Processor Local x2APIC]
> +[A25h 2597   1]   Length : 10
> +[A26h 2598   2] Reserved : 
> +[A28h 2600   4]  Processor x2Apic ID : 011F
> +[A2Ch 2604   4]Flags (decoded below) : 
> +   Processor Enabled : 0
> +[A30h 2608   4]Processor UID : 011F
> +
> +[A34h 2612   1]Subtable Type : 01 [I/O APIC]
> +[A35h 2613   1]   Length : 0C
> +[A36h 2614   1]  I/O Apic ID : 00
> +[A37h 2615   1] Reserved : 00
> +[A38h 2616   4]  Address : FEC0
> +[A3Ch 2620   4]Interrupt : 
> +
> +[A40h 2624   1]Subtable Type : 02 [Interrupt Source Override]
> +[A41h 2625   1]   Length : 0A
> +[A42h 2626   1]  Bus : 00
> +[A43h 2627   1]   Source : 00
> +[A44h 2628   4]Interrupt : 0002
> +[A48h 2632   2]Flags (decoded below) : 
>  Polarity : 0
>   

Re: [PATCH] hw/arm/sbsa-ref: Fixed cpu type error message typo.

2021-10-18 Thread Richard Henderson

On 10/7/21 11:36 PM, Shuuichirou Ishii wrote:

Signed-off-by: Shuuichirou Ishii 
---
  hw/arm/sbsa-ref.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index 509c5f09b4..358714bd3e 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -670,7 +670,7 @@ static void sbsa_ref_init(MachineState *machine)
  int n, sbsa_max_cpus;
  
  if (!cpu_type_valid(machine->cpu_type)) {

-error_report("mach-virt: CPU type %s not supported", 
machine->cpu_type);
+error_report("sbsa-ref: CPU type %s not supported", machine->cpu_type);
  exit(1);
  }


Queued to target-arm.next.

r~




Re: [PATCH v4 07/12] virtiofsd: Let lo_inode_open() return a TempFd

2021-10-18 Thread Vivek Goyal
On Thu, Sep 16, 2021 at 10:40:40AM +0200, Hanna Reitz wrote:
> Strictly speaking, this is not necessary, because lo_inode_open() will
> always return a new FD owned by the caller, so TempFd.owned will always
> be true.
> 
> The auto-cleanup is nice, though.  Also, we get a more unified interface
> where you always get a TempFd when you need an FD for an lo_inode
> (regardless of whether it is an O_PATH FD or a non-O_PATH FD).
> 
> Signed-off-by: Hanna Reitz 
> ---
>  tools/virtiofsd/passthrough_ll.c | 156 +++
>  1 file changed, 75 insertions(+), 81 deletions(-)
> 
> diff --git a/tools/virtiofsd/passthrough_ll.c 
> b/tools/virtiofsd/passthrough_ll.c
> index 3bf20b8659..d257eda129 100644
> --- a/tools/virtiofsd/passthrough_ll.c
> +++ b/tools/virtiofsd/passthrough_ll.c
> @@ -293,10 +293,8 @@ static void temp_fd_clear(TempFd *temp_fd)
>  /**
>   * Return an owned fd from *temp_fd that will not be closed when
>   * *temp_fd goes out of scope.
> - *
> - * (TODO: Remove __attribute__ once this is used.)
>   */
> -static __attribute__((unused)) int temp_fd_steal(TempFd *temp_fd)
> +static int temp_fd_steal(TempFd *temp_fd)
>  {
>  if (temp_fd->owned) {
>  temp_fd->owned = false;
> @@ -309,10 +307,8 @@ static __attribute__((unused)) int temp_fd_steal(TempFd 
> *temp_fd)
>  /**
>   * Create a borrowing copy of an existing TempFd.  Note that *to is
>   * only valid as long as *from is valid.
> - *
> - * (TODO: Remove __attribute__ once this is used.)
>   */
> -static __attribute__((unused)) void temp_fd_copy(const TempFd *from, TempFd 
> *to)
> +static void temp_fd_copy(const TempFd *from, TempFd *to)
>  {
>  *to = (TempFd) {
>  .fd = from->fd,
> @@ -689,9 +685,12 @@ static int lo_fd(fuse_req_t req, fuse_ino_t ino, TempFd 
> *tfd)
>   * when a malicious client opens special files such as block device nodes.
>   * Symlink inodes are also rejected since symlinks must already have been
>   * traversed on the client side.
> + *
> + * The fd is returned in tfd->fd.  The return value is 0 on success and 
> -errno
> + * otherwise.
>   */
>  static int lo_inode_open(struct lo_data *lo, struct lo_inode *inode,
> - int open_flags)
> + int open_flags, TempFd *tfd)
>  {
>  g_autofree char *fd_str = g_strdup_printf("%d", inode->fd);
>  int fd;
> @@ -710,7 +709,13 @@ static int lo_inode_open(struct lo_data *lo, struct 
> lo_inode *inode,
>  if (fd < 0) {
>  return -errno;
>  }
> -return fd;
> +
> +*tfd = (TempFd) {
> +.fd = fd,
> +.owned = true,
> +};
> +
> +return 0;
>  }
>  
>  static void lo_init(void *userdata, struct fuse_conn_info *conn)
> @@ -854,7 +859,8 @@ static int lo_fi_fd(fuse_req_t req, struct fuse_file_info 
> *fi)
>  static void lo_setattr(fuse_req_t req, fuse_ino_t ino, struct stat *attr,
> int valid, struct fuse_file_info *fi)
>  {
> -g_auto(TempFd) path_fd = TEMP_FD_INIT;
> +g_auto(TempFd) path_fd = TEMP_FD_INIT; /* at least an O_PATH fd */

What does atleast O_PATH fd mean?

> +g_auto(TempFd) rw_fd = TEMP_FD_INIT; /* O_RDWR fd */
>  int saverr;
>  char procname[64];
>  struct lo_data *lo = lo_data(req);
> @@ -868,7 +874,15 @@ static void lo_setattr(fuse_req_t req, fuse_ino_t ino, 
> struct stat *attr,
>  return;
>  }
>  
> -res = lo_inode_fd(inode, &path_fd);
> +if (!fi && (valid & FUSE_SET_ATTR_SIZE)) {
> +/* We need an O_RDWR FD for ftruncate() */
> +res = lo_inode_open(lo, inode, O_RDWR, &rw_fd);
> +if (res >= 0) {
> +temp_fd_copy(&rw_fd, &path_fd);

I am lost here. If lo_inode_open() failed, why are we calling this
temp_fd_copy()? path_fd is not even a valid fd yet.

Still beats me that why open rw_fd now instead of down in
FUSE_SET_ATTR_SIZE block. I think we had this discussion and you
had some reasons to move it up.

Vivek

> +}
> +} else {
> +res = lo_inode_fd(inode, &path_fd);
> +}
>  if (res < 0) {
>  saverr = -res;
>  goto out_err;
> @@ -916,18 +930,12 @@ static void lo_setattr(fuse_req_t req, fuse_ino_t ino, 
> struct stat *attr,
>  if (fi) {
>  truncfd = fd;
>  } else {
> -truncfd = lo_inode_open(lo, inode, O_RDWR);
> -if (truncfd < 0) {
> -saverr = -truncfd;
> -goto out_err;
> -}
> +assert(rw_fd.fd >= 0);
> +truncfd = rw_fd.fd;
>  }
>  
>  saverr = drop_security_capability(lo, truncfd);
>  if (saverr) {
> -if (!fi) {
> -close(truncfd);
> -}
>  goto out_err;
>  }
>  
> @@ -935,9 +943,6 @@ static void lo_setattr(fuse_req_t req, fuse_ino_t ino, 
> struct stat *attr,
>  res = drop_effective_cap("FSETID", &cap_fsetid_dropped);
>  if (res != 0) {
>  saverr = res;
> 

[PULL v2 18/23] bsd-user/target_os_elf: If ELF_HWCAP2 is defined, publish it

2021-10-18 Thread Warner Losh
Some architectures publish AT_HWCAP2 as well as AT_HWCAP. Those
architectures will define ELF_HWCAP2 in their target_arch_elf.h files
for the value for this process. If it is defined, then publish it.

Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/freebsd/target_os_elf.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/bsd-user/freebsd/target_os_elf.h b/bsd-user/freebsd/target_os_elf.h
index adcffd1ddb..e5ac8e8e50 100644
--- a/bsd-user/freebsd/target_os_elf.h
+++ b/bsd-user/freebsd/target_os_elf.h
@@ -112,6 +112,10 @@ static abi_ulong target_create_elf_tables(abi_ulong p, int 
argc, int envc,
 NEW_AUX_ENT(AT_ENTRY, load_bias + exec->e_entry);
 features = ELF_HWCAP;
 NEW_AUX_ENT(FREEBSD_AT_HWCAP, features);
+#ifdef ELF_HWCAP2
+features = ELF_HWCAP2;
+NEW_AUX_ENT(FREEBSD_AT_HWCAP2, features);
+#endif
 NEW_AUX_ENT(AT_UID, (abi_ulong)getuid());
 NEW_AUX_ENT(AT_EUID, (abi_ulong)geteuid());
 NEW_AUX_ENT(AT_GID, (abi_ulong)getgid());
-- 
2.32.0




[PULL v2 20/23] bsd-user: Add stop_all_tasks

2021-10-18 Thread Warner Losh
Similar to the same function in linux-user: this stops all the current tasks.

Signed-off-by: Stacey Son 
Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Kyle Evans 
---
 bsd-user/main.c | 9 +
 bsd-user/qemu.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/bsd-user/main.c b/bsd-user/main.c
index ee84554854..cb5ea40236 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -195,6 +195,15 @@ static void usage(void)
 
 __thread CPUState *thread_cpu;
 
+void stop_all_tasks(void)
+{
+/*
+ * We trust when using NPTL (pthreads) start_exclusive() handles thread
+ * stopping correctly.
+ */
+start_exclusive();
+}
+
 bool qemu_cpu_is_self(CPUState *cpu)
 {
 return thread_cpu == cpu;
diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index c1170f14d9..cdb85140f4 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -103,6 +103,7 @@ typedef struct TaskState {
 } __attribute__((aligned(16))) TaskState;
 
 void init_task_state(TaskState *ts);
+void stop_all_tasks(void);
 extern const char *qemu_uname_release;
 
 /*
-- 
2.32.0




[PULL v2 23/23] bsd-user/signal: Create a dummy signal queueing function

2021-10-18 Thread Warner Losh
Create dummy signal queueing function so we can start to integrate other
architectures (at the cost of signals remaining broken) to tame the
dependency graph a bit and to bring in signals in a more controlled
fashion.  Log unimplemented events to it in the mean time.

Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Kyle Evans 
---
 bsd-user/qemu.h   |  2 +-
 bsd-user/signal.c | 11 ++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index ba15b1b56f..1b3b974afe 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -17,7 +17,6 @@
 #ifndef QEMU_H
 #define QEMU_H
 
-
 #include "qemu/osdep.h"
 #include "cpu.h"
 #include "qemu/units.h"
@@ -209,6 +208,7 @@ void process_pending_signals(CPUArchState *cpu_env);
 void signal_init(void);
 long do_sigreturn(CPUArchState *env);
 long do_rt_sigreturn(CPUArchState *env);
+void queue_signal(CPUArchState *env, int sig, target_siginfo_t *info);
 abi_long do_sigaltstack(abi_ulong uss_addr, abi_ulong uoss_addr, abi_ulong sp);
 
 /* mmap.c */
diff --git a/bsd-user/signal.c b/bsd-user/signal.c
index ad6d935569..0c1093deb1 100644
--- a/bsd-user/signal.c
+++ b/bsd-user/signal.c
@@ -16,10 +16,19 @@
  *  You should have received a copy of the GNU General Public License
  *  along with this program; if not, see .
  */
-#include "qemu/osdep.h"
 
+#include "qemu/osdep.h"
 #include "qemu.h"
 
+/*
+ * Queue a signal so that it will be send to the virtual CPU as soon as
+ * possible.
+ */
+void queue_signal(CPUArchState *env, int sig, target_siginfo_t *info)
+{
+qemu_log_mask(LOG_UNIMP, "No signal queueing, dropping signal %d\n", sig);
+}
+
 void signal_init(void)
 {
 }
-- 
2.32.0




[PULL v2 19/23] bsd-user: Remove used from TaskState

2021-10-18 Thread Warner Losh
The 'used' field in TaskState is write only. Remove it from TaskState.

Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/main.c | 1 -
 bsd-user/qemu.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/bsd-user/main.c b/bsd-user/main.c
index 48643eeabc..ee84554854 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -210,7 +210,6 @@ void init_task_state(TaskState *ts)
 {
 int i;
 
-ts->used = 1;
 ts->first_free = ts->sigqueue_table;
 for (i = 0; i < MAX_SIGQUEUE_SIZE - 1; i++) {
 ts->sigqueue_table[i].next = &ts->sigqueue_table[i + 1];
diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index 3b8475394c..c1170f14d9 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -92,7 +92,6 @@ typedef struct TaskState {
 
 struct TaskState *next;
 struct bsd_binprm *bprm;
-int used; /* non zero if used */
 struct image_info *info;
 
 struct emulated_sigtable sigtab[TARGET_NSIG];
-- 
2.32.0




[PULL v2 22/23] bsd-user: Rename sigqueue to qemu_sigqueue

2021-10-18 Thread Warner Losh
To avoid a name clash with FreeBSD's sigqueue data structure in
signalvar.h, rename sigqueue to qemu_sigqueue. This structure
is currently defined, but unused.

Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/qemu.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index e65e41d53d..ba15b1b56f 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -73,15 +73,15 @@ struct image_info {
 
 #define MAX_SIGQUEUE_SIZE 1024
 
-struct sigqueue {
-struct sigqueue *next;
+struct qemu_sigqueue {
+struct qemu_sigqueue *next;
+target_siginfo_t info;
 };
 
 struct emulated_sigtable {
 int pending; /* true if signal is pending */
-struct sigqueue *first;
-/* in order to always have memory for the first signal, we put it here */
-struct sigqueue info;
+struct qemu_sigqueue *first;
+struct qemu_sigqueue info;  /* Put first signal info here */
 };
 
 /*
@@ -95,8 +95,8 @@ typedef struct TaskState {
 struct image_info *info;
 
 struct emulated_sigtable sigtab[TARGET_NSIG];
-struct sigqueue sigqueue_table[MAX_SIGQUEUE_SIZE]; /* siginfo queue */
-struct sigqueue *first_free; /* first free siginfo queue entry */
+struct qemu_sigqueue sigqueue_table[MAX_SIGQUEUE_SIZE]; /* siginfo queue */
+struct qemu_sigqueue *first_free; /* first free siginfo queue entry */
 int signal_pending; /* non zero if a signal may be pending */
 
 uint8_t stack[];
-- 
2.32.0




[PULL v2 17/23] bsd-user/target_os_elf.h: Remove fallback ELF_HWCAP and reorder

2021-10-18 Thread Warner Losh
All architectures have a ELF_HWCAP, so remove the fallback ifdef.
Place ELF_HWCAP in the same order as on native FreeBSD.

Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/freebsd/target_os_elf.h | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/bsd-user/freebsd/target_os_elf.h b/bsd-user/freebsd/target_os_elf.h
index 2d03a883aa..adcffd1ddb 100644
--- a/bsd-user/freebsd/target_os_elf.h
+++ b/bsd-user/freebsd/target_os_elf.h
@@ -38,10 +38,6 @@
 #define ELF_PLATFORM (NULL)
 #endif
 
-#ifndef ELF_HWCAP
-#define ELF_HWCAP 0
-#endif
-
 /* XXX Look at the other conflicting AT_* values. */
 #define FREEBSD_AT_NCPUS 19
 #define FREEBSD_AT_HWCAP 25
@@ -114,12 +110,12 @@ static abi_ulong target_create_elf_tables(abi_ulong p, 
int argc, int envc,
 NEW_AUX_ENT(AT_FLAGS, (abi_ulong)0);
 NEW_AUX_ENT(FREEBSD_AT_NCPUS, (abi_ulong)bsd_get_ncpu());
 NEW_AUX_ENT(AT_ENTRY, load_bias + exec->e_entry);
+features = ELF_HWCAP;
+NEW_AUX_ENT(FREEBSD_AT_HWCAP, features);
 NEW_AUX_ENT(AT_UID, (abi_ulong)getuid());
 NEW_AUX_ENT(AT_EUID, (abi_ulong)geteuid());
 NEW_AUX_ENT(AT_GID, (abi_ulong)getgid());
 NEW_AUX_ENT(AT_EGID, (abi_ulong)getegid());
-features = ELF_HWCAP;
-NEW_AUX_ENT(FREEBSD_AT_HWCAP, features);
 target_auxents = sp; /* Note where the aux entries are in the target */
 #ifdef ARCH_DLINFO
 /*
-- 
2.32.0




[PULL v2 13/23] bsd-user: TARGET_RESET define is unused, remove it

2021-10-18 Thread Warner Losh
Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/i386/target_arch_cpu.h   | 2 --
 bsd-user/x86_64/target_arch_cpu.h | 2 --
 2 files changed, 4 deletions(-)

diff --git a/bsd-user/i386/target_arch_cpu.h b/bsd-user/i386/target_arch_cpu.h
index 978e8066af..b28602adbb 100644
--- a/bsd-user/i386/target_arch_cpu.h
+++ b/bsd-user/i386/target_arch_cpu.h
@@ -23,8 +23,6 @@
 
 #define TARGET_DEFAULT_CPU_MODEL "qemu32"
 
-#define TARGET_CPU_RESET(cpu)
-
 static inline void target_cpu_init(CPUX86State *env,
 struct target_pt_regs *regs)
 {
diff --git a/bsd-user/x86_64/target_arch_cpu.h 
b/bsd-user/x86_64/target_arch_cpu.h
index 5f5ee602f9..5172b230f0 100644
--- a/bsd-user/x86_64/target_arch_cpu.h
+++ b/bsd-user/x86_64/target_arch_cpu.h
@@ -23,8 +23,6 @@
 
 #define TARGET_DEFAULT_CPU_MODEL "qemu64"
 
-#define TARGET_CPU_RESET(cpu)
-
 static inline void target_cpu_init(CPUX86State *env,
 struct target_pt_regs *regs)
 {
-- 
2.32.0




[PULL v2 16/23] bsd-user: move TARGET_MC_GET_CLEAR_RET to target_os_signal.h

2021-10-18 Thread Warner Losh
Move TARGET_MC_GET_CLEAR_RET to freebsd/target_os_signal.h since it's
architecture agnostic on FreeBSD.

Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/freebsd/target_os_signal.h  | 3 +++
 bsd-user/i386/target_arch_signal.h   | 2 --
 bsd-user/x86_64/target_arch_signal.h | 2 --
 3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/bsd-user/freebsd/target_os_signal.h 
b/bsd-user/freebsd/target_os_signal.h
index 3ed454e086..1a4c5faf19 100644
--- a/bsd-user/freebsd/target_os_signal.h
+++ b/bsd-user/freebsd/target_os_signal.h
@@ -1,6 +1,9 @@
 #ifndef _TARGET_OS_SIGNAL_H_
 #define _TARGET_OS_SIGNAL_H_
 
+/* FreeBSD's sys/ucontext.h defines this */
+#define TARGET_MC_GET_CLEAR_RET 0x0001
+
 #include "target_os_siginfo.h"
 #include "target_arch_signal.h"
 
diff --git a/bsd-user/i386/target_arch_signal.h 
b/bsd-user/i386/target_arch_signal.h
index 9812c6b034..a90750d602 100644
--- a/bsd-user/i386/target_arch_signal.h
+++ b/bsd-user/i386/target_arch_signal.h
@@ -27,8 +27,6 @@
 #define TARGET_MINSIGSTKSZ  (512 * 4)   /* min sig stack size */
 #define TARGET_SIGSTKSZ (MINSIGSTKSZ + 32768)   /* recommended size */
 
-#define TARGET_MC_GET_CLEAR_RET 0x0001
-
 struct target_sigcontext {
 /* to be added */
 };
diff --git a/bsd-user/x86_64/target_arch_signal.h 
b/bsd-user/x86_64/target_arch_signal.h
index 4c1ff0e5ba..4bb753b08b 100644
--- a/bsd-user/x86_64/target_arch_signal.h
+++ b/bsd-user/x86_64/target_arch_signal.h
@@ -27,8 +27,6 @@
 #define TARGET_MINSIGSTKSZ  (512 * 4)   /* min sig stack size */
 #define TARGET_SIGSTKSZ (MINSIGSTKSZ + 32768)   /* recommended size */
 
-#define TARGET_MC_GET_CLEAR_RET 0x0001
-
 struct target_sigcontext {
 /* to be added */
 };
-- 
2.32.0




[PULL v2 21/23] bsd-user/sysarch: Move to using do_freebsd_arch_sysarch interface

2021-10-18 Thread Warner Losh
do_freebsd_arch_sysarch() exists in $ARCH/target_arch_sysarch.h for x86.
Call it from do_freebsd_sysarch() and remove the mostly duplicate
version in syscall.c. Future changes will move it to os-sys.c and
support other architectures.

Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Kyle Evans 
---
 bsd-user/freebsd/meson.build |  3 +++
 bsd-user/freebsd/os-sys.c| 27 +++
 bsd-user/meson.build |  3 +++
 bsd-user/qemu.h  |  3 +++
 bsd-user/syscall.c   | 50 
 5 files changed, 36 insertions(+), 50 deletions(-)
 create mode 100644 bsd-user/freebsd/meson.build
 create mode 100644 bsd-user/freebsd/os-sys.c

diff --git a/bsd-user/freebsd/meson.build b/bsd-user/freebsd/meson.build
new file mode 100644
index 00..4b69cca7b9
--- /dev/null
+++ b/bsd-user/freebsd/meson.build
@@ -0,0 +1,3 @@
+bsd_user_ss.add(files(
+  'os-sys.c',
+))
diff --git a/bsd-user/freebsd/os-sys.c b/bsd-user/freebsd/os-sys.c
new file mode 100644
index 00..309e27b9d6
--- /dev/null
+++ b/bsd-user/freebsd/os-sys.c
@@ -0,0 +1,27 @@
+/*
+ *  FreeBSD sysctl() and sysarch() system call emulation
+ *
+ *  Copyright (c) 2013-15 Stacey D. Son
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#include "qemu.h"
+#include "target_arch_sysarch.h"
+
+/* sysarch() is architecture dependent. */
+abi_long do_freebsd_sysarch(void *cpu_env, abi_long arg1, abi_long arg2)
+{
+return do_freebsd_arch_sysarch(cpu_env, arg1, arg2);
+}
diff --git a/bsd-user/meson.build b/bsd-user/meson.build
index 5378f56f71..87885d91ed 100644
--- a/bsd-user/meson.build
+++ b/bsd-user/meson.build
@@ -12,3 +12,6 @@ bsd_user_ss.add(files(
   'syscall.c',
   'uaccess.c',
 ))
+
+# Pull in the OS-specific build glue, if any
+subdir(targetos)
diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index cdb85140f4..e65e41d53d 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -239,6 +239,9 @@ extern unsigned long target_sgrowsiz;
 abi_long get_errno(abi_long ret);
 bool is_error(abi_long ret);
 
+/* os-sys.c */
+abi_long do_freebsd_sysarch(void *cpu_env, abi_long arg1, abi_long arg2);
+
 /* user access */
 
 #define VERIFY_READ  PAGE_READ
diff --git a/bsd-user/syscall.c b/bsd-user/syscall.c
index 2fd2ba8330..d3322760f4 100644
--- a/bsd-user/syscall.c
+++ b/bsd-user/syscall.c
@@ -88,56 +88,6 @@ static abi_long do_obreak(abi_ulong new_brk)
 return 0;
 }
 
-#if defined(TARGET_I386)
-static abi_long do_freebsd_sysarch(CPUX86State *env, int op, abi_ulong parms)
-{
-abi_long ret = 0;
-abi_ulong val;
-int idx;
-
-switch (op) {
-#ifdef TARGET_ABI32
-case TARGET_FREEBSD_I386_SET_GSBASE:
-case TARGET_FREEBSD_I386_SET_FSBASE:
-if (op == TARGET_FREEBSD_I386_SET_GSBASE)
-#else
-case TARGET_FREEBSD_AMD64_SET_GSBASE:
-case TARGET_FREEBSD_AMD64_SET_FSBASE:
-if (op == TARGET_FREEBSD_AMD64_SET_GSBASE)
-#endif
-idx = R_GS;
-else
-idx = R_FS;
-if (get_user(val, parms, abi_ulong))
-return -TARGET_EFAULT;
-cpu_x86_load_seg(env, idx, 0);
-env->segs[idx].base = val;
-break;
-#ifdef TARGET_ABI32
-case TARGET_FREEBSD_I386_GET_GSBASE:
-case TARGET_FREEBSD_I386_GET_FSBASE:
-if (op == TARGET_FREEBSD_I386_GET_GSBASE)
-#else
-case TARGET_FREEBSD_AMD64_GET_GSBASE:
-case TARGET_FREEBSD_AMD64_GET_FSBASE:
-if (op == TARGET_FREEBSD_AMD64_GET_GSBASE)
-#endif
-idx = R_GS;
-else
-idx = R_FS;
-val = env->segs[idx].base;
-if (put_user(val, parms, abi_ulong))
-return -TARGET_EFAULT;
-break;
-/* XXX handle the others... */
-default:
-ret = -TARGET_EINVAL;
-break;
-}
-return ret;
-}
-#endif
-
 #ifdef __FreeBSD__
 /*
  * XXX this uses the undocumented oidfmt interface to find the kind of
-- 
2.32.0




[PULL v2 12/23] bsd-user/strace.list: Remove support for FreeBSD versions older than 12.0

2021-10-18 Thread Warner Losh
Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Kyle Evans 
---
 bsd-user/freebsd/strace.list | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/bsd-user/freebsd/strace.list b/bsd-user/freebsd/strace.list
index b01b5f36e8..275d2dbe27 100644
--- a/bsd-user/freebsd/strace.list
+++ b/bsd-user/freebsd/strace.list
@@ -33,10 +33,6 @@
 { TARGET_FREEBSD_NR___syscall, "__syscall", NULL, NULL, NULL },
 { TARGET_FREEBSD_NR___sysctl, "__sysctl", NULL, print_sysctl, NULL },
 { TARGET_FREEBSD_NR__umtx_op, "_umtx_op", "%s(%#x, %d, %d, %#x, %#x)", NULL, 
NULL },
-#if defined(__FreeBSD_version) && __FreeBSD_version < 100
-{ TARGET_FREEBSD_NR__umtx_lock, "__umtx_lock", NULL, NULL, NULL },
-{ TARGET_FREEBSD_NR__umtx_unlock, "__umtx_unlock", NULL, NULL, NULL },
-#endif
 { TARGET_FREEBSD_NR_accept, "accept", "%s(%d,%#x,%#x)", NULL, NULL },
 { TARGET_FREEBSD_NR_accept4, "accept4", "%s(%d,%d,%#x,%#x)", NULL, NULL },
 { TARGET_FREEBSD_NR_access, "access", "%s(\"%s\",%#o)", NULL, NULL },
@@ -49,10 +45,6 @@
 { TARGET_FREEBSD_NR_cap_fcntls_get, "cap_fcntls_get", NULL, NULL, NULL },
 { TARGET_FREEBSD_NR_cap_fcntls_limit, "cap_fcntls_limit", NULL, NULL, NULL },
 { TARGET_FREEBSD_NR_cap_getmode, "cap_getmode", NULL, NULL, NULL },
-#if defined(__FreeBSD_version) && __FreeBSD_version < 100
-{ TARGET_FREEBSD_NR_cap_getrights, "cap_getrights", NULL, NULL, NULL },
-{ TARGET_FREEBSD_NR_cap_new, "cap_new", NULL, NULL, NULL },
-#endif
 { TARGET_FREEBSD_NR_cap_ioctls_get, "cap_ioctls_get", NULL, NULL, NULL },
 { TARGET_FREEBSD_NR_cap_ioctls_limit, "cap_ioctls_limit", NULL, NULL, NULL },
 { TARGET_FREEBSD_NR_cap_rights_limit, "cap_rights_limit", NULL, NULL, NULL },
@@ -146,9 +138,6 @@
 { TARGET_FREEBSD_NR_freebsd11_kevent, "freebsd11_kevent", NULL, NULL, NULL },
 { TARGET_FREEBSD_NR_kevent, "kevent", NULL, NULL, NULL },
 { TARGET_FREEBSD_NR_kill, "kill", NULL, NULL, NULL },
-#if defined(__FreeBSD_version) && __FreeBSD_version < 100
-{ TARGET_FREEBSD_NR_killpg, "killpg", NULL, NULL, NULL },
-#endif
 { TARGET_FREEBSD_NR_kqueue, "kqueue", NULL, NULL, NULL },
 { TARGET_FREEBSD_NR_ktrace, "ktrace", NULL, NULL, NULL },
 { TARGET_FREEBSD_NR_lchown, "lchown", NULL, NULL, NULL },
-- 
2.32.0




[PULL v2 14/23] bsd-user: export get_errno and is_error from syscall.c

2021-10-18 Thread Warner Losh
Make get_errno and is_error global so files other than syscall.c can use
them.

Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/qemu.h|  4 
 bsd-user/syscall.c | 10 +-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index 522d6c4031..3b8475394c 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -235,6 +235,10 @@ extern unsigned long target_dflssiz;
 extern unsigned long target_maxssiz;
 extern unsigned long target_sgrowsiz;
 
+/* syscall.c */
+abi_long get_errno(abi_long ret);
+bool is_error(abi_long ret);
+
 /* user access */
 
 #define VERIFY_READ  PAGE_READ
diff --git a/bsd-user/syscall.c b/bsd-user/syscall.c
index 372836d44d..2fd2ba8330 100644
--- a/bsd-user/syscall.c
+++ b/bsd-user/syscall.c
@@ -33,18 +33,18 @@
 static abi_ulong target_brk;
 static abi_ulong target_original_brk;
 
-static inline abi_long get_errno(abi_long ret)
+abi_long get_errno(abi_long ret)
 {
-if (ret == -1)
+if (ret == -1) {
 /* XXX need to translate host -> target errnos here */
 return -(errno);
-else
-return ret;
+}
+return ret;
 }
 
 #define target_to_host_bitmask(x, tbl) (x)
 
-static inline int is_error(abi_long ret)
+bool is_error(abi_long ret)
 {
 return (abi_ulong)ret >= (abi_ulong)(-4096);
 }
-- 
2.32.0




[PULL v2 15/23] bsd-user/errno_defs.h: Add internal error numbers

2021-10-18 Thread Warner Losh
From: Stacey Son 

To emulate signals and interrupted system calls, we need to have the
same mechanisms we have in the kernel, including these errno values.

Signed-off-by: Stacey Son 
Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Kyle Evans 
---
 bsd-user/errno_defs.h | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/bsd-user/errno_defs.h b/bsd-user/errno_defs.h
index 1efa502a12..832671354f 100644
--- a/bsd-user/errno_defs.h
+++ b/bsd-user/errno_defs.h
@@ -1,6 +1,3 @@
-/*  $OpenBSD: errno.h,v 1.20 2007/09/03 14:37:52 millert Exp $  */
-/*  $NetBSD: errno.h,v 1.10 1996/01/20 01:33:53 jtc Exp $   */
-
 /*
  * Copyright (c) 1982, 1986, 1989, 1993
  *  The Regents of the University of California.  All rights reserved.
@@ -37,6 +34,9 @@
  *  @(#)errno.h 8.5 (Berkeley) 1/21/94
  */
 
+#ifndef _ERRNO_DEFS_H_
+#define _ERRNO_DEFS_H_
+
 #define TARGET_EPERM1   /* Operation not permitted */
 #define TARGET_ENOENT   2   /* No such file or directory */
 #define TARGET_ESRCH3   /* No such process */
@@ -147,3 +147,10 @@
 #define TARGET_EIDRM89  /* Identifier removed */
 #define TARGET_ENOMSG   90  /* No message of desired type 
*/
 #define TARGET_ELAST90  /* Must be equal largest errno 
*/
+
+/* Internal errors: */
+#define TARGET_EJUSTRETURN  254 /* Just return without 
modifing regs */
+#define TARGET_ERESTART 255 /* Restart syscall */
+#define TARGET_ERESTARTSYS  TARGET_ERESTART /* Linux compat */
+
+#endif /* !  _ERRNO_DEFS_H_ */
-- 
2.32.0




[PULL v2 10/23] meson: *-user: only descend into *-user when configured

2021-10-18 Thread Warner Losh
To increase flexibility, only descend into *-user when that is
configured. This allows *-user to selectively include directories based
on the host OS which may not exist on all hosts. Adopt Paolo's
suggestion of checking the configuration in the directories that know
about the configuration.

Message-Id: <20210926220103.1721355-2-f4...@amsat.org>
Message-Id: <20210926220103.1721355-3-f4...@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Warner Losh 
Acked-by: Paolo Bonzini 
Reviewed-by: Kyle Evans 
---
 bsd-user/meson.build   |  4 
 linux-user/meson.build |  4 
 meson.build| 12 
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/bsd-user/meson.build b/bsd-user/meson.build
index 0369549340..5378f56f71 100644
--- a/bsd-user/meson.build
+++ b/bsd-user/meson.build
@@ -1,3 +1,7 @@
+if not have_bsd_user
+   subdir_done()
+endif
+
 bsd_user_ss.add(files(
   'bsdload.c',
   'elfload.c',
diff --git a/linux-user/meson.build b/linux-user/meson.build
index 9549f81682..bf62c13e37 100644
--- a/linux-user/meson.build
+++ b/linux-user/meson.build
@@ -1,3 +1,7 @@
+if not have_linux_user
+   subdir_done()
+endif
+
 linux_user_ss.add(files(
   'elfload.c',
   'exit.c',
diff --git a/meson.build b/meson.build
index 6b7487b725..5e7946776d 100644
--- a/meson.build
+++ b/meson.build
@@ -40,12 +40,15 @@ config_host_data = configuration_data()
 genh = []
 
 target_dirs = config_host['TARGET_DIRS'].split()
-have_user = false
+have_linux_user = false
+have_bsd_user = false
 have_system = false
 foreach target : target_dirs
-  have_user = have_user or target.endswith('-user')
+  have_linux_user = have_linux_user or target.endswith('linux-user')
+  have_bsd_user = have_bsd_user or target.endswith('bsd-user')
   have_system = have_system or target.endswith('-softmmu')
 endforeach
+have_user = have_linux_user or have_bsd_user
 have_tools = 'CONFIG_TOOLS' in config_host
 have_block = have_system or have_tools
 
@@ -2595,10 +2598,11 @@ subdir('bsd-user')
 subdir('linux-user')
 subdir('ebpf')
 
-bsd_user_ss.add(files('gdbstub.c'))
+common_ss.add(libbpf)
+
 specific_ss.add_all(when: 'CONFIG_BSD_USER', if_true: bsd_user_ss)
 
-linux_user_ss.add(files('gdbstub.c', 'thunk.c'))
+linux_user_ss.add(files('thunk.c'))
 specific_ss.add_all(when: 'CONFIG_LINUX_USER', if_true: linux_user_ss)
 
 # needed for fuzzing binaries
-- 
2.32.0




[PULL v2 09/23] bsd-user/mmap.c: assert that target_mprotect cannot fail

2021-10-18 Thread Warner Losh
Similar to the equivalent linux-user change 86abac06c14. All error
conditions that target_mprotect checks are also checked by target_mmap.
EACCESS cannot happen because we are just removing PROT_WRITE.  ENOMEM
should not happen because we are modifying a whole VMA (and we have
bigger problems anyway if it happens).

Fixes a Coverity false positive, where Coverity complains about
target_mprotect's return value being passed to tb_invalidate_phys_range.

Signed-off-by: Mikaël Urankar 
Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/mmap.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index 5b6ed5eed1..13cb32dba1 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -604,10 +604,7 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 }
 if (!(prot & PROT_WRITE)) {
 ret = target_mprotect(start, len, prot);
-if (ret != 0) {
-start = ret;
-goto the_end;
-}
+assert(ret == 0);
 }
 goto the_end;
 }
-- 
2.32.0




[PULL v2 06/23] bsd-user/mmap.c: Convert to qemu_log logging for mmap debugging

2021-10-18 Thread Warner Losh
Convert DEBUG_MMAP to qemu_log CPU_LOG_PAGE.

Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/mmap.c | 53 +
 1 file changed, 23 insertions(+), 30 deletions(-)

diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index 301108ed25..face98573f 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -21,8 +21,6 @@
 #include "qemu.h"
 #include "qemu-common.h"
 
-//#define DEBUG_MMAP
-
 static pthread_mutex_t mmap_mutex = PTHREAD_MUTEX_INITIALIZER;
 static __thread int mmap_lock_count;
 
@@ -67,14 +65,11 @@ int target_mprotect(abi_ulong start, abi_ulong len, int 
prot)
 abi_ulong end, host_start, host_end, addr;
 int prot1, ret;
 
-#ifdef DEBUG_MMAP
-printf("mprotect: start=0x" TARGET_ABI_FMT_lx
-   "len=0x" TARGET_ABI_FMT_lx " prot=%c%c%c\n", start, len,
-   prot & PROT_READ ? 'r' : '-',
-   prot & PROT_WRITE ? 'w' : '-',
-   prot & PROT_EXEC ? 'x' : '-');
-#endif
-
+qemu_log_mask(CPU_LOG_PAGE, "mprotect: start=0x" TARGET_ABI_FMT_lx
+  " len=0x" TARGET_ABI_FMT_lx " prot=%c%c%c\n", start, len,
+  prot & PROT_READ ? 'r' : '-',
+  prot & PROT_WRITE ? 'w' : '-',
+  prot & PROT_EXEC ? 'x' : '-');
 if ((start & ~TARGET_PAGE_MASK) != 0)
 return -EINVAL;
 len = TARGET_PAGE_ALIGN(len);
@@ -391,45 +386,43 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 abi_ulong ret, end, real_start, real_end, retaddr, host_offset, host_len;
 
 mmap_lock();
-#ifdef DEBUG_MMAP
-{
-printf("mmap: start=0x" TARGET_ABI_FMT_lx
-   " len=0x" TARGET_ABI_FMT_lx " prot=%c%c%c flags=",
-   start, len,
-   prot & PROT_READ ? 'r' : '-',
-   prot & PROT_WRITE ? 'w' : '-',
-   prot & PROT_EXEC ? 'x' : '-');
+if (qemu_loglevel_mask(CPU_LOG_PAGE)) {
+qemu_log("mmap: start=0x" TARGET_ABI_FMT_lx
+ " len=0x" TARGET_ABI_FMT_lx " prot=%c%c%c flags=",
+ start, len,
+ prot & PROT_READ ? 'r' : '-',
+ prot & PROT_WRITE ? 'w' : '-',
+ prot & PROT_EXEC ? 'x' : '-');
 if (flags & MAP_ALIGNMENT_MASK) {
-printf("MAP_ALIGNED(%u) ", (flags & MAP_ALIGNMENT_MASK)
->> MAP_ALIGNMENT_SHIFT);
+qemu_log("MAP_ALIGNED(%u) ",
+ (flags & MAP_ALIGNMENT_MASK) >> MAP_ALIGNMENT_SHIFT);
 }
 if (flags & MAP_GUARD) {
-printf("MAP_GUARD ");
+qemu_log("MAP_GUARD ");
 }
 if (flags & MAP_FIXED) {
-printf("MAP_FIXED ");
+qemu_log("MAP_FIXED ");
 }
 if (flags & MAP_ANON) {
-printf("MAP_ANON ");
+qemu_log("MAP_ANON ");
 }
 if (flags & MAP_EXCL) {
-printf("MAP_EXCL ");
+qemu_log("MAP_EXCL ");
 }
 if (flags & MAP_PRIVATE) {
-printf("MAP_PRIVATE ");
+qemu_log("MAP_PRIVATE ");
 }
 if (flags & MAP_SHARED) {
-printf("MAP_SHARED ");
+qemu_log("MAP_SHARED ");
 }
 if (flags & MAP_NOCORE) {
-printf("MAP_NOCORE ");
+qemu_log("MAP_NOCORE ");
 }
 if (flags & MAP_STACK) {
-printf("MAP_STACK ");
+qemu_log("MAP_STACK ");
 }
-printf("fd=%d offset=0x%llx\n", fd, offset);
+qemu_log("fd=%d offset=0x%lx\n", fd, offset);
 }
-#endif
 
 if ((flags & MAP_ANON) && fd != -1) {
 errno = EINVAL;
-- 
2.32.0




[PULL v2 07/23] bsd-user/mmap.c: Don't mmap fd == -1 independently from MAP_ANON flag

2021-10-18 Thread Warner Losh
Switch checks for !(flags & MAP_ANONYMOUS) with checks for fd != -1.
MAP_STACK and MAP_GUARD both require fd == -1 and don't require mapping
the fd either. Add analysis from Guy Yur detailing the different cases
for MAP_GUARD and MAP_STACK.

Signed-off-by: Guy Yur 
[ partially merged before, finishing the job and documenting origin]
Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Kyle Evans 
---
 bsd-user/mmap.c | 30 +-
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index face98573f..4ecd949a10 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -127,7 +127,27 @@ error:
 return ret;
 }
 
-/* map an incomplete host page */
+/*
+ * map an incomplete host page
+ *
+ * mmap_frag can be called with a valid fd, if flags doesn't contain one of
+ * MAP_ANON, MAP_STACK, MAP_GUARD. If we need to map a page in those cases, we
+ * pass fd == -1. However, if flags contains MAP_GUARD then MAP_ANON cannot be
+ * added.
+ *
+ * * If fd is valid (not -1) we want to map the pages with MAP_ANON.
+ * * If flags contains MAP_GUARD we don't want to add MAP_ANON because it
+ *   will be rejected.  See kern_mmap's enforcing of constraints for MAP_GUARD
+ *   in sys/vm/vm_mmap.c.
+ * * If flags contains MAP_ANON it doesn't matter if we add it or not.
+ * * If flags contains MAP_STACK, mmap adds MAP_ANON when called so doesn't
+ *   matter if we add it or not either. See enforcing of constraints for
+ *   MAP_STACK in kern_mmap.
+ *
+ * Don't add MAP_ANON for the flags that use fd == -1 without specifying the
+ * flags directly, with the assumption that future flags that require fd == -1
+ * will also not require MAP_ANON.
+ */
 static int mmap_frag(abi_ulong real_start,
  abi_ulong start, abi_ulong end,
  int prot, int flags, int fd, abi_ulong offset)
@@ -147,9 +167,9 @@ static int mmap_frag(abi_ulong real_start,
 }
 
 if (prot1 == 0) {
-/* no page was there, so we allocate one */
+/* no page was there, so we allocate one. See also above. */
 void *p = mmap(host_start, qemu_host_page_size, prot,
-   flags | MAP_ANON, -1, 0);
+   flags | ((fd != -1) ? MAP_ANON : 0), -1, 0);
 if (p == MAP_FAILED)
 return -1;
 prot1 = prot;
@@ -157,7 +177,7 @@ static int mmap_frag(abi_ulong real_start,
 prot1 &= PAGE_BITS;
 
 prot_new = prot | prot1;
-if (!(flags & MAP_ANON)) {
+if (fd != -1) {
 /* msync() won't work here, so we return an error if write is
possible while it is a shared mapping */
 if ((flags & TARGET_BSD_MAP_FLAGMASK) == MAP_SHARED &&
@@ -565,7 +585,7 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
  * worst case: we cannot map the file because the offset is not
  * aligned, so we read it
  */
-if (!(flags & MAP_ANON) &&
+if (fd != -1 &&
 (offset & ~qemu_host_page_mask) != (start & ~qemu_host_page_mask)) 
{
 /*
  * msync() won't work here, so we return an error if write is
-- 
2.32.0




[PULL v2 11/23] bsd-user/target_os-user.h: Remove support for FreeBSD older than 12.0

2021-10-18 Thread Warner Losh
Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Kyle Evans 
---
 bsd-user/freebsd/target_os_user.h | 100 +-
 1 file changed, 1 insertion(+), 99 deletions(-)

diff --git a/bsd-user/freebsd/target_os_user.h 
b/bsd-user/freebsd/target_os_user.h
index 95b1fa9f99..19892c5071 100644
--- a/bsd-user/freebsd/target_os_user.h
+++ b/bsd-user/freebsd/target_os_user.h
@@ -61,15 +61,7 @@ struct target_sockaddr_storage {
 /*
  * from sys/user.h
  */
-#if defined(__FreeBSD_version) && __FreeBSD_version >= 1200031
 #define TARGET_KI_NSPARE_INT2
-#elif defined(__FreeBSD_version) && __FreeBSD_version >= 110
-#define TARGET_KI_NSPARE_INT4
-#elif defined(__FreeBSD_version) && __FreeBSD_version >= 100
-#define TARGET_KI_NSPARE_INT7
-#else
-#define TARGET_KI_NSPARE_INT9
-#endif /* ! __FreeBSD_version >= 100 */
 #define TARGET_KI_NSPARE_LONG   12
 #define TARGET_KI_NSPARE_PTR6
 
@@ -116,11 +108,7 @@ struct target_kinfo_proc {
 int32_t ki_tsid;/* Terminal session ID */
 int16_t ki_jobc;/* job control counter */
 int16_t ki_spare_short1;/* unused (just here for alignment) */
-#if defined(__FreeBSD_version) && __FreeBSD_version >= 1200031
 int32_t ki_tdev__freebsd11; /* controlling tty dev */
-#else
-int32_t ki_tdev;/* controlling tty dev */
-#endif
 target_sigset_t ki_siglist; /* Signals arrived but not delivered */
 target_sigset_t ki_sigmask; /* Current signal mask */
 target_sigset_t ki_sigignore;   /* Signals being ignored */
@@ -164,45 +152,24 @@ struct target_kinfo_proc {
 int8_t  ki_nice;/* Process "nice" value */
 charki_lock;/* Process lock (prevent swap) count */
 charki_rqindex; /* Run queue index */
-#if defined(__FreeBSD_version) && __FreeBSD_version >= 110
 u_char  ki_oncpu_old;   /* Which cpu we are on (legacy) */
 u_char  ki_lastcpu_old; /* Last cpu we were on (legacy) */
-#else
-u_char  ki_oncpu;   /* Which cpu we are on */
-u_char  ki_lastcpu; /* Last cpu we were on */
-#endif /* ! __FreeBSD_version >= 110 */
-#if defined(__FreeBSD_version) && __FreeBSD_version >= 90
 charki_tdname[TARGET_TDNAMLEN + 1];  /* thread name */
-#else
-charki_ocomm[TARGET_TDNAMLEN + 1];   /* thread name */
-#endif /* ! __FreeBSD_version >= 90 */
 charki_wmesg[TARGET_WMESGLEN + 1];   /* wchan message */
 charki_login[TARGET_LOGNAMELEN + 1]; /* setlogin name */
 charki_lockname[TARGET_LOCKNAMELEN + 1]; /* lock name */
 charki_comm[TARGET_COMMLEN + 1]; /* command name */
 charki_emul[TARGET_KI_EMULNAMELEN + 1];  /* emulation name */
-#if defined(__FreeBSD_version) && __FreeBSD_version >= 90
 charki_loginclass[TARGET_LOGINCLASSLEN + 1]; /* login class */
-#endif /* ! __FreeBSD_version >= 90 */
 
-#if defined(__FreeBSD_version) && __FreeBSD_version >= 90
 charki_sparestrings[50];/* spare string space */
-#else
-charki_sparestrings[68];/* spare string space */
-#endif /* ! __FreeBSD_version >= 90 */
 int32_t ki_spareints[TARGET_KI_NSPARE_INT]; /* spare room for growth */
-#if defined(__FreeBSD_version) && __FreeBSD_version >= 1200031
- uint64_t ki_tdev;  /* controlling tty dev */
-#endif
-#if defined(__FreeBSD_version) && __FreeBSD_version >= 110
+uint64_t ki_tdev;  /* controlling tty dev */
 int32_t ki_oncpu;   /* Which cpu we are on */
 int32_t ki_lastcpu; /* Last cpu we were on */
 int32_t ki_tracer;  /* Pid of tracing process */
-#endif /* __FreeBSD_version >= 110 */
-#if defined(__FreeBSD_version) && __FreeBSD_version >= 90
 int32_t ki_flag2;   /* P2_* flags */
 int32_t ki_fibnum;  /* Default FIB number */
-#endif /* ! __FreeBSD_version >= 90 */
 uint32_tki_cr_flags;/* Credential flags */
 int32_t ki_jid; /* Process jail ID */
 int32_t ki_numthreads;  /* XXXKSE number of threads in total */
@@ -234,18 +201,8 @@ struct target_kinfo_file {
 int32_t  kf_flags;  /* Flags. */
 int32_t  kf_pad0;  /* Round to 64 bit alignment. */
 int64_t  kf_offset;  /* Seek location. */
-#if defined(__FreeBSD_version) && __FreeBSD_version < 1200031
-int32_t  kf_vnode_type;  /* Vnode type. */
-int32_t  kf_sock_domain;  /* Socket domain. */
-int32_t  kf_sock_type;  /* Socket type. */
-int32_t  kf_sock_protocol; /* Socket protocol. */
-struct target_sockaddr_storage kf_sa_local; /* Socket address. */
-struct target_sockaddr_storage kf_sa_peer; /* Peer address. */
-#endif
-#if defined(__FreeBSD_version) && __FreeBSD_version >= 90
 union {
 struct {
-#if defined(__FreeBSD_vers

[PULL v2 05/23] bsd-user/mmap.c: mmap prefer MAP_ANON for BSD

2021-10-18 Thread Warner Losh
MAP_ANON and MAP_ANONYMOUS are identical. Prefer MAP_ANON for BSD since
the file is now a confusing mix of the two.

Signed-off-by: Warner Losh 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Reviewed-by: Kyle Evans 
---
 bsd-user/mmap.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index f0be3b12cf..301108ed25 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -285,7 +285,7 @@ static abi_ulong mmap_find_vma_aligned(abi_ulong start, 
abi_ulong size,
 addr = start;
 wrapped = repeat = 0;
 prev = 0;
-flags = MAP_ANONYMOUS | MAP_PRIVATE;
+flags = MAP_ANON | MAP_PRIVATE;
 if (alignment != 0) {
 flags |= MAP_ALIGNED(alignment);
 }
@@ -409,7 +409,7 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 if (flags & MAP_FIXED) {
 printf("MAP_FIXED ");
 }
-if (flags & MAP_ANONYMOUS) {
+if (flags & MAP_ANON) {
 printf("MAP_ANON ");
 }
 if (flags & MAP_EXCL) {
@@ -431,7 +431,7 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 }
 #endif
 
-if ((flags & MAP_ANONYMOUS) && fd != -1) {
+if ((flags & MAP_ANON) && fd != -1) {
 errno = EINVAL;
 goto fail;
 }
@@ -533,7 +533,7 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
  * qemu_real_host_page_size
  */
 p = mmap(g2h_untagged(start), host_len, prot,
- flags | MAP_FIXED | ((fd != -1) ? MAP_ANONYMOUS : 0), -1, 0);
+ flags | MAP_FIXED | ((fd != -1) ? MAP_ANON : 0), -1, 0);
 if (p == MAP_FAILED)
 goto fail;
 /* update start so that it points to the file position at 'offset' */
@@ -696,8 +696,7 @@ static void mmap_reserve(abi_ulong start, abi_ulong size)
 }
 if (real_start != real_end) {
 mmap(g2h_untagged(real_start), real_end - real_start, PROT_NONE,
- MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE,
- -1, 0);
+ MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
 }
 }
 
-- 
2.32.0




[PULL v2 02/23] bsd-user/mmap.c: check pread's return value to fix warnings with _FORTIFY_SOURCE

2021-10-18 Thread Warner Losh
From: Mikaël Urankar 

Simmilar to the equivalent linux-user: commit fb7e378cf9c, which added
checking to pread's return value. Update to current qemu standards with
{} around the if statement.

Signed-off-by: Mikaël Urankar 
Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/mmap.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index fc3c1480f5..4f4fa3ab46 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -174,7 +174,9 @@ static int mmap_frag(abi_ulong real_start,
 mprotect(host_start, qemu_host_page_size, prot1 | PROT_WRITE);
 
 /* read the corresponding file data */
-pread(fd, g2h_untagged(start), end - start, offset);
+if (pread(fd, g2h_untagged(start), end - start, offset) == -1) {
+return -1;
+}
 
 /* put final protection */
 if (prot_new != (prot1 | PROT_WRITE))
@@ -593,7 +595,9 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
   -1, 0);
 if (retaddr == -1)
 goto fail;
-pread(fd, g2h_untagged(start), len, offset);
+if (pread(fd, g2h_untagged(start), len, offset) == -1) {
+goto fail;
+}
 if (!(prot & PROT_WRITE)) {
 ret = target_mprotect(start, len, prot);
 if (ret != 0) {
-- 
2.32.0




[PULL v2 08/23] bsd-user/mmap.c: Implement MAP_EXCL, required by jemalloc in head

2021-10-18 Thread Warner Losh
From: Kyle Evans 

jemalloc requires a working MAP_EXCL. Ensure that no page is double
mapped when specified. In addition, use guest_range_valid_untagged to
test for valid ranges of pages rather than an incomplete inlined version
of the test that might be wrong.

Signed-off-by: Kyle Evans 
Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
---
 bsd-user/mmap.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index 4ecd949a10..5b6ed5eed1 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -574,12 +574,10 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
  * It can fail only on 64-bit host with 32-bit target.
  * On any other target/host host mmap() handles this error correctly.
  */
-#if TARGET_ABI_BITS == 32 && HOST_LONG_BITS == 64
-if ((unsigned long)start + len - 1 > (abi_ulong) -1) {
+if (!guest_range_valid_untagged(start, len)) {
 errno = EINVAL;
 goto fail;
 }
-#endif
 
 /*
  * worst case: we cannot map the file because the offset is not
@@ -614,6 +612,12 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 goto the_end;
 }
 
+/* Reject the mapping if any page within the range is mapped */
+if ((flags & MAP_EXCL) && page_check_range(start, len, 0) < 0) {
+errno = EINVAL;
+goto fail;
+}
+
 /* handle the start of the mapping */
 if (start > real_start) {
 if (real_end == real_start + qemu_host_page_size) {
-- 
2.32.0




[PULL v2 01/23] bsd-user/mmap.c: Always zero MAP_ANONYMOUS memory in mmap_frag()

2021-10-18 Thread Warner Losh
From: Mikaël Urankar 

Similar to the equivalent linux-user commit e6deac9cf99

When mapping MAP_ANONYMOUS memory fragments, still need notice about to
set it zero, or it will cause issues.

Signed-off-by: Mikaël Urankar 
Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Kyle Evans 
---
 bsd-user/mmap.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index b40ab9045f..fc3c1480f5 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -180,10 +180,12 @@ static int mmap_frag(abi_ulong real_start,
 if (prot_new != (prot1 | PROT_WRITE))
 mprotect(host_start, qemu_host_page_size, prot_new);
 } else {
-/* just update the protection */
 if (prot_new != prot1) {
 mprotect(host_start, qemu_host_page_size, prot_new);
 }
+if (prot_new & PROT_WRITE) {
+memset(g2h_untagged(start), 0, end - start);
+}
 }
 return 0;
 }
-- 
2.32.0




[PULL v2 00/23] Pull bsd user 20211018 patches

2021-10-18 Thread Warner Losh
The following changes since commit c148a0572130ff485cd2249fbdd1a3260d5e10a4:

  Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20211016' into 
staging (2021-10-16 11:16:28 -0700)

are available in the Git repository at:

  g...@gitlab.com:bsdimp/qemu.git tags/pull-bsd-user-20211018-pull-request

for you to fetch changes up to 5abfac277d25feb5f12332422c03ea1cb21c6aa1:

  bsd-user/signal: Create a dummy signal queueing function (2021-10-18 12:51:39 
-0600)


bsd-user pull request: merge dependencies for next architectures

Merge the dependencies for arm, aarch64, and riscv64 architectures. This joins
together two patch series:

[PATCH v2 00/15] bsd-user: misc cleanup for aarch64 import

Prepare for aarch64 support (the next architecture to be upstreamed). As the
aarch64 emulation is more complete, it relies on a number of different items.
In some cases, I've pulled in the full support from bsd-user fork. In other
cases I've created a simple stub (as is the case for signals, which have
independent changes pending, so I wanted to be as minimal as possible.  Since
all pre-12.2 support was purged from the bsd-user fork, go ahead and remove it
here. FreeBSD 11.x goes ouft of support at the end of the month. Remove what
little multi-version support that's in upstream.

and

[PATCH v3 0/9] bsd-user mmap fixes
This series synchronizes mmap.c with the bsd-user fork. This is a mix of old bug
fixes pulled in from linux-user, as well as some newer fixes to adress bugs
found in check-tcg and recent FreeBSD developments. There are also a couple of
style commits. Updated to migrate debugging to qemu_log.

as well as a couple of minor rebase tweaks. In addition, the next two
architectures I plan on upstreaming (arm and riscv64) also have their prereqs
satisfied with this request.

v2: Remove accidental module regression in patch 7 and try again.



Kyle Evans (1):
  bsd-user/mmap.c: Implement MAP_EXCL, required by jemalloc in head

Mikaël Urankar (2):
  bsd-user/mmap.c: Always zero MAP_ANONYMOUS memory in mmap_frag()
  bsd-user/mmap.c: check pread's return value to fix warnings with
_FORTIFY_SOURCE

Stacey Son (1):
  bsd-user/errno_defs.h: Add internal error numbers

Warner Losh (19):
  bsd-user/mmap.c: MAP_ symbols are defined, so no need for ifdefs
  bsd-user/mmap.c: mmap return ENOMEM on overflow
  bsd-user/mmap.c: mmap prefer MAP_ANON for BSD
  bsd-user/mmap.c: Convert to qemu_log logging for mmap debugging
  bsd-user/mmap.c: Don't mmap fd == -1 independently from MAP_ANON flag
  bsd-user/mmap.c: assert that target_mprotect cannot fail
  meson: *-user: only descend into *-user when configured
  bsd-user/target_os-user.h: Remove support for FreeBSD older than 12.0
  bsd-user/strace.list: Remove support for FreeBSD versions older than
12.0
  bsd-user: TARGET_RESET define is unused, remove it
  bsd-user: export get_errno and is_error from syscall.c
  bsd-user: move TARGET_MC_GET_CLEAR_RET to target_os_signal.h
  bsd-user/target_os_elf.h: Remove fallback ELF_HWCAP and reorder
  bsd-user/target_os_elf: If ELF_HWCAP2 is defined, publish it
  bsd-user: Remove used from TaskState
  bsd-user: Add stop_all_tasks
  bsd-user/sysarch: Move to using do_freebsd_arch_sysarch interface
  bsd-user: Rename sigqueue to qemu_sigqueue
  bsd-user/signal: Create a dummy signal queueing function

 bsd-user/errno_defs.h|  13 ++-
 bsd-user/freebsd/meson.build |   3 +
 bsd-user/freebsd/os-sys.c|  27 +
 bsd-user/freebsd/strace.list |  11 --
 bsd-user/freebsd/target_os_elf.h |  12 +--
 bsd-user/freebsd/target_os_signal.h  |   3 +
 bsd-user/freebsd/target_os_user.h| 100 +--
 bsd-user/i386/target_arch_cpu.h  |   2 -
 bsd-user/i386/target_arch_signal.h   |   2 -
 bsd-user/main.c  |  10 +-
 bsd-user/meson.build |   7 ++
 bsd-user/mmap.c  | 144 +++
 bsd-user/qemu.h  |  25 +++--
 bsd-user/signal.c|  11 +-
 bsd-user/syscall.c   |  60 +--
 bsd-user/x86_64/target_arch_cpu.h|   2 -
 bsd-user/x86_64/target_arch_signal.h |   2 -
 linux-user/meson.build   |   4 +
 meson.build  |  12 ++-
 19 files changed, 187 insertions(+), 263 deletions(-)
 create mode 100644 bsd-user/freebsd/meson.build
 create mode 100644 bsd-user/freebsd/os-sys.c

-- 
2.32.0




[PULL v2 04/23] bsd-user/mmap.c: mmap return ENOMEM on overflow

2021-10-18 Thread Warner Losh
mmap should return ENOMEM on len overflow rather than EINVAL. Return
EINVAL when len == 0 and ENOMEM when the rounded to a page length is 0.
Found by make check-tcg.

Signed-off-by: Warner Losh 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Kyle Evans 
---
 bsd-user/mmap.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index 6f33aec58b..f0be3b12cf 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -455,11 +455,18 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 goto fail;
 }
 
-len = TARGET_PAGE_ALIGN(len);
 if (len == 0) {
 errno = EINVAL;
 goto fail;
 }
+
+/* Check for overflows */
+len = TARGET_PAGE_ALIGN(len);
+if (len == 0) {
+errno = ENOMEM;
+goto fail;
+}
+
 real_start = start & qemu_host_page_mask;
 host_offset = offset & qemu_host_page_mask;
 
-- 
2.32.0




[PULL v2 03/23] bsd-user/mmap.c: MAP_ symbols are defined, so no need for ifdefs

2021-10-18 Thread Warner Losh
All these MAP_ symbols are always defined on supported FreeBSD versions
(12.2 and newer), so remove the #ifdefs since they aren't needed.

Signed-off-by: Warner Losh 
Reviewed-by: Philippe Mathieu-Daudé 
Acked-by: Richard Henderson 
Reviewed-by: Kyle Evans 
---
 bsd-user/mmap.c | 14 --
 1 file changed, 14 deletions(-)

diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index 4f4fa3ab46..6f33aec58b 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -286,13 +286,9 @@ static abi_ulong mmap_find_vma_aligned(abi_ulong start, 
abi_ulong size,
 wrapped = repeat = 0;
 prev = 0;
 flags = MAP_ANONYMOUS | MAP_PRIVATE;
-#ifdef MAP_ALIGNED
 if (alignment != 0) {
 flags |= MAP_ALIGNED(alignment);
 }
-#else
-/* XXX TODO */
-#endif
 
 for (;; prev = ptr) {
 /*
@@ -407,22 +403,18 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 printf("MAP_ALIGNED(%u) ", (flags & MAP_ALIGNMENT_MASK)
 >> MAP_ALIGNMENT_SHIFT);
 }
-#if MAP_GUARD
 if (flags & MAP_GUARD) {
 printf("MAP_GUARD ");
 }
-#endif
 if (flags & MAP_FIXED) {
 printf("MAP_FIXED ");
 }
 if (flags & MAP_ANONYMOUS) {
 printf("MAP_ANON ");
 }
-#ifdef MAP_EXCL
 if (flags & MAP_EXCL) {
 printf("MAP_EXCL ");
 }
-#endif
 if (flags & MAP_PRIVATE) {
 printf("MAP_PRIVATE ");
 }
@@ -432,11 +424,9 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 if (flags & MAP_NOCORE) {
 printf("MAP_NOCORE ");
 }
-#ifdef MAP_STACK
 if (flags & MAP_STACK) {
 printf("MAP_STACK ");
 }
-#endif
 printf("fd=%d offset=0x%llx\n", fd, offset);
 }
 #endif
@@ -445,7 +435,6 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 errno = EINVAL;
 goto fail;
 }
-#ifdef MAP_STACK
 if (flags & MAP_STACK) {
 if ((fd != -1) || ((prot & (PROT_READ | PROT_WRITE)) !=
 (PROT_READ | PROT_WRITE))) {
@@ -453,8 +442,6 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 goto fail;
 }
 }
-#endif /* MAP_STACK */
-#ifdef MAP_GUARD
 if ((flags & MAP_GUARD) && (prot != PROT_NONE || fd != -1 ||
 offset != 0 || (flags & (MAP_SHARED | MAP_PRIVATE |
 /* MAP_PREFAULT | */ /* MAP_PREFAULT not in mman.h */
@@ -462,7 +449,6 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 errno = EINVAL;
 goto fail;
 }
-#endif
 
 if (offset & ~TARGET_PAGE_MASK) {
 errno = EINVAL;
-- 
2.32.0




Re: [PATCH v7 16/21] target/loongarch: Add disassembler

2021-10-18 Thread Philippe Mathieu-Daudé
On 10/18/21 20:33, Richard Henderson wrote:
> On 10/18/21 11:18 AM, WANG Xuerui wrote:
>> On 10/19/21 01:29, Richard Henderson wrote:
>>> On 10/18/21 8:38 AM, WANG Xuerui wrote:

 For now any implementation would suffice, and I already saw one or
 two bugs in the output during my TCG host work, but it surely would
 be nice to switch to generated decoder in the future. The
 loongarch-opcodes tables could be extended to support peculiarities
 as exhibited in the v1.00 ISA manual and binutils implementation,
 via additional attributes, and I'm open to such contributions.
>>>
>>> Perhaps it would be easiest to re-use the decodetree description?
>>> See e.g. target/openrisc/disas.c.
>>>
>> Indeed; I didn't thought of disassemblers in target/ instead of
>> disas/. That would be the most elegant way forward!
> 
> 
> The one quirk will be that so far using decodetree for disas is limited
> to the target, whereas you'll want this for host as well.  It shouldn't
> be a big deal, just a small matter of the correct build rules.

Oh, good to know. OTOH I expect very few developers to look at
host disas.



Re: [PATCH v2 0/2] roms/edk2: Avoid cloning unused cmocka submodule

2021-10-18 Thread Richard Henderson

On 10/18/21 11:53 AM, Philippe Mathieu-Daudé wrote:

If you don't have anything in your queue I can send a pullreq
tomorrow, otherwise thanks for taking care of it.


I do have a few other patches already in the queue.
I'll take care of it.


r~



Re: [PATCH v2 0/2] roms/edk2: Avoid cloning unused cmocka submodule

2021-10-18 Thread Philippe Mathieu-Daudé
On 10/18/21 20:10, Richard Henderson wrote:
> On 10/18/21 3:58 AM, Philippe Mathieu-Daudé wrote:
>> cmocka website SSL certificate expired, making CI pipelines
>> fail [*]. However EDK2 images built to test QEMU don't need
>> cmocka, nor oniguruma. Avoid cloning them.
>>
>> Note: scripts/make-release is neither covered in MAINTAINERS
>>    nor in our CI.
>>
>> [*] https://gitlab.com/rth7680/qemu/-/jobs/1685387520
>> fatal: unable to access
>> 'https://git.cryptomilk.org/projects/cmocka.git/': server certificate
>> verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt
>> CRLfile: none
>> fatal: clone of 'https://git.cryptomilk.org/projects/cmocka.git' into
>> submodule path 'UnitTestFrameworkPkg/Library/CmockaLib/cmocka' failed
>>
>> Since v1:
>> - Fixed typo (thuth)
>> - Added missing '--' shell separator
>>
>> Philippe Mathieu-Daudé (2):
>>    roms/edk2: Only init brotli submodule to build BaseTools
>>    roms/edk2: Only initialize required submodules
> 
> Thanks a lot.
> 
> Tested-by: Richard Henderson 
> 
> I'll queue this to target-arm.next.

If you don't have anything in your queue I can send a pullreq
tomorrow, otherwise thanks for taking care of it.




Re: [PULL 07/23] bsd-user/mmap.c: Don't mmap fd == -1 independently from MAP_ANON flag

2021-10-18 Thread Richard Henderson

On 10/18/21 11:47 AM, Warner Losh wrote:



On Mon, Oct 18, 2021 at 12:45 PM Richard Henderson > wrote:


On 10/18/21 9:04 AM, Warner Losh wrote:
 > diff --git a/roms/seabios-hppa b/roms/seabios-hppa
 > index b12acac4be..73b740f771 16
 > --- a/roms/seabios-hppa
 > +++ b/roms/seabios-hppa
 > @@ -1 +1 @@
 > -Subproject commit b12acac4be27b6d5d9fbe48c4be1286dcc245fbb
 > +Subproject commit 73b740f77190643b2ada5ee97a9a108c6ef2a37b

You rebased with a stale submodule.
You'll need to fix that and recreate the pull request.


Doh! I tried really hard *NOT* to do that.

So is this PULL v2 then?


Yep.

r~



Re: [PULL 07/23] bsd-user/mmap.c: Don't mmap fd == -1 independently from MAP_ANON flag

2021-10-18 Thread Warner Losh
On Mon, Oct 18, 2021 at 12:45 PM Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 10/18/21 9:04 AM, Warner Losh wrote:
> > diff --git a/roms/seabios-hppa b/roms/seabios-hppa
> > index b12acac4be..73b740f771 16
> > --- a/roms/seabios-hppa
> > +++ b/roms/seabios-hppa
> > @@ -1 +1 @@
> > -Subproject commit b12acac4be27b6d5d9fbe48c4be1286dcc245fbb
> > +Subproject commit 73b740f77190643b2ada5ee97a9a108c6ef2a37b
>
> You rebased with a stale submodule.
> You'll need to fix that and recreate the pull request.
>

Doh! I tried really hard *NOT* to do that.

So is this PULL v2 then?

Warner


Re: [PULL 07/23] bsd-user/mmap.c: Don't mmap fd == -1 independently from MAP_ANON flag

2021-10-18 Thread Richard Henderson

On 10/18/21 9:04 AM, Warner Losh wrote:

diff --git a/roms/seabios-hppa b/roms/seabios-hppa
index b12acac4be..73b740f771 16
--- a/roms/seabios-hppa
+++ b/roms/seabios-hppa
@@ -1 +1 @@
-Subproject commit b12acac4be27b6d5d9fbe48c4be1286dcc245fbb
+Subproject commit 73b740f77190643b2ada5ee97a9a108c6ef2a37b


You rebased with a stale submodule.
You'll need to fix that and recreate the pull request.



r~



Re: [PULL 00/17] MIPS patches for 2021-10-18

2021-10-18 Thread Richard Henderson

On 10/17/21 3:52 PM, Philippe Mathieu-Daudé wrote:

The following changes since commit c148a0572130ff485cd2249fbdd1a3260d5e10a4:

   Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20211016' into 
staging (2021-10-16 11:16:28 -0700)

are available in the Git repository at:

   https://github.com/philmd/qemu.git tags/mips-20211018

for you to fetch changes up to 2792cf20ca7eed0e354a0ed731422411faca4908:

   via-ide: Avoid using isa_get_irq() (2021-10-18 00:41:36 +0200)


MIPS patches queue

Hardware emulation:
- Generate FDT blob for Boston machine (Jiaxun)
- VIA chipset cleanups (Zoltan)

TCG:
- Use tcg_constant() in Compact branch and MSA opcodes
- Restrict nanoMIPS DSP MULT[U] opcode accumulator to Rel6
- Fix DEXTRV_S.H DSP opcode
- Remove unused TCG temporary for some DSP opcodes



BALATON Zoltan (4):
   via-ide: Set user_creatable to false
   vt82c686: Move common code to via_isa_realize
   vt82c686: Add a method to VIA_ISA to raise ISA interrupts
   via-ide: Avoid using isa_get_irq()

Jiaxun Yang (3):
   hw/mips/boston: Massage memory map information
   hw/mips/boston: Allow loading elf kernel and dtb
   hw/mips/boston: Add FDT generator

Philippe Mathieu-Daudé (10):
   target/mips: Check nanoMIPS DSP MULT[U] accumulator with Release 6
   target/mips: Remove unused register from MSA 2R/2RF instruction format
   target/mips: Use tcg_constant_i32() in gen_msa_elm_df()
   target/mips: Use tcg_constant_i32() in gen_msa_2rf()
   target/mips: Use tcg_constant_i32() in gen_msa_2r()
   target/mips: Use tcg_constant_i32() in gen_msa_3rf()
   target/mips: Use explicit extract32() calls in gen_msa_i5()
   target/mips: Use tcg_constant_tl() in gen_compute_compact_branch()
   target/mips: Fix DEXTRV_S.H DSP opcode
   target/mips: Remove unused TCG temporary in gen_mipsdsp_accinsn()

  include/hw/isa/vt82c686.h|   4 +
  hw/ide/via.c |   7 +-
  hw/isa/vt82c686.c|  75 +++--
  hw/mips/boston.c | 371 +--
  target/mips/tcg/msa_translate.c  |  51 ++--
  target/mips/tcg/translate.c  |  11 +-
  target/mips/tcg/nanomips_translate.c.inc |   6 +
  7 files changed, 415 insertions(+), 110 deletions(-)


Applied, thanks.

r~




  1   2   3   >