date:20210923

[PULL 0/3] hw/nvme updates

2021-09-23 Thread Klaus Jensen

From: Klaus Jensen 

Hi Peter,

The following changes since commit 2c3e83f92d93fbab071b8a96b8ab769b01902475:

  Merge remote-tracking branch 
'remotes/alistair23/tags/pull-riscv-to-apply-20210921' into staging (2021-09-21 
10:57:48 -0700)

are available in the Git repository at:

  git://git.infradead.org/qemu-nvme.git tags/nvme-next-pull-request

for you to fetch changes up to c53a9a91021c2f57de9ab18393d0048bd0fe90c2:

  hw/nvme: Return error for fused operations (2021-09-24 08:43:58 +0200)


hw/nvme updates



Klaus Jensen (1):
  hw/nvme: fix validation of ASQ and ACQ

Naveen Nagar (1):
  hw/nvme: fix verification of select field in namespace attachment

Pankaj Raghav (1):
  hw/nvme: Return error for fused operations

 hw/nvme/ctrl.c   | 31 ---
 hw/nvme/trace-events |  2 --
 include/block/nvme.h |  5 +
 3 files changed, 25 insertions(+), 13 deletions(-)

-- 
2.33.0

[PULL 2/3] hw/nvme: fix verification of select field in namespace attachment

2021-09-23 Thread Klaus Jensen

From: Naveen Nagar 

Fix is added to check for reserved value in select field for
namespace attachment

CC: Minwoo Im 
Signed-off-by: Naveen Nagar 
Signed-off-by: Klaus Jensen 
---
 hw/nvme/ctrl.c   | 15 ---
 include/block/nvme.h |  5 +
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index ff784851137e..dc0e7b00308e 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -5191,7 +5191,7 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, 
NvmeRequest *req)
 uint16_t list[NVME_CONTROLLER_LIST_SIZE] = {};
 uint32_t nsid = le32_to_cpu(req->cmd.nsid);
 uint32_t dw10 = le32_to_cpu(req->cmd.cdw10);
-bool attach = !(dw10 & 0xf);
+uint8_t sel = dw10 & 0xf;
 uint16_t *nr_ids = &list[0];
 uint16_t *ids = &list[1];
 uint16_t ret;
@@ -5224,7 +5224,8 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, 
NvmeRequest *req)
 return NVME_NS_CTRL_LIST_INVALID | NVME_DNR;
 }
 
-if (attach) {
+switch (sel) {
+case NVME_NS_ATTACHMENT_ATTACH:
 if (nvme_ns(ctrl, nsid)) {
 return NVME_NS_ALREADY_ATTACHED | NVME_DNR;
 }
@@ -5235,7 +5236,10 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, 
NvmeRequest *req)
 
 nvme_attach_ns(ctrl, ns);
 nvme_select_iocs_ns(ctrl, ns);
-} else {
+
+break;
+
+case NVME_NS_ATTACHMENT_DETACH:
 if (!nvme_ns(ctrl, nsid)) {
 return NVME_NS_NOT_ATTACHED | NVME_DNR;
 }
@@ -5244,6 +5248,11 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, 
NvmeRequest *req)
 ns->attached--;
 
 nvme_update_dmrsl(ctrl);
+
+break;
+
+default:
+return NVME_INVALID_FIELD | NVME_DNR;
 }
 
 /*
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 77aae0117494..e3bd47bf76ab 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -1154,6 +1154,11 @@ enum NvmeIdCtrlCmic {
 NVME_CMIC_MULTI_CTRL= 1 << 1,
 };
 
+enum NvmeNsAttachmentOperation {
+NVME_NS_ATTACHMENT_ATTACH = 0x0,
+NVME_NS_ATTACHMENT_DETACH = 0x1,
+};
+
 #define NVME_CTRL_SQES_MIN(sqes) ((sqes) & 0xf)
 #define NVME_CTRL_SQES_MAX(sqes) (((sqes) >> 4) & 0xf)
 #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf)
-- 
2.33.0

[PULL 3/3] hw/nvme: Return error for fused operations

2021-09-23 Thread Klaus Jensen

From: Pankaj Raghav 

Currently, FUSED operations are not supported by QEMU. As per the 1.4 SPEC,
controller should abort the command that requested a fused operation with
an INVALID FIELD error code if they are not supported.

Changes from v1:
Added FUSE flag check also to the admin cmd processing as the FUSED
operations are mentioned in the general SQE section in the SPEC.

Signed-off-by: Pankaj Raghav 
Reviewed-by: Keith Busch 
Signed-off-by: Klaus Jensen 
---
 hw/nvme/ctrl.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index dc0e7b00308e..2f247a9275ca 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -3893,6 +3893,10 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest 
*req)
 return ns->status;
 }
 
+if (NVME_CMD_FLAGS_FUSE(req->cmd.flags)) {
+return NVME_INVALID_FIELD;
+}
+
 req->ns = ns;
 
 switch (req->cmd.opcode) {
@@ -5475,6 +5479,10 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest 
*req)
 return NVME_INVALID_FIELD | NVME_DNR;
 }
 
+if (NVME_CMD_FLAGS_FUSE(req->cmd.flags)) {
+return NVME_INVALID_FIELD;
+}
+
 switch (req->cmd.opcode) {
 case NVME_ADM_CMD_DELETE_SQ:
 return nvme_del_sq(n, req);
-- 
2.33.0

[PULL 1/3] hw/nvme: fix validation of ASQ and ACQ

2021-09-23 Thread Klaus Jensen

From: Klaus Jensen 

Address 0x0 is a valid address. Fix the admin submission and completion
queue address validation to not error out on this.

Signed-off-by: Klaus Jensen 
Reviewed-by: Keith Busch 
---
 hw/nvme/ctrl.c   | 8 
 hw/nvme/trace-events | 2 --
 2 files changed, 10 deletions(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 6baf9e0420d5..ff784851137e 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -5623,14 +5623,6 @@ static int nvme_start_ctrl(NvmeCtrl *n)
 trace_pci_nvme_err_startfail_sq();
 return -1;
 }
-if (unlikely(!asq)) {
-trace_pci_nvme_err_startfail_nbarasq();
-return -1;
-}
-if (unlikely(!acq)) {
-trace_pci_nvme_err_startfail_nbaracq();
-return -1;
-}
 if (unlikely(asq & (page_size - 1))) {
 trace_pci_nvme_err_startfail_asq_misaligned(asq);
 return -1;
diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events
index 430eeb395b24..ff6cafd520df 100644
--- a/hw/nvme/trace-events
+++ b/hw/nvme/trace-events
@@ -159,8 +159,6 @@ pci_nvme_err_invalid_setfeat(uint32_t dw10) "invalid set 
features, dw10=0x%"PRIx
 pci_nvme_err_invalid_log_page(uint16_t cid, uint16_t lid) "cid %"PRIu16" lid 
0x%"PRIx16""
 pci_nvme_err_startfail_cq(void) "nvme_start_ctrl failed because there are 
non-admin completion queues"
 pci_nvme_err_startfail_sq(void) "nvme_start_ctrl failed because there are 
non-admin submission queues"
-pci_nvme_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin 
submission queue address is null"
-pci_nvme_err_startfail_nbaracq(void) "nvme_start_ctrl failed because the admin 
completion queue address is null"
 pci_nvme_err_startfail_asq_misaligned(uint64_t addr) "nvme_start_ctrl failed 
because the admin submission queue address is misaligned: 0x%"PRIx64""
 pci_nvme_err_startfail_acq_misaligned(uint64_t addr) "nvme_start_ctrl failed 
because the admin completion queue address is misaligned: 0x%"PRIx64""
 pci_nvme_err_startfail_page_too_small(uint8_t log2ps, uint8_t maxlog2ps) 
"nvme_start_ctrl failed because the page size is too small: log2size=%u, min=%u"
-- 
2.33.0

Re: [PATCH] hw/nvme: reattach subsystem namespaces on hotplug

2021-09-23 Thread Hannes Reinecke


On 9/23/21 10:09 PM, Klaus Jensen wrote:

On Sep  9 13:37, Hannes Reinecke wrote:

On 9/9/21 12:47 PM, Klaus Jensen wrote:

On Sep  9 11:43, Hannes Reinecke wrote:

With commit 5ffbaeed16 ("hw/nvme: fix controller hot unplugging")
namespaces get moved from the controller to the subsystem if one
is specified.
That keeps the namespaces alive after a controller hot-unplug, but
after a controller hotplug we have to reconnect the namespaces
from the subsystem to the controller.

Fixes: 5ffbaeed16 ("hw/nvme: fix controller hot unplugging")
Cc: Klaus Jensen 
Signed-off-by: Hannes Reinecke 
---
  hw/nvme/subsys.c | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/nvme/subsys.c b/hw/nvme/subsys.c
index 93c35950d6..a9404f2b5e 100644
--- a/hw/nvme/subsys.c
+++ b/hw/nvme/subsys.c
@@ -14,7 +14,7 @@
  int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp)
  {
  NvmeSubsystem *subsys = n->subsys;
-int cntlid;
+int cntlid, nsid;
  
  for (cntlid = 0; cntlid < ARRAY_SIZE(subsys->ctrls); cntlid++) {

  if (!subsys->ctrls[cntlid]) {
@@ -29,12 +29,18 @@ int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp)
  
  subsys->ctrls[cntlid] = n;
  
+for (nsid = 0; nsid < ARRAY_SIZE(subsys->namespaces); nsid++) {

+if (subsys->namespaces[nsid]) {
+nvme_attach_ns(n, subsys->namespaces[nsid]);
+}


Thanks Hannes! I like it, keeping things simple.

But we should only attach namespaces that have the shared property or
have ns->attached == 0. Non-shared namespaces may already be attached to
another controller in the subsystem.



Well ... I tried to avoid that subject, but as you brought it up:
There is a slightly tricky issue in fabrics, in that the 'controller' is
independent from the 'connection'.
The 'shared' bit in the CMIC setting indicates that the subsystem may
have more than one _controller_. It doesn't talk about how many
_connections_ a controller may support; that then is the realm of
dynamic or static controllers, which we don't talk about :-).
Sufficient to say, linux only implements the dynamic controller model,
so every connection will be to a different controller.
But you have been warned :-)

However, the 'CMIC' setting is independent on the 'NMIC' setting (ie the
'shared' bit in the namespace).
Which leads to the interesting question on how exactly should one handle
non-shared namespaces in subsystems for which there are multiple
controllers. Especially as the NSID space is per subsystem, so each
controller will be able to figure out if there are blanked-out namespaces.
So to make this a sane setup I would propose to set the 'shared' option
automatically whenever the controller has the 'subsys' option set.
And actually, I would ditch the 'shared' option completely, and make it
dependent on the 'subsys' setting for the controller.
Much like we treat the 'CMIC' setting nowadays.
That avoids lots of confusions, and also make the implementation _way_
easier.



I see your point. Unfortunately, there is no easy way to ditch shared=
now. But I think it'd be good enough to attach to shared automatically,
but not to "shared=off".

I've already ditched the shared parameter on my RFC refactor series and
always having the namespaces shared.



Okay.


However...

The spec says that "The attach and detach operations are persistent
across all reset events.". This means that we should track those events
in the subsystem and only reattach namespaces that were attached at the
time of the "reset" event. Currently, we don't have anything mapping
that state. But the device already has to take some liberties with
regard to stuff that is considered persistent by the spec (SMART log
etc.) since we do not have any way to store stuff persistently across
qemu invocations, so I think the above is an acceptable compromise.


Careful. 'attach' and 'detach' are MI (management interface) operations.
If and how many namespaces are visible to any given controllers is
actually independent on that; and, in fact, controllers might not even
implement 'attach' or 'detach'.
But I do agree that we don't handle the 'reset' state properly.



The Namespace Attachment command has nothing to do with MI? And the QEMU
controller does support attaching and detaching namespaces.



No, you got me wrong. Whether a not a namespace is attached is 
independent of any 'Namespace attachment' command.
Case in point: the subsystem will be starting up with namespace already 
attached, even though no-one issued any namespace attachment command.



A potential (as good as it gets) fix would be to keep a map/list of
"persistently" attached controllers on the namespaces and re-attach
according to that when we see that controller joining the subsystem
again. CNTLID would be the obvious choice for the key here, but problem
is that we cant really use it since we assign it sequentially from the
subsystem, which now looks like a pretty bad choice. CNTLID should have
been a required property o

Re: [PATCH] block: introduce max_hw_iov for use in scsi-generic

2021-09-23 Thread Paolo Bonzini

Yes, the question is whether it still exists...

Paolo

El jue., 23 sept. 2021 16:48, Christian Borntraeger 
escribió:

>
>
> Am 23.09.21 um 15:04 schrieb Paolo Bonzini:
> > Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel
> > sources, IOV_MAX in POSIX).  Because of this, on some host adapters
> > requests with many iovecs are rejected with -EINVAL by the
> > io_submit() or readv()/writev() system calls.
> >
> > In fact, the same limit applies to SG_IO as well.  To fix both the
> > EINVAL and the possible performance issues from using fewer iovecs
> > than allowed by Linux (some HBAs have max_segments as low as 128),
> > introduce a separate entry in BlockLimits to hold the max_segments
> > value from sysfs.  This new limit is used only for SG_IO and clamped
> > to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to
> > bs->bl.max_transfer.
> >
> > Reported-by: Halil Pasic 
> > Cc: Hanna Reitz 
> > Cc: Kevin Wolf 
> > Cc: qemu-block@nongnu.org
> > Fixes: 18473467d5 ("file-posix: try BLKSECTGET on block devices too, do
> not round to power of 2", 2021-06-25)
>
> This sneaked in shortly before the 6.1 release (between rc0 and rc1 I
> think).
> Shouldnt that go to stable in cass this still exist?
>
>
> > Signed-off-by: Paolo Bonzini 
> > ---
> >   block/block-backend.c  | 6 ++
> >   block/file-posix.c | 2 +-
> >   block/io.c | 1 +
> >   hw/scsi/scsi-generic.c | 2 +-
> >   include/block/block_int.h  | 7 +++
> >   include/sysemu/block-backend.h | 1 +
> >   6 files changed, 17 insertions(+), 2 deletions(-)
> >
> > diff --git a/block/block-backend.c b/block/block-backend.c
> > index 6140d133e2..ba2b5ebb10 100644
> > --- a/block/block-backend.c
> > +++ b/block/block-backend.c
> > @@ -1986,6 +1986,12 @@ uint32_t blk_get_max_transfer(BlockBackend *blk)
> >   return ROUND_DOWN(max, blk_get_request_alignment(blk));
> >   }
> >
> > +int blk_get_max_hw_iov(BlockBackend *blk)
> > +{
> > +return MIN_NON_ZERO(blk->root->bs->bl.max_hw_iov,
> > +blk->root->bs->bl.max_iov);
> > +}
> > +
> >   int blk_get_max_iov(BlockBackend *blk)
> >   {
> >   return blk->root->bs->bl.max_iov;
> > diff --git a/block/file-posix.c b/block/file-posix.c
> > index cb9bffe047..1567edb3d5 100644
> > --- a/block/file-posix.c
> > +++ b/block/file-posix.c
> > @@ -1273,7 +1273,7 @@ static void raw_refresh_limits(BlockDriverState
> *bs, Error **errp)
> >
> >   ret = hdev_get_max_segments(s->fd, &st);
> >   if (ret > 0) {
> > -bs->bl.max_iov = ret;
> > +bs->bl.max_hw_iov = ret;
> >   }
> >   }
> >   }
> > diff --git a/block/io.c b/block/io.c
> > index a19942718b..f38e7f81d8 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -136,6 +136,7 @@ static void bdrv_merge_limits(BlockLimits *dst,
> const BlockLimits *src)
> >   dst->min_mem_alignment = MAX(dst->min_mem_alignment,
> >src->min_mem_alignment);
> >   dst->max_iov = MIN_NON_ZERO(dst->max_iov, src->max_iov);
> > +dst->max_hw_iov = MIN_NON_ZERO(dst->max_hw_iov, src->max_hw_iov);
> >   }
> >
> >   typedef struct BdrvRefreshLimitsState {
> > diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> > index 665baf900e..0306ccc7b1 100644
> > --- a/hw/scsi/scsi-generic.c
> > +++ b/hw/scsi/scsi-generic.c
> > @@ -180,7 +180,7 @@ static int scsi_handle_inquiry_reply(SCSIGenericReq
> *r, SCSIDevice *s, int len)
> >   page = r->req.cmd.buf[2];
> >   if (page == 0xb0) {
> >   uint64_t max_transfer =
> blk_get_max_hw_transfer(s->conf.blk);
> > -uint32_t max_iov = blk_get_max_iov(s->conf.blk);
> > +uint32_t max_iov = blk_get_max_hw_iov(s->conf.blk);
> >
> >   assert(max_transfer);
> >   max_transfer = MIN_NON_ZERO(max_transfer, max_iov *
> qemu_real_host_page_size)
> > diff --git a/include/block/block_int.h b/include/block/block_int.h
> > index f1a54db0f8..c31cbd034a 100644
> > --- a/include/block/block_int.h
> > +++ b/include/block/block_int.h
> > @@ -702,6 +702,13 @@ typedef struct BlockLimits {
> >*/
> >   uint64_t max_hw_transfer;
> >
> > +/* Maximal number of scatter/gather elements allowed by the
> hardware.
> > + * Applies whenever transfers to the device bypass the kernel I/O
> > + * scheduler, for example with SG_IO.  If larger than max_iov
> > + * or if zero, blk_get_max_hw_iov will fall back to max_iov.
> > + */
> > +int max_hw_iov;
> > +
> >   /* memory alignment, in bytes so that no bounce buffer is needed */
> >   size_t min_mem_alignment;
> >
> > diff --git a/include/sysemu/block-backend.h
> b/include/sysemu/block-backend.h
> > index 29d4fdbf63..82bae55161 100644
> > --- a/include/sysemu/block-backend.h
> > +++ b/include/sysemu/block-backend.h
> > @@ -211,6 +211,7 @@ uint32_t blk_get_request_alignment(BlockBackend
> *blk);
> >   uint32_t blk_get_max_tran

Re: [PULL 00/12] jobs: mirror: Handle errors after READY cancel

2021-09-23 Thread Vladimir Sementsov-Ogievskiy


22.09.2021 22:19, Vladimir Sementsov-Ogievskiy wrote:

22.09.2021 19:05, Richard Henderson wrote:

On 9/21/21 3:20 AM, Vladimir Sementsov-Ogievskiy wrote:

The following changes since commit 326ff8dd09556fc2e257196c49f35009700794ac:

   Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into 
staging (2021-09-20 16:17:05 +0100)

are available in the Git repository at:

   https://src.openvz.org/scm/~vsementsov/qemu.git  tags/pull-jobs-2021-09-21

for you to fetch changes up to c9489c04319cac75c76af8fc27c254f46e10214c:

   iotests: Add mirror-ready-cancel-error test (2021-09-21 11:56:11 +0300)


mirror: Handle errors after READY cancel


Hanna Reitz (12):
   job: Context changes in job_completed_txn_abort()
   mirror: Keep s->synced on error
   mirror: Drop s->synced
   job: Force-cancel jobs in a failed transaction
   job: @force parameter for job_cancel_sync()
   jobs: Give Job.force_cancel more meaning
   job: Add job_cancel_requested()
   mirror: Use job_is_cancelled()
   mirror: Check job_is_cancelled() earlier
   mirror: Stop active mirroring after force-cancel
   mirror: Do not clear .cancelled
   iotests: Add mirror-ready-cancel-error test


This fails testing with errors like so:

Running test test-replication
test-replication: ../job.c:186: job_state_transition: Assertion 
`JobSTT[s0][s1]' failed.
ERROR test-replication - too few tests run (expected 13, got 8)
make: *** [Makefile.mtest:816: run-test-100] Error 1
Cleaning up project directory and file based variables
ERROR: Job failed: exit code 1

https://gitlab.com/qemu-project/qemu/-/pipelines/375324015/failures




Interesting :(

I've reproduced, starting test-replication in several parallel loops. (it 
doesn't reproduce for me if just start in one loop). So, that's some racy bug..

Hmm, and seems it doesn't reproduce so simple on master. I'll try to bisect the 
series tomorrow.



(gdb) bt
#0  0x7f034a3d09d5 in raise () from /lib64/libc.so.6
#1  0x7f034a3b9954 in abort () from /lib64/libc.so.6
#2  0x7f034a3b9789 in __assert_fail_base.cold () from /lib64/libc.so.6
#3  0x7f034a3c9026 in __assert_fail () from /lib64/libc.so.6
#4  0x55d3b503d670 in job_state_transition (job=0x55d3b5e67020, 
s1=JOB_STATUS_CONCLUDED) at ../job.c:186
#5  0x55d3b503e7c2 in job_conclude (job=0x55d3b5e67020) at ../job.c:652
#6  0x55d3b503eaa1 in job_finalize_single (job=0x55d3b5e67020) at 
../job.c:722
#7  0x55d3b503ecd1 in job_completed_txn_abort (job=0x55d3b5e67020) at 
../job.c:801
#8  0x55d3b503f2ea in job_cancel (job=0x55d3b5e67020, force=false) at 
../job.c:973
#9  0x55d3b503f360 in job_cancel_err (job=0x55d3b5e67020, 
errp=0x7fffcc997a80) at ../job.c:992
#10 0x55d3b503f576 in job_finish_sync (job=0x55d3b5e67020, finish=0x55d3b503f33f 
, errp=0x0) at ../job.c:1054
#11 0x55d3b503f3d0 in job_cancel_sync (job=0x55d3b5e67020, force=false) at 
../job.c:1008
#12 0x55d3b4ff14a3 in replication_close (bs=0x55d3b5e6ef80) at 
../block/replication.c:152
#13 0x55d3b50277fc in bdrv_close (bs=0x55d3b5e6ef80) at ../block.c:4677
#14 0x55d3b50286cf in bdrv_delete (bs=0x55d3b5e6ef80) at ../block.c:5100
#15 0x55d3b502ae3a in bdrv_unref (bs=0x55d3b5e6ef80) at ../block.c:6495
#16 0x55d3b5023a38 in bdrv_root_unref_child (child=0x55d3b5e4c690) at 
../block.c:3010
#17 0x55d3b5047998 in blk_remove_bs (blk=0x55d3b5e73b40) at 
../block/block-backend.c:845
#18 0x55d3b5046e38 in blk_delete (blk=0x55d3b5e73b40) at 
../block/block-backend.c:461
#19 0x55d3b50470dc in blk_unref (blk=0x55d3b5e73b40) at 
../block/block-backend.c:516
#20 0x55d3b4fdb20a in teardown_secondary () at 
../tests/unit/test-replication.c:367
#21 0x55d3b4fdb632 in test_secondary_continuous_replication () at 
../tests/unit/test-replication.c:504
#22 0x7f034b26979e in g_test_run_suite_internal () from 
/lib64/libglib-2.0.so.0
#23 0x7f034b26959b in g_test_run_suite_internal () from 
/lib64/libglib-2.0.so.0
#24 0x7f034b26959b in g_test_run_suite_internal () from 
/lib64/libglib-2.0.so.0
#25 0x7f034b269c8a in g_test_run_suite () from /lib64/libglib-2.0.so.0
#26 0x7f034b269ca5 in g_test_run () from /lib64/libglib-2.0.so.0
#27 0x55d3b4fdb9c0 in main (argc=1, argv=0x7fffcc998138) at 
../tests/unit/test-replication.c:613
(gdb) fr 4
#4  0x55d3b503d670 in job_state_transition (job=0x55d3b5e67020, 
s1=JOB_STATUS_CONCLUDED) at ../job.c:186
186 assert(JobSTT[s0][s1]);
(gdb) list
181 JobStatus s0 = job->status;
182 assert(s1 >= 0 && s1 < JOB_STATUS__MAX);
183 trace_job_state_transition(job, job->ret,
184    JobSTT[s0][s1] ? "allowed" : 
"disallowed",
185    JobStatus_str(s0), JobStatus_str(s1));
186 assert(JobSTT[s0][s1]);
187

Re: [PATCH v6 07/11] block: use int64_t instead of int in driver write_zeroes handlers

2021-09-23 Thread Vladimir Sementsov-Ogievskiy


23.09.2021 23:33, Eric Blake wrote:

On Fri, Sep 03, 2021 at 01:28:03PM +0300, Vladimir Sementsov-Ogievskiy wrote:

We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

So, convert driver write_zeroes handlers bytes parameter to int64_t.

The only caller of all updated function is bdrv_co_do_pwrite_zeroes().

bdrv_co_do_pwrite_zeroes() itself is of course OK with widening of
callee parameter type. Also, bdrv_co_do_pwrite_zeroes()'s
max_write_zeroes is limited to INT_MAX. So, updated functions all are
safe, they will not get "bytes" larger than before.

Still, let's look through all updated functions, and add assertions to
the ones which are actually unprepared to values larger than INT_MAX.
For these drivers also set explicit max_pwrite_zeroes limit.


[snip]


At this point all block drivers are prepared to support 64bit
write-zero requests, or have explicitly set max_pwrite_zeroes.


The long commit message is essential, but the analysis looks sane.



Signed-off-by: Vladimir Sementsov-Ogievskiy 
---



+++ b/block/iscsi.c



@@ -1250,11 +1250,21 @@ coroutine_fn iscsi_co_pwrite_zeroes(BlockDriverState 
*bs, int64_t offset,
  iscsi_co_init_iscsitask(iscsilun, &iTask);
  retry:
  if (use_16_for_ws) {
+/*
+ * iscsi_writesame16_task num_blocks argument is uint32_t. We rely here
+ * on our max_pwrite_zeroes limit.
+ */
+assert(nb_blocks < UINT32_MAX);
  iTask.task = iscsi_writesame16_task(iscsilun->iscsi, iscsilun->lun, 
lba,
  iscsilun->zeroblock, 
iscsilun->block_size,
  nb_blocks, 0, !!(flags & 
BDRV_REQ_MAY_UNMAP),
  0, 0, iscsi_co_generic_cb, 
&iTask);


Should this be <= instead of < ?


  } else {
+/*
+ * iscsi_writesame10_task num_blocks argument is uint16_t. We rely here
+ * on our max_pwrite_zeroes limit.
+ */
+assert(nb_blocks < UINT16_MAX);
  iTask.task = iscsi_writesame10_task(iscsilun->iscsi, iscsilun->lun, 
lba,
  iscsilun->zeroblock, 
iscsilun->block_size,
  nb_blocks, 0, !!(flags & 
BDRV_REQ_MAY_UNMAP),


here too.  The 16-bit limit is where we're most likely to run into
someone actually trying to zeroize that much at once.


+++ b/block/nbd.c
@@ -1407,15 +1407,17 @@ static int nbd_client_co_pwritev(BlockDriverState *bs, 
int64_t offset,
  }
  
  static int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,

-   int bytes, BdrvRequestFlags flags)
+   int64_t bytes, BdrvRequestFlags flags)
  {
  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
  NBDRequest request = {
  .type = NBD_CMD_WRITE_ZEROES,
  .from = offset,
-.len = bytes,
+.len = bytes,  /* .len is uint32_t actually */
  };
  
+assert(bytes < UINT32_MAX); /* relay on max_pwrite_zeroes */


And again.  Here, you happen to get by with < because we clamped
bl.max_pwrite_zeroes at BDRV_REQUEST_MAX_BYTES, which is INT_MAX
rounded down.  But I had to check; whereas using <= would be less
worrisome, even if we never get a request that large.

If you agree with my analysis, I can make that change while preparing
my pull request.


I agree, <= should be right thing, thanks!



Reviewed-by: Eric Blake 




--
Best regards,
Vladimir

Re: [PATCH v6 10/11] block: use int64_t instead of int in driver discard handlers

2021-09-23 Thread Eric Blake

On Fri, Sep 03, 2021 at 01:28:06PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We are generally moving to int64_t for both offset and bytes parameters
> on all io paths.
> 
> Main motivation is realization of 64-bit write_zeroes operation for
> fast zeroing large disk chunks, up to the whole disk.
> 
> We chose signed type, to be consistent with off_t (which is signed) and
> with possibility for signed return type (where negative value means
> error).
> 
> So, convert driver discard handlers bytes parameter to int64_t.
> 
> The only caller of all updated function is bdrv_co_pdiscard in
> block/io.c. It is already prepared to work with 64bit requests, but
> pass at most max(bs->bl.max_pdiscard, INT_MAX) to the driver.
> 
> Let's look at all updated functions:

> 
> Great! Now all drivers are prepared to handle 64bit discard requests,
> or else have explicit max_pdiscard limits.

Very similar analysis to the write-zeroes.

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---

> +++ b/block/nbd.c
> @@ -1457,15 +1457,17 @@ static int nbd_client_co_flush(BlockDriverState *bs)
>  }
>  
>  static int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset,
> -  int bytes)
> +  int64_t bytes)
>  {
>  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
>  NBDRequest request = {
>  .type = NBD_CMD_TRIM,
>  .from = offset,
> -.len = bytes,
> +.len = bytes, /* len is uint32_t */
>  };
>  
> +assert(bytes <= UINT32_MAX); /* rely on max_pdiscard */

Aha - you used <= here, compared to < in 7/11 on write zeroes.
Consistency says we want <= in both places.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v6 07/11] block: use int64_t instead of int in driver write_zeroes handlers

2021-09-23 Thread Eric Blake

On Thu, Sep 23, 2021 at 03:33:45PM -0500, Eric Blake wrote:
> > +++ b/block/nbd.c
> > @@ -1407,15 +1407,17 @@ static int nbd_client_co_pwritev(BlockDriverState 
> > *bs, int64_t offset,
> >  }
> >  
> >  static int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t 
> > offset,
> > -   int bytes, BdrvRequestFlags flags)
> > +   int64_t bytes, BdrvRequestFlags 
> > flags)
> >  {
> >  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
> >  NBDRequest request = {
> >  .type = NBD_CMD_WRITE_ZEROES,
> >  .from = offset,
> > -.len = bytes,
> > +.len = bytes,  /* .len is uint32_t actually */
> >  };
> >  
> > +assert(bytes < UINT32_MAX); /* relay on max_pwrite_zeroes */
> 
> And again.  Here, you happen to get by with < because we clamped
> bl.max_pwrite_zeroes at BDRV_REQUEST_MAX_BYTES, which is INT_MAX
> rounded down.  But I had to check; whereas using <= would be less
> worrisome, even if we never get a request that large.

Whoops, I was reading a local patch of mine.  Upstream has merely:

uint32_t max = MIN_NON_ZERO(NBD_MAX_BUFFER_SIZE, s->info.max_block);

bs->bl.max_pdiscard = QEMU_ALIGN_DOWN(INT_MAX, min);
bs->bl.max_pwrite_zeroes = max;

which is an even smaller limit than BDRV_REQUEST_MAX_BYTES (and
obviously one we're trying to raise).  But the point remains that
using <= rather than < will make it easier to review the code where we
raise the limits (either up to the 4G-1 limit of the current protocol,
or with protocol extensions to finally get to 64-bit requests).

> 
> If you agree with my analysis, I can make that change while preparing
> my pull request.
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v6 07/11] block: use int64_t instead of int in driver write_zeroes handlers

2021-09-23 Thread Eric Blake

On Fri, Sep 03, 2021 at 01:28:03PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We are generally moving to int64_t for both offset and bytes parameters
> on all io paths.
> 
> Main motivation is realization of 64-bit write_zeroes operation for
> fast zeroing large disk chunks, up to the whole disk.
> 
> We chose signed type, to be consistent with off_t (which is signed) and
> with possibility for signed return type (where negative value means
> error).
> 
> So, convert driver write_zeroes handlers bytes parameter to int64_t.
> 
> The only caller of all updated function is bdrv_co_do_pwrite_zeroes().
> 
> bdrv_co_do_pwrite_zeroes() itself is of course OK with widening of
> callee parameter type. Also, bdrv_co_do_pwrite_zeroes()'s
> max_write_zeroes is limited to INT_MAX. So, updated functions all are
> safe, they will not get "bytes" larger than before.
> 
> Still, let's look through all updated functions, and add assertions to
> the ones which are actually unprepared to values larger than INT_MAX.
> For these drivers also set explicit max_pwrite_zeroes limit.
> 
[snip]
> 
> At this point all block drivers are prepared to support 64bit
> write-zero requests, or have explicitly set max_pwrite_zeroes.

The long commit message is essential, but the analysis looks sane.

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---

> +++ b/block/iscsi.c

> @@ -1250,11 +1250,21 @@ coroutine_fn iscsi_co_pwrite_zeroes(BlockDriverState 
> *bs, int64_t offset,
>  iscsi_co_init_iscsitask(iscsilun, &iTask);
>  retry:
>  if (use_16_for_ws) {
> +/*
> + * iscsi_writesame16_task num_blocks argument is uint32_t. We rely 
> here
> + * on our max_pwrite_zeroes limit.
> + */
> +assert(nb_blocks < UINT32_MAX);
>  iTask.task = iscsi_writesame16_task(iscsilun->iscsi, iscsilun->lun, 
> lba,
>  iscsilun->zeroblock, 
> iscsilun->block_size,
>  nb_blocks, 0, !!(flags & 
> BDRV_REQ_MAY_UNMAP),
>  0, 0, iscsi_co_generic_cb, 
> &iTask);

Should this be <= instead of < ?

>  } else {
> +/*
> + * iscsi_writesame10_task num_blocks argument is uint16_t. We rely 
> here
> + * on our max_pwrite_zeroes limit.
> + */
> +assert(nb_blocks < UINT16_MAX);
>  iTask.task = iscsi_writesame10_task(iscsilun->iscsi, iscsilun->lun, 
> lba,
>  iscsilun->zeroblock, 
> iscsilun->block_size,
>  nb_blocks, 0, !!(flags & 
> BDRV_REQ_MAY_UNMAP),

here too.  The 16-bit limit is where we're most likely to run into
someone actually trying to zeroize that much at once.

> +++ b/block/nbd.c
> @@ -1407,15 +1407,17 @@ static int nbd_client_co_pwritev(BlockDriverState 
> *bs, int64_t offset,
>  }
>  
>  static int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
> -   int bytes, BdrvRequestFlags flags)
> +   int64_t bytes, BdrvRequestFlags flags)
>  {
>  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
>  NBDRequest request = {
>  .type = NBD_CMD_WRITE_ZEROES,
>  .from = offset,
> -.len = bytes,
> +.len = bytes,  /* .len is uint32_t actually */
>  };
>  
> +assert(bytes < UINT32_MAX); /* relay on max_pwrite_zeroes */

And again.  Here, you happen to get by with < because we clamped
bl.max_pwrite_zeroes at BDRV_REQUEST_MAX_BYTES, which is INT_MAX
rounded down.  But I had to check; whereas using <= would be less
worrisome, even if we never get a request that large.

If you agree with my analysis, I can make that change while preparing
my pull request.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [RFC PATCH v2] hw/nvme:Adding Support for namespace management

2021-09-23 Thread Klaus Jensen

On Aug 19 18:39, Naveen Nagar wrote:
> From: Naveen 
> 
> This patch supports namespace management : create and delete operations.
> 
> Since v1:
> - Modified and moved nvme_ns_identify_common in ns.c file 
> - Added check for CSI field in NS management
> - Indentation fix in namespace create
> 
> This patch has been tested with the following command and size of image
> file for unallocated namespaces is taken as 0GB. ns_create will look into
> the list of unallocated namespaces and it will initialize the same and 
> return the nsid of the same. A new mandatory field has been added called
> tnvmcap and we have ensured that the total capacity of namespace created
> does not exceed tnvmcap
> 
> -device nvme-subsys,id=subsys0,tnvmcap=8
> -device nvme,serial=foo,id=nvme0,subsys=subsys0
> -device nvme,serial=bar,id=nvme1,subsys=subsys0
> -drive id=ns1,file=ns1.img,if=none
> -device nvme-ns,drive=ns1,bus=nvme0,nsid=1,zoned=false,shared=true
> -drive id=ns2,file=ns2.img,if=none
> -device nvme-ns,drive=ns2,bus=nvme0,nsid=2,zoned=false,shared=true
> -drive id=ns3,file=ns3.img,if=none
> -device nvme-ns,drive=ns3,bus=nvme0,nsid=3,zoned=false,shared=true
> -drive id=ns4,file=ns4.img,if=none
> -device nvme-ns,drive=ns4,bus=nvme0,nsid=4,zoned=false,shared=true
> 
> Please review and suggest if any changes are required.
> 
> Signed-off-by: Naveen Nagar 
> Reviewed-by: Klaus Jensen 
>   

So, I think this is a fine approach.

However, I think we should let it simmer until we know if my -object
refactoring will be accepted as a way forward. In that case, I'd like to
only add it there and likely as a new namespace "type" (i.e.
x-nvme-ns-unallocated) that will be replaced with a dynamically created
object depending on CSI.


signature.asc
Description: PGP signature

Re: [PATCH v2 2/6] iotests: add warning for rogue 'qemu' packages

2021-09-23 Thread Vladimir Sementsov-Ogievskiy

23.09.2021 21:44, John Snow wrote:

On Thu, Sep 23, 2021 at 2:32 PM Vladimir Sementsov-Ogievskiy mailto:vsement...@virtuozzo.com>> wrote:

23.09.2021 21:07, John Snow wrote:
 > Add a warning for when 'iotests' runs against a qemu namespace that
 > isn't the one in the source tree. This might occur if you have
 > (accidentally) installed the Python namespace package to your local
 > packages.
 >
 > (I'm not going to say that this is because I bit myself with this,
 > but you can fill in the blanks.)
 >
 > In the future, we will pivot to always preferring a specific installed
 > instance of qemu python packages managed directly by iotests. For now
 > simply warn if there is an ambiguity over which instance that iotests
 > might use.
 >
 > Example: If a user has navigated to ~/src/qemu/python and executed
 > `pip install .`, you will see output like this when running `./check`:
 >
 > WARNING: 'qemu' python packages will be imported from outside the source 
tree ('/home/jsnow/src/qemu/python')
 >           Importing instead from 
'/home/jsnow/.local/lib/python3.9/site-packages/qemu'
 >
 > Signed-off-by: John Snow mailto:js...@redhat.com>>
 > ---
 >   tests/qemu-iotests/testenv.py | 24 
 >   1 file changed, 24 insertions(+)
 >
 > diff --git a/tests/qemu-iotests/testenv.py 
b/tests/qemu-iotests/testenv.py
 > index 99a57a69f3a..1c0f6358538 100644
 > --- a/tests/qemu-iotests/testenv.py
 > +++ b/tests/qemu-iotests/testenv.py
 > @@ -16,6 +16,8 @@
 >   # along with this program.  If not, see >.
 >   #
 >
 > +import importlib.util
 > +import logging
 >   import os
 >   import sys
 >   import tempfile
 > @@ -112,6 +114,27 @@ def init_directories(self) -> None:
 >           # Path where qemu goodies live in this source tree.
 >           qemu_srctree_path = Path(__file__, '../../../python').resolve()
 >
 > +        # Warn if we happen to be able to find qemu namespace packages
 > +        # (using qemu.qmp as a bellwether) from an unexpected location.
 > +        # i.e. the package is already installed in the user's 
environment.
 > +        try:
 > +            qemu_spec = importlib.util.find_spec('qemu.qmp')
 > +        except ModuleNotFoundError:
 > +            qemu_spec = None
 > +
 > +        if qemu_spec and qemu_spec.origin:
 > +            spec_path = Path(qemu_spec.origin)
 > +            try:
 > +                _ = spec_path.relative_to(qemu_srctree_path)

It took some time and looking at specification trying to understand what's 
going on here :)

Could we just use:

if not Path(qemu_spec.origin).is_relative_to(qemu_srctree_path):
     ... logging ...

Nope, that's 3.9+ only. (I made the same mistake.)

Oh :(

OK

 > +            except ValueError:

 > +                self._logger.warning(
 > +                    "WARNING: 'qemu' python packages will be imported 
from"
 > +                    " outside the source tree ('%s')",
 > +                    qemu_srctree_path)

why not use f-strings ? :)

 > +                self._logger.warning(
 > +                    "         Importing instead from '%s'",
 > +                    spec_path.parents[1])
 > +

Also, I'd move this new chunk of code to a separate function (may be even 
out of class, as the only usage of self is self._logger, which you introduce 
with this patch. Still a method would be OK too). And then, just call it from 
__init__(). Just to keep init_directories() simpler. And with this new code we 
don't init any directories to pass to further test execution, it's just a check 
for runtime environment.

I wanted to keep the wiggly python import logic all in one place so that it was 
harder to accidentally forget a piece of it if/when we adjust it.

Hmm right.. I didn't look from that point of view.

So, we actually check the library we are using now is the same which we pass to 
called tests.

So, it's a right place for it. And it's about the fact that we are still 
hacking around importing libraries :) Hope for bright future.

I can create a standalone function for it, but I'd need to stash that 
qemu_srctree_path variable somewhere if we want to call that runtime check from 
somewhere else, because I don't want to compute it twice. Is it still worth 
doing in your opinion if I just create a method/function and pass it the 
qemu_srctree_path variable straight from init_directories()?

My first impression was that init_directories() is not a right place. But now I 
see that we want to check exactly this qemu_srctree_path, which we are going to 
pass to tests.

So, I'm OK as is.

Still, may be adding helper function like

def warn_if_module_loads_not_from(module_name, path):

worth d

Re: [PATCH v2] hw/nvme: Return error for fused operations

2021-09-23 Thread Klaus Jensen

On Sep 15 17:43, Pankaj Raghav wrote:
> Currently, FUSED operations are not supported by QEMU. As per the 1.4 SPEC,
> controller should abort the command that requested a fused operation with 
> an INVALID FIELD error code if they are not supported.
> 
> Changes from v1:
> Added FUSE flag check also to the admin cmd processing as the FUSED 
> operations are mentioned in the general SQE section in the SPEC. 
> 
> Signed-off-by: Pankaj Raghav 
> ---
>  hw/nvme/ctrl.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index dc0e7b0030..2f247a9275 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -3893,6 +3893,10 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest 
> *req)
>  return ns->status;
>  }
>  
> +if (NVME_CMD_FLAGS_FUSE(req->cmd.flags)) {
> +return NVME_INVALID_FIELD;
> +}
> +
>  req->ns = ns;
>  
>  switch (req->cmd.opcode) {
> @@ -5475,6 +5479,10 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, 
> NvmeRequest *req)
>  return NVME_INVALID_FIELD | NVME_DNR;
>  }
>  
> +if (NVME_CMD_FLAGS_FUSE(req->cmd.flags)) {
> +return NVME_INVALID_FIELD;
> +}
> +
>  switch (req->cmd.opcode) {
>  case NVME_ADM_CMD_DELETE_SQ:
>  return nvme_del_sq(n, req);
> -- 
> 2.25.1

Applied to nvme-next. Thanks!


signature.asc
Description: PGP signature

Re: [PATCH] hw/nvme: reattach subsystem namespaces on hotplug

2021-09-23 Thread Klaus Jensen

On Sep  9 13:37, Hannes Reinecke wrote:
> On 9/9/21 12:47 PM, Klaus Jensen wrote:
> > On Sep  9 11:43, Hannes Reinecke wrote:
> >> With commit 5ffbaeed16 ("hw/nvme: fix controller hot unplugging")
> >> namespaces get moved from the controller to the subsystem if one
> >> is specified.
> >> That keeps the namespaces alive after a controller hot-unplug, but
> >> after a controller hotplug we have to reconnect the namespaces
> >> from the subsystem to the controller.
> >>
> >> Fixes: 5ffbaeed16 ("hw/nvme: fix controller hot unplugging")
> >> Cc: Klaus Jensen 
> >> Signed-off-by: Hannes Reinecke 
> >> ---
> >>  hw/nvme/subsys.c | 8 +++-
> >>  1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/hw/nvme/subsys.c b/hw/nvme/subsys.c
> >> index 93c35950d6..a9404f2b5e 100644
> >> --- a/hw/nvme/subsys.c
> >> +++ b/hw/nvme/subsys.c
> >> @@ -14,7 +14,7 @@
> >>  int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp)
> >>  {
> >>  NvmeSubsystem *subsys = n->subsys;
> >> -int cntlid;
> >> +int cntlid, nsid;
> >>  
> >>  for (cntlid = 0; cntlid < ARRAY_SIZE(subsys->ctrls); cntlid++) {
> >>  if (!subsys->ctrls[cntlid]) {
> >> @@ -29,12 +29,18 @@ int nvme_subsys_register_ctrl(NvmeCtrl *n, Error 
> >> **errp)
> >>  
> >>  subsys->ctrls[cntlid] = n;
> >>  
> >> +for (nsid = 0; nsid < ARRAY_SIZE(subsys->namespaces); nsid++) {
> >> +if (subsys->namespaces[nsid]) {
> >> +nvme_attach_ns(n, subsys->namespaces[nsid]);
> >> +}
> > 
> > Thanks Hannes! I like it, keeping things simple.
> > 
> > But we should only attach namespaces that have the shared property or
> > have ns->attached == 0. Non-shared namespaces may already be attached to
> > another controller in the subsystem.
> > 
> 
> Well ... I tried to avoid that subject, but as you brought it up:
> There is a slightly tricky issue in fabrics, in that the 'controller' is
> independent from the 'connection'.
> The 'shared' bit in the CMIC setting indicates that the subsystem may
> have more than one _controller_. It doesn't talk about how many
> _connections_ a controller may support; that then is the realm of
> dynamic or static controllers, which we don't talk about :-).
> Sufficient to say, linux only implements the dynamic controller model,
> so every connection will be to a different controller.
> But you have been warned :-)
> 
> However, the 'CMIC' setting is independent on the 'NMIC' setting (ie the
> 'shared' bit in the namespace).
> Which leads to the interesting question on how exactly should one handle
> non-shared namespaces in subsystems for which there are multiple
> controllers. Especially as the NSID space is per subsystem, so each
> controller will be able to figure out if there are blanked-out namespaces.
> So to make this a sane setup I would propose to set the 'shared' option
> automatically whenever the controller has the 'subsys' option set.
> And actually, I would ditch the 'shared' option completely, and make it
> dependent on the 'subsys' setting for the controller.
> Much like we treat the 'CMIC' setting nowadays.
> That avoids lots of confusions, and also make the implementation _way_
> easier.
> 

I see your point. Unfortunately, there is no easy way to ditch shared=
now. But I think it'd be good enough to attach to shared automatically,
but not to "shared=off".

I've already ditched the shared parameter on my RFC refactor series and
always having the namespaces shared.

> > However...
> > 
> > The spec says that "The attach and detach operations are persistent
> > across all reset events.". This means that we should track those events
> > in the subsystem and only reattach namespaces that were attached at the
> > time of the "reset" event. Currently, we don't have anything mapping
> > that state. But the device already has to take some liberties with
> > regard to stuff that is considered persistent by the spec (SMART log
> > etc.) since we do not have any way to store stuff persistently across
> > qemu invocations, so I think the above is an acceptable compromise.
> > 
> Careful. 'attach' and 'detach' are MI (management interface) operations.
> If and how many namespaces are visible to any given controllers is
> actually independent on that; and, in fact, controllers might not even
> implement 'attach' or 'detach'.
> But I do agree that we don't handle the 'reset' state properly.
> 

The Namespace Attachment command has nothing to do with MI? And the QEMU
controller does support attaching and detaching namespaces.

> > A potential (as good as it gets) fix would be to keep a map/list of
> > "persistently" attached controllers on the namespaces and re-attach
> > according to that when we see that controller joining the subsystem
> > again. CNTLID would be the obvious choice for the key here, but problem
> > is that we cant really use it since we assign it sequentially from the
> > subsystem, which now looks like a pretty bad choice. CNTLID should have
> > been a require

Re: [PATCH v6 06/11] block: make BlockLimits::max_pwrite_zeroes 64bit

2021-09-23 Thread Eric Blake

On Fri, Sep 03, 2021 at 01:28:02PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We are going to support 64 bit write-zeroes requests. Now update the
> limit variable. It's absolutely safe. The variable is set in some
> drivers, and used in bdrv_co_do_pwrite_zeroes().
> 
> Update also max_write_zeroes variable in bdrv_co_do_pwrite_zeroes(), so
> that bdrv_co_do_pwrite_zeroes() is now prepared to 64bit requests. The
> remaining logic including num, offset and bytes variables is already
> supporting 64bit requests.
> 
> So the only thing that prevents 64 bit requests is limiting
> max_write_zeroes variable to INT_MAX in bdrv_co_do_pwrite_zeroes().
> We'll drop this limitation after updating all block drivers.
> 
> Ah, we also have bdrv_check_request32() in bdrv_co_pwritev_part(). It
> will be modified to do bdrv_check_request() for write-zeroes path.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/block_int.h | 9 +
>  block/io.c| 2 +-
>  2 files changed, 6 insertions(+), 5 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 2/6] iotests: add warning for rogue 'qemu' packages

2021-09-23 Thread John Snow

On Thu, Sep 23, 2021 at 2:32 PM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> 23.09.2021 21:07, John Snow wrote:
> > Add a warning for when 'iotests' runs against a qemu namespace that
> > isn't the one in the source tree. This might occur if you have
> > (accidentally) installed the Python namespace package to your local
> > packages.
> >
> > (I'm not going to say that this is because I bit myself with this,
> > but you can fill in the blanks.)
> >
> > In the future, we will pivot to always preferring a specific installed
> > instance of qemu python packages managed directly by iotests. For now
> > simply warn if there is an ambiguity over which instance that iotests
> > might use.
> >
> > Example: If a user has navigated to ~/src/qemu/python and executed
> > `pip install .`, you will see output like this when running `./check`:
> >
> > WARNING: 'qemu' python packages will be imported from outside the source
> tree ('/home/jsnow/src/qemu/python')
> >   Importing instead from
> '/home/jsnow/.local/lib/python3.9/site-packages/qemu'
> >
> > Signed-off-by: John Snow 
> > ---
> >   tests/qemu-iotests/testenv.py | 24 
> >   1 file changed, 24 insertions(+)
> >
> > diff --git a/tests/qemu-iotests/testenv.py
> b/tests/qemu-iotests/testenv.py
> > index 99a57a69f3a..1c0f6358538 100644
> > --- a/tests/qemu-iotests/testenv.py
> > +++ b/tests/qemu-iotests/testenv.py
> > @@ -16,6 +16,8 @@
> >   # along with this program.  If not, see  >.
> >   #
> >
> > +import importlib.util
> > +import logging
> >   import os
> >   import sys
> >   import tempfile
> > @@ -112,6 +114,27 @@ def init_directories(self) -> None:
> >   # Path where qemu goodies live in this source tree.
> >   qemu_srctree_path = Path(__file__, '../../../python').resolve()
> >
> > +# Warn if we happen to be able to find qemu namespace packages
> > +# (using qemu.qmp as a bellwether) from an unexpected location.
> > +# i.e. the package is already installed in the user's
> environment.
> > +try:
> > +qemu_spec = importlib.util.find_spec('qemu.qmp')
> > +except ModuleNotFoundError:
> > +qemu_spec = None
> > +
> > +if qemu_spec and qemu_spec.origin:
> > +spec_path = Path(qemu_spec.origin)
> > +try:
> > +_ = spec_path.relative_to(qemu_srctree_path)
>
> It took some time and looking at specification trying to understand what's
> going on here :)
>
> Could we just use:
>
> if not Path(qemu_spec.origin).is_relative_to(qemu_srctree_path):
> ... logging ...
>
>
Nope, that's 3.9+ only. (I made the same mistake.)


> > +except ValueError:
>
> +self._logger.warning(
> > +"WARNING: 'qemu' python packages will be imported
> from"
> > +" outside the source tree ('%s')",
> > +qemu_srctree_path)
> > +self._logger.warning(
> > +" Importing instead from '%s'",
> > +spec_path.parents[1])
> > +
>
> Also, I'd move this new chunk of code to a separate function (may be even
> out of class, as the only usage of self is self._logger, which you
> introduce with this patch. Still a method would be OK too). And then, just
> call it from __init__(). Just to keep init_directories() simpler. And with
> this new code we don't init any directories to pass to further test
> execution, it's just a check for runtime environment.
>
>
I wanted to keep the wiggly python import logic all in one place so that it
was harder to accidentally forget a piece of it if/when we adjust it. I can
create a standalone function for it, but I'd need to stash that
qemu_srctree_path variable somewhere if we want to call that runtime check
from somewhere else, because I don't want to compute it twice. Is it still
worth doing in your opinion if I just create a method/function and pass it
the qemu_srctree_path variable straight from init_directories()?

Not adding _logger is valid, though. I almost removed it myself. I'll
squash that in.


> >   self.pythonpath = os.pathsep.join(filter(None, (
> >   self.source_iotests,
> >   str(qemu_srctree_path),
> > @@ -230,6 +253,7 @@ def __init__(self, imgfmt: str, imgproto: str,
> aiomode: str,
> >
> >   self.build_root = os.path.join(self.build_iotests, '..', '..')
> >
> > +self._logger = logging.getLogger('qemu.iotests')
> >   self.init_directories()
> >   self.init_binaries()
> >
> >
>
>
> --
> Best regards,
> Vladimir
>
>

Re: [PATCH v2 2/6] iotests: add warning for rogue 'qemu' packages

2021-09-23 Thread Vladimir Sementsov-Ogievskiy


23.09.2021 21:07, John Snow wrote:

Add a warning for when 'iotests' runs against a qemu namespace that
isn't the one in the source tree. This might occur if you have
(accidentally) installed the Python namespace package to your local
packages.

(I'm not going to say that this is because I bit myself with this,
but you can fill in the blanks.)

In the future, we will pivot to always preferring a specific installed
instance of qemu python packages managed directly by iotests. For now
simply warn if there is an ambiguity over which instance that iotests
might use.

Example: If a user has navigated to ~/src/qemu/python and executed
`pip install .`, you will see output like this when running `./check`:

WARNING: 'qemu' python packages will be imported from outside the source tree 
('/home/jsnow/src/qemu/python')
  Importing instead from 
'/home/jsnow/.local/lib/python3.9/site-packages/qemu'

Signed-off-by: John Snow 
---
  tests/qemu-iotests/testenv.py | 24 
  1 file changed, 24 insertions(+)

diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 99a57a69f3a..1c0f6358538 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -16,6 +16,8 @@
  # along with this program.  If not, see .
  #
  
+import importlib.util

+import logging
  import os
  import sys
  import tempfile
@@ -112,6 +114,27 @@ def init_directories(self) -> None:
  # Path where qemu goodies live in this source tree.
  qemu_srctree_path = Path(__file__, '../../../python').resolve()
  
+# Warn if we happen to be able to find qemu namespace packages

+# (using qemu.qmp as a bellwether) from an unexpected location.
+# i.e. the package is already installed in the user's environment.
+try:
+qemu_spec = importlib.util.find_spec('qemu.qmp')
+except ModuleNotFoundError:
+qemu_spec = None
+
+if qemu_spec and qemu_spec.origin:
+spec_path = Path(qemu_spec.origin)
+try:
+_ = spec_path.relative_to(qemu_srctree_path)


It took some time and looking at specification trying to understand what's 
going on here :)

Could we just use:

if not Path(qemu_spec.origin).is_relative_to(qemu_srctree_path):
   ... logging ...



+except ValueError:
+self._logger.warning(
+"WARNING: 'qemu' python packages will be imported from"
+" outside the source tree ('%s')",
+qemu_srctree_path)
+self._logger.warning(
+" Importing instead from '%s'",
+spec_path.parents[1])
+


Also, I'd move this new chunk of code to a separate function (may be even out 
of class, as the only usage of self is self._logger, which you introduce with 
this patch. Still a method would be OK too). And then, just call it from 
__init__(). Just to keep init_directories() simpler. And with this new code we 
don't init any directories to pass to further test execution, it's just a check 
for runtime environment.


  self.pythonpath = os.pathsep.join(filter(None, (
  self.source_iotests,
  str(qemu_srctree_path),
@@ -230,6 +253,7 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: str,
  
  self.build_root = os.path.join(self.build_iotests, '..', '..')
  
+self._logger = logging.getLogger('qemu.iotests')

  self.init_directories()
  self.init_binaries()
  




--
Best regards,
Vladimir

[PATCH v2 5/6] iotests/migrate-bitmaps-test: delint

2021-09-23 Thread John Snow

Mostly uninteresting stuff. Move the test injections under a function
named main() so that the variables used during that process aren't in
the global scope.

Signed-off-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Hanna Reitz 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/tests/migrate-bitmaps-test | 50 +++
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/tests/qemu-iotests/tests/migrate-bitmaps-test 
b/tests/qemu-iotests/tests/migrate-bitmaps-test
index dc431c35b35..c23df3d75c7 100755
--- a/tests/qemu-iotests/tests/migrate-bitmaps-test
+++ b/tests/qemu-iotests/tests/migrate-bitmaps-test
@@ -19,10 +19,11 @@
 # along with this program.  If not, see .
 #
 
-import os
 import itertools
 import operator
+import os
 import re
+
 import iotests
 from iotests import qemu_img, qemu_img_create, Timeout
 
@@ -224,25 +225,6 @@ def inject_test_case(klass, suffix, method, *args, 
**kwargs):
 setattr(klass, 'test_' + method + suffix, lambda self: mc(self))
 
 
-for cmb in list(itertools.product((True, False), repeat=5)):
-name = ('_' if cmb[0] else '_not_') + 'persistent_'
-name += ('_' if cmb[1] else '_not_') + 'migbitmap_'
-name += '_online' if cmb[2] else '_offline'
-name += '_shared' if cmb[3] else '_nonshared'
-if cmb[4]:
-name += '__pre_shutdown'
-
-inject_test_case(TestDirtyBitmapMigration, name, 'do_test_migration',
- *list(cmb))
-
-for cmb in list(itertools.product((True, False), repeat=2)):
-name = ('_' if cmb[0] else '_not_') + 'persistent_'
-name += ('_' if cmb[1] else '_not_') + 'migbitmap'
-
-inject_test_case(TestDirtyBitmapMigration, name,
- 'do_test_migration_resume_source', *list(cmb))
-
-
 class TestDirtyBitmapBackingMigration(iotests.QMPTestCase):
 def setUp(self):
 qemu_img_create('-f', iotests.imgfmt, base_a, size)
@@ -304,6 +286,30 @@ class TestDirtyBitmapBackingMigration(iotests.QMPTestCase):
 self.assert_qmp(result, 'return', {})
 
 
+def main() -> None:
+for cmb in list(itertools.product((True, False), repeat=5)):
+name = ('_' if cmb[0] else '_not_') + 'persistent_'
+name += ('_' if cmb[1] else '_not_') + 'migbitmap_'
+name += '_online' if cmb[2] else '_offline'
+name += '_shared' if cmb[3] else '_nonshared'
+if cmb[4]:
+name += '__pre_shutdown'
+
+inject_test_case(TestDirtyBitmapMigration, name, 'do_test_migration',
+ *list(cmb))
+
+for cmb in list(itertools.product((True, False), repeat=2)):
+name = ('_' if cmb[0] else '_not_') + 'persistent_'
+name += ('_' if cmb[1] else '_not_') + 'migbitmap'
+
+inject_test_case(TestDirtyBitmapMigration, name,
+ 'do_test_migration_resume_source', *list(cmb))
+
+iotests.main(
+supported_fmts=['qcow2'],
+supported_protocols=['file']
+)
+
+
 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2'],
- supported_protocols=['file'])
+main()
-- 
2.31.1

[PATCH v2 6/6] iotests: Update for pylint 2.11.1

2021-09-23 Thread John Snow

1. Ignore the new f-strings warning, we're not interested in doing a
   full conversion at this time.

2. Just mute the unbalanced-tuple-unpacking warning, it's not a real
   error in this case and muting the dozens of callsites is just not
   worth it.

3. Add encodings to read_text().

Signed-off-by: John Snow 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/pylintrc  | 6 +-
 tests/qemu-iotests/testrunner.py | 7 ---
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/pylintrc b/tests/qemu-iotests/pylintrc
index f2c0b522ac0..8cb4e1d6a6d 100644
--- a/tests/qemu-iotests/pylintrc
+++ b/tests/qemu-iotests/pylintrc
@@ -19,13 +19,17 @@ disable=invalid-name,
 too-many-public-methods,
 # pylint warns about Optional[] etc. as unsubscriptable in 3.9
 unsubscriptable-object,
+# pylint's static analysis causes false positivies for file_path();
+# If we really care to make it statically knowable, we'll use mypy.
+unbalanced-tuple-unpacking,
 # Sometimes we need to disable a newly introduced pylint warning.
 # Doing so should not produce a warning in older versions of pylint.
 bad-option-value,
 # These are temporary, and should be removed:
 missing-docstring,
 too-many-return-statements,
-too-many-statements
+too-many-statements,
+consider-using-f-string,
 
 [FORMAT]
 
diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index 4a6ec421ed6..a56b6da3968 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -266,12 +266,13 @@ def do_run_test(self, test: str) -> TestResult:
   diff=file_diff(str(f_reference), str(f_bad)))
 
 if f_notrun.exists():
-return TestResult(status='not run',
-  description=f_notrun.read_text().strip())
+return TestResult(
+status='not run',
+description=f_notrun.read_text(encoding='utf-8').strip())
 
 casenotrun = ''
 if f_casenotrun.exists():
-casenotrun = f_casenotrun.read_text()
+casenotrun = f_casenotrun.read_text(encoding='utf-8')
 
 diff = file_diff(str(f_reference), str(f_bad))
 if diff:
-- 
2.31.1

[PATCH v2 4/6] iotests/mirror-top-perms: Adjust imports

2021-09-23 Thread John Snow

We need to import subpackages from the qemu namespace package; importing
the namespace package alone doesn't bring the subpackages with it --
unless someone else (like iotests.py) imports them too.

Adjust the imports.

Signed-off-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Hanna Reitz 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/tests/mirror-top-perms | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/tests/mirror-top-perms 
b/tests/qemu-iotests/tests/mirror-top-perms
index 73138a0ef91..3d475aa3a54 100755
--- a/tests/qemu-iotests/tests/mirror-top-perms
+++ b/tests/qemu-iotests/tests/mirror-top-perms
@@ -21,7 +21,8 @@
 
 import os
 
-import qemu
+from qemu import qmp
+from qemu.machine import machine
 
 import iotests
 from iotests import qemu_img
@@ -46,7 +47,7 @@ class TestMirrorTopPerms(iotests.QMPTestCase):
 def tearDown(self):
 try:
 self.vm.shutdown()
-except qemu.machine.machine.AbnormalShutdown:
+except machine.AbnormalShutdown:
 pass
 
 if self.vm_b is not None:
@@ -101,7 +102,7 @@ class TestMirrorTopPerms(iotests.QMPTestCase):
 self.vm_b.launch()
 print('ERROR: VM B launched successfully, this should not have '
   'happened')
-except qemu.qmp.QMPConnectError:
+except qmp.QMPConnectError:
 assert 'Is another process using the image' in self.vm_b.get_log()
 
 result = self.vm.qmp('block-job-cancel',
-- 
2.31.1

[PATCH v2 1/6] iotests: add 'qemu' package location to PYTHONPATH in testenv

2021-09-23 Thread John Snow

We can drop the sys.path hacking in various places by doing
this. Additionally, by doing it in one place right up top, we can print
interesting warnings in case the environment does not look correct. (See
next commit.)

If we ever decide to change how the environment is crafted, all of the
"help me find my python packages" goop is all in one place, right in one
function.

Signed-off-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/235|  2 --
 tests/qemu-iotests/297|  6 --
 tests/qemu-iotests/300|  7 +++
 tests/qemu-iotests/iotests.py |  2 --
 tests/qemu-iotests/testenv.py | 15 +--
 tests/qemu-iotests/tests/mirror-top-perms |  7 +++
 6 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/tests/qemu-iotests/235 b/tests/qemu-iotests/235
index 8aed45f9a76..4de920c3801 100755
--- a/tests/qemu-iotests/235
+++ b/tests/qemu-iotests/235
@@ -24,8 +24,6 @@ import os
 import iotests
 from iotests import qemu_img_create, qemu_io, file_path, log
 
-sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
-
 from qemu.machine import QEMUMachine
 
 iotests.script_initialize(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297
index b04cba53667..467b712280e 100755
--- a/tests/qemu-iotests/297
+++ b/tests/qemu-iotests/297
@@ -68,12 +68,6 @@ def run_linters():
 # Todo notes are fine, but fixme's or xxx's should probably just be
 # fixed (in tests, at least)
 env = os.environ.copy()
-qemu_module_path = os.path.join(os.path.dirname(__file__),
-'..', '..', 'python')
-try:
-env['PYTHONPATH'] += os.pathsep + qemu_module_path
-except KeyError:
-env['PYTHONPATH'] = qemu_module_path
 subprocess.run(('pylint-3', '--score=n', '--notes=FIXME,XXX', *files),
env=env, check=False)
 
diff --git a/tests/qemu-iotests/300 b/tests/qemu-iotests/300
index fe94de84edd..10f9f2a8da6 100755
--- a/tests/qemu-iotests/300
+++ b/tests/qemu-iotests/300
@@ -24,12 +24,11 @@ import random
 import re
 from typing import Dict, List, Optional
 
-import iotests
-
-# Import qemu after iotests.py has amended sys.path
-# pylint: disable=wrong-import-order
 from qemu.machine import machine
 
+import iotests
+
+
 BlockBitmapMapping = List[Dict[str, object]]
 
 mig_sock = os.path.join(iotests.sock_dir, 'mig_sock')
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index ce06cf56304..b06ad76e0c5 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -36,8 +36,6 @@
 
 from contextlib import contextmanager
 
-# pylint: disable=import-error, wrong-import-position
-sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
 from qemu.machine import qtest
 from qemu.qmp import QMPMessage
 
diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 70da0d60c80..99a57a69f3a 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -108,12 +108,15 @@ def init_directories(self) -> None:
  SAMPLE_IMG_DIR
  OUTPUT_DIR
 """
-self.pythonpath = os.getenv('PYTHONPATH')
-if self.pythonpath:
-self.pythonpath = self.source_iotests + os.pathsep + \
-self.pythonpath
-else:
-self.pythonpath = self.source_iotests
+
+# Path where qemu goodies live in this source tree.
+qemu_srctree_path = Path(__file__, '../../../python').resolve()
+
+self.pythonpath = os.pathsep.join(filter(None, (
+self.source_iotests,
+str(qemu_srctree_path),
+os.getenv('PYTHONPATH'),
+)))
 
 self.test_dir = os.getenv('TEST_DIR',
   os.path.join(os.getcwd(), 'scratch'))
diff --git a/tests/qemu-iotests/tests/mirror-top-perms 
b/tests/qemu-iotests/tests/mirror-top-perms
index 2fc8dd66e0a..73138a0ef91 100755
--- a/tests/qemu-iotests/tests/mirror-top-perms
+++ b/tests/qemu-iotests/tests/mirror-top-perms
@@ -20,13 +20,12 @@
 #
 
 import os
+
+import qemu
+
 import iotests
 from iotests import qemu_img
 
-# Import qemu after iotests.py has amended sys.path
-# pylint: disable=wrong-import-order
-import qemu
-
 
 image_size = 1 * 1024 * 1024
 source = os.path.join(iotests.test_dir, 'source.img')
-- 
2.31.1

[PATCH v2 3/6] iotests/linters: check mypy files all at once

2021-09-23 Thread John Snow

We can circumvent the '__main__' redefinition problem by passing
--scripts-are-modules. Take mypy out of the loop per-filename and check
everything in one go: it's quite a bit faster.

Signed-off-by: John Snow 
Reviewed-by: Hanna Reitz 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/297 | 44 +++---
 1 file changed, 20 insertions(+), 24 deletions(-)

diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297
index 467b712280e..91ec34d9521 100755
--- a/tests/qemu-iotests/297
+++ b/tests/qemu-iotests/297
@@ -74,32 +74,28 @@ def run_linters():
 print('=== mypy ===')
 sys.stdout.flush()
 
-# We have to call mypy separately for each file.  Otherwise, it
-# will interpret all given files as belonging together (i.e., they
-# may not both define the same classes, etc.; most notably, they
-# must not both define the __main__ module).
 env['MYPYPATH'] = env['PYTHONPATH']
-for filename in files:
-p = subprocess.run(('mypy',
-'--warn-unused-configs',
-'--disallow-subclassing-any',
-'--disallow-any-generics',
-'--disallow-incomplete-defs',
-'--disallow-untyped-decorators',
-'--no-implicit-optional',
-'--warn-redundant-casts',
-'--warn-unused-ignores',
-'--no-implicit-reexport',
-'--namespace-packages',
-filename),
-   env=env,
-   check=False,
-   stdout=subprocess.PIPE,
-   stderr=subprocess.STDOUT,
-   universal_newlines=True)
+p = subprocess.run(('mypy',
+'--warn-unused-configs',
+'--disallow-subclassing-any',
+'--disallow-any-generics',
+'--disallow-incomplete-defs',
+'--disallow-untyped-decorators',
+'--no-implicit-optional',
+'--warn-redundant-casts',
+'--warn-unused-ignores',
+'--no-implicit-reexport',
+'--namespace-packages',
+'--scripts-are-modules',
+*files),
+   env=env,
+   check=False,
+   stdout=subprocess.PIPE,
+   stderr=subprocess.STDOUT,
+   universal_newlines=True)
 
-if p.returncode != 0:
-print(p.stdout)
+if p.returncode != 0:
+print(p.stdout)
 
 
 for linter in ('pylint-3', 'mypy'):
-- 
2.31.1

[PATCH v2 0/6] iotests: update environment and linting configuration

2021-09-23 Thread John Snow

GitLab: https://gitlab.com/jsnow/qemu/-/commits/python-package-iotest-pt1
CI: https://gitlab.com/jsnow/qemu/-/pipelines/376236687

This series partially supersedes:
  [PATCH v3 00/16] python/iotests: Run iotest linters during Python CI'

Howdy, this is good stuff we want even if we aren't yet in agreement
about the best way to run iotest 297 from CI.

- Update linting config to tolerate pylint 2.11.1
- Eliminate sys.path hacking in individual test files
- make mypy execution in test 297 faster

The rest of the actual "run at CI time" stuff can get handled separately
and later pending some discussion on the other series.

V2:

001/6:[0011] [FC] 'iotests: add 'qemu' package location to PYTHONPATH in 
testenv'
002/6:[0025] [FC] 'iotests: add warning for rogue 'qemu' packages'

- Squashed in a small optimization from Vladimir to 001, kept R-Bs.
- Fixed the package detection logic to not panic if it can't find
  'qemu' at all (kwolf)
- Updated commit messages for the first two patches.

--js

John Snow (6):
  iotests: add 'qemu' package location to PYTHONPATH in testenv
  iotests: add warning for rogue 'qemu' packages
  iotests/linters: check mypy files all at once
  iotests/mirror-top-perms: Adjust imports
  iotests/migrate-bitmaps-test: delint
  iotests: Update for pylint 2.11.1

 tests/qemu-iotests/235|  2 -
 tests/qemu-iotests/297| 50 ---
 tests/qemu-iotests/300|  7 ++-
 tests/qemu-iotests/iotests.py |  2 -
 tests/qemu-iotests/pylintrc   |  6 ++-
 tests/qemu-iotests/testenv.py | 39 ---
 tests/qemu-iotests/testrunner.py  |  7 +--
 tests/qemu-iotests/tests/migrate-bitmaps-test | 50 +++
 tests/qemu-iotests/tests/mirror-top-perms | 12 ++---
 9 files changed, 99 insertions(+), 76 deletions(-)

-- 
2.31.1

[PATCH v2 2/6] iotests: add warning for rogue 'qemu' packages

2021-09-23 Thread John Snow

Add a warning for when 'iotests' runs against a qemu namespace that
isn't the one in the source tree. This might occur if you have
(accidentally) installed the Python namespace package to your local
packages.

(I'm not going to say that this is because I bit myself with this,
but you can fill in the blanks.)

In the future, we will pivot to always preferring a specific installed
instance of qemu python packages managed directly by iotests. For now
simply warn if there is an ambiguity over which instance that iotests
might use.

Example: If a user has navigated to ~/src/qemu/python and executed
`pip install .`, you will see output like this when running `./check`:

WARNING: 'qemu' python packages will be imported from outside the source tree 
('/home/jsnow/src/qemu/python')
 Importing instead from 
'/home/jsnow/.local/lib/python3.9/site-packages/qemu'

Signed-off-by: John Snow 
---
 tests/qemu-iotests/testenv.py | 24 
 1 file changed, 24 insertions(+)

diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 99a57a69f3a..1c0f6358538 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -16,6 +16,8 @@
 # along with this program.  If not, see .
 #
 
+import importlib.util
+import logging
 import os
 import sys
 import tempfile
@@ -112,6 +114,27 @@ def init_directories(self) -> None:
 # Path where qemu goodies live in this source tree.
 qemu_srctree_path = Path(__file__, '../../../python').resolve()
 
+# Warn if we happen to be able to find qemu namespace packages
+# (using qemu.qmp as a bellwether) from an unexpected location.
+# i.e. the package is already installed in the user's environment.
+try:
+qemu_spec = importlib.util.find_spec('qemu.qmp')
+except ModuleNotFoundError:
+qemu_spec = None
+
+if qemu_spec and qemu_spec.origin:
+spec_path = Path(qemu_spec.origin)
+try:
+_ = spec_path.relative_to(qemu_srctree_path)
+except ValueError:
+self._logger.warning(
+"WARNING: 'qemu' python packages will be imported from"
+" outside the source tree ('%s')",
+qemu_srctree_path)
+self._logger.warning(
+" Importing instead from '%s'",
+spec_path.parents[1])
+
 self.pythonpath = os.pathsep.join(filter(None, (
 self.source_iotests,
 str(qemu_srctree_path),
@@ -230,6 +253,7 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: str,
 
 self.build_root = os.path.join(self.build_iotests, '..', '..')
 
+self._logger = logging.getLogger('qemu.iotests')
 self.init_directories()
 self.init_binaries()
 
-- 
2.31.1

Re: [PATCH] block: introduce max_hw_iov for use in scsi-generic

2021-09-23 Thread Halil Pasic

On Thu, 23 Sep 2021 16:28:11 +0200
Halil Pasic  wrote:

> Can't we use some of the established constants instead of hard coding a
> qemu specific IOV_MAX?
> 
> POSIX.1 seems to guarantee the availability of IOV_MAX in 
> according to: https://man7.org/linux/man-pages/man2/readv.2.html
> and  may have UIO_MAXIOV defined.

Never mind, the 
#define IOV_MAX 1024
in osdep.h is conditional and I guess we already use IOV_MAX from limit
when CONFIG_IOVEC is defined, i.e. when we don't emulate the interface.

Sorry for the noise.

Regards,
Halil

Re: [PATCH 5/6] iotests/migrate-bitmaps-test: delint

2021-09-23 Thread Vladimir Sementsov-Ogievskiy


23.09.2021 03:16, John Snow wrote:

Mostly uninteresting stuff. Move the test injections under a function
named main() so that the variables used during that process aren't in
the global scope.

Signed-off-by: John Snow
Reviewed-by: Philippe Mathieu-Daudé
Reviewed-by: Hanna Reitz


Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PATCH 6/6] iotests: Update for pylint 2.11.1

2021-09-23 Thread Vladimir Sementsov-Ogievskiy


23.09.2021 03:16, John Snow wrote:

1. Ignore the new f-strings warning, we're not interested in doing a
full conversion at this time.

2. Just mute the unbalanced-tuple-unpacking warning, it's not a real
error in this case and muting the dozens of callsites is just not
worth it.

3. Add encodings to read_text().

Signed-off-by: John Snow



Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PATCH 1/6] iotests: add 'qemu' package location to PYTHONPATH in testenv

2021-09-23 Thread John Snow

On Thu, Sep 23, 2021 at 11:20 AM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> 23.09.2021 03:16, John Snow wrote:
> > We can drop the sys.path hacking in various places by doing
> > this. Additionally, by doing it in one place right up top, we can print
> > interesting warnings in case the environment does not look correct.
> >
> > If we ever decide to change how the environment is crafted, all of the
> > "help me find my python packages" goop is all in one place, right in one
> > function.
> >
> > Signed-off-by: John Snow 
>
> Hurrah!!
>
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>
> > ---
> >   tests/qemu-iotests/235|  2 --
> >   tests/qemu-iotests/297|  6 --
> >   tests/qemu-iotests/300|  7 +++
> >   tests/qemu-iotests/iotests.py |  2 --
> >   tests/qemu-iotests/testenv.py | 14 +-
> >   tests/qemu-iotests/tests/mirror-top-perms |  7 +++
> >   6 files changed, 15 insertions(+), 23 deletions(-)
> >
> > diff --git a/tests/qemu-iotests/235 b/tests/qemu-iotests/235
> > index 8aed45f9a76..4de920c3801 100755
> > --- a/tests/qemu-iotests/235
> > +++ b/tests/qemu-iotests/235
> > @@ -24,8 +24,6 @@ import os
> >   import iotests
> >   from iotests import qemu_img_create, qemu_io, file_path, log
> >
> > -sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..',
> 'python'))
> > -
> >   from qemu.machine import QEMUMachine
> >
> >   iotests.script_initialize(supported_fmts=['qcow2'])
> > diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297
> > index b04cba53667..467b712280e 100755
> > --- a/tests/qemu-iotests/297
> > +++ b/tests/qemu-iotests/297
> > @@ -68,12 +68,6 @@ def run_linters():
> >   # Todo notes are fine, but fixme's or xxx's should probably just be
> >   # fixed (in tests, at least)
> >   env = os.environ.copy()
> > -qemu_module_path = os.path.join(os.path.dirname(__file__),
> > -'..', '..', 'python')
> > -try:
> > -env['PYTHONPATH'] += os.pathsep + qemu_module_path
> > -except KeyError:
> > -env['PYTHONPATH'] = qemu_module_path
> >   subprocess.run(('pylint-3', '--score=n', '--notes=FIXME,XXX',
> *files),
> >  env=env, check=False)
> >
> > diff --git a/tests/qemu-iotests/300 b/tests/qemu-iotests/300
> > index fe94de84edd..10f9f2a8da6 100755
> > --- a/tests/qemu-iotests/300
> > +++ b/tests/qemu-iotests/300
> > @@ -24,12 +24,11 @@ import random
> >   import re
> >   from typing import Dict, List, Optional
> >
> > -import iotests
> > -
> > -# Import qemu after iotests.py has amended sys.path
> > -# pylint: disable=wrong-import-order
> >   from qemu.machine import machine
> >
> > +import iotests
> > +
> > +
> >   BlockBitmapMapping = List[Dict[str, object]]
> >
> >   mig_sock = os.path.join(iotests.sock_dir, 'mig_sock')
> > diff --git a/tests/qemu-iotests/iotests.py
> b/tests/qemu-iotests/iotests.py
> > index ce06cf56304..b06ad76e0c5 100644
> > --- a/tests/qemu-iotests/iotests.py
> > +++ b/tests/qemu-iotests/iotests.py
> > @@ -36,8 +36,6 @@
> >
> >   from contextlib import contextmanager
> >
> > -# pylint: disable=import-error, wrong-import-position
> > -sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..',
> 'python'))
> >   from qemu.machine import qtest
> >   from qemu.qmp import QMPMessage
> >
> > diff --git a/tests/qemu-iotests/testenv.py
> b/tests/qemu-iotests/testenv.py
> > index 70da0d60c80..88104dace90 100644
> > --- a/tests/qemu-iotests/testenv.py
> > +++ b/tests/qemu-iotests/testenv.py
> > @@ -108,12 +108,16 @@ def init_directories(self) -> None:
> >SAMPLE_IMG_DIR
> >OUTPUT_DIR
> >   """
> > +
> > +# Path where qemu goodies live in this source tree.
> > +qemu_srctree_path = Path(__file__, '../../../python').resolve()
> > +
> >   self.pythonpath = os.getenv('PYTHONPATH')
> > -if self.pythonpath:
> > -self.pythonpath = self.source_iotests + os.pathsep + \
> > -self.pythonpath
> > -else:
> > -self.pythonpath = self.source_iotests
> > +self.pythonpath = os.pathsep.join(filter(None, (
> > +self.source_iotests,
> > +str(qemu_srctree_path),
> > +self.pythonpath,
> > +)))
>
> That was simple:)
>
>
Only because a very dedicated engineer from Virtuozzo spent so much time
writing a new iotest launcher ;)


> Hmm, after you this you have
>
> self.pythonpath = ...
> self.pythonpath = something other
>
>
> If have to resend, you may just use os.getenv('PYTHONPATH') inside
> filter(), it seems to be even nicer.
>
>
Ah, yeah. I'll just fold that in. Thanks!


> >
> >   self.test_dir = os.getenv('TEST_DIR',
> > os.path.join(os.getcwd(), 'scratch'))
> > diff --git a/tests/qemu-iotests/tests/mirror-top-perms
> b/tests/qemu-iotests/tests/mirror-to

Re: [PATCH 2/6] iotests: add warning for rogue 'qemu' packages

2021-09-23 Thread John Snow

On Thu, Sep 23, 2021 at 7:09 AM Daniel P. Berrangé 
wrote:

> On Wed, Sep 22, 2021 at 08:16:21PM -0400, John Snow wrote:
> > Add a warning for when 'iotests' runs against a qemu namespace that
> > isn't the one in the source tree. This might occur if you have
> > (accidentally) installed the Python namespace package to your local
> > packages.
>
> IIUC, it is/was a valid use case to run the iotests on arbitrary
> QEMU outside the source tree ie a distro packaged binary set.
>
> Are we not allowing the tests/qemu-iotests/* content to be
> run outside the context of the main source tree for this
> purpose ?
>
> eg  consider if Fedora/RHEL builds put tests/qemu-iotests/* into
> a 'qemu-iotests' RPM, which was installed and used with a distro
> package python-qemu ?
>
>
(1) "isn't the one in the source tree" is imprecise language here. The real
key is that it must be the QEMU namespace located at "
testenv.py/../../../python/qemu". This usually means the source tree, but
it'd work in any identically structured filesystem hierarchy.

(2) The preceding patch might help illustrate this. At present the iotests
expect to be able to find the 'qemu' python packages by navigating
directories starting from their own location (and not CWD). What patch 1
does is to centralize the logic that has to go "searching" for the python
packages into the init_directories method in testenv.py so that each
individual test doesn't have to repeat it.

i.e. before patch 1, iotests.py does this:
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..',
'python'))

so we'll always crawl to ../../python from wherever iotests.py is.

After patch 1, we're going to crawl to the same location, but starting from
testenv.py instead. testenv.py and iotests.py are in the same directory, so
I expect whatever worked before (and in whatever environment) will continue
working after. I can't say with full certainty in what circumstances we
support running iotests, but I don't think I have introduced any new
limitation with this.


> > (I'm not going to say that this is because I bit myself with this,
> > but. You can fill in the blanks.)
> > Signed-off-by: John Snow 
> > ---
> >  tests/qemu-iotests/testenv.py | 21 -
> >  1 file changed, 20 insertions(+), 1 deletion(-)
> >
> > diff --git a/tests/qemu-iotests/testenv.py
> b/tests/qemu-iotests/testenv.py
> > index 88104dace90..8a43b193af5 100644
> > --- a/tests/qemu-iotests/testenv.py
> > +++ b/tests/qemu-iotests/testenv.py
> > @@ -16,6 +16,8 @@
> >  # along with this program.  If not, see .
> >  #
> >
> > +import importlib.util
> > +import logging
> >  import os
> >  import sys
> >  import tempfile
> > @@ -25,7 +27,7 @@
> >  import random
> >  import subprocess
> >  import glob
> > -from typing import List, Dict, Any, Optional, ContextManager
> > +from typing import List, Dict, Any, Optional, ContextManager, cast
> >
> >  DEF_GDB_OPTIONS = 'localhost:12345'
> >
> > @@ -112,6 +114,22 @@ def init_directories(self) -> None:
> >  # Path where qemu goodies live in this source tree.
> >  qemu_srctree_path = Path(__file__, '../../../python').resolve()
> >
> > +# warn if we happen to be able to find 'qemu' packages from an
> > +# unexpected location (i.e. the package is already installed in
> > +# the user's environment)
> > +qemu_spec = importlib.util.find_spec('qemu.qmp')
> > +if qemu_spec:
> > +spec_path = Path(cast(str, qemu_spec.origin))
> > +try:
> > +_ = spec_path.relative_to(qemu_srctree_path)
> > +except ValueError:
> > +self._logger.warning(
> > +"WARNING: 'qemu' package will be imported from
> outside "
> > +"the source tree!")
> > +self._logger.warning(
> > +"Importing from: '%s'",
> > +spec_path.parents[1])
>
> It feels to me like the scenario  we're blocking here is actally
> the scenario we want to aim for.
>
>
It isn't blocking it, it's just a warning. At present, iotests expects that
it's using the version of the python packages that are in the source tree /
tarball / whatever. Since I converted those packages to be installable to
your environment, I introduced an ambiguity about which version iotests is
actually using: the version you installed, or the version in the source
tree? This is just merely a warning to let you know that iotests is
technically doing something new. ("Hey, you're running some other python
code than what iotests has done for the past ten years. Are you cool with
that?")

You're right, though: my actual end-goal (after a lot of pending cleanups I
have in-flight, I am getting to it ...!) is to eventually pivot iotests to
always using qemu as an installed package and wean it off of using
directory-crawling to find its dependencies. We are close to doing that.
When we do transition to that, the warni

Re: [PATCH 3/6] iotests/linters: check mypy files all at once

2021-09-23 Thread Vladimir Sementsov-Ogievskiy


23.09.2021 03:16, John Snow wrote:

We can circumvent the '__main__' redefinition problem by passing
--scripts-are-modules. Take mypy out of the loop per-filename and check
everything in one go: it's quite a bit faster.

Signed-off-by: John Snow
Reviewed-by: Hanna Reitz



Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PATCH 4/6] iotests/mirror-top-perms: Adjust imports

2021-09-23 Thread Vladimir Sementsov-Ogievskiy


23.09.2021 03:16, John Snow wrote:

We need to import things from the qemu namespace; importing the
namespace alone doesn't bring the submodules with it -- unless someone
else (like iotests.py) imports them too.

Adjust the imports.

Signed-off-by: John Snow
Reviewed-by: Philippe Mathieu-Daudé
Reviewed-by: Hanna Reitz



Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PATCH 1/6] iotests: add 'qemu' package location to PYTHONPATH in testenv

2021-09-23 Thread Vladimir Sementsov-Ogievskiy


23.09.2021 03:16, John Snow wrote:

We can drop the sys.path hacking in various places by doing
this. Additionally, by doing it in one place right up top, we can print
interesting warnings in case the environment does not look correct.

If we ever decide to change how the environment is crafted, all of the
"help me find my python packages" goop is all in one place, right in one
function.

Signed-off-by: John Snow 


Hurrah!!

Reviewed-by: Vladimir Sementsov-Ogievskiy 


---
  tests/qemu-iotests/235|  2 --
  tests/qemu-iotests/297|  6 --
  tests/qemu-iotests/300|  7 +++
  tests/qemu-iotests/iotests.py |  2 --
  tests/qemu-iotests/testenv.py | 14 +-
  tests/qemu-iotests/tests/mirror-top-perms |  7 +++
  6 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/tests/qemu-iotests/235 b/tests/qemu-iotests/235
index 8aed45f9a76..4de920c3801 100755
--- a/tests/qemu-iotests/235
+++ b/tests/qemu-iotests/235
@@ -24,8 +24,6 @@ import os
  import iotests
  from iotests import qemu_img_create, qemu_io, file_path, log
  
-sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))

-
  from qemu.machine import QEMUMachine
  
  iotests.script_initialize(supported_fmts=['qcow2'])

diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297
index b04cba53667..467b712280e 100755
--- a/tests/qemu-iotests/297
+++ b/tests/qemu-iotests/297
@@ -68,12 +68,6 @@ def run_linters():
  # Todo notes are fine, but fixme's or xxx's should probably just be
  # fixed (in tests, at least)
  env = os.environ.copy()
-qemu_module_path = os.path.join(os.path.dirname(__file__),
-'..', '..', 'python')
-try:
-env['PYTHONPATH'] += os.pathsep + qemu_module_path
-except KeyError:
-env['PYTHONPATH'] = qemu_module_path
  subprocess.run(('pylint-3', '--score=n', '--notes=FIXME,XXX', *files),
 env=env, check=False)
  
diff --git a/tests/qemu-iotests/300 b/tests/qemu-iotests/300

index fe94de84edd..10f9f2a8da6 100755
--- a/tests/qemu-iotests/300
+++ b/tests/qemu-iotests/300
@@ -24,12 +24,11 @@ import random
  import re
  from typing import Dict, List, Optional
  
-import iotests

-
-# Import qemu after iotests.py has amended sys.path
-# pylint: disable=wrong-import-order
  from qemu.machine import machine
  
+import iotests

+
+
  BlockBitmapMapping = List[Dict[str, object]]
  
  mig_sock = os.path.join(iotests.sock_dir, 'mig_sock')

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index ce06cf56304..b06ad76e0c5 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -36,8 +36,6 @@
  
  from contextlib import contextmanager
  
-# pylint: disable=import-error, wrong-import-position

-sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
  from qemu.machine import qtest
  from qemu.qmp import QMPMessage
  
diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py

index 70da0d60c80..88104dace90 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -108,12 +108,16 @@ def init_directories(self) -> None:
   SAMPLE_IMG_DIR
   OUTPUT_DIR
  """
+
+# Path where qemu goodies live in this source tree.
+qemu_srctree_path = Path(__file__, '../../../python').resolve()
+
  self.pythonpath = os.getenv('PYTHONPATH')
-if self.pythonpath:
-self.pythonpath = self.source_iotests + os.pathsep + \
-self.pythonpath
-else:
-self.pythonpath = self.source_iotests
+self.pythonpath = os.pathsep.join(filter(None, (
+self.source_iotests,
+str(qemu_srctree_path),
+self.pythonpath,
+)))


That was simple:)

Hmm, after you this you have

self.pythonpath = ...
self.pythonpath = something other


If have to resend, you may just use os.getenv('PYTHONPATH') inside filter(), it 
seems to be even nicer.

  
  self.test_dir = os.getenv('TEST_DIR',

os.path.join(os.getcwd(), 'scratch'))
diff --git a/tests/qemu-iotests/tests/mirror-top-perms 
b/tests/qemu-iotests/tests/mirror-top-perms
index 2fc8dd66e0a..73138a0ef91 100755
--- a/tests/qemu-iotests/tests/mirror-top-perms
+++ b/tests/qemu-iotests/tests/mirror-top-perms
@@ -20,13 +20,12 @@
  #
  
  import os

+
+import qemu
+
  import iotests
  from iotests import qemu_img
  
-# Import qemu after iotests.py has amended sys.path

-# pylint: disable=wrong-import-order
-import qemu
-
  
  image_size = 1 * 1024 * 1024

  source = os.path.join(iotests.test_dir, 'source.img')




--
Best regards,
Vladimir

Re: [PATCH 2/6] iotests: add warning for rogue 'qemu' packages

2021-09-23 Thread John Snow

On Thu, Sep 23, 2021 at 7:17 AM Kevin Wolf  wrote:

> Am 23.09.2021 um 12:57 hat Kevin Wolf geschrieben:
> > Am 23.09.2021 um 02:16 hat John Snow geschrieben:
> > > Add a warning for when 'iotests' runs against a qemu namespace that
> > > isn't the one in the source tree. This might occur if you have
> > > (accidentally) installed the Python namespace package to your local
> > > packages.
> > >
> > > (I'm not going to say that this is because I bit myself with this,
> > > but. You can fill in the blanks.)
> > >
> > > Signed-off-by: John Snow 
> > > ---
> > >  tests/qemu-iotests/testenv.py | 21 -
> > >  1 file changed, 20 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tests/qemu-iotests/testenv.py
> b/tests/qemu-iotests/testenv.py
> > > index 88104dace90..8a43b193af5 100644
> > > --- a/tests/qemu-iotests/testenv.py
> > > +++ b/tests/qemu-iotests/testenv.py
> > > @@ -16,6 +16,8 @@
> > >  # along with this program.  If not, see  >.
> > >  #
> > >
> > > +import importlib.util
> > > +import logging
> > >  import os
> > >  import sys
> > >  import tempfile
> > > @@ -25,7 +27,7 @@
> > >  import random
> > >  import subprocess
> > >  import glob
> > > -from typing import List, Dict, Any, Optional, ContextManager
> > > +from typing import List, Dict, Any, Optional, ContextManager, cast
> > >
> > >  DEF_GDB_OPTIONS = 'localhost:12345'
> > >
> > > @@ -112,6 +114,22 @@ def init_directories(self) -> None:
> > >  # Path where qemu goodies live in this source tree.
> > >  qemu_srctree_path = Path(__file__,
> '../../../python').resolve()
> > >
> > > +# warn if we happen to be able to find 'qemu' packages from an
> > > +# unexpected location (i.e. the package is already installed
> in
> > > +# the user's environment)
> > > +qemu_spec = importlib.util.find_spec('qemu.qmp')
> > > +if qemu_spec:
> > > +spec_path = Path(cast(str, qemu_spec.origin))
> >
> > You're casting Optional[str] to str here. The source type is not obvious
> > from the local code, so unless you spend some time actively figuring it
> > out, this could be casting anything to str. But even knowing this, just
> > casting Optional away looks fishy anyway.
> >
> > Looking up the ModuleSpec docs, it seems okay to assume that it's never
> > None in our case, but wouldn't it be much cleaner to just 'assert
> > qemu_spec.origin' first and then use it without the cast?
> >
>

OK. I suppose I was thinking: "It's always going to be a string, and if it
isn't, something else will explode below, surely, so the assertion is
redundant", but I don't want code that makes people wonder, so principle of
least surprise it is.


> > > +try:
> > > +_ = spec_path.relative_to(qemu_srctree_path)
> > > +except ValueError:
> > > +self._logger.warning(
> > > +"WARNING: 'qemu' package will be imported from
> outside "
> > > +"the source tree!")
> > > +self._logger.warning(
> > > +"Importing from: '%s'",
> > > +spec_path.parents[1])
> > > +
> > >  self.pythonpath = os.getenv('PYTHONPATH')
> > >  self.pythonpath = os.pathsep.join(filter(None, (
> > >  self.source_iotests,
> > > @@ -231,6 +249,7 @@ def __init__(self, imgfmt: str, imgproto: str,
> aiomode: str,
> > >
> > >  self.build_root = os.path.join(self.build_iotests, '..', '..')
> > >
> > > +self._logger = logging.getLogger('qemu.iotests')
> > >  self.init_directories()
> > >  self.init_binaries()
> >
> > Looks good otherwise.
>
> Well, actually there is a major problem: We'll pass the right PYTHONPATH
> for the test scripts that we're calling, but this script itself doesn't
> use it yet. So what I get is:
>
> Traceback (most recent call last):
>   File "/home/kwolf/source/qemu/tests/qemu-iotests/build/check", line 120,
> in 
> env = TestEnv(imgfmt=args.imgfmt, imgproto=args.imgproto,
>   File "/home/kwolf/source/qemu/tests/qemu-iotests/testenv.py", line 253,
> in __init__
> self.init_directories()
>   File "/home/kwolf/source/qemu/tests/qemu-iotests/testenv.py", line 120,
> in init_directories
> qemu_spec = importlib.util.find_spec('qemu.qmp')
>   File "/usr/lib64/python3.9/importlib/util.py", line 94, in find_spec
> parent = __import__(parent_name, fromlist=['__path__'])
> ModuleNotFoundError: No module named 'qemu'
>
> Kevin
>
>
Tch. So, it turns out with namespace packages that once you remove the
subpackages (pip uninstall qemu), it leaves the empty namespace folder
behind. This is also the reason I directly use 'qemu.qmp' as a bellwether
here, to make sure we're prodding a package and not just an empty namespace
with nothing in it.

I'll fix these, thanks.

Re: [PATCH] block: introduce max_hw_iov for use in scsi-generic

2021-09-23 Thread Christian Borntraeger





Am 23.09.21 um 15:04 schrieb Paolo Bonzini:

Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel
sources, IOV_MAX in POSIX).  Because of this, on some host adapters
requests with many iovecs are rejected with -EINVAL by the
io_submit() or readv()/writev() system calls.

In fact, the same limit applies to SG_IO as well.  To fix both the
EINVAL and the possible performance issues from using fewer iovecs
than allowed by Linux (some HBAs have max_segments as low as 128),
introduce a separate entry in BlockLimits to hold the max_segments
value from sysfs.  This new limit is used only for SG_IO and clamped
to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to
bs->bl.max_transfer.

Reported-by: Halil Pasic 
Cc: Hanna Reitz 
Cc: Kevin Wolf 
Cc: qemu-block@nongnu.org
Fixes: 18473467d5 ("file-posix: try BLKSECTGET on block devices too, do not round to 
power of 2", 2021-06-25)


This sneaked in shortly before the 6.1 release (between rc0 and rc1 I think).
Shouldnt that go to stable in cass this still exist?



Signed-off-by: Paolo Bonzini 
---
  block/block-backend.c  | 6 ++
  block/file-posix.c | 2 +-
  block/io.c | 1 +
  hw/scsi/scsi-generic.c | 2 +-
  include/block/block_int.h  | 7 +++
  include/sysemu/block-backend.h | 1 +
  6 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 6140d133e2..ba2b5ebb10 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1986,6 +1986,12 @@ uint32_t blk_get_max_transfer(BlockBackend *blk)
  return ROUND_DOWN(max, blk_get_request_alignment(blk));
  }
  
+int blk_get_max_hw_iov(BlockBackend *blk)

+{
+return MIN_NON_ZERO(blk->root->bs->bl.max_hw_iov,
+blk->root->bs->bl.max_iov);
+}
+
  int blk_get_max_iov(BlockBackend *blk)
  {
  return blk->root->bs->bl.max_iov;
diff --git a/block/file-posix.c b/block/file-posix.c
index cb9bffe047..1567edb3d5 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1273,7 +1273,7 @@ static void raw_refresh_limits(BlockDriverState *bs, 
Error **errp)
  
  ret = hdev_get_max_segments(s->fd, &st);

  if (ret > 0) {
-bs->bl.max_iov = ret;
+bs->bl.max_hw_iov = ret;
  }
  }
  }
diff --git a/block/io.c b/block/io.c
index a19942718b..f38e7f81d8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -136,6 +136,7 @@ static void bdrv_merge_limits(BlockLimits *dst, const 
BlockLimits *src)
  dst->min_mem_alignment = MAX(dst->min_mem_alignment,
   src->min_mem_alignment);
  dst->max_iov = MIN_NON_ZERO(dst->max_iov, src->max_iov);
+dst->max_hw_iov = MIN_NON_ZERO(dst->max_hw_iov, src->max_hw_iov);
  }
  
  typedef struct BdrvRefreshLimitsState {

diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 665baf900e..0306ccc7b1 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -180,7 +180,7 @@ static int scsi_handle_inquiry_reply(SCSIGenericReq *r, 
SCSIDevice *s, int len)
  page = r->req.cmd.buf[2];
  if (page == 0xb0) {
  uint64_t max_transfer = blk_get_max_hw_transfer(s->conf.blk);
-uint32_t max_iov = blk_get_max_iov(s->conf.blk);
+uint32_t max_iov = blk_get_max_hw_iov(s->conf.blk);
  
  assert(max_transfer);

  max_transfer = MIN_NON_ZERO(max_transfer, max_iov * 
qemu_real_host_page_size)
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f1a54db0f8..c31cbd034a 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -702,6 +702,13 @@ typedef struct BlockLimits {
   */
  uint64_t max_hw_transfer;
  
+/* Maximal number of scatter/gather elements allowed by the hardware.

+ * Applies whenever transfers to the device bypass the kernel I/O
+ * scheduler, for example with SG_IO.  If larger than max_iov
+ * or if zero, blk_get_max_hw_iov will fall back to max_iov.
+ */
+int max_hw_iov;
+
  /* memory alignment, in bytes so that no bounce buffer is needed */
  size_t min_mem_alignment;
  
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h

index 29d4fdbf63..82bae55161 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -211,6 +211,7 @@ uint32_t blk_get_request_alignment(BlockBackend *blk);
  uint32_t blk_get_max_transfer(BlockBackend *blk);
  uint64_t blk_get_max_hw_transfer(BlockBackend *blk);
  int blk_get_max_iov(BlockBackend *blk);
+int blk_get_max_hw_iov(BlockBackend *blk);
  void blk_set_guest_block_size(BlockBackend *blk, int align);
  void *blk_try_blockalign(BlockBackend *blk, size_t size);
  void *blk_blockalign(BlockBackend *blk, size_t size);

[PATCH 3/3] linux-aio: add `dev_max_batch` parameter to laio_io_unplug()

2021-09-23 Thread Stefano Garzarella

Between the submission of a request and the unplug, other devices
with larger limits may have been queued new requests without flushing
the batch.

Using the new `dev_max_batch` parameter, laio_io_unplug() can check
if the batch exceeds the device limit to flush the current batch.

Signed-off-by: Stefano Garzarella 
---
 include/block/raw-aio.h | 3 ++-
 block/file-posix.c  | 2 +-
 block/linux-aio.c   | 8 +---
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
index ebd042fa27..21fc10c4c9 100644
--- a/include/block/raw-aio.h
+++ b/include/block/raw-aio.h
@@ -56,7 +56,8 @@ int coroutine_fn laio_co_submit(BlockDriverState *bs, 
LinuxAioState *s, int fd,
 void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context);
 void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context);
 void laio_io_plug(BlockDriverState *bs, LinuxAioState *s);
-void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s);
+void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s,
+uint64_t dev_max_batch);
 #endif
 /* io_uring.c - Linux io_uring implementation */
 #ifdef CONFIG_LINUX_IO_URING
diff --git a/block/file-posix.c b/block/file-posix.c
index 614c725235..f31a75a715 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2125,7 +2125,7 @@ static void raw_aio_unplug(BlockDriverState *bs)
 #ifdef CONFIG_LINUX_AIO
 if (s->use_linux_aio) {
 LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
-laio_io_unplug(bs, aio);
+laio_io_unplug(bs, aio, s->aio_max_batch);
 }
 #endif
 #ifdef CONFIG_LINUX_IO_URING
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 88b44fee72..f53ae72e21 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -356,11 +356,13 @@ void laio_io_plug(BlockDriverState *bs, LinuxAioState *s)
 s->io_q.plugged++;
 }
 
-void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s)
+void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s,
+uint64_t dev_max_batch)
 {
 assert(s->io_q.plugged);
-if (--s->io_q.plugged == 0 &&
-!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
+if (s->io_q.in_queue >= laio_max_batch(s, dev_max_batch) ||
+(--s->io_q.plugged == 0 &&
+ !s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending))) {
 ioq_submit(s);
 }
 }
-- 
2.31.1

[PATCH 1/3] file-posix: add `aio-max-batch` option

2021-09-23 Thread Stefano Garzarella

Commit d7ddd0a161 ("linux-aio: limit the batch size using
`aio-max-batch` parameter") added a way to limit the batch size
of Linux AIO backend for the entire AIO context.

The same AIO context can be shared by multiple devices, so
latency-sensitive devices may want to limit the batch size even
more to avoid increasing latency.

For this reason we add the `aio-max-batch` option to the file
backend, which will be used by the next commits to limit the size of
batches including requests generated by this device.

Suggested-by: Kevin Wolf 
Signed-off-by: Stefano Garzarella 
---
 qapi/block-core.json | 5 +
 block/file-posix.c   | 9 +
 2 files changed, 14 insertions(+)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index c8ce1d9d5d..1a8ed325bc 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2851,6 +2851,10 @@
 #  for this device (default: none, forward the commands via SG_IO;
 #  since 2.11)
 # @aio: AIO backend (default: threads) (since: 2.8)
+# @aio-max-batch: maximum number of requests in an AIO backend batch that
+# contains request from this block device. 0 means that the
+# AIO backend will handle it automatically.
+# (default:0, since 6.2)
 # @locking: whether to enable file locking. If set to 'auto', only enable
 #   when Open File Descriptor (OFD) locking API is available
 #   (default: auto, since 2.10)
@@ -2879,6 +2883,7 @@
 '*pr-manager': 'str',
 '*locking': 'OnOffAuto',
 '*aio': 'BlockdevAioOptions',
+'*aio-max-batch': 'int',
 '*drop-cache': {'type': 'bool',
 'if': 'CONFIG_LINUX'},
 '*x-check-cache-dropped': 'bool' },
diff --git a/block/file-posix.c b/block/file-posix.c
index d81e15efa4..da50e94034 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -150,6 +150,8 @@ typedef struct BDRVRawState {
 uint64_t locked_perm;
 uint64_t locked_shared_perm;
 
+uint64_t aio_max_batch;
+
 int perm_change_fd;
 int perm_change_flags;
 BDRVReopenState *reopen_state;
@@ -530,6 +532,11 @@ static QemuOptsList raw_runtime_opts = {
 .type = QEMU_OPT_STRING,
 .help = "host AIO implementation (threads, native, io_uring)",
 },
+{
+.name = "aio-max-batch",
+.type = QEMU_OPT_NUMBER,
+.help = "AIO max batch size (0 = auto handled by AIO backend, 
default: 0)",
+},
 {
 .name = "locking",
 .type = QEMU_OPT_STRING,
@@ -609,6 +616,8 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 s->use_linux_io_uring = (aio == BLOCKDEV_AIO_OPTIONS_IO_URING);
 #endif
 
+s->aio_max_batch = qemu_opt_get_number(opts, "aio-max-batch", 0);
+
 locking = qapi_enum_parse(&OnOffAuto_lookup,
   qemu_opt_get(opts, "locking"),
   ON_OFF_AUTO_AUTO, &local_err);
-- 
2.31.1

[PATCH 2/3] linux-aio: add `dev_max_batch` parameter to laio_co_submit()

2021-09-23 Thread Stefano Garzarella

This new parameter can be used by block devices to limit the
Linux AIO batch size more than the limit set by the AIO context.

file-posix backend supports this, passing its `aio-max-batch` option
previously added.

Add an helper function to calculate the maximum batch size.

Signed-off-by: Stefano Garzarella 
---
 include/block/raw-aio.h |  3 ++-
 block/file-posix.c  |  3 ++-
 block/linux-aio.c   | 30 ++
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
index 251b10d273..ebd042fa27 100644
--- a/include/block/raw-aio.h
+++ b/include/block/raw-aio.h
@@ -51,7 +51,8 @@ typedef struct LinuxAioState LinuxAioState;
 LinuxAioState *laio_init(Error **errp);
 void laio_cleanup(LinuxAioState *s);
 int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
-uint64_t offset, QEMUIOVector *qiov, int type);
+uint64_t offset, QEMUIOVector *qiov, int type,
+uint64_t dev_max_batch);
 void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context);
 void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context);
 void laio_io_plug(BlockDriverState *bs, LinuxAioState *s);
diff --git a/block/file-posix.c b/block/file-posix.c
index da50e94034..614c725235 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2066,7 +2066,8 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, 
uint64_t offset,
 } else if (s->use_linux_aio) {
 LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
 assert(qiov->size == bytes);
-return laio_co_submit(bs, aio, s->fd, offset, qiov, type);
+return laio_co_submit(bs, aio, s->fd, offset, qiov, type,
+  s->aio_max_batch);
 #endif
 }
 
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 0dab507b71..88b44fee72 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -334,6 +334,23 @@ static void ioq_submit(LinuxAioState *s)
 }
 }
 
+static uint64_t laio_max_batch(LinuxAioState *s, uint64_t dev_max_batch)
+{
+uint64_t max_batch = s->aio_context->aio_max_batch ?: DEFAULT_MAX_BATCH;
+
+/*
+ * AIO context can be shared between multiple block devices, so
+ * `dev_max_batch` allows reducing the batch size for latency-sensitive
+ * devices.
+ */
+max_batch = MIN_NON_ZERO(dev_max_batch, max_batch);
+
+/* limit the batch with the number of available events */
+max_batch = MIN_NON_ZERO(MAX_EVENTS - s->io_q.in_flight, max_batch);
+
+return max_batch;
+}
+
 void laio_io_plug(BlockDriverState *bs, LinuxAioState *s)
 {
 s->io_q.plugged++;
@@ -349,15 +366,11 @@ void laio_io_unplug(BlockDriverState *bs, LinuxAioState 
*s)
 }
 
 static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset,
-  int type)
+  int type, uint64_t dev_max_batch)
 {
 LinuxAioState *s = laiocb->ctx;
 struct iocb *iocbs = &laiocb->iocb;
 QEMUIOVector *qiov = laiocb->qiov;
-int64_t max_batch = s->aio_context->aio_max_batch ?: DEFAULT_MAX_BATCH;
-
-/* limit the batch with the number of available events */
-max_batch = MIN_NON_ZERO(MAX_EVENTS - s->io_q.in_flight, max_batch);
 
 switch (type) {
 case QEMU_AIO_WRITE:
@@ -378,7 +391,7 @@ static int laio_do_submit(int fd, struct qemu_laiocb 
*laiocb, off_t offset,
 s->io_q.in_queue++;
 if (!s->io_q.blocked &&
 (!s->io_q.plugged ||
- s->io_q.in_queue >= max_batch)) {
+ s->io_q.in_queue >= laio_max_batch(s, dev_max_batch))) {
 ioq_submit(s);
 }
 
@@ -386,7 +399,8 @@ static int laio_do_submit(int fd, struct qemu_laiocb 
*laiocb, off_t offset,
 }
 
 int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
-uint64_t offset, QEMUIOVector *qiov, int type)
+uint64_t offset, QEMUIOVector *qiov, int type,
+uint64_t dev_max_batch)
 {
 int ret;
 struct qemu_laiocb laiocb = {
@@ -398,7 +412,7 @@ int coroutine_fn laio_co_submit(BlockDriverState *bs, 
LinuxAioState *s, int fd,
 .qiov   = qiov,
 };
 
-ret = laio_do_submit(fd, &laiocb, offset, type);
+ret = laio_do_submit(fd, &laiocb, offset, type, dev_max_batch);
 if (ret < 0) {
 return ret;
 }
-- 
2.31.1

[PATCH 0/3] linux-aio: allow block devices to limit aio-max-batch

2021-09-23 Thread Stefano Garzarella

Commit d7ddd0a161 ("linux-aio: limit the batch size using
`aio-max-batch` parameter") added a way to limit the batch size
of Linux AIO backend for the entire AIO context.

The same AIO context can be shared by multiple devices, so
latency-sensitive devices may want to limit the batch size even
more to avoid increasing latency.

This series add the `aio-max-batch` option to the file backend,
and use it in laio_co_submit() and laio_io_unplug() to limit the
Linux AIO batch size more than the limit set by the AIO context.

Stefano Garzarella (3):
  file-posix: add `aio-max-batch` option
  linux-aio: add `dev_max_batch` parameter to laio_co_submit()
  linux-aio: add `dev_max_batch` parameter to laio_io_unplug()

 qapi/block-core.json|  5 +
 include/block/raw-aio.h |  6 --
 block/file-posix.c  | 14 --
 block/linux-aio.c   | 38 +++---
 4 files changed, 48 insertions(+), 15 deletions(-)

-- 
2.31.1

Re: [PATCH] block: introduce max_hw_iov for use in scsi-generic

2021-09-23 Thread Halil Pasic

On Thu, 23 Sep 2021 09:04:36 -0400
Paolo Bonzini  wrote:

> Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel
> sources, IOV_MAX in POSIX).  Because of this, on some host adapters
> requests with many iovecs are rejected with -EINVAL by the
> io_submit() or readv()/writev() system calls.
> 
> In fact, the same limit applies to SG_IO as well.  To fix both the
> EINVAL and the possible performance issues from using fewer iovecs
> than allowed by Linux (some HBAs have max_segments as low as 128),
> introduce a separate entry in BlockLimits to hold the max_segments
> value from sysfs.  This new limit is used only for SG_IO and clamped
> to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to
> bs->bl.max_transfer.

Doesn't this patch render bs->bl.max_iov a constant?

$ git grep -p -e 'bl\(.\|->\)max_iov'
block/block-backend.c=int blk_get_max_iov(BlockBackend *blk)
block/block-backend.c:return blk->root->bs->bl.max_iov;
block/file-posix.c=static void raw_refresh_limits(BlockDriverState *bs, Error 
**errp)
block/file-posix.c:bs->bl.max_iov = ret;
block/io.c=void bdrv_refresh_limits(BlockDriverState *bs, Transaction *tran, 
Error **errp)
block/io.c:bs->bl.max_iov = IOV_MAX;
block/mirror.c=static int coroutine_fn mirror_run(Job *job, Error **errp)
block/mirror.c:s->max_iov = MIN(bs->bl.max_iov, target_bs->bl.max_iov);

Can't we use some of the established constants instead of hard coding a
qemu specific IOV_MAX?

POSIX.1 seems to guarantee the availability of IOV_MAX in 
according to: https://man7.org/linux/man-pages/man2/readv.2.html
and  may have UIO_MAXIOV defined.

> 
> Reported-by: Halil Pasic 
> Cc: Hanna Reitz 
> Cc: Kevin Wolf 
> Cc: qemu-block@nongnu.org
> Fixes: 18473467d5 ("file-posix: try BLKSECTGET on block devices too, do not 
> round to power of 2", 2021-06-25)
> Signed-off-by: Paolo Bonzini 
> ---
>  block/block-backend.c  | 6 ++
>  block/file-posix.c | 2 +-
>  block/io.c | 1 +
>  hw/scsi/scsi-generic.c | 2 +-
>  include/block/block_int.h  | 7 +++
>  include/sysemu/block-backend.h | 1 +
>  6 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 6140d133e2..ba2b5ebb10 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1986,6 +1986,12 @@ uint32_t blk_get_max_transfer(BlockBackend *blk)
>  return ROUND_DOWN(max, blk_get_request_alignment(blk));
>  }
>  
> +int blk_get_max_hw_iov(BlockBackend *blk)
> +{
> +return MIN_NON_ZERO(blk->root->bs->bl.max_hw_iov,
> +blk->root->bs->bl.max_iov);
> +}
> +
>  int blk_get_max_iov(BlockBackend *blk)
>  {
>  return blk->root->bs->bl.max_iov;
> diff --git a/block/file-posix.c b/block/file-posix.c
> index cb9bffe047..1567edb3d5 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1273,7 +1273,7 @@ static void raw_refresh_limits(BlockDriverState *bs, 
> Error **errp)
>  
>  ret = hdev_get_max_segments(s->fd, &st);
>  if (ret > 0) {
> -bs->bl.max_iov = ret;
> +bs->bl.max_hw_iov = ret;
>  }
>  }
>  }
> diff --git a/block/io.c b/block/io.c
> index a19942718b..f38e7f81d8 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -136,6 +136,7 @@ static void bdrv_merge_limits(BlockLimits *dst, const 
> BlockLimits *src)
>  dst->min_mem_alignment = MAX(dst->min_mem_alignment,
>   src->min_mem_alignment);
>  dst->max_iov = MIN_NON_ZERO(dst->max_iov, src->max_iov);
> +dst->max_hw_iov = MIN_NON_ZERO(dst->max_hw_iov, src->max_hw_iov);
>  }
>  
>  typedef struct BdrvRefreshLimitsState {
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 665baf900e..0306ccc7b1 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -180,7 +180,7 @@ static int scsi_handle_inquiry_reply(SCSIGenericReq *r, 
> SCSIDevice *s, int len)
>  page = r->req.cmd.buf[2];
>  if (page == 0xb0) {
>  uint64_t max_transfer = blk_get_max_hw_transfer(s->conf.blk);
> -uint32_t max_iov = blk_get_max_iov(s->conf.blk);
> +uint32_t max_iov = blk_get_max_hw_iov(s->conf.blk);
>  
>  assert(max_transfer);
>  max_transfer = MIN_NON_ZERO(max_transfer, max_iov * 
> qemu_real_host_page_size)
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index f1a54db0f8..c31cbd034a 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -702,6 +702,13 @@ typedef struct BlockLimits {
>   */
>  uint64_t max_hw_transfer;
>  
> +/* Maximal number of scatter/gather elements allowed by the hardware.
> + * Applies whenever transfers to the device bypass the kernel I/O
> + * scheduler, for example with SG_IO.  If larger than max_iov
> + * or if zero, blk_get_max_hw_iov will fall back to max_iov.
> + */
> +int max_hw_iov;
> +
>

[PATCH] block: introduce max_hw_iov for use in scsi-generic

2021-09-23 Thread Paolo Bonzini

Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel
sources, IOV_MAX in POSIX).  Because of this, on some host adapters
requests with many iovecs are rejected with -EINVAL by the
io_submit() or readv()/writev() system calls.

In fact, the same limit applies to SG_IO as well.  To fix both the
EINVAL and the possible performance issues from using fewer iovecs
than allowed by Linux (some HBAs have max_segments as low as 128),
introduce a separate entry in BlockLimits to hold the max_segments
value from sysfs.  This new limit is used only for SG_IO and clamped
to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to
bs->bl.max_transfer.

Reported-by: Halil Pasic 
Cc: Hanna Reitz 
Cc: Kevin Wolf 
Cc: qemu-block@nongnu.org
Fixes: 18473467d5 ("file-posix: try BLKSECTGET on block devices too, do not 
round to power of 2", 2021-06-25)
Signed-off-by: Paolo Bonzini 
---
 block/block-backend.c  | 6 ++
 block/file-posix.c | 2 +-
 block/io.c | 1 +
 hw/scsi/scsi-generic.c | 2 +-
 include/block/block_int.h  | 7 +++
 include/sysemu/block-backend.h | 1 +
 6 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 6140d133e2..ba2b5ebb10 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1986,6 +1986,12 @@ uint32_t blk_get_max_transfer(BlockBackend *blk)
 return ROUND_DOWN(max, blk_get_request_alignment(blk));
 }
 
+int blk_get_max_hw_iov(BlockBackend *blk)
+{
+return MIN_NON_ZERO(blk->root->bs->bl.max_hw_iov,
+blk->root->bs->bl.max_iov);
+}
+
 int blk_get_max_iov(BlockBackend *blk)
 {
 return blk->root->bs->bl.max_iov;
diff --git a/block/file-posix.c b/block/file-posix.c
index cb9bffe047..1567edb3d5 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1273,7 +1273,7 @@ static void raw_refresh_limits(BlockDriverState *bs, 
Error **errp)
 
 ret = hdev_get_max_segments(s->fd, &st);
 if (ret > 0) {
-bs->bl.max_iov = ret;
+bs->bl.max_hw_iov = ret;
 }
 }
 }
diff --git a/block/io.c b/block/io.c
index a19942718b..f38e7f81d8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -136,6 +136,7 @@ static void bdrv_merge_limits(BlockLimits *dst, const 
BlockLimits *src)
 dst->min_mem_alignment = MAX(dst->min_mem_alignment,
  src->min_mem_alignment);
 dst->max_iov = MIN_NON_ZERO(dst->max_iov, src->max_iov);
+dst->max_hw_iov = MIN_NON_ZERO(dst->max_hw_iov, src->max_hw_iov);
 }
 
 typedef struct BdrvRefreshLimitsState {
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 665baf900e..0306ccc7b1 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -180,7 +180,7 @@ static int scsi_handle_inquiry_reply(SCSIGenericReq *r, 
SCSIDevice *s, int len)
 page = r->req.cmd.buf[2];
 if (page == 0xb0) {
 uint64_t max_transfer = blk_get_max_hw_transfer(s->conf.blk);
-uint32_t max_iov = blk_get_max_iov(s->conf.blk);
+uint32_t max_iov = blk_get_max_hw_iov(s->conf.blk);
 
 assert(max_transfer);
 max_transfer = MIN_NON_ZERO(max_transfer, max_iov * 
qemu_real_host_page_size)
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f1a54db0f8..c31cbd034a 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -702,6 +702,13 @@ typedef struct BlockLimits {
  */
 uint64_t max_hw_transfer;
 
+/* Maximal number of scatter/gather elements allowed by the hardware.
+ * Applies whenever transfers to the device bypass the kernel I/O
+ * scheduler, for example with SG_IO.  If larger than max_iov
+ * or if zero, blk_get_max_hw_iov will fall back to max_iov.
+ */
+int max_hw_iov;
+
 /* memory alignment, in bytes so that no bounce buffer is needed */
 size_t min_mem_alignment;
 
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 29d4fdbf63..82bae55161 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -211,6 +211,7 @@ uint32_t blk_get_request_alignment(BlockBackend *blk);
 uint32_t blk_get_max_transfer(BlockBackend *blk);
 uint64_t blk_get_max_hw_transfer(BlockBackend *blk);
 int blk_get_max_iov(BlockBackend *blk);
+int blk_get_max_hw_iov(BlockBackend *blk);
 void blk_set_guest_block_size(BlockBackend *blk, int align);
 void *blk_try_blockalign(BlockBackend *blk, size_t size);
 void *blk_blockalign(BlockBackend *blk, size_t size);
-- 
2.27.0

Re: [PULL 18/28] file-posix: try BLKSECTGET on block devices too, do not round to power of 2

2021-09-23 Thread Paolo Bonzini


On 23/09/21 14:13, Halil Pasic wrote:

On Thu, 23 Sep 2021 12:57:38 +0200
Paolo Bonzini  wrote:


On 22/09/21 21:51, Halil Pasic wrote:

We have figured out what is going on here. The problem seems to be
specific to linux aio, which seems to limit the size of the iovec to
1024 (UIO_MAXIOV).


Hi Halil,

I'll send a patch shortly to fix this issue.  Sorry about the delay as I
was busy with KVM Forum and all that. :(


Hi Paolo!

With some guidance I could cook up a patch myself. But if you prefer to
do it yourself, I will be glad to test the fix (please put me on cc).
Thanks!


NP, the patch already existed but I was busy and didn't send it out.

Paolo

Re: [PATCH 0/6] iotests: update environment and linting configuration

2021-09-23 Thread Kevin Wolf

Am 23.09.2021 um 02:16 hat John Snow geschrieben:
> GitLab: https://gitlab.com/jsnow/qemu/-/commits/python-package-iotest-pt1
> CI: https://gitlab.com/jsnow/qemu/-/pipelines/375630185
> 
> This series partially supersedes:
>   [PATCH v3 00/16] python/iotests: Run iotest linters during Python CI'
> 
> Howdy, this is good stuff we want even if we aren't yet in agreement
> about the best way to run iotest 297 from CI.
> 
> - Update linting config to tolerate pylint 2.11.1
> - Eliminate sys.path hacking in individual test files
> - make mypy execution in test 297 faster
> 
> The rest of the actual "run at CI time" stuff can get handled separately
> and later pending some discussion on the other series.

Patch 2 seems to need some more work.

The rest is: Reviewed-by: Kevin Wolf

Re: [PULL 18/28] file-posix: try BLKSECTGET on block devices too, do not round to power of 2

2021-09-23 Thread Halil Pasic

On Thu, 23 Sep 2021 12:57:38 +0200
Paolo Bonzini  wrote:

> On 22/09/21 21:51, Halil Pasic wrote:
> > We have figured out what is going on here. The problem seems to be
> > specific to linux aio, which seems to limit the size of the iovec to
> > 1024 (UIO_MAXIOV).  
> 
> Hi Halil,
> 
> I'll send a patch shortly to fix this issue.  Sorry about the delay as I 
> was busy with KVM Forum and all that. :(

Hi Paolo!

With some guidance I could cook up a patch myself. But if you prefer to
do it yourself, I will be glad to test the fix (please put me on cc).
Thanks!

Regards,
Halil

Re: [PATCH 8/8] qapi: add blockdev-replace command

2021-09-23 Thread Vladimir Sementsov-Ogievskiy


23.09.2021 13:09, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


Thanks a lot for reviewing!

20.09.2021 09:44, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


Add command that can add and remove filters.

Key points of functionality:

What the command does is simply replace some BdrvChild.bs by some other
nodes. The tricky thing is selecting there BdrvChild objects.
To be able to select any kind of BdrvChild we use a generic parent_id,
which may be a node-name, or qdev id or block export id. In future we
may support block jobs.

Any kind of ambiguity leads to error. If we have both device named
device0 and block export named device0 and they both point to same BDS,
user can't replace root child of one of these parents. So, to be able
to do replacements, user should avoid duplicating names in different
parent namespaces.

So, command allows to replace any single child in the graph.

On the other hand we want to realize a kind of bdrv_replace_node(),
which works well when we want to replace all parents of some node. For
this kind of task @parents-mode argument implemented.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
   qapi/block-core.json | 78 +
   block.c  | 82 
   2 files changed, 160 insertions(+)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 675d8265eb..8059b96341 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -5433,3 +5433,81 @@
   { 'command': 'blockdev-snapshot-delete-internal-sync',
 'data': { 'device': 'str', '*id': 'str', '*name': 'str'},
 'returns': 'SnapshotInfo' }
+
+##
+# @BlockdevReplaceParentsMode:
+#
+# Alternative (to directly set @parent) way to chose parents in
+# @blockdev-replace
+#
+# @exactly-one: Exactly one parent should match a condition, otherwise
+#   @blockdev-replace fails.
+#
+# @all: All matching parents are taken into account. If replacing lead
+#   to loops in block graph, @blockdev-replace fails.
+#
+# @auto: Same as @all, but automatically skip replacing parents if it
+#leads to loops in block graph.
+#
+# Since: 6.2
+##
+{ 'enum': 'BlockdevReplaceParentsMode',
+  'data': ['exactly-one', 'all', 'auto'] }
+
+##
+# @BlockdevReplace:
+#
+# Declaration of one replacement.


Replacement of what?  A node in the block graph?


A specific child node in one or in several edges


Spell that out in the doc comment, please.




+#
+# @parent: id of parent. It may be qdev or block export or simple
+#  node-name.


It may also be a QOM path, because find_device_state() interprets
arguments starting with '/' as QOM paths.

When is a node name "simple"?

Suggest: It may be a qdev ID, a QOM path, a block export ID, or a node
name.


OK



The trouble is of course that we're merging three separate name spaces.


Yes. Alternatively we can also add an enum of node-type [bds, device, export[, 
job]], and select graph nodes more explicit (by pair of id/path/name and type)

But if we have to use these things in one context, it seems good to enforce 
users use different names for them. And in this way, we can avoid strict typing.


Is there precedence in QMP for merging ID name spaces, or for selecting
a name space?


Hmm, I didn't see neither of it.




Aside: a single name space for IDs would be so much saner, but we
screwed that up long ago.


Throwing some of the multiple name spaces together some of the time
feels like another mistake.




 If id is ambiguous (for example node-name of
+#  some BDS equals to block export name), blockdev-replace
+#  fails.


Is there a way out of this situation, or are is replacement simply
impossible then?


In my idea, it's simply impossible. If someone want to use this new interface, 
he should care to use different names for different things.


Reminds me of device_del, which simply could not delete a device without
an ID.  Made many users go "oh" (or possibly a more colorful version
thereof), until daniel fixed it in commit 6287d827d4 "monitor: allow
device_del to accept QOM paths" for v2.5.




 If not specified, blockdev-replace goes through
+#  @parents-mode scenario, see below. Note, that @parent and
+#  @parents-mode can't be specified simultaneously.


What if neither is specified?  Hmm, @parents-mode has a default, so
that's what we get.


+#  If @parent is specified, only one edge is selected. If
+#  several edges match the condition, blockdev-replace fails.
+#
+# @edge: name of the child. If omitted, any child name matches.
+#
+# @child: node-name of the child. If omitted, any child matches.
+# Must be present if @parent is not specified.


Is @child useful when @parent is present?


You may specify @child and @parent, to replace child in specific edge. Or 
@parent and @edge. Or all three fields: just to be strict.



What's the diffe

Re: [PATCH 2/6] iotests: add warning for rogue 'qemu' packages

2021-09-23 Thread Kevin Wolf

Am 23.09.2021 um 12:57 hat Kevin Wolf geschrieben:
> Am 23.09.2021 um 02:16 hat John Snow geschrieben:
> > Add a warning for when 'iotests' runs against a qemu namespace that
> > isn't the one in the source tree. This might occur if you have
> > (accidentally) installed the Python namespace package to your local
> > packages.
> > 
> > (I'm not going to say that this is because I bit myself with this,
> > but. You can fill in the blanks.)
> > 
> > Signed-off-by: John Snow 
> > ---
> >  tests/qemu-iotests/testenv.py | 21 -
> >  1 file changed, 20 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
> > index 88104dace90..8a43b193af5 100644
> > --- a/tests/qemu-iotests/testenv.py
> > +++ b/tests/qemu-iotests/testenv.py
> > @@ -16,6 +16,8 @@
> >  # along with this program.  If not, see .
> >  #
> >  
> > +import importlib.util
> > +import logging
> >  import os
> >  import sys
> >  import tempfile
> > @@ -25,7 +27,7 @@
> >  import random
> >  import subprocess
> >  import glob
> > -from typing import List, Dict, Any, Optional, ContextManager
> > +from typing import List, Dict, Any, Optional, ContextManager, cast
> >  
> >  DEF_GDB_OPTIONS = 'localhost:12345'
> >  
> > @@ -112,6 +114,22 @@ def init_directories(self) -> None:
> >  # Path where qemu goodies live in this source tree.
> >  qemu_srctree_path = Path(__file__, '../../../python').resolve()
> >  
> > +# warn if we happen to be able to find 'qemu' packages from an
> > +# unexpected location (i.e. the package is already installed in
> > +# the user's environment)
> > +qemu_spec = importlib.util.find_spec('qemu.qmp')
> > +if qemu_spec:
> > +spec_path = Path(cast(str, qemu_spec.origin))
> 
> You're casting Optional[str] to str here. The source type is not obvious
> from the local code, so unless you spend some time actively figuring it
> out, this could be casting anything to str. But even knowing this, just
> casting Optional away looks fishy anyway.
> 
> Looking up the ModuleSpec docs, it seems okay to assume that it's never
> None in our case, but wouldn't it be much cleaner to just 'assert
> qemu_spec.origin' first and then use it without the cast?
> 
> > +try:
> > +_ = spec_path.relative_to(qemu_srctree_path)
> > +except ValueError:
> > +self._logger.warning(
> > +"WARNING: 'qemu' package will be imported from outside 
> > "
> > +"the source tree!")
> > +self._logger.warning(
> > +"Importing from: '%s'",
> > +spec_path.parents[1])
> > +
> >  self.pythonpath = os.getenv('PYTHONPATH')
> >  self.pythonpath = os.pathsep.join(filter(None, (
> >  self.source_iotests,
> > @@ -231,6 +249,7 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: 
> > str,
> >  
> >  self.build_root = os.path.join(self.build_iotests, '..', '..')
> >  
> > +self._logger = logging.getLogger('qemu.iotests')
> >  self.init_directories()
> >  self.init_binaries()
> 
> Looks good otherwise.

Well, actually there is a major problem: We'll pass the right PYTHONPATH
for the test scripts that we're calling, but this script itself doesn't
use it yet. So what I get is:

Traceback (most recent call last):
  File "/home/kwolf/source/qemu/tests/qemu-iotests/build/check", line 120, in 

env = TestEnv(imgfmt=args.imgfmt, imgproto=args.imgproto,
  File "/home/kwolf/source/qemu/tests/qemu-iotests/testenv.py", line 253, in 
__init__
self.init_directories()
  File "/home/kwolf/source/qemu/tests/qemu-iotests/testenv.py", line 120, in 
init_directories
qemu_spec = importlib.util.find_spec('qemu.qmp')
  File "/usr/lib64/python3.9/importlib/util.py", line 94, in find_spec
parent = __import__(parent_name, fromlist=['__path__'])
ModuleNotFoundError: No module named 'qemu'

Kevin

Re: [PATCH 2/6] iotests: add warning for rogue 'qemu' packages

2021-09-23 Thread Daniel P . Berrangé

On Wed, Sep 22, 2021 at 08:16:21PM -0400, John Snow wrote:
> Add a warning for when 'iotests' runs against a qemu namespace that
> isn't the one in the source tree. This might occur if you have
> (accidentally) installed the Python namespace package to your local
> packages.

IIUC, it is/was a valid use case to run the iotests on arbitrary
QEMU outside the source tree ie a distro packaged binary set.

Are we not allowing the tests/qemu-iotests/* content to be
run outside the context of the main source tree for this
purpose ?

eg  consider if Fedora/RHEL builds put tests/qemu-iotests/* into
a 'qemu-iotests' RPM, which was installed and used with a distro
package python-qemu ?

> (I'm not going to say that this is because I bit myself with this,
> but. You can fill in the blanks.)
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/testenv.py | 21 -
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
> index 88104dace90..8a43b193af5 100644
> --- a/tests/qemu-iotests/testenv.py
> +++ b/tests/qemu-iotests/testenv.py
> @@ -16,6 +16,8 @@
>  # along with this program.  If not, see .
>  #
>  
> +import importlib.util
> +import logging
>  import os
>  import sys
>  import tempfile
> @@ -25,7 +27,7 @@
>  import random
>  import subprocess
>  import glob
> -from typing import List, Dict, Any, Optional, ContextManager
> +from typing import List, Dict, Any, Optional, ContextManager, cast
>  
>  DEF_GDB_OPTIONS = 'localhost:12345'
>  
> @@ -112,6 +114,22 @@ def init_directories(self) -> None:
>  # Path where qemu goodies live in this source tree.
>  qemu_srctree_path = Path(__file__, '../../../python').resolve()
>  
> +# warn if we happen to be able to find 'qemu' packages from an
> +# unexpected location (i.e. the package is already installed in
> +# the user's environment)
> +qemu_spec = importlib.util.find_spec('qemu.qmp')
> +if qemu_spec:
> +spec_path = Path(cast(str, qemu_spec.origin))
> +try:
> +_ = spec_path.relative_to(qemu_srctree_path)
> +except ValueError:
> +self._logger.warning(
> +"WARNING: 'qemu' package will be imported from outside "
> +"the source tree!")
> +self._logger.warning(
> +"Importing from: '%s'",
> +spec_path.parents[1])

It feels to me like the scenario  we're blocking here is actally
the scenario we want to aim for.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PULL 18/28] file-posix: try BLKSECTGET on block devices too, do not round to power of 2

2021-09-23 Thread Paolo Bonzini


On 22/09/21 21:51, Halil Pasic wrote:

We have figured out what is going on here. The problem seems to be
specific to linux aio, which seems to limit the size of the iovec to
1024 (UIO_MAXIOV).


Hi Halil,

I'll send a patch shortly to fix this issue.  Sorry about the delay as I 
was busy with KVM Forum and all that. :(


Paolo

Re: [PATCH 2/6] iotests: add warning for rogue 'qemu' packages

2021-09-23 Thread Kevin Wolf

Am 23.09.2021 um 02:16 hat John Snow geschrieben:
> Add a warning for when 'iotests' runs against a qemu namespace that
> isn't the one in the source tree. This might occur if you have
> (accidentally) installed the Python namespace package to your local
> packages.
> 
> (I'm not going to say that this is because I bit myself with this,
> but. You can fill in the blanks.)
> 
> Signed-off-by: John Snow 
> ---
>  tests/qemu-iotests/testenv.py | 21 -
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
> index 88104dace90..8a43b193af5 100644
> --- a/tests/qemu-iotests/testenv.py
> +++ b/tests/qemu-iotests/testenv.py
> @@ -16,6 +16,8 @@
>  # along with this program.  If not, see .
>  #
>  
> +import importlib.util
> +import logging
>  import os
>  import sys
>  import tempfile
> @@ -25,7 +27,7 @@
>  import random
>  import subprocess
>  import glob
> -from typing import List, Dict, Any, Optional, ContextManager
> +from typing import List, Dict, Any, Optional, ContextManager, cast
>  
>  DEF_GDB_OPTIONS = 'localhost:12345'
>  
> @@ -112,6 +114,22 @@ def init_directories(self) -> None:
>  # Path where qemu goodies live in this source tree.
>  qemu_srctree_path = Path(__file__, '../../../python').resolve()
>  
> +# warn if we happen to be able to find 'qemu' packages from an
> +# unexpected location (i.e. the package is already installed in
> +# the user's environment)
> +qemu_spec = importlib.util.find_spec('qemu.qmp')
> +if qemu_spec:
> +spec_path = Path(cast(str, qemu_spec.origin))

You're casting Optional[str] to str here. The source type is not obvious
from the local code, so unless you spend some time actively figuring it
out, this could be casting anything to str. But even knowing this, just
casting Optional away looks fishy anyway.

Looking up the ModuleSpec docs, it seems okay to assume that it's never
None in our case, but wouldn't it be much cleaner to just 'assert
qemu_spec.origin' first and then use it without the cast?

> +try:
> +_ = spec_path.relative_to(qemu_srctree_path)
> +except ValueError:
> +self._logger.warning(
> +"WARNING: 'qemu' package will be imported from outside "
> +"the source tree!")
> +self._logger.warning(
> +"Importing from: '%s'",
> +spec_path.parents[1])
> +
>  self.pythonpath = os.getenv('PYTHONPATH')
>  self.pythonpath = os.pathsep.join(filter(None, (
>  self.source_iotests,
> @@ -231,6 +249,7 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: 
> str,
>  
>  self.build_root = os.path.join(self.build_iotests, '..', '..')
>  
> +self._logger = logging.getLogger('qemu.iotests')
>  self.init_directories()
>  self.init_binaries()

Looks good otherwise.

Kevin

Re: [PATCH 1/6] iotests: add 'qemu' package location to PYTHONPATH in testenv

2021-09-23 Thread Philippe Mathieu-Daudé


On 9/23/21 02:16, John Snow wrote:

We can drop the sys.path hacking in various places by doing
this. Additionally, by doing it in one place right up top, we can print
interesting warnings in case the environment does not look correct.

If we ever decide to change how the environment is crafted, all of the
"help me find my python packages" goop is all in one place, right in one
function.

Signed-off-by: John Snow 
---
  tests/qemu-iotests/235|  2 --
  tests/qemu-iotests/297|  6 --
  tests/qemu-iotests/300|  7 +++
  tests/qemu-iotests/iotests.py |  2 --
  tests/qemu-iotests/testenv.py | 14 +-
  tests/qemu-iotests/tests/mirror-top-perms |  7 +++
  6 files changed, 15 insertions(+), 23 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH 3/6] iotests/linters: check mypy files all at once

2021-09-23 Thread Philippe Mathieu-Daudé


On 9/23/21 02:16, John Snow wrote:

We can circumvent the '__main__' redefinition problem by passing
--scripts-are-modules. Take mypy out of the loop per-filename and check
everything in one go: it's quite a bit faster.

Signed-off-by: John Snow 
Reviewed-by: Hanna Reitz 
---
  tests/qemu-iotests/297 | 44 +++---
  1 file changed, 20 insertions(+), 24 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH 8/8] qapi: add blockdev-replace command

2021-09-23 Thread Markus Armbruster

Vladimir Sementsov-Ogievskiy  writes:

> Thanks a lot for reviewing!
>
> 20.09.2021 09:44, Markus Armbruster wrote:
>> Vladimir Sementsov-Ogievskiy  writes:
>> 
>>> Add command that can add and remove filters.
>>>
>>> Key points of functionality:
>>>
>>> What the command does is simply replace some BdrvChild.bs by some other
>>> nodes. The tricky thing is selecting there BdrvChild objects.
>>> To be able to select any kind of BdrvChild we use a generic parent_id,
>>> which may be a node-name, or qdev id or block export id. In future we
>>> may support block jobs.
>>>
>>> Any kind of ambiguity leads to error. If we have both device named
>>> device0 and block export named device0 and they both point to same BDS,
>>> user can't replace root child of one of these parents. So, to be able
>>> to do replacements, user should avoid duplicating names in different
>>> parent namespaces.
>>>
>>> So, command allows to replace any single child in the graph.
>>>
>>> On the other hand we want to realize a kind of bdrv_replace_node(),
>>> which works well when we want to replace all parents of some node. For
>>> this kind of task @parents-mode argument implemented.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>   qapi/block-core.json | 78 +
>>>   block.c  | 82 
>>>   2 files changed, 160 insertions(+)
>>>
>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>> index 675d8265eb..8059b96341 100644
>>> --- a/qapi/block-core.json
>>> +++ b/qapi/block-core.json
>>> @@ -5433,3 +5433,81 @@
>>>   { 'command': 'blockdev-snapshot-delete-internal-sync',
>>> 'data': { 'device': 'str', '*id': 'str', '*name': 'str'},
>>> 'returns': 'SnapshotInfo' }
>>> +
>>> +##
>>> +# @BlockdevReplaceParentsMode:
>>> +#
>>> +# Alternative (to directly set @parent) way to chose parents in
>>> +# @blockdev-replace
>>> +#
>>> +# @exactly-one: Exactly one parent should match a condition, otherwise
>>> +#   @blockdev-replace fails.
>>> +#
>>> +# @all: All matching parents are taken into account. If replacing lead
>>> +#   to loops in block graph, @blockdev-replace fails.
>>> +#
>>> +# @auto: Same as @all, but automatically skip replacing parents if it
>>> +#leads to loops in block graph.
>>> +#
>>> +# Since: 6.2
>>> +##
>>> +{ 'enum': 'BlockdevReplaceParentsMode',
>>> +  'data': ['exactly-one', 'all', 'auto'] }
>>> +
>>> +##
>>> +# @BlockdevReplace:
>>> +#
>>> +# Declaration of one replacement.
>> 
>> Replacement of what?  A node in the block graph?
>
> A specific child node in one or in several edges

Spell that out in the doc comment, please.

>> 
>>> +#
>>> +# @parent: id of parent. It may be qdev or block export or simple
>>> +#  node-name.
>> 
>> It may also be a QOM path, because find_device_state() interprets
>> arguments starting with '/' as QOM paths.
>> 
>> When is a node name "simple"?
>> 
>> Suggest: It may be a qdev ID, a QOM path, a block export ID, or a node
>> name.
>
> OK
>
>> 
>> The trouble is of course that we're merging three separate name spaces.
>
> Yes. Alternatively we can also add an enum of node-type [bds, device, 
> export[, job]], and select graph nodes more explicit (by pair of id/path/name 
> and type)
>
> But if we have to use these things in one context, it seems good to enforce 
> users use different names for them. And in this way, we can avoid strict 
> typing.

Is there precedence in QMP for merging ID name spaces, or for selecting
a name space?

>> Aside: a single name space for IDs would be so much saner, but we
>> screwed that up long ago.

Throwing some of the multiple name spaces together some of the time
feels like another mistake.

>> 
>>> If id is ambiguous (for example node-name of
>>> +#  some BDS equals to block export name), blockdev-replace
>>> +#  fails.
>> 
>> Is there a way out of this situation, or are is replacement simply
>> impossible then?
>
> In my idea, it's simply impossible. If someone want to use this new 
> interface, he should care to use different names for different things.

Reminds me of device_del, which simply could not delete a device without
an ID.  Made many users go "oh" (or possibly a more colorful version
thereof), until daniel fixed it in commit 6287d827d4 "monitor: allow
device_del to accept QOM paths" for v2.5.

>> 
>>> If not specified, blockdev-replace goes through
>>> +#  @parents-mode scenario, see below. Note, that @parent and
>>> +#  @parents-mode can't be specified simultaneously.
>> 
>> What if neither is specified?  Hmm, @parents-mode has a default, so
>> that's what we get.
>> 
>>> +#  If @parent is specified, only one edge is selected. If
>>> +#  several edges match the condition, blockdev-replace fails.
>>> +#
>>> +# @edge: name of the child. If omitted, any child name matches.
>>> +#
>>> +# @child: node-name of the

Recent qemu patch results in aio failures with host DASD disks resulting in guest I/O errors

2021-09-23 Thread Christian Borntraeger


Am 22.09.21 um 21:51 schrieb Halil Pasic:

On Mon, 6 Sep 2021 16:24:20 +0200
Halil Pasic  wrote:


On Fri, 25 Jun 2021 16:18:12 +0200
Paolo Bonzini  wrote:


bs->sg is only true for character devices, but block devices can also
be used with scsi-block and scsi-generic.  Unfortunately BLKSECTGET
returns bytes in an int for /dev/sgN devices, and sectors in a short
for block devices, so account for that in the code.

The maximum transfer also need not be a power of 2 (for example I have
seen disks with 1280 KiB maximum transfer) so there's no need to pass
the result through pow2floor.

Signed-off-by: Paolo Bonzini 


We have found that this patch leads to in guest I/O errors when DASD
is used as a source device. I.e. libvirt domain xml wise something like:

 
   
   
   
   
   
 

I don't think it is the fault of this patch: it LGTM. But it correlates
100%, furthermore the problem seems to be related to the value of
bl.max_iov which now comes from sysfs.

We are still investigating what is actually wrong. Just wanted to give
everybody a heads-up that this does seem to cause a nasty regression on
s390x, even if the code itself is perfect.



We have figured out what is going on here. The problem seems to be
specific to linux aio, which seems to limit the size of the iovec to
1024 (UIO_MAXIOV).

Because of this requests get rejected with -EINVAL by the io_submit()
syscall. Here comes a real world example:

(gdb) p *laiocb
$5 = {co = 0x3ff700072c0, ctx = 0x3fe60002650, iocb = {data = 0x0, aio_rw_flags 
= 0, key = 0,
 aio_lio_opcode = 8, aio_reqprio = 0, aio_fildes = 38, u = {c = {buf = 
0x3ff70055bc0,
 nbytes = 1274, offset = 19977953280, __pad3 = 0, flags = 1, resfd = 
43}, v = {
 vec = 0x3ff70055bc0, nr = 0, offset = 19977953280}, poll = {__pad1 = 
1023,
 events = 1879399360}, saddr = {addr = 0x3ff70055bc0, len = 0}}}, ret = 
-22,
   nbytes = 20586496, qiov = 0x3ff700382f8, is_read = false, next = {sqe_next = 
0x0}}

On the host kernel side, the -EINVAL comes from this line:
https://elixir.bootlin.com/linux/v5.15-rc2/source/lib/iov_iter.c#L1863
in iovec_from_user() roughly via: __do_sys_io_submit()->
io_submit_one() -> aio_write() -> aio_setup_rw() -> __import_iovec().

Based on the offline discussion with the DASD maintainers, and on the
linux in source tree documentation (Documentation/block/queue-sysfs.rst
und Documentation/block/biodoc.rst), I believe that the DASD driver is
not wrong when advertising the value 65535 for queue/max_segments.

I believe QEMU jumps to the wrong conclusion in blk_get_max_iov() or
in virtio_blk_submit_multireq(), I can't really tell because I'm not
sure what the semantic of blk_get_max_iov() is. But if, I had to, I would
bet that blk_get_max_iov() returns the wrong value, when linux aio is
used. I'm not sure what is the exact relationship between max_segments
and max_iov.

One idea on how to fix this would be, to cap max_iov to UIO_MAXIOV
(unconditionally, or when linux aio is used). But I have to admit, I
don't have clarity. I couldn't even find documentation that states
this limitation of linux aio (I didn't look for it too hard though), so
I can not even be sure this is a QEMU problem.

That is why I decided to write this up first, and start a discussion on
who is playing foul with the most relevant people posted. I intend to try
myself fixing the problem once my understanding of it reaches a
sufficient level.

Thanks in advance for your contribution!


Changing subject to reflect what we see.
For reference the QEMU patch is

https://git.qemu.org/?p=qemu.git;a=commitdiff;h=18473467d55a20d643b6c9b3a52de42f705b4d35;hp=24b36e9813ec15da7db62e3b3621730710c5f020

53 matches

Mail list logo