Re: [Qemu-block] [PATCH 0/7] blockjobs: preliminary refactoring work, Pt 1

2016-10-25 Thread Jeff Cody
On Fri, Oct 14, 2016 at 02:32:55PM -0400, John Snow wrote:
> 
> 
> On 10/13/2016 06:56 PM, John Snow wrote:
> >This is a follow-up to patches 1-6 of:
> >[PATCH v2 00/11] blockjobs: Fix transactional race condition
> >
> >That series started trying to refactor blockjobs with the goal of
> >internalizing BlockJob state as a side effect of having gone through
> >the effort of figuring out which commands were "safe" to call on
> >a Job that has no coroutine object.
> >
> >I've split out the less contentious bits so I can move forward with my
> >original work of focusing on the transactional race condition in a
> >different series.
> >
> >Functionally the biggest difference here is the presence of "internal"
> >block jobs, which do not emit QMP events or show up in block query
> >requests. This is done for the sake of replication jobs, which should
> >not be interfering with the public jobs namespace.
> >
> 
> I have v2 ready to send out correcting Kevin's comments in patch #01, but
> I'd like to have the Replication maintainers at Fujitsu take a look at how I
> have modified replication and at least 'ACK' the change.
> 
> As a recap: I am creating "internal" block jobs that have no ID and
> therefore do not collide with the user-specified jobs namespace. This way
> users cannot query, cancel, pause, or otherwise accidentally interfere with
> the replication job lifetime.
> 
> It also means that management layers such as libvirt will not be aware of
> the presence of such "internal" jobs.
> 
> Relevant patches are 1-3. Please let me know if you have questions.
> 
> Thanks,
> --John Snow
>

Looks good to me, once you address Kevin's comments in patch 1.

> 
> >
> >
> >For convenience, this branch is available at:
> >https://github.com/jnsnow/qemu.git branch job-refactor-pt1
> >https://github.com/jnsnow/qemu/tree/job-refactor-pt1
> >
> >This version is tagged job-refactor-pt1-v1:
> >https://github.com/jnsnow/qemu/releases/tag/job-refactor-pt1-v1
> >
> >John Snow (7):
> >  blockjobs: hide internal jobs from management API
> >  blockjobs: Allow creating internal jobs
> >  Replication/Blockjobs: Create replication jobs as internal
> >  blockjob: centralize QMP event emissions
> >  Blockjobs: Internalize user_pause logic
> >  blockjobs: split interface into public/private, Part 1
> >  blockjobs: fix documentation
> >
> > block/backup.c   |   5 +-
> > block/commit.c   |  10 +-
> > block/mirror.c   |  28 +++--
> > block/replication.c  |  14 +--
> > block/stream.c   |   9 +-
> > block/trace-events   |   5 +-
> > blockdev.c   |  74 +
> > blockjob.c   | 109 ++
> > include/block/block.h|   3 +-
> > include/block/block_int.h|  26 ++---
> > include/block/blockjob.h | 257 
> > +++
> > include/block/blockjob_int.h | 232 ++
> > qemu-img.c   |   5 +-
> > tests/test-blockjob-txn.c|   5 +-
> > tests/test-blockjob.c|   4 +-
> > 15 files changed, 443 insertions(+), 343 deletions(-)
> > create mode 100644 include/block/blockjob_int.h
> >



Re: [Qemu-block] [PATCH 7/7] blockjobs: fix documentation

2016-10-25 Thread Jeff Cody
On Thu, Oct 13, 2016 at 06:57:02PM -0400, John Snow wrote:
> (Trivial)
> 
> Fix wrong function names in documentation.
> 
> Signed-off-by: John Snow 
> ---
>  include/block/blockjob_int.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
> index 8eced19..10ebb38 100644
> --- a/include/block/blockjob_int.h
> +++ b/include/block/blockjob_int.h
> @@ -191,8 +191,8 @@ void coroutine_fn block_job_pause_point(BlockJob *job);
>  void block_job_enter(BlockJob *job);
>  
>  /**
> - * block_job_ready:
> - * @job: The job which is now ready to complete.
> + * block_job_event_ready:
> + * @job: The job which is now ready to be completed.
>   *
>   * Send a BLOCK_JOB_READY event for the specified job.
>   */
> -- 
> 2.7.4
> 

Reviewed-by: Jeff Cody 



Re: [Qemu-block] [PATCH 5/7] Blockjobs: Internalize user_pause logic

2016-10-25 Thread Jeff Cody
On Thu, Oct 13, 2016 at 06:57:00PM -0400, John Snow wrote:
> BlockJobs will begin hiding their state in preparation for some
> refactorings anyway, so let's internalize the user_pause mechanism
> instead of leaving it to callers to correctly manage.
> 
> Signed-off-by: John Snow 
> ---
>  blockdev.c   | 12 +---
>  blockjob.c   | 22 --
>  include/block/blockjob.h | 26 ++
>  3 files changed, 51 insertions(+), 9 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 22a1280..1661d08 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -3579,7 +3579,7 @@ void qmp_block_job_cancel(const char *device,
>  force = false;
>  }
>  
> -if (job->user_paused && !force) {
> +if (block_job_user_paused(job) && !force) {
>  error_setg(errp, "The block job for device '%s' is currently paused",
> device);
>  goto out;
> @@ -3596,13 +3596,12 @@ void qmp_block_job_pause(const char *device, Error 
> **errp)
>  AioContext *aio_context;
>  BlockJob *job = find_block_job(device, _context, errp);
>  
> -if (!job || job->user_paused) {
> +if (!job || block_job_user_paused(job)) {
>  return;
>  }
>  
> -job->user_paused = true;
>  trace_qmp_block_job_pause(job);
> -block_job_pause(job);
> +block_job_user_pause(job);
>  aio_context_release(aio_context);
>  }
>  
> @@ -3611,14 +3610,13 @@ void qmp_block_job_resume(const char *device, Error 
> **errp)
>  AioContext *aio_context;
>  BlockJob *job = find_block_job(device, _context, errp);
>  
> -if (!job || !job->user_paused) {
> +if (!job || !block_job_user_paused(job)) {
>  return;
>  }
>  
> -job->user_paused = false;
>  trace_qmp_block_job_resume(job);
>  block_job_iostatus_reset(job);
> -block_job_resume(job);
> +block_job_user_resume(job);
>  aio_context_release(aio_context);
>  }
>  
> diff --git a/blockjob.c b/blockjob.c
> index e32cb78..d118a1f 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -362,11 +362,22 @@ void block_job_pause(BlockJob *job)
>  job->pause_count++;
>  }
>  
> +void block_job_user_pause(BlockJob *job)
> +{
> +job->user_paused = true;
> +block_job_pause(job);
> +}
> +
>  static bool block_job_should_pause(BlockJob *job)
>  {
>  return job->pause_count > 0;
>  }
>  
> +bool block_job_user_paused(BlockJob *job)
> +{
> +return job ? job->user_paused : 0;
> +}
> +
>  void coroutine_fn block_job_pause_point(BlockJob *job)
>  {
>  if (!block_job_should_pause(job)) {
> @@ -403,6 +414,14 @@ void block_job_resume(BlockJob *job)
>  block_job_enter(job);
>  }
>  
> +void block_job_user_resume(BlockJob *job)
> +{
> +if (job && job->user_paused && job->pause_count > 0) {
> +job->user_paused = false;
> +block_job_resume(job);
> +}
> +}
> +
>  void block_job_enter(BlockJob *job)
>  {
>  if (job->co && !job->busy) {
> @@ -626,8 +645,7 @@ BlockErrorAction block_job_error_action(BlockJob *job, 
> BlockdevOnError on_err,
>  }
>  if (action == BLOCK_ERROR_ACTION_STOP) {
>  /* make the pause user visible, which will be resumed from QMP. */
> -job->user_paused = true;
> -block_job_pause(job);
> +block_job_user_pause(job);
>  block_job_iostatus_set_err(job, error);
>  }
>  return action;
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index 928f0b8..5b61140 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -358,6 +358,23 @@ void coroutine_fn block_job_pause_point(BlockJob *job);
>  void block_job_pause(BlockJob *job);
>  
>  /**
> + * block_job_user_pause:
> + * @job: The job to be paused.
> + *
> + * Asynchronously pause the specified job.
> + * Do not allow a resume until a matching call to block_job_user_resume.
> + */
> +void block_job_user_pause(BlockJob *job);
> +
> +/**
> + * block_job_paused:
> + * @job: The job to query.
> + *
> + * Returns true if the job is user-paused.
> + */
> +bool block_job_user_paused(BlockJob *job);
> +
> +/**
>   * block_job_resume:
>   * @job: The job to be resumed.
>   *
> @@ -366,6 +383,15 @@ void block_job_pause(BlockJob *job);
>  void block_job_resume(BlockJob *job);
>  
>  /**
> + * block_job_user_resume:
> + * @job: The job to be resumed.
> + *
> + * Resume the specified job.
> + * Must be paired with a preceding block_job_user_pause.
> + */
> +void block_job_user_resume(BlockJob *job);
> +
> +/**
>   * block_job_enter:
>   * @job: The job to enter.
>   *
> -- 
> 2.7.4
> 

Reviewed-by: Jeff Cody 



Re: [Qemu-block] [PATCH 4/7] blockjob: centralize QMP event emissions

2016-10-25 Thread Jeff Cody
On Thu, Oct 13, 2016 at 06:56:59PM -0400, John Snow wrote:
> There's no reason to leave this to blockdev; we can do it in blockjobs
> directly and get rid of an extra callback for most users.
> 
> All non-internal events, even those created outside of QMP, will
> consistently emit events.
> 
> Signed-off-by: John Snow 
> ---
>  block/commit.c|  8 
>  block/mirror.c|  6 ++
>  block/stream.c|  7 +++
>  block/trace-events|  5 ++---
>  blockdev.c| 42 --
>  blockjob.c| 23 +++
>  include/block/block_int.h | 17 -
>  include/block/blockjob.h  | 17 -
>  8 files changed, 42 insertions(+), 83 deletions(-)
> 
> diff --git a/block/commit.c b/block/commit.c
> index f29e341..475a375 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -209,8 +209,8 @@ static const BlockJobDriver commit_job_driver = {
>  
>  void commit_start(const char *job_id, BlockDriverState *bs,
>BlockDriverState *base, BlockDriverState *top, int64_t 
> speed,
> -  BlockdevOnError on_error, BlockCompletionFunc *cb,
> -  void *opaque, const char *backing_file_str, Error **errp)
> +  BlockdevOnError on_error, const char *backing_file_str,
> +  Error **errp)
>  {
>  CommitBlockJob *s;
>  BlockReopenQueue *reopen_queue = NULL;
> @@ -233,7 +233,7 @@ void commit_start(const char *job_id, BlockDriverState 
> *bs,
>  }
>  
>  s = block_job_create(job_id, _job_driver, bs, speed,
> - BLOCK_JOB_DEFAULT, cb, opaque, errp);
> + BLOCK_JOB_DEFAULT, NULL, NULL, errp);
>  if (!s) {
>  return;
>  }
> @@ -276,7 +276,7 @@ void commit_start(const char *job_id, BlockDriverState 
> *bs,
>  s->on_error = on_error;
>  s->common.co = qemu_coroutine_create(commit_run, s);
>  
> -trace_commit_start(bs, base, top, s, s->common.co, opaque);
> +trace_commit_start(bs, base, top, s, s->common.co);
>  qemu_coroutine_enter(s->common.co);
>  }
>  
> diff --git a/block/mirror.c b/block/mirror.c
> index 15d2d10..4374fb4 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -979,9 +979,7 @@ void mirror_start(const char *job_id, BlockDriverState 
> *bs,
>MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
>BlockdevOnError on_source_error,
>BlockdevOnError on_target_error,
> -  bool unmap,
> -  BlockCompletionFunc *cb,
> -  void *opaque, Error **errp)
> +  bool unmap, Error **errp)
>  {
>  bool is_none_mode;
>  BlockDriverState *base;
> @@ -994,7 +992,7 @@ void mirror_start(const char *job_id, BlockDriverState 
> *bs,
>  base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
>  mirror_start_job(job_id, bs, BLOCK_JOB_DEFAULT, target, replaces,
>   speed, granularity, buf_size, backing_mode,
> - on_source_error, on_target_error, unmap, cb, opaque, 
> errp,
> + on_source_error, on_target_error, unmap, NULL, NULL, 
> errp,
>   _job_driver, is_none_mode, base, false);
>  }
>  
> diff --git a/block/stream.c b/block/stream.c
> index eeb6f52..7d6877d 100644
> --- a/block/stream.c
> +++ b/block/stream.c
> @@ -216,13 +216,12 @@ static const BlockJobDriver stream_job_driver = {
>  
>  void stream_start(const char *job_id, BlockDriverState *bs,
>BlockDriverState *base, const char *backing_file_str,
> -  int64_t speed, BlockdevOnError on_error,
> -  BlockCompletionFunc *cb, void *opaque, Error **errp)
> +  int64_t speed, BlockdevOnError on_error, Error **errp)
>  {
>  StreamBlockJob *s;
>  
>  s = block_job_create(job_id, _job_driver, bs, speed,
> - BLOCK_JOB_DEFAULT, cb, opaque, errp);
> + BLOCK_JOB_DEFAULT, NULL, NULL, errp);
>  if (!s) {
>  return;
>  }
> @@ -232,6 +231,6 @@ void stream_start(const char *job_id, BlockDriverState 
> *bs,
>  
>  s->on_error = on_error;
>  s->common.co = qemu_coroutine_create(stream_run, s);
> -trace_stream_start(bs, base, s, s->common.co, opaque);
> +trace_stream_start(bs, base, s, s->common.co);
>  qemu_coroutine_enter(s->common.co);
>  }
> diff --git a/block/trace-events b/block/trace-events
> index 05fa13c..c12f91b 100644
> --- a/block/trace-events
> +++ b/block/trace-events
> @@ -20,11 +20,11 @@ bdrv_co_do_copy_on_readv(void *bs, int64_t offset, 
> unsigned int bytes, int64_t c
>  
>  # block/stream.c
>  stream_one_iteration(void *s, int64_t sector_num, int nb_sectors, int 
> is_allocated) "s %p sector_num %"PRId64" nb_sectors %d is_allocated %d"
> -stream_start(void *bs, void *base, void *s, 

Re: [Qemu-block] [PATCH 3/7] Replication/Blockjobs: Create replication jobs as internal

2016-10-25 Thread Jeff Cody
On Thu, Oct 13, 2016 at 06:56:58PM -0400, John Snow wrote:
> Bubble up the internal interface to commit and backup jobs, then switch
> replication tasks over to using this methodology.
> 
> Signed-off-by: John Snow 
> ---
>  block/backup.c|  3 ++-
>  block/mirror.c| 21 ++---
>  block/replication.c   | 14 +++---
>  blockdev.c| 11 +++
>  include/block/block_int.h |  9 +++--
>  qemu-img.c|  5 +++--
>  6 files changed, 36 insertions(+), 27 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 5acb5c4..6a60ca8 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -527,6 +527,7 @@ void backup_start(const char *job_id, BlockDriverState 
> *bs,
>bool compress,
>BlockdevOnError on_source_error,
>BlockdevOnError on_target_error,
> +  int creation_flags,
>BlockCompletionFunc *cb, void *opaque,
>BlockJobTxn *txn, Error **errp)
>  {
> @@ -596,7 +597,7 @@ void backup_start(const char *job_id, BlockDriverState 
> *bs,
>  }
>  
>  job = block_job_create(job_id, _job_driver, bs, speed,
> -   BLOCK_JOB_DEFAULT, cb, opaque, errp);
> +   creation_flags, cb, opaque, errp);
>  if (!job) {
>  goto error;
>  }
> diff --git a/block/mirror.c b/block/mirror.c
> index 74c03ae..15d2d10 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -906,9 +906,9 @@ static const BlockJobDriver commit_active_job_driver = {
>  };
>  
>  static void mirror_start_job(const char *job_id, BlockDriverState *bs,
> - BlockDriverState *target, const char *replaces,
> - int64_t speed, uint32_t granularity,
> - int64_t buf_size,
> + int creation_flags, BlockDriverState *target,
> + const char *replaces, int64_t speed,
> + uint32_t granularity, int64_t buf_size,
>   BlockMirrorBackingMode backing_mode,
>   BlockdevOnError on_source_error,
>   BlockdevOnError on_target_error,
> @@ -936,8 +936,8 @@ static void mirror_start_job(const char *job_id, 
> BlockDriverState *bs,
>  buf_size = DEFAULT_MIRROR_BUF_SIZE;
>  }
>  
> -s = block_job_create(job_id, driver, bs, speed,
> - BLOCK_JOB_DEFAULT, cb, opaque, errp);
> +s = block_job_create(job_id, driver, bs, speed, creation_flags,
> + cb, opaque, errp);
>  if (!s) {
>  return;
>  }
> @@ -992,17 +992,16 @@ void mirror_start(const char *job_id, BlockDriverState 
> *bs,
>  }
>  is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
>  base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
> -mirror_start_job(job_id, bs, target, replaces,
> +mirror_start_job(job_id, bs, BLOCK_JOB_DEFAULT, target, replaces,
>   speed, granularity, buf_size, backing_mode,
>   on_source_error, on_target_error, unmap, cb, opaque, 
> errp,
>   _job_driver, is_none_mode, base, false);
>  }
>  
>  void commit_active_start(const char *job_id, BlockDriverState *bs,
> - BlockDriverState *base, int64_t speed,
> - BlockdevOnError on_error,
> - BlockCompletionFunc *cb,
> - void *opaque, Error **errp,
> + BlockDriverState *base, int creation_flags,
> + int64_t speed, BlockdevOnError on_error,
> + BlockCompletionFunc *cb, void *opaque, Error **errp,
>   bool auto_complete)
>  {
>  int64_t length, base_length;
> @@ -1041,7 +1040,7 @@ void commit_active_start(const char *job_id, 
> BlockDriverState *bs,
>  }
>  }
>  
> -mirror_start_job(job_id, bs, base, NULL, speed, 0, 0,
> +mirror_start_job(job_id, bs, creation_flags, base, NULL, speed, 0, 0,
>   MIRROR_LEAVE_BACKING_CHAIN,
>   on_error, on_error, false, cb, opaque, _err,
>   _active_job_driver, false, base, auto_complete);
> diff --git a/block/replication.c b/block/replication.c
> index 3bd1cf1..d4f4a7b 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -496,10 +496,11 @@ static void replication_start(ReplicationState *rs, 
> ReplicationMode mode,
>  bdrv_op_block_all(top_bs, s->blocker);
>  bdrv_op_unblock(top_bs, BLOCK_OP_TYPE_DATAPLANE, s->blocker);
>  
> -backup_start("replication-backup", s->secondary_disk->bs,
> - s->hidden_disk->bs, 0, MIRROR_SYNC_MODE_NONE, NULL, 
> false,
> +backup_start(NULL, s->secondary_disk->bs, 

Re: [Qemu-block] [PATCH 2/7] blockjobs: Allow creating internal jobs

2016-10-25 Thread Jeff Cody
On Thu, Oct 13, 2016 at 06:56:57PM -0400, John Snow wrote:
> Add the ability to create jobs without an ID.
> 
> Signed-off-by: John Snow 
> ---
>  block/backup.c|  2 +-
>  block/commit.c|  2 +-
>  block/mirror.c|  3 ++-
>  block/stream.c|  2 +-
>  blockjob.c| 25 -
>  include/block/blockjob.h  |  7 ++-
>  tests/test-blockjob-txn.c |  3 ++-
>  tests/test-blockjob.c |  2 +-
>  8 files changed, 30 insertions(+), 16 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 582bd0f..5acb5c4 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -596,7 +596,7 @@ void backup_start(const char *job_id, BlockDriverState 
> *bs,
>  }
>  
>  job = block_job_create(job_id, _job_driver, bs, speed,
> -   cb, opaque, errp);
> +   BLOCK_JOB_DEFAULT, cb, opaque, errp);
>  if (!job) {
>  goto error;
>  }
> diff --git a/block/commit.c b/block/commit.c
> index 9f67a8b..f29e341 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -233,7 +233,7 @@ void commit_start(const char *job_id, BlockDriverState 
> *bs,
>  }
>  
>  s = block_job_create(job_id, _job_driver, bs, speed,
> - cb, opaque, errp);
> + BLOCK_JOB_DEFAULT, cb, opaque, errp);
>  if (!s) {
>  return;
>  }
> diff --git a/block/mirror.c b/block/mirror.c
> index f9d1fec..74c03ae 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -936,7 +936,8 @@ static void mirror_start_job(const char *job_id, 
> BlockDriverState *bs,
>  buf_size = DEFAULT_MIRROR_BUF_SIZE;
>  }
>  
> -s = block_job_create(job_id, driver, bs, speed, cb, opaque, errp);
> +s = block_job_create(job_id, driver, bs, speed,
> + BLOCK_JOB_DEFAULT, cb, opaque, errp);
>  if (!s) {
>  return;
>  }
> diff --git a/block/stream.c b/block/stream.c
> index 3187481..eeb6f52 100644
> --- a/block/stream.c
> +++ b/block/stream.c
> @@ -222,7 +222,7 @@ void stream_start(const char *job_id, BlockDriverState 
> *bs,
>  StreamBlockJob *s;
>  
>  s = block_job_create(job_id, _job_driver, bs, speed,
> - cb, opaque, errp);
> + BLOCK_JOB_DEFAULT, cb, opaque, errp);
>  if (!s) {
>  return;
>  }
> diff --git a/blockjob.c b/blockjob.c
> index e78ad94..017905a 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -118,7 +118,7 @@ static void block_job_detach_aio_context(void *opaque)
>  }
>  
>  void *block_job_create(const char *job_id, const BlockJobDriver *driver,
> -   BlockDriverState *bs, int64_t speed,
> +   BlockDriverState *bs, int64_t speed, int flags,
> BlockCompletionFunc *cb, void *opaque, Error **errp)
>  {
>  BlockBackend *blk;
> @@ -130,7 +130,7 @@ void *block_job_create(const char *job_id, const 
> BlockJobDriver *driver,
>  return NULL;
>  }
>  
> -if (job_id == NULL) {
> +if (job_id == NULL && !(flags & BLOCK_JOB_INTERNAL)) {
>  job_id = bdrv_get_device_name(bs);
>  if (!*job_id) {
>  error_setg(errp, "An explicit job ID is required for this node");
> @@ -138,14 +138,21 @@ void *block_job_create(const char *job_id, const 
> BlockJobDriver *driver,
>  }
>  }
>  
> -if (!id_wellformed(job_id)) {
> -error_setg(errp, "Invalid job ID '%s'", job_id);
> -return NULL;
> -}
> +if (job_id) {
> +if (flags & BLOCK_JOB_INTERNAL) {
> +error_setg(errp, "Cannot specify job ID for internal block job");
> +return NULL;
> +}
>  
> -if (block_job_get(job_id)) {
> -error_setg(errp, "Job ID '%s' already in use", job_id);
> -return NULL;
> +if (!id_wellformed(job_id)) {
> +error_setg(errp, "Invalid job ID '%s'", job_id);
> +return NULL;
> +}
> +
> +if (block_job_get(job_id)) {
> +error_setg(errp, "Job ID '%s' already in use", job_id);
> +return NULL;
> +}
>  }
>  
>  blk = blk_new();
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index 6ecfa2e..fdb31e0 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -200,6 +200,11 @@ struct BlockJob {
>  QLIST_ENTRY(BlockJob) txn_list;
>  };
>  
> +typedef enum BlockJobCreateFlags {
> +BLOCK_JOB_DEFAULT = 0x00,
> +BLOCK_JOB_INTERNAL = 0x01,
> +} BlockJobCreateFlags;
> +
>  /**
>   * block_job_next:
>   * @job: A block job, or %NULL.
> @@ -242,7 +247,7 @@ BlockJob *block_job_get(const char *id);
>   * called from a wrapper that is specific to the job type.
>   */
>  void *block_job_create(const char *job_id, const BlockJobDriver *driver,
> -   BlockDriverState *bs, int64_t speed,
> +   BlockDriverState 

Re: [Qemu-block] [PATCH RFC 0/7] COLO block replication supports shared disk case

2016-10-25 Thread Changlong Xie

I did't review p5/p6, I think you can merge p5/p6 into a single one.
Also don't forget update qapi/block-core.json with p3.

Thanks
-Xie

On 10/20/2016 09:57 PM, zhanghailiang wrote:

COLO block replication doesn't support the shared disk case,
Here we try to implement it.

Just as the scenario of non-shared disk block replication,
we are going to implement block replication from many basic
blocks that are already in QEMU.
The architecture is:

  virtio-blk ||   
.--
  /  ||   | 
Secondary
 /   ||   
'--
/|| 
virtio-blk
   / || 
 |
   | ||   
replication(5)
   |NBD  >   NBD   (2)  
 |
   |  client ||server ---> hidden disk <-- 
active disk(4)
   | ^   ||  |
   |  replication(1) ||  |
   | |   ||  |
   |   +-'   ||  |
  (3)  |drive-backup sync=none   ||  |
. |   +-+   ||  |
Primary | | |   ||   backing|
' | |   ||  |
   V |   |
+---+|
|   shared disk | <--+
+---+
1) Primary writes will read original data and forward it to Secondary
QEMU.
2) The hidden-disk will buffers the original content that is modified
by the primary VM. It should also be an empty disk, and
the driver supports bdrv_make_empty() and backing file.
3) Primary write requests will be written to Shared disk.
4) Secondary write requests will be buffered in the active disk and it
will overwrite the existing sector content in the buffe

For more details, please refer to patch 1.

The complete codes can be found from the link:
https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk

Test steps:
1. Secondary:
# x86_64-softmmu/qemu-system-x86_64 -boot c -m 2048 -smp 2 -qmp stdio -vnc :9 
-name secondary -enable-kvm -cpu qemu64,+kvmclock -device piix3-usb-uhci -drive 
if=none,driver=qcow2,file.filename=/mnt/ramfs/hidden_disk.img,id=hidden_disk0,backing.driver=raw,backing.file.filename=/work/kvm/suse11_sp3_64
  -drive 
if=virtio,id=active-disk0,driver=replication,mode=secondary,file.driver=qcow2,top-id=active-disk0,file.file.filename=/mnt/ramfs/active_disk.img,file.backing=hidden_disk0,shared-disk=on
 -incoming tcp:0:

Issue qmp commands:
{'execute':'qmp_capabilities'}
{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data': 
{'host': '0', 'port': '9998'} } } }
{'execute': 'nbd-server-add', 'arguments': {'device': 'hidden_disk0', 
'writable': true } }

2.Primary:
# x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 2048 -smp 2 -qmp stdio -vnc 
:9 -name primary -cpu qemu64,+kvmclock -device piix3-usb-uhci -drive 
if=virtio,id=primary_disk0,file.filename=/work/kvm/suse11_sp3_64,driver=raw -S

Issue qmp commands:
{'execute':'qmp_capabilities'}
{'execute': 'human-monitor-command', 'arguments': {'command-line': 'drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=9.42.3.17,file.port=9998,file.export=hidden_disk0,shared-disk-id=primary_disk0,shared-disk=on,node-name=rep'}}
{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [ 
{'capability': 'x-colo', 'state': true } ] } }
{'execute': 'migrate', 'arguments': {'uri': 'tcp:9.42.3.17:' } }

3. Failover
Secondary side:
Issue qmp commands:
{ 'execute': 'nbd-server-stop' }
{ "execute": "x-colo-lost-heartbeat" }

Please review and any commits are welcomed.

Cc: Juan Quintela 
Cc: Amit Shah 
Cc: Dr. David Alan Gilbert (git) 

zhanghailiang (7):
   docs/block-replication: Add description for shared-disk case
   block-backend: Introduce blk_root() helper
   replication: add shared-disk and shared-disk-id options
   replication: Split out backup_do_checkpoint() from
 secondary_do_checkpoint()
   replication: fix code logic with the new shared_disk option
   replication: Implement block replication for shared disk case
   nbd/replication: implement .bdrv_get_info() for nbd and replication
 driver

  block/block-backend.c  |   5 ++
  block/nbd.c|  12 

Re: [Qemu-block] [PATCH RFC 3/7] replication: add shared-disk and shared-disk-id options

2016-10-25 Thread Changlong Xie

On 10/20/2016 09:57 PM, zhanghailiang wrote:

We use these two options to identify which disk is
shared

Signed-off-by: zhanghailiang 
Signed-off-by: Wen Congyang 
Signed-off-by: Zhang Chen 
---
  block/replication.c | 33 +
  1 file changed, 33 insertions(+)

diff --git a/block/replication.c b/block/replication.c
index 3bd1cf1..2a2fdb2 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -25,9 +25,12 @@
  typedef struct BDRVReplicationState {
  ReplicationMode mode;
  int replication_state;
+bool is_shared_disk;
+char *shared_disk_id;
  BdrvChild *active_disk;
  BdrvChild *hidden_disk;
  BdrvChild *secondary_disk;
+BdrvChild *primary_disk;
  char *top_id;
  ReplicationState *rs;
  Error *blocker;
@@ -53,6 +56,9 @@ static void replication_stop(ReplicationState *rs, bool 
failover,

  #define REPLICATION_MODE"mode"
  #define REPLICATION_TOP_ID  "top-id"
+#define REPLICATION_SHARED_DISK "shared-disk"
+#define REPLICATION_SHARED_DISK_ID "shared-disk-id"
+
  static QemuOptsList replication_runtime_opts = {
  .name = "replication",
  .head = QTAILQ_HEAD_INITIALIZER(replication_runtime_opts.head),
@@ -65,6 +71,14 @@ static QemuOptsList replication_runtime_opts = {
  .name = REPLICATION_TOP_ID,
  .type = QEMU_OPT_STRING,
  },
+{
+.name = REPLICATION_SHARED_DISK_ID,
+.type = QEMU_OPT_STRING,
+},
+{
+.name = REPLICATION_SHARED_DISK,
+.type = QEMU_OPT_BOOL,
+},
  { /* end of list */ }
  },
  };
@@ -85,6 +99,8 @@ static int replication_open(BlockDriverState *bs, QDict 
*options,
  QemuOpts *opts = NULL;
  const char *mode;
  const char *top_id;
+const char *shared_disk_id;
+BlockBackend *blk;

  ret = -EINVAL;
  opts = qemu_opts_create(_runtime_opts, NULL, 0, _abort);
@@ -114,6 +130,22 @@ static int replication_open(BlockDriverState *bs, QDict 
*options,
 "The option mode's value should be primary or secondary");
  goto fail;
  }


Now we have four runtime options 
"mode"/"top-id"/"shared-disk"/"shared-disk-id". But the current checking 
logic is too weak, i think you need enhance it to avoid opts misusage.



+s->is_shared_disk = qemu_opt_get_bool(opts, REPLICATION_SHARED_DISK,
+false);


Missing one space.


+if (s->is_shared_disk && (s->mode == REPLICATION_MODE_PRIMARY)) {
+shared_disk_id = qemu_opt_get(opts, REPLICATION_SHARED_DISK_ID);
+if (!shared_disk_id) {
+error_setg(_err, "Missing shared disk blk");
+goto fail;
+}
+s->shared_disk_id = g_strdup(shared_disk_id);
+blk = blk_by_name(s->shared_disk_id);
+if (!blk) {
+error_setg(_err, "There is no %s block", s->shared_disk_id);
+goto fail;
+}
+s->primary_disk = blk_root(blk);
+}

  s->rs = replication_new(bs, _ops);

@@ -130,6 +162,7 @@ static void replication_close(BlockDriverState *bs)
  {
  BDRVReplicationState *s = bs->opaque;

+g_free(s->shared_disk_id);
  if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
  replication_stop(s->rs, false, NULL);
  }







Re: [Qemu-block] [PATCH RFC 4/7] replication: Split out backup_do_checkpoint() from secondary_do_checkpoint()

2016-10-25 Thread Changlong Xie

On 10/20/2016 09:57 PM, zhanghailiang wrote:

The helper backup_do_checkpoint() will be used for primary related
codes. Here we split it out from secondary_do_checkpoint().

Besides, it is unnecessary to call backup_do_checkpoint() in
replication starting and normally stop replication path.


This patch is unnecessary. We *really* need clean 
backup_job->done_bitmap in replication_start/stop path.



We only need call it while do real checkpointing.

Signed-off-by: zhanghailiang 
---
  block/replication.c | 36 +++-
  1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 2a2fdb2..d687ffc 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -320,20 +320,8 @@ static bool 
replication_recurse_is_first_non_filter(BlockDriverState *bs,

  static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
  {
-Error *local_err = NULL;
  int ret;

-if (!s->secondary_disk->bs->job) {
-error_setg(errp, "Backup job was cancelled unexpectedly");
-return;
-}
-
-backup_do_checkpoint(s->secondary_disk->bs->job, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-return;
-}
-
  ret = s->active_disk->bs->drv->bdrv_make_empty(s->active_disk->bs);
  if (ret < 0) {
  error_setg(errp, "Cannot make active disk empty");
@@ -539,6 +527,8 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
  aio_context_release(aio_context);
  return;
  }
+
+secondary_do_checkpoint(s, errp);
  break;
  default:
  aio_context_release(aio_context);
@@ -547,10 +537,6 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,

  s->replication_state = BLOCK_REPLICATION_RUNNING;

-if (s->mode == REPLICATION_MODE_SECONDARY) {
-secondary_do_checkpoint(s, errp);
-}
-
  s->error = 0;
  aio_context_release(aio_context);
  }
@@ -560,13 +546,29 @@ static void replication_do_checkpoint(ReplicationState 
*rs, Error **errp)
  BlockDriverState *bs = rs->opaque;
  BDRVReplicationState *s;
  AioContext *aio_context;
+Error *local_err = NULL;

  aio_context = bdrv_get_aio_context(bs);
  aio_context_acquire(aio_context);
  s = bs->opaque;

-if (s->mode == REPLICATION_MODE_SECONDARY) {
+switch (s->mode) {
+case REPLICATION_MODE_PRIMARY:
+break;
+case REPLICATION_MODE_SECONDARY:
+if (!s->secondary_disk->bs->job) {
+error_setg(errp, "Backup job was cancelled unexpectedly");
+break;
+}
+backup_do_checkpoint(s->secondary_disk->bs->job, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+break;
+}
  secondary_do_checkpoint(s, errp);
+break;
+default:
+abort();
  }
  aio_context_release(aio_context);
  }







Re: [Qemu-block] [PATCH v5 08/13] qapi: Allow blockdev-add for NBD

2016-10-25 Thread Eric Blake
On 10/25/2016 08:11 AM, Max Reitz wrote:
> Signed-off-by: Max Reitz 
> ---
>  qapi/block-core.json | 25 ++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 97b1205..4b4a74c 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1703,14 +1703,15 @@
>  #
>  # @host_device, @host_cdrom: Since 2.1
>  # @gluster: Since 2.7
> +# @nbd: Since 2.8

'replication' was also added in 2.8; we should mention it while touching
this.

>  #
>  # Since: 2.0
>  ##
>  { 'enum': 'BlockdevDriver',
>'data': [ 'archipelago', 'blkdebug', 'blkverify', 'bochs', 'cloop',
>  'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
> -'host_device', 'http', 'https', 'luks', 'null-aio', 'null-co',
> -'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
> +'host_device', 'http', 'https', 'luks', 'nbd', 'null-aio',
> +'null-co', 'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
>   'replication', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }

Can we fix the TAB damage while at it?

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH v6] raw_bsd: add offset and size options

2016-10-25 Thread Tomáš Golembiovský
Added two new options 'offset' and 'size'. This makes it possible to use
only part of the file as a device. This can be used e.g. to limit the
access only to single partition in a disk image or use a disk inside a
tar archive (like OVA).

When 'size' is specified we do our best to honour it.

Signed-off-by: Tomáš Golembiovský 
---
 block/raw_bsd.c  | 176 ++-
 qapi/block-core.json |  16 -
 2 files changed, 188 insertions(+), 4 deletions(-)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 588d408..9eb187a 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -31,6 +31,30 @@
 #include "qapi/error.h"
 #include "qemu/option.h"
 
+typedef struct BDRVRawState {
+uint64_t offset;
+uint64_t size;
+bool has_size;
+} BDRVRawState;
+
+static QemuOptsList raw_runtime_opts = {
+.name = "raw",
+.head = QTAILQ_HEAD_INITIALIZER(raw_runtime_opts.head),
+.desc = {
+{
+.name = "offset",
+.type = QEMU_OPT_SIZE,
+.help = "offset in the disk where the image starts",
+},
+{
+.name = "size",
+.type = QEMU_OPT_SIZE,
+.help = "virtual disk size",
+},
+{ /* end of list */ }
+},
+};
+
 static QemuOptsList raw_create_opts = {
 .name = "raw-create-opts",
 .head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
@@ -44,16 +68,108 @@ static QemuOptsList raw_create_opts = {
 }
 };
 
+static int raw_read_options(QDict *options, BlockDriverState *bs,
+BDRVRawState *s, Error **errp)
+{
+Error *local_err = NULL;
+QemuOpts *opts = NULL;
+int64_t real_size = 0;
+int ret;
+
+real_size = bdrv_getlength(bs->file->bs);
+if (real_size < 0) {
+error_setg_errno(errp, -real_size, "Could not get image size");
+return real_size;
+}
+
+opts = qemu_opts_create(_runtime_opts, NULL, 0, _abort);
+qemu_opts_absorb_qdict(opts, options, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+ret = -EINVAL;
+goto end;
+}
+
+s->offset = qemu_opt_get_size(opts, "offset", 0);
+if (qemu_opt_find(opts, "size") != NULL) {
+s->size = qemu_opt_get_size(opts, "size", 0);
+s->has_size = true;
+} else {
+s->has_size = false;
+s->size = real_size - s->offset;
+}
+
+/* Check size and offset */
+if (real_size < s->offset || (real_size - s->offset) < s->size) {
+error_setg(errp, "The sum of offset (%" PRIu64 ") and size "
+"(%" PRIu64 ") has to be smaller or equal to the "
+" actual size of the containing file (%" PRId64 ")",
+s->offset, s->size, real_size);
+ret = -EINVAL;
+goto end;
+}
+
+/* Make sure size is multiple of BDRV_SECTOR_SIZE to prevent rounding
+ * up and leaking out of the specified area. */
+if (!QEMU_IS_ALIGNED(s->size, BDRV_SECTOR_SIZE)) {
+error_setg(errp, "Specified size is not multiple of %llu",
+BDRV_SECTOR_SIZE);
+ret = -EINVAL;
+goto end;
+}
+
+ret = 0;
+
+end:
+
+qemu_opts_del(opts);
+
+return ret;
+}
+
 static int raw_reopen_prepare(BDRVReopenState *reopen_state,
   BlockReopenQueue *queue, Error **errp)
 {
-return 0;
+assert(reopen_state != NULL);
+assert(reopen_state->bs != NULL);
+
+reopen_state->opaque = g_new0(BDRVRawState, 1);
+
+return raw_read_options(
+reopen_state->options,
+reopen_state->bs,
+reopen_state->opaque,
+errp);
+}
+
+static void raw_reopen_commit(BDRVReopenState *state)
+{
+BDRVRawState *new_s = state->opaque;
+BDRVRawState *s = state->bs->opaque;
+
+memcpy(s, new_s, sizeof(BDRVRawState));
+
+g_free(state->opaque);
+state->opaque = NULL;
+}
+
+static void raw_reopen_abort(BDRVReopenState *state)
+{
+g_free(state->opaque);
+state->opaque = NULL;
 }
 
 static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
   uint64_t bytes, QEMUIOVector *qiov,
   int flags)
 {
+BDRVRawState *s = bs->opaque;
+
+if (offset > UINT64_MAX - s->offset) {
+return -EINVAL;
+}
+offset += s->offset;
+
 BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
 return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
 }
@@ -62,11 +178,23 @@ static int coroutine_fn raw_co_pwritev(BlockDriverState 
*bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov,
int flags)
 {
+BDRVRawState *s = bs->opaque;
 void *buf = NULL;
 BlockDriver *drv;
 QEMUIOVector local_qiov;
 int ret;
 
+if (s->has_size && (offset > s->size || bytes > (s->size - offset))) {
+/* There's not enough space for the data. Don't write anything and just
+ * fail 

Re: [Qemu-block] [PATCH v5] raw_bsd: add offset and size options

2016-10-25 Thread Tomáš Golembiovský
I should test my code more before submitting it to ML. I have found two
bugs in the patch.


On Sun, 23 Oct 2016 16:54:37 +0200
Tomáš Golembiovský  wrote:

> +static int raw_read_options(QDict *options, BlockDriverState *bs,
> +BDRVRawState *s, Error **errp)
> +{
> +Error *local_err = NULL;
> +QemuOpts *opts = NULL;
> +int64_t real_size = 0;
> +int ret;
> +
> +real_size = bdrv_getlength(bs->file->bs);
> +if (real_size < 0) {
> +error_setg_errno(errp, -real_size, "Could not get image size");
> +return real_size;
> +}
> +
> +opts = qemu_opts_create(_runtime_opts, NULL, 0, _abort);
> +qemu_opts_absorb_qdict(opts, options, _err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +ret = -EINVAL;
> +goto end;
> +}
> +
> +s->offset = qemu_opt_get_size(opts, "offset", 0);
> +if (qemu_opt_find(opts, "size") != NULL) {
> +s->size = qemu_opt_get_size(opts, "size", 0);
> +s->has_size = true;
> +} else {
> +s->has_size = false;
> +s->size = real_size;

This has to be:

s->size = real_size - s->offset;

.. to account for the offset. Otherwise the following check will fail.

> +}
> +
> +/* Check size and offset */
> +if (real_size < s->offset || (real_size - s->offset) < s->size) {
> +error_setg(errp, "The sum of offset (%" PRIu64 ") and size "
> +"(%" PRIu64 ") has to be smaller or equal to the "
> +" actual size of the containing file (%" PRId64 ")",
> +s->offset, s->size, real_size);
> +ret = -EINVAL;
> +goto end;
> +}
> +
> +/* Make sure size is multiple of BDRV_SECTOR_SIZE to prevent rounding
> + * up and leaking out of the specified area. */
> +if (QEMU_IS_ALIGNED(s->size, BDRV_SECTOR_SIZE)) {

The condition has to be negated. Silly mistake made while rewriting the
condition to use QEMU_IS_ALIGNED.

> +error_setg(errp, "Specified size is not multiple of %llu",
> +BDRV_SECTOR_SIZE);
> +ret = -EINVAL;
> +goto end;
> +}
> +
> +ret = 0;
> +
> +end:
> +
> +qemu_opts_del(opts);
> +
> +return ret;
> +}
> +

-- 
Tomáš Golembiovský 



Re: [Qemu-block] [PATCH v2 2/2] qapi: allow blockdev-add for NFS

2016-10-25 Thread Eric Blake
On 10/24/2016 02:27 PM, Ashijeet Acharya wrote:
> Introduce new object 'BlockdevOptionsNFS' in qapi/block-core.json to
> support blockdev-add for NFS network protocol driver. Also make a new
> struct NFSServer to support tcp connection.
> 
> Signed-off-by: Ashijeet Acharya 
> ---
>  qapi/block-core.json | 56 
> 
>  1 file changed, 52 insertions(+), 4 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 9d797b8..3ab028d 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1714,9 +1714,9 @@
>  { 'enum': 'BlockdevDriver',
>'data': [ 'archipelago', 'blkdebug', 'blkverify', 'bochs', 'cloop',
>  'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
> -'host_device', 'http', 'https', 'luks', 'null-aio', 'null-co',
> -'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
> - 'replication', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
> +'host_device', 'http', 'https', 'luks', 'nfs', 'null-aio',
> +'null-co', 'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
> +'replication', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }

Missing a comment that 'nfs' is since 2.8.

>  ##
> +# @NFSServer
> +#
> +# Captures the address of the socket
> +#
> +# @type:transport type used for NFS (only TCP supported)
> +#
> +# @host:host part of the address
> +#
> +# Since 2.8
> +##
> +{ 'struct': 'NFSServer',
> +  'data': { 'type': 'str',

Please make this an enum, instead of an open-coded string. It's okay if
the enum only has one value 'tcp' for now; but using an enum will make
it introspectable if we later add a second transport, unlike what we get
with an open-coded string.

Must 'type' be mandatory if it must always be 'tcp'?

> +'host': 'str' } }
> +
> +##
> +# @BlockdevOptionsNfs
> +#
> +# Driver specific block device option for NFS
> +#
> +# @server:host address
> +#
> +# @path:  path of the image on the host
> +#
> +# @uid:   #optional UID value to use when talking to the server
> +#
> +# @gid:   #optional GID value to use when talking to the server

Do we want to allow string names in addition to numeric uid/gid values?
I'm not sure if NFS has name-based id mapping, but it's food for thought
on whether we need to use an alternate type here (alternate between
integer id and string name), or leave this as is.

> +#
> +# @tcp-syncnt:#optional number of SYNs during the session establishment

Would tcp-syn-count be any more legible?  What is the default when omitted?

> +#
> +# @readahead: #optional set the readahead size in bytes

What's the default when omitted?

> +#
> +# @pagecache: #optional set the pagecache size in bytes

Default?

> +#
> +# @debug: #optional set the NFS debug level (max 2)

Presumably default 0?

> +#
> +# Since 2.8
> +##
> +{ 'struct': 'BlockdevOptionsNfs',
> +  'data': { 'server': 'NFSServer',
> +'path': 'str',
> +'*uid': 'int',
> +'*gid': 'int',
> +'*tcp-syncnt': 'int',
> +'*readahead': 'int',
> +'*pagecache': 'int',
> +'*debug': 'int' } }
> +

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v3 5/5] qapi: allow blockdev-add for ssh

2016-10-25 Thread Eric Blake
On 10/17/2016 12:32 PM, Ashijeet Acharya wrote:
> Introduce new object 'BlockdevOptionsSsh' in qapi/block-core.json to
> support blockdev-add for SSH network protocol driver. Use only 'struct
> InetSocketAddress' since SSH only supports connection over TCP.
> 
> Signed-off-by: Ashijeet Acharya 
> ---
>  qapi/block-core.json | 26 --
>  1 file changed, 24 insertions(+), 2 deletions(-)
> 

> +++ b/qapi/block-core.json
> @@ -1716,7 +1716,8 @@
>  'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
>  'host_device', 'http', 'https', 'luks', 'null-aio', 'null-co',
>  'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
> - 'replication', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
> +'replication', 'ssh', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc',
> +'vvfat' ] }

Please update the comment just before the enum that mentions 'ssh' as an
addition in 2.8.


> +##
> +# @BlockdevOptionsSsh
> +#
> +# @server:  host address
> +#
> +# @path:path to the image on the host
> +#
> +# @user:#optional user as which to connect, defaults to 
> current
> +#   local user name
> +#
> +# @host_key_check   #optional defines how and what to check the host
> +#   key against, defaults to "yes"

Please s/host_key_check/host-key-check/ - new interfaces should favor
dash, not underscore. (The C code will be the same, though.)

> +#
> +# Since 2.8
> +##
> +{ 'struct': 'BlockdevOptionsSsh',
> +  'data': { 'server': 'InetSocketAddress',
> +'path': 'str',
> +'*user': 'str',
> +'*host_key_check': 'str' } }

Is host-key-check truly a free-form string, or is it only a finite set
of valid possibilities, where 'yes' is the default string?  Would it be
better to express it as an enum instead of a raw string?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 5/5] qapi: allow blockdev-add for ssh

2016-10-25 Thread Eric Blake
On 10/25/2016 08:04 AM, Ashijeet Acharya wrote:
> Introduce new object 'BlockdevOptionsSsh' in qapi/block-core.json to
> support blockdev-add for SSH network protocol driver. Use only 'struct
> InetSocketAddress' since SSH only supports connection over TCP.
> 
> Signed-off-by: Ashijeet Acharya 
> Reviewed-by: Kevin Wolf 
> ---
>  qapi/block-core.json | 26 --
>  1 file changed, 24 insertions(+), 2 deletions(-)

Sorry for not noticing this when I finally replied to v4;


> +##
> +# @BlockdevOptionsSsh
> +#
> +# @server:  host address
> +#
> +# @path:path to the image on the host
> +#
> +# @user:#optional user as which to connect, defaults to 
> current
> +#   local user name
> +#
> +# @host_key_check   #optional defines how and what to check the host
> +#   key against, defaults to "yes"

I still have reservations about this parameter. I think we have time to
fix it as followups during soft freeze if Kevin would rather get your
initial patches in now, if that's what it takes to meet soft freeze
deadlines, but I do not want to bake it into the actual 2.8 release
without addressing those concerns.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH 2/4] block/curl: Fix return value from curl_read_cb

2016-10-25 Thread Eric Blake
On 10/24/2016 09:54 PM, Max Reitz wrote:
> While commit 38bbc0a580f9f10570b1d1b5d3e92f0e6feb2970 is correct in that
> the callback is supposed to return the number of bytes handled; what it
> does not mention is that libcurl will throw an error if the callback did
> not "handle" all of the data passed to it.
> 
> Therefore, if the callback receives some data that it cannot handle
> (either because the receive buffer has not been set up yet or because it
> would not fit into the receive buffer) and we have to ignore it, we
> still have to report that the data has been handled.
> 
> Obviously, this should not happen normally. But it does happen at least
> for FTP connections where some data (that we do not expect) may be
> generated when the connection is established.

Just to make sure, we aren't losing data by reporting this value, but
merely letting curl know that our callback has "dealt" with the data, so
that we don't error out, in order to get a second chance at the same
data later on?

Reviewed-by: Eric Blake 

But given that it undoes 38bbc0a, I'd rather that it gets reviewed by
Matthew and/or tested by Richard.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH 1/4] block/curl: Use BDRV_SECTOR_SIZE

2016-10-25 Thread Eric Blake
On 10/24/2016 09:54 PM, Max Reitz wrote:
> Currently, curl defines its own constant SECTOR_SIZE. There is no
> advantage over using the global BDRV_SECTOR_SIZE, so drop it.
> 
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Max Reitz 
> ---
>  block/curl.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)

Reviewed-by: Eric Blake 

(Can curl technically be used to access single-byte granularity?  I
suppose that would be a larger patch, though, to implement the right
callbacks).

> +DPRINTF("CURL (AIO): Reading %llu at %zd (%s)\n",
> +(acb->nb_sectors * BDRV_SECTOR_SIZE), start, state->range);

Eww. start is seriously limited to size_t rather than off_t?  Doesn't
that cripple us to a maximum of 4G files on 32-bit hosts?  But unrelated
to this patch.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH real v5 0/4] fdc: Use separate qdev device for drives

2016-10-25 Thread John Snow

fdc.use.separate.qdev.device.for.drives.EN.V5.REPACK.PC.LINUX-kw0lf-kr3w

 _____  .__   _   __  
|  |   _  _\   _  \ |  |_/ \ |  | \_  \_  _  __
|  |/ /\ \/ \/ /  /_\  \|  |\   __\  |  |/ /\_  __ \_(__  < \/ \/ /
|<  \ /\  \_/   \  |_|  ||<  |  | \/   \ /
|__|_ \  \/\_/  \_  //__||__|_ \ |__| /__  /\/\_/
 \/   \/  \/ \/

  _
 \_  \   |__\   _  \ ___.__.
  __   _(__  < /\|  /  /_\  <   |  |  __
 /_/  /   \   |  \   |  \  \_/   \___  | /_/
 /__  /___|  /\__|  |\_  / |
\/ \/\__|  \/\/


On 10/25/2016 05:14 AM, Kevin Wolf wrote:

We have been complaining for a long time about how the floppy controller and
floppy drives are combined in a single qdev device and how this makes the
device awkward to work with because it behaves different from all other block
devices.

The latest reason to complain was when I noticed that using qdev device names
in QMP commands (e.g. for media change) doesn't really work when only the
controller is a qdev device, but the drives aren't.

So I decided to have a go at it, and this is the result.

It doesn't actually change any of the inner workings of the floppy controller,
but it wires things up differently on the qdev layer so that a floppy
controller now exposes a bus on which the floppy drives sit. This results in a
structure that is similar to IDE where the actual drive state is still in the
controller and the qdev device basically just contains the qdev properties -
not pretty, but quite workable.

The commit message of patch 3 explains how to use it. In short, there is a
'-device floppy' now and it does what you would expect if you ever used ide-cd.

The other problem is old command lines, especially those using things like
'-global isa-fdc,driveA=...'. In order to keep them working, we need to forward
the property to an internally created floppy drive device. This is a bit like
usb-storage, which we know is ugly, but works well enough in practice. The good
news here is that in contrast to usb-storage, the floppy controller only does
the forwarding for legacy configurations; as soon as you start using '-device
floppy', it doesn't happen any more.

So as you may have expected, this conversion doesn't result in a perfect
device, but I think it's definitely an improvement over the old state. I hope
you like it despite the warts. :-)

v5:
- Apply _filter_qemu to stderr, too [John]
- Rename the bus to floppy-bus [Frederic]
- Use FLOPPY_BUS() instead of DO_UPDATE() [Frederic]

v4:
- John says that his grep is broken and hangs at 100% CPU with my attempt to
  extract the floppy controller from info qtree. Use a simpler sed command
  instead (which, unlike the grep command, doesn't handle arbitrary indentation
  level of the next item, but we know what comes next, so just hardcoding 10
  spaces works, too).

v3:
- Fixed omissons in the conversion sysbus-fdc and the Sun one. Nice, combining
  floppy controllers and weird platforms in a single series. [John]

v2:
- Added patch 4 (qemu-iotests case for floppy config on the command line)
- Patch 2: Create a floppy device only if a BlockBackend exists instead of
  always creating two of them
- Patch 2: Initialise drive->fdctrl even if no drive is attached, it is
  accessed anyway during migration
- Patch 3: Keep 'type' qdev property and FDrive->drive in sync
- Patch 3: Removed if with condition that is always true

Kevin Wolf (4):
  fdc: Add a floppy qbus
  fdc: Add a floppy drive qdev
  fdc: Move qdev properties to FloppyDrive
  qemu-iotests: Test creating floppy drives

 hw/block/fdc.c |  271 --
 tests/qemu-iotests/172 |  246 ++
 tests/qemu-iotests/172.out | 1170 
 tests/qemu-iotests/group   |1 +
 vl.c   |1 +
 5 files changed, 1641 insertions(+), 48 deletions(-)
 create mode 100755 tests/qemu-iotests/172
 create mode 100644 tests/qemu-iotests/172.out



Thanks, applied to my IDE tree:

https://github.com/jnsnow/qemu/commits/ide
https://github.com/jnsnow/qemu.git

--js

(Yes, the IDE one instead of the FDC one. I'm just going to batch submit 
a few disparate patches I have kicking around.)




Re: [Qemu-block] [Qemu-devel] [PATCH v7 00/16] nbd: efficient write zeroes

2016-10-25 Thread Eric Blake
ping - I'd really like this in 2.8, since it missed 2.7, but soft freeze
is again approaching

On 10/14/2016 01:33 PM, Eric Blake wrote:
> Also available as a tag at:
> git fetch git://repo.or.cz/qemu/ericb.git nbd-zero-v7
> 
> v5 was here, but missed 2.7 freeze:
> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg04053.html
> 
> Since then, I've rebased the series, and the bulk of the changes
> were to use consistent NBDFoo CamelCase naming, as well as to
> improve the commit messages for questions raised on v5.
> 
> v6 was here, with no human review yet:
> https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03048.html
> 
> Since then, I addressed the buildbot complaints.
> 
> 001/16:[] [--] 'nbd: Add qemu-nbd -D for human-readable description'
> 002/16:[] [--] 'nbd: Treat flags vs. command type as separate fields'
> 003/16:[] [--] 'nbd: Rename NBDRequest to NBDRequestData'
> 004/16:[] [--] 'nbd: Rename NbdClientSession to NBDClientSession'
> 005/16:[] [--] 'nbd: Rename struct nbd_request and nbd_reply'
> 006/16:[] [--] 'nbd: Share common reply-sending code in server'
> 007/16:[] [--] 'nbd: Send message along with server NBD_REP_ERR errors'
> 008/16:[] [--] 'nbd: Share common option-sending code in client'
> 009/16:[] [--] 'nbd: Let server know when client gives up negotiation'
> 010/16:[] [--] 'nbd: Let client skip portions of server reply'
> 011/16:[] [--] 'nbd: Less allocation during NBD_OPT_LIST'
> 012/16:[] [--] 'nbd: Support shorter handshake'
> 013/16:[down] 'nbd: Refactor conversion to errno to silence checkpatch'
> 014/16:[0012] [FC] 'nbd: Improve server handling of shutdown requests'
> 015/16:[] [--] 'nbd: Implement NBD_CMD_WRITE_ZEROES on server'
> 016/16:[] [--] 'nbd: Implement NBD_CMD_WRITE_ZEROES on client'
> 
> Eric Blake (16):
>   nbd: Add qemu-nbd -D for human-readable description
>   nbd: Treat flags vs. command type as separate fields
>   nbd: Rename NBDRequest to NBDRequestData
>   nbd: Rename NbdClientSession to NBDClientSession
>   nbd: Rename struct nbd_request and nbd_reply
>   nbd: Share common reply-sending code in server
>   nbd: Send message along with server NBD_REP_ERR errors
>   nbd: Share common option-sending code in client
>   nbd: Let server know when client gives up negotiation
>   nbd: Let client skip portions of server reply
>   nbd: Less allocation during NBD_OPT_LIST
>   nbd: Support shorter handshake
>   nbd: Refactor conversion to errno to silence checkpatch
>   nbd: Improve server handling of shutdown requests
>   nbd: Implement NBD_CMD_WRITE_ZEROES on server
>   nbd: Implement NBD_CMD_WRITE_ZEROES on client
> 
>  block/nbd-client.h  |  10 +-
>  include/block/nbd.h |  73 ++--
>  nbd/nbd-internal.h  |  12 +-
>  block/nbd-client.c  |  96 ++
>  block/nbd.c |   8 +-
>  nbd/client.c| 510 
> 
>  nbd/server.c| 296 --
>  qemu-nbd.c  |  12 +-
>  qemu-nbd.texi   |   5 +-
>  9 files changed, 638 insertions(+), 384 deletions(-)
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v5 06/13] block/nbd: Accept SocketAddress

2016-10-25 Thread Kevin Wolf
Am 25.10.2016 um 15:11 hat Max Reitz geschrieben:
> Add a new option "server" to the NBD block driver which accepts a
> SocketAddress.
> 
> "path", "host" and "port" are still supported as legacy options and are
> mapped to their corresponding SocketAddress representation.
> 
> Signed-off-by: Max Reitz 

Reviewed-by: Kevin Wolf 



Re: [Qemu-block] [PATCH v8 00/36] block: Image locking series

2016-10-25 Thread Kevin Wolf
Am 25.10.2016 um 15:30 hat Max Reitz geschrieben:
> On 25.10.2016 10:24, Kevin Wolf wrote:
> > Am 24.10.2016 um 20:03 hat Max Reitz geschrieben:
> >> On 24.10.2016 12:11, Kevin Wolf wrote:
> >>
> >> [...]
> >>
> >>> Now, the big question is how to translate this into file locking. This
> >>> could become a little tricky. I had a few thoughts involving another
> >>> lock on byte 2, but none of them actually worked out so far, because
> >>> what we want is essentially a lock that can be shared by readers, that
> >>> can also be shared by writers, but not by readers and writers at the
> >>> same time.
> >>
> >> You can also share it between readers and writers, as long as everyone
> >> can cope with volatile data.
> > 
> > Sorry, that was ambiguous. I meant a file-level lock rather than the
> > high-level one. If we had a lock that can be shared by one or the other,
> > but not both, then two locks would be enough to build what we really
> > want.
> > 
> >> I agree that it's very similar to the proposed op blocker style, but I
> >> can't really come up with a meaningful translation either.
> >>
> >> Maybe something like this (?): All readers who do not want the file to
> >> be modified grab a shared lock on byte 1. All writers who can deal with
> >> volatile data grab a shared lock on byte 2. Exclusive writers grab an
> >> exclusive lock on byte 1 and 2. Readers who can cope with volatile data
> >> get no lock at all.
> >>
> >> When opening, the first and second group would always have to test
> >> whether there is a lock on the other byte, respectively. E.g. sharing
> >> writers would first grab an exclusive lock on byte 1, then the shared
> >> lock on byte 2 and then release the exclusive lock again.
> >>
> >> Would that work?
> > 
> > I'm afraid it wouldn't. If you start the sharing writer first and then
> > the writer-blocking reader, the writer doesn't hold a lock on byte 1 any
> > more,
> 
> But it holds a lock on byte 2.
> 
> >   so the reader can start even though someone is writing to the
> > image.
> 
> It can't because it would try to grab an exclusive lock on byte 2 before
> grabbing the shared lock on byte 1.

Apparently I failed to understand the most important part of the
proposal. :-)

So we have two locks. Both are only held for a longer time in shared
mode. Exclusive mode is only used for testing whether the lock is being
held and is immediately given up again.

The meaning of holding a shared lock is:

byte 1: I can't allow other processes to write to the image
byte 2: I am writing to the image

The four cases that we have involve:

* shared writer: Take shared lock on byte 2. Test whether byte 1 is
  locked using an exclusive lock, and fail if so.

* exclusive writer: Take shared lock on byte 2. Test whether byte 1 is
  locked using an exclusive lock, and fail if so. Then take shared lock
  on byte 1. I suppose this is racy, but we can probably tolerate that.

* reader that can tolerate writers: Don't do anything

* reader that can't tolerate writers: Take shared lock on byte 1. Test
  whether byte 2 is locked, and fail if so.

Seems to work if I got that right.

Kevin


pgpEjSuH0QATP.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH v11 01/19] block: Add bdrv_drain_all_{begin, end}()

2016-10-25 Thread Paolo Bonzini


On 25/10/2016 16:48, Alberto Garcia wrote:
> And how about the rest of the things that are going on in
> bdrv_drain_all_begin()?
> 
> bdrv_parent_drained_begin(bs);

No BlockBackend yet, and BlockDriverStates have been quiesced already,
so that's okay.

> bdrv_io_unplugged_begin(bs);

No I/O yet, so that's okay.

> bdrv_drain_recurse(bs);

Children have been created before, so they're already quiescent.

> aio_disable_external(aio_context);

This is also a hack for what should be in BlockBackend---which means
that we're safe because there's no BlockBackend yet.

Paolo



Re: [Qemu-block] [PATCH v11 01/19] block: Add bdrv_drain_all_{begin, end}()

2016-10-25 Thread Kevin Wolf
Am 25.10.2016 um 16:41 hat Paolo Bonzini geschrieben:
> 
> 
> On 25/10/2016 16:38, Kevin Wolf wrote:
> > Am 25.10.2016 um 15:53 hat Paolo Bonzini geschrieben:
> >>
> >>
> >> On 25/10/2016 15:39, Alberto Garcia wrote:
> >>> On Mon 24 Oct 2016 12:53:41 PM CEST, Paolo Bonzini wrote:
> >>>
> > My first thoughts were about how to let an unpause succeed without a
> > previous pause for these objects, but actually I think this isn't
> > what we should do. We rather want to actually do the pause instead
> > because even new BDSes and block jobs should probably start in a
> > quiesced state when created inside a drain_all section.
> 
>  Yes, I agree with this.  It shouldn't be too hard to implement it.  It
>  would require a BlockDriverState to look at the global "inside
>  bdrv_drain_all_begin" state, and ask its BlockBackend to disable
>  itself upon bdrv_replace_child.
> >>>
> >>> Why in bdrv_replace_child()? bdrv_drain_all_end() enables all BDSs, but
> >>> if you add one with "blockdev-add" it's not going to be disabled using
> >>> this method.
> >>
> >> You only need to disable it when blk_insert_bs is called.  In fact...
> > 
> > This assumes that the block driver doesn't issue internal background I/O
> > by itself. Probably true for everything that we have today, but it would
> > probably be cleaner to quiesce it directly in bdrv_open_common().
> 
> So
> 
>   bs->quiesce_counter = all_quiesce_counter;
> 
> ?  That would work.

Yes, that's what I had in mind.

> Should bdrv_close assert bs->quiesce_counter==0
> (which implies all_quiesce_counter == 0), since it usually has to do I/O?

Hm... Not sure about that. We're still using bdrv_drain_all_begin/end as
a function to isolate the BDSes, so some I/O from the caller of
drain_all is expected, and that could involve deleting a BDS.

But once we fully implemented what you proposed, probably yes.

Kevin



Re: [Qemu-block] [PATCH v11 01/19] block: Add bdrv_drain_all_{begin, end}()

2016-10-25 Thread Kevin Wolf
Am 25.10.2016 um 15:53 hat Paolo Bonzini geschrieben:
> 
> 
> On 25/10/2016 15:39, Alberto Garcia wrote:
> > On Mon 24 Oct 2016 12:53:41 PM CEST, Paolo Bonzini wrote:
> > 
> >>> My first thoughts were about how to let an unpause succeed without a
> >>> previous pause for these objects, but actually I think this isn't
> >>> what we should do. We rather want to actually do the pause instead
> >>> because even new BDSes and block jobs should probably start in a
> >>> quiesced state when created inside a drain_all section.
> >>
> >> Yes, I agree with this.  It shouldn't be too hard to implement it.  It
> >> would require a BlockDriverState to look at the global "inside
> >> bdrv_drain_all_begin" state, and ask its BlockBackend to disable
> >> itself upon bdrv_replace_child.
> > 
> > Why in bdrv_replace_child()? bdrv_drain_all_end() enables all BDSs, but
> > if you add one with "blockdev-add" it's not going to be disabled using
> > this method.
> 
> You only need to disable it when blk_insert_bs is called.  In fact...

This assumes that the block driver doesn't issue internal background I/O
by itself. Probably true for everything that we have today, but it would
probably be cleaner to quiesce it directly in bdrv_open_common().

> > In addition to that block jobs need the same, don't they? Something like
> > "job->pause_count = all_quiesce_counter" in the initialization.
> 
> ... we discussed a couple weeks ago that block jobs should register
> BlkDeviceOps that pause the job in the drained_begin callback.  So when
> the block job calls blk_insert_bs, the drained_begin callback would
> start the job as paused.

Yes, should, but doing this kind of infrastructure work isn't something
for this series.

> > I think we'd also need to add block_job_pause_point() at the beginning
> > of each one of their coroutines, in order to make sure that they really
> > start paused.
> 
> It depends on the job, for example streaming starts with
> block_job_sleep_ns.  Commit also does, except for some blk_getlength and
> blk_truncate calls that could be moved to the caller.

Kevin



Re: [Qemu-block] [PATCH v11 01/19] block: Add bdrv_drain_all_{begin, end}()

2016-10-25 Thread Paolo Bonzini


On 25/10/2016 16:38, Kevin Wolf wrote:
> Am 25.10.2016 um 15:53 hat Paolo Bonzini geschrieben:
>>
>>
>> On 25/10/2016 15:39, Alberto Garcia wrote:
>>> On Mon 24 Oct 2016 12:53:41 PM CEST, Paolo Bonzini wrote:
>>>
> My first thoughts were about how to let an unpause succeed without a
> previous pause for these objects, but actually I think this isn't
> what we should do. We rather want to actually do the pause instead
> because even new BDSes and block jobs should probably start in a
> quiesced state when created inside a drain_all section.

 Yes, I agree with this.  It shouldn't be too hard to implement it.  It
 would require a BlockDriverState to look at the global "inside
 bdrv_drain_all_begin" state, and ask its BlockBackend to disable
 itself upon bdrv_replace_child.
>>>
>>> Why in bdrv_replace_child()? bdrv_drain_all_end() enables all BDSs, but
>>> if you add one with "blockdev-add" it's not going to be disabled using
>>> this method.
>>
>> You only need to disable it when blk_insert_bs is called.  In fact...
> 
> This assumes that the block driver doesn't issue internal background I/O
> by itself. Probably true for everything that we have today, but it would
> probably be cleaner to quiesce it directly in bdrv_open_common().

So

bs->quiesce_counter = all_quiesce_counter;

?  That would work.  Should bdrv_close assert bs->quiesce_counter==0
(which implies all_quiesce_counter == 0), since it usually has to do I/O?

>>> In addition to that block jobs need the same, don't they? Something like
>>> "job->pause_count = all_quiesce_counter" in the initialization.
>>
>> ... we discussed a couple weeks ago that block jobs should register
>> BlkDeviceOps that pause the job in the drained_begin callback.  So when
>> the block job calls blk_insert_bs, the drained_begin callback would
>> start the job as paused.
> 
> Yes, should, but doing this kind of infrastructure work isn't something
> for this series.

I agree.  I was just explaining why it wouldn't be necessary to
initialize job->pause_count.

Paolo

>>> I think we'd also need to add block_job_pause_point() at the beginning
>>> of each one of their coroutines, in order to make sure that they really
>>> start paused.
>>
>> It depends on the job, for example streaming starts with
>> block_job_sleep_ns.  Commit also does, except for some blk_getlength and
>> blk_truncate calls that could be moved to the caller.
> 
> Kevin
> 



Re: [Qemu-block] [PATCH 4/4] block: Cater to iscsi with non-power-of-2 discard

2016-10-25 Thread Paolo Bonzini


On 25/10/2016 16:35, Eric Blake wrote:
> So your argument is that we should always pass down every unaligned
> less-than-optimum discard request all the way to the hardware, rather
> than dropping it higher in the stack, even though discard requests are
> already advisory, in order to leave the hardware as the ultimate
> decision on whether to ignore the unaligned request?

Yes, I agree with Peter as to this.

Paolo



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v5 13/13] iotests: Add test for NBD's blockdev-add interface

2016-10-25 Thread Markus Armbruster
Max Reitz  writes:

> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/147 | 196 
> +
>  tests/qemu-iotests/147.out |   5 ++
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 202 insertions(+)
>  create mode 100755 tests/qemu-iotests/147
>  create mode 100644 tests/qemu-iotests/147.out
>
> diff --git a/tests/qemu-iotests/147 b/tests/qemu-iotests/147
> new file mode 100755
> index 000..32d2a03
> --- /dev/null
> +++ b/tests/qemu-iotests/147
> @@ -0,0 +1,196 @@
[...]
> +if __name__ == '__main__':
> +# Need to support image creation
> +iotests.main(supported_fmts=['vpc', 'parallels', 'qcow', 'vdi', 'qcow2',
> + 'vmdk', 'raw', 'vhdx', 'qed'])
> +

I just fed the series to git-am to make sure it still applies after me
rewriting qapi-next a bit, and git-am is unhappy with you adding a
trailing blank line.

[...]



Re: [Qemu-block] [PATCH 4/4] block: Cater to iscsi with non-power-of-2 discard

2016-10-25 Thread Eric Blake
On 10/25/2016 09:20 AM, Peter Lieven wrote:
>> The firmware is probably technically buggy for advertising too large of
>> a minimum granularity, if it can piece together smaller requests into a
>> larger discard.  If discards need to happen at a smaller granularity,
>> the firmware (or kernel quirk system) should fix the advertisement to
>> the actual granularity that it will honor.  I don't see a reason to
>> change qemu's current behavior.
>>
> 
> The issue is that the optimal unmap granularity is only a hint.
> There is no evidence what happens with unaligned requests or requests
> that are smaller. They could still lead to a discard operation in the
> storage device. It just says if you can send me discards of that size
> thats optimal for me. Thats not said that smaller or unaligned requests
> have no effect.

So your argument is that we should always pass down every unaligned
less-than-optimum discard request all the way to the hardware, rather
than dropping it higher in the stack, even though discard requests are
already advisory, in order to leave the hardware as the ultimate
decision on whether to ignore the unaligned request?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v2 2/2] qapi: allow blockdev-add for NFS

2016-10-25 Thread Kevin Wolf
Am 24.10.2016 um 21:27 hat Ashijeet Acharya geschrieben:
> Introduce new object 'BlockdevOptionsNFS' in qapi/block-core.json to
> support blockdev-add for NFS network protocol driver. Also make a new
> struct NFSServer to support tcp connection.
> 
> Signed-off-by: Ashijeet Acharya 
> ---
>  qapi/block-core.json | 56 
> 
>  1 file changed, 52 insertions(+), 4 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 9d797b8..3ab028d 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1714,9 +1714,9 @@
>  { 'enum': 'BlockdevDriver',
>'data': [ 'archipelago', 'blkdebug', 'blkverify', 'bochs', 'cloop',
>  'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
> -'host_device', 'http', 'https', 'luks', 'null-aio', 'null-co',
> -'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
> - 'replication', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
> +'host_device', 'http', 'https', 'luks', 'nfs', 'null-aio',
> +'null-co', 'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
> +'replication', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
>  
>  ##
>  # @BlockdevOptionsFile
> @@ -2212,6 +2212,54 @@
>  '*top-id': 'str' } }
>  
>  ##
> +# @NFSServer
> +#
> +# Captures the address of the socket
> +#
> +# @type:transport type used for NFS (only TCP supported)
> +#
> +# @host:host part of the address
> +#
> +# Since 2.8
> +##
> +{ 'struct': 'NFSServer',
> +  'data': { 'type': 'str',
> +'host': 'str' } }
> +
> +##
> +# @BlockdevOptionsNfs
> +#
> +# Driver specific block device option for NFS
> +#
> +# @server:host address
> +#
> +# @path:  path of the image on the host
> +#
> +# @uid:   #optional UID value to use when talking to the server
> +#
> +# @gid:   #optional GID value to use when talking to the server
> +#
> +# @tcp-syncnt:#optional number of SYNs during the session establishment
> +#
> +# @readahead: #optional set the readahead size in bytes
> +#
> +# @pagecache: #optional set the pagecache size in bytes
> +#
> +# @debug: #optional set the NFS debug level (max 2)
> +#
> +# Since 2.8
> +##
> +{ 'struct': 'BlockdevOptionsNfs',
> +  'data': { 'server': 'NFSServer',

Patch 1 doesn't handle "server.type" and "server.host", so does this
actually work?

> +'path': 'str',
> +'*uid': 'int',
> +'*gid': 'int',
> +'*tcp-syncnt': 'int',
> +'*readahead': 'int',
> +'*pagecache': 'int',
> +'*debug': 'int' } }
> +
> +##
>  # @BlockdevOptionsCurl
>  #
>  # Driver specific block device options for the curl backend.

Kevin



Re: [Qemu-block] [PATCH 4/4] block: Cater to iscsi with non-power-of-2 discard

2016-10-25 Thread Peter Lieven

Am 25.10.2016 um 15:59 schrieb Eric Blake:

On 10/25/2016 07:42 AM, Peter Lieven wrote:

But hey, that firmware is seriously weird. :)

Yes, so you would not change the new implementation?

Even if the discard is e.g. 1MB it could theretically be that internally
the device has a finer granularity. Its an optimal discard alignment
not the minimum required discard size. I think thats a difference.
It does not say I can't handle smaller discards.

The firmware is probably technically buggy for advertising too large of
a minimum granularity, if it can piece together smaller requests into a
larger discard.  If discards need to happen at a smaller granularity,
the firmware (or kernel quirk system) should fix the advertisement to
the actual granularity that it will honor.  I don't see a reason to
change qemu's current behavior.



The issue is that the optimal unmap granularity is only a hint.
There is no evidence what happens with unaligned requests or requests
that are smaller. They could still lead to a discard operation in the
storage device. It just says if you can send me discards of that size
thats optimal for me. Thats not said that smaller or unaligned requests
have no effect.

Peter




Re: [Qemu-block] [PATCH v2 1/2] block/nfs: Introduce runtime_opts in NFS

2016-10-25 Thread Kevin Wolf
Peter, there is a question for you hidden somewhere below.

Am 24.10.2016 um 21:27 hat Ashijeet Acharya geschrieben:
> Make NFS block driver use various fine grained runtime_opts.
> Set .bdrv_parse_filename() to nfs_parse_filename() and introduce two
> new functions nfs_parse_filename() and nfs_parse_uri() to help parsing
> the URI.
> 
> Signed-off-by: Ashijeet Acharya 
> ---
>  block/nfs.c | 348 
> +++-
>  1 file changed, 253 insertions(+), 95 deletions(-)
> 
> diff --git a/block/nfs.c b/block/nfs.c
> index 8602a44..666ccf2 100644
> --- a/block/nfs.c
> +++ b/block/nfs.c
> @@ -35,8 +35,12 @@
>  #include "qemu/uri.h"
>  #include "qemu/cutils.h"
>  #include "sysemu/sysemu.h"
> +#include "qapi/qmp/qdict.h"
> +#include "qapi/qmp/qint.h"
> +#include "qapi/qmp/qstring.h"
>  #include 
>  
> +
>  #define QEMU_NFS_MAX_READAHEAD_SIZE 1048576
>  #define QEMU_NFS_MAX_PAGECACHE_SIZE (8388608 / NFS_BLKSIZE)
>  #define QEMU_NFS_MAX_DEBUG_LEVEL 2
> @@ -61,6 +65,118 @@ typedef struct NFSRPC {
>  NFSClient *client;
>  } NFSRPC;
>  
> +static int nfs_parse_uri(const char *filename, QDict *options, Error **errp)
> +{
> +URI *uri = NULL;
> +QueryParams *qp = NULL;
> +int ret = 0, i;
> +
> +uri = uri_parse(filename);
> +if (!uri) {
> +error_setg(errp, "Invalid URI specified");
> +ret = -EINVAL;
> +goto out;
> +}
> +if (strcmp(uri->scheme, "nfs") != 0) {
> +error_setg(errp, "URI scheme must be 'nfs'");
> +ret = -EINVAL;
> +goto out;
> +}
> +
> +if (!uri->server) {
> +error_setg(errp, "missing hostname in URI");
> +ret = -EINVAL;
> +goto out;
> +}
> +
> +if (!uri->path) {
> +error_setg(errp, "missing file path in URI");
> +ret = -EINVAL;
> +goto out;
> +}
> +
> +qp = query_params_parse(uri->query);
> +if (!qp) {
> +error_setg(errp, "could not parse query parameters");
> +ret = -EINVAL;
> +goto out;
> +}
> +
> +qdict_put(options, "host", qstring_from_str(uri->server));
> +
> +qdict_put(options, "path", qstring_from_str(uri->path));

Just style, but why the empty line between both qdict_put() calls?

> +
> +for (i = 0; i < qp->n; i++) {
> +if (!qp->p[i].value) {
> +error_setg(errp, "Value for NFS parameter expected: %s",
> +   qp->p[i].name);
> +goto out;

You need to set ret = -EINVAL here.

> +}
> +if (parse_uint_full(qp->p[i].value, NULL, 0)) {
> +error_setg(errp, "Illegal value for NFS parameter: %s",
> +   qp->p[i].name);
> +goto out;

And here.

> +}
> +if (!strcmp(qp->p[i].name, "uid")) {
> +qdict_put(options, "uid",
> +  qstring_from_str(qp->p[i].value));
> +} else if (!strcmp(qp->p[i].name, "gid")) {
> +qdict_put(options, "gid",
> +  qstring_from_str(qp->p[i].value));
> +} else if (!strcmp(qp->p[i].name, "tcp-syncnt")) {
> +qdict_put(options, "tcp-syncnt",
> +  qstring_from_str(qp->p[i].value));
> +} else if (!strcmp(qp->p[i].name, "readahead")) {
> +qdict_put(options, "readahead",
> +  qstring_from_str(qp->p[i].value));
> +} else if (!strcmp(qp->p[i].name, "pagecache")) {
> +qdict_put(options, "pagecache",
> +  qstring_from_str(qp->p[i].value));
> +} else if (!strcmp(qp->p[i].name, "debug")) {
> +qdict_put(options, "debug",
> +  qstring_from_str(qp->p[i].value));
> +} else {
> +error_setg(errp, "Unknown NFS parameter name: %s",
> +   qp->p[i].name);
> +goto out;

And here.

If you prefer, you can set ret = -EINVAL at the top of the function and
do a ret = 0 only right before out:

> +}
> +}
> +out:
> +if (qp) {
> +query_params_free(qp);
> +}
> +if (uri) {
> +uri_free(uri);
> +}
> +return ret;
> +}
> +
> +static void nfs_parse_filename(const char *filename, QDict *options,
> +   Error **errp)
> +{
> +int ret = 0;
> +
> +if (qdict_haskey(options, "host") ||
> +qdict_haskey(options, "path") ||
> +qdict_haskey(options, "uid") ||
> +qdict_haskey(options, "gid") ||
> +qdict_haskey(options, "tcp-syncnt") ||
> +qdict_haskey(options, "readahead") ||
> +qdict_haskey(options, "pagecache") ||
> +qdict_haskey(options, "debug")) {
> +error_setg(errp, "host/path/uid/gid/tcp-syncnt/readahead/pagecache"
> + "/debug and a filename may not be used at the same"
> + " time");
> +return;
> +}
> +
> +ret = nfs_parse_uri(filename, options, errp);
> +if 

Re: [Qemu-block] [PATCH v11 01/19] block: Add bdrv_drain_all_{begin, end}()

2016-10-25 Thread Paolo Bonzini


On 25/10/2016 15:39, Alberto Garcia wrote:
> On Mon 24 Oct 2016 12:53:41 PM CEST, Paolo Bonzini wrote:
> 
>>> My first thoughts were about how to let an unpause succeed without a
>>> previous pause for these objects, but actually I think this isn't
>>> what we should do. We rather want to actually do the pause instead
>>> because even new BDSes and block jobs should probably start in a
>>> quiesced state when created inside a drain_all section.
>>
>> Yes, I agree with this.  It shouldn't be too hard to implement it.  It
>> would require a BlockDriverState to look at the global "inside
>> bdrv_drain_all_begin" state, and ask its BlockBackend to disable
>> itself upon bdrv_replace_child.
> 
> Why in bdrv_replace_child()? bdrv_drain_all_end() enables all BDSs, but
> if you add one with "blockdev-add" it's not going to be disabled using
> this method.

You only need to disable it when blk_insert_bs is called.  In fact...

> In addition to that block jobs need the same, don't they? Something like
> "job->pause_count = all_quiesce_counter" in the initialization.

... we discussed a couple weeks ago that block jobs should register
BlkDeviceOps that pause the job in the drained_begin callback.  So when
the block job calls blk_insert_bs, the drained_begin callback would
start the job as paused.

> I think we'd also need to add block_job_pause_point() at the beginning
> of each one of their coroutines, in order to make sure that they really
> start paused.

It depends on the job, for example streaming starts with
block_job_sleep_ns.  Commit also does, except for some blk_getlength and
blk_truncate calls that could be moved to the caller.

Paolo



Re: [Qemu-block] [PATCH 4/4] block: Cater to iscsi with non-power-of-2 discard

2016-10-25 Thread Eric Blake
On 10/25/2016 07:42 AM, Peter Lieven wrote:
>>
>> But hey, that firmware is seriously weird. :)
> 
> Yes, so you would not change the new implementation?
> 
> Even if the discard is e.g. 1MB it could theretically be that internally
> the device has a finer granularity. Its an optimal discard alignment
> not the minimum required discard size. I think thats a difference.
> It does not say I can't handle smaller discards.

The firmware is probably technically buggy for advertising too large of
a minimum granularity, if it can piece together smaller requests into a
larger discard.  If discards need to happen at a smaller granularity,
the firmware (or kernel quirk system) should fix the advertisement to
the actual granularity that it will honor.  I don't see a reason to
change qemu's current behavior.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v11 01/19] block: Add bdrv_drain_all_{begin, end}()

2016-10-25 Thread Alberto Garcia
On Mon 24 Oct 2016 12:53:41 PM CEST, Paolo Bonzini wrote:

>> My first thoughts were about how to let an unpause succeed without a
>> previous pause for these objects, but actually I think this isn't
>> what we should do. We rather want to actually do the pause instead
>> because even new BDSes and block jobs should probably start in a
>> quiesced state when created inside a drain_all section.
>
> Yes, I agree with this.  It shouldn't be too hard to implement it.  It
> would require a BlockDriverState to look at the global "inside
> bdrv_drain_all_begin" state, and ask its BlockBackend to disable
> itself upon bdrv_replace_child.

Why in bdrv_replace_child()? bdrv_drain_all_end() enables all BDSs, but
if you add one with "blockdev-add" it's not going to be disabled using
this method.

In addition to that block jobs need the same, don't they? Something like
"job->pause_count = all_quiesce_counter" in the initialization.

I think we'd also need to add block_job_pause_point() at the beginning
of each one of their coroutines, in order to make sure that they really
start paused.

Berto



Re: [Qemu-block] [Qemu-devel] Assertion failure on qcow2 disk with cluster_size != 64k

2016-10-25 Thread Eric Blake
On 10/25/2016 03:39 AM, Kevin Wolf wrote:
>> It appears loop devices (with or without dm-crypt/LUKS) report a
>> 255-sector maximum per request via the BLKSECTGET ioctl, which qemu
>> rounds down to 64k in raw_refresh_limits().

Cool - I can make a loop device, so now I should have enough info to
reproduce locally.

I wonder if we should continue rounding down to a power of 2 (128
sectors), or if we should try to utilize the full 255 sector limit.  But
that's an independent patch and bigger audit, more along the lines of
what we did for the weird scsi device with 15M limits.

>> However this maximum
>> appears to be just a hint: bdrv_driver_pwritev() succeeds even with a
>> 385024-byte buffer of zeroes.
> 
> I suppose what happens is that we get short writes, but the raw-posix
> driver actually has the loop to deal with this, so eventually we return
> with the whole thing written.
> 
> Considering the presence of this loop, maybe we shouldn't set
> bs->bl.max_transfer at all for raw-posix. Hm, except that for Linux AIO
> we might actually need it.

bdrv_aligned_preadv() fragments before calling into
bdrv_driver_preadv(), but our write-zeroes fallback code is calling
directly into bdrv_driver_preadv() rather than going through
bdrv_aligned_preadv(); and I don't think we want to change that.  In
other words, it sounds like bdrv_co_do_pwrite_zeroes() has to do the
same fragmentation that bdrv_aligned_preadv() would be doing.  Okay, I
think I know where to go with the patch, and hope to have something
posted later today.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v8 05/36] raw-posix: Add image locking support

2016-10-25 Thread Max Reitz
On 25.10.2016 08:31, Fam Zheng wrote:
> On Sat, 10/22 01:40, Max Reitz wrote:
>> On 30.09.2016 14:09, Fam Zheng wrote:

[...]

>>> +static int
>>> +raw_reopen_upgrade(BDRVReopenState *state,
>>> +   RawReopenOperation op,
>>> +   ImageLockMode old_lock,
>>> +   ImageLockMode new_lock,
>>> +   Error **errp)
>>> +{
>>> +BDRVRawReopenState *raw_s = state->opaque;
>>> +BDRVRawState *s = state->bs->opaque;
>>> +int ret = 0, ret2;
>>> +
>>> +assert(old_lock == IMAGE_LOCK_MODE_SHARED);
>>> +assert(new_lock == IMAGE_LOCK_MODE_EXCLUSIVE);
>>> +switch (op) {
>>> +case RAW_REOPEN_PREPARE:
>>> +ret = raw_lock_fd(raw_s->lock_fd, IMAGE_LOCK_MODE_SHARED);
> 
> [1]
> 
>>> +if (ret) {
>>> +error_setg_errno(errp, -ret, "Failed to lock new fd (shared)");
>>> +break;
>>> +}
>>> +ret = raw_lock_fd(s->lock_fd, IMAGE_LOCK_MODE_NOLOCK);
> 
> [2]
> 
>>> +if (ret) {
>>> +error_setg_errno(errp, -ret, "Failed to unlock old fd");
>>> +goto restore;
>>> +}
>>> +ret = raw_lock_fd(raw_s->lock_fd, IMAGE_LOCK_MODE_EXCLUSIVE);
> 
> [3]
> 
>>> +if (ret) {
>>> +error_setg_errno(errp, -ret, "Failed to lock new fd 
>>> (exclusive)");
>>> +goto restore;
>>> +}
>>> +break;
>>> +case RAW_REOPEN_COMMIT:
>>> +break;
>>> +case RAW_REOPEN_ABORT:
>>> +raw_lock_fd(raw_s->lock_fd, IMAGE_LOCK_MODE_SHARED);
>>> +ret = raw_lock_fd(s->lock_fd, IMAGE_LOCK_MODE_SHARED);
> 
> [4]
> 
>>> +if (ret) {
>>> +error_report("Failed to restore lock on old fd");
>>
>> If we get here, s->lock_fd is still locked exclusively. The following is
>> a very personal opinion, but anyway: I think it would be be better for
>> it to be unlocked. If it's locked too strictly, this can really break
>> something; but if it's just not locked (while it should be locked in
>> shared mode), everything's going to be fine unless the user makes a
>> mistake. I think the latter is less bad.
> 
> There are four lock states when we land on this "abort" branch:
> 
>   a) Lock "prepare" was not run, some other error happened earlier, so the 
> lock
>  aren't changed compared to before the transaction starts: raw_s->lock_fd 
> is
>  unlocked, s->lock_fd is shared locked. In this case raw_lock_fd [4] 
> cannot
>  fail, and even if it does, s->lock_fd is in the correct state.
> 
>   b) The raw_lock_fd [1] failed. This is similar to 1), s->lock_fd is intact, 
> so
>  we are good.
> 
>   c) The raw_lock_fd [2] failed. Again, similar to above.
> 
>   d) The raw_lock_fd [3] failed. Here raw_s->lock_fd is shared locked, and
>  s->lock_fd is unlocked. The correct "abort" sequence is downgrade
>  raw_s->lock_fd and "shared lock" s->lock_fd again. If the "abort" 
> sequence
>  failed, s->lock_fd is unlocked.

OK, you're right, I somehow forgot about the cases where our prepare
sequence was either not run at all or failed. But I was thinking about
the case where our preparation was successful, but some later
preparation in the reopen transaction failed. Then, s->lock_fd should be
locked exclusively, no?

[...]

>>> +static int raw_reopen_handle_lock(BDRVReopenState *state,
>>> +  RawReopenOperation op,
>>> +  Error **errp)
>>> +{
>>> +BDRVRawReopenState *raw_s = state->opaque;
>>
>> Please choose another name, it's hard not to confuse this with the
>> BDRVRawState all the time. (e.g. raw_rs or just rs would be enough.)
> 
> Sorry I can't change the name in this patch, it will cause more inconsistency
> because raw_reopen_* already use the name broadly. If you really insist it's a
> bad name, I can append or prepend a patch in the next version.

Well, I don't insist, but it really confused me a couple of times, even
after I had realized my mistake once.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v14 04/21] qapi: rename QmpInputVisitor to QObjectInputVisitor

2016-10-25 Thread Markus Armbruster
"Daniel P. Berrange"  writes:

> The QmpInputVisitor has no direct dependency on QMP. It is
> valid to use it anywhere that one has a QObject. Rename it
> to better reflect its functionality as a generic QObject
> to QAPI converter.
>
> Reviewed-by: Kevin Wolf 
> Reviewed-by: Eric Blake 
> Signed-off-by: Daniel P. Berrange 
> ---
>  docs/qapi-code-gen.txt |   2 +-
>  ...qmp-input-visitor.h => qobject-input-visitor.h} |  10 +-
>  include/qapi/visitor.h |   6 +-
>  monitor.c  |   2 +-
>  qapi/Makefile.objs |   2 +-
>  ...qmp-input-visitor.c => qobject-input-visitor.c} | 171 
> +++--
>  qmp.c  |   4 +-
>  qom/qom-qobject.c  |   4 +-
>  scripts/qapi-commands.py   |   4 +-
>  target-s390x/cpu_models.c  |   4 +-
>  tests/.gitignore   |   4 +-
>  tests/Makefile.include |  12 +-
>  tests/check-qnull.c|   4 +-
>  tests/test-qmp-commands.c  |   4 +-
>  ...-input-strict.c => test-qobject-input-strict.c} |   6 +-
>  ...nput-visitor.c => test-qobject-input-visitor.c} |   6 +-
>  tests/test-string-input-visitor.c  |   2 +-
>  tests/test-visitor-serialization.c |   4 +-
>  util/qemu-sockets.c|   2 +-
>  19 files changed, 128 insertions(+), 125 deletions(-)
>  rename include/qapi/{qmp-input-visitor.h => qobject-input-visitor.h} (63%)
>  rename qapi/{qmp-input-visitor.c => qobject-input-visitor.c} (56%)
>  rename tests/{test-qmp-input-strict.c => test-qobject-input-strict.c} (98%)
>  rename tests/{test-qmp-input-visitor.c => test-qobject-input-visitor.c} (99%)

Here, git detects the rename, barely.  I'll split this patch anyway, for
symmetry.



Re: [Qemu-block] [PATCH v8 02/36] qapi: Add ImageLockMode

2016-10-25 Thread Fam Zheng
On Tue, 10/25 15:20, Max Reitz wrote:
> On 25.10.2016 07:36, Fam Zheng wrote:
> > On Fri, 10/21 22:45, Max Reitz wrote:
> >> On 30.09.2016 14:09, Fam Zheng wrote:
> >>> Signed-off-by: Fam Zheng 
> >>> ---
> >>>  qapi/block-core.json | 18 ++
> >>>  1 file changed, 18 insertions(+)
> >>>
> >>> diff --git a/qapi/block-core.json b/qapi/block-core.json
> >>> index 92193ab..22e8d04 100644
> >>> --- a/qapi/block-core.json
> >>> +++ b/qapi/block-core.json
> >>> @@ -2754,3 +2754,21 @@
> >>>'data' : { 'parent': 'str',
> >>>   '*child': 'str',
> >>>   '*node': 'str' } }
> >>> +
> >>> +##
> >>> +# @ImageLockMode:
> >>> +#
> >>> +# @auto: defer to the block driver to use the least strict mode, based on
> >>> +#the nature of format and read-only flag, and the supported 
> >>> locking
> >>> +#operations of the protocol.
> >>
> >> I have some difficulty understanding this description. I'd intuitively
> >> assume no locking to be the "least strict mode"; however, since it
> >> should be always possible not to lock an image, this would mean that
> >> auto=nolock. Which is hopefully isn't.
> >>
> >> If it's not easy to come up with a thorough explanation, perhaps it
> >> would be best to give some examples which help to understand the concept
> >> behind "auto" intuitively.
> > 
> > It could have beeen more specific, it's my bad being too terse here. Maybe
> > something like this:
> > 
> > @auto: defer to the block layer to use an appropriate lock mode, based 
> > on
> >the driver used and read-only option: for read-only images, 
> > shared
> >lock mode, or otherwise exclusive lock mode, will be attempted; 
> > if
> >the driver doesn't support this mode (or sharing is particularly
> >desired by its design), nolock will be used.
> > 
> > ?
> 
> Sounds good to me, if that's how it's supposed to be.
> 
> Do we actually want to use nolock "if sharing is particularly desired by
> its design"? I mean, I think one of the drivers that would apply to is
> NBD, but is the fact that multiple parties can freely access (and write
> to!) an NBD disk at the same time really what we want or just a design
> limitation?

The guest can always corrupt data themselves so I think my wording is inaccurate
here. Actually the more reasonable case of this is for example when server side
locking is applied automatically and transparently:

https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg03576.html

(I don't know how the mentioned feature above ended up at RBD side, but that
makes an interesting consideration point).

Fam






Re: [Qemu-block] [PATCH v8 00/36] block: Image locking series

2016-10-25 Thread Max Reitz
On 25.10.2016 10:24, Kevin Wolf wrote:
> Am 24.10.2016 um 20:03 hat Max Reitz geschrieben:
>> On 24.10.2016 12:11, Kevin Wolf wrote:
>>
>> [...]
>>
>>> Now, the big question is how to translate this into file locking. This
>>> could become a little tricky. I had a few thoughts involving another
>>> lock on byte 2, but none of them actually worked out so far, because
>>> what we want is essentially a lock that can be shared by readers, that
>>> can also be shared by writers, but not by readers and writers at the
>>> same time.
>>
>> You can also share it between readers and writers, as long as everyone
>> can cope with volatile data.
> 
> Sorry, that was ambiguous. I meant a file-level lock rather than the
> high-level one. If we had a lock that can be shared by one or the other,
> but not both, then two locks would be enough to build what we really
> want.
> 
>> I agree that it's very similar to the proposed op blocker style, but I
>> can't really come up with a meaningful translation either.
>>
>> Maybe something like this (?): All readers who do not want the file to
>> be modified grab a shared lock on byte 1. All writers who can deal with
>> volatile data grab a shared lock on byte 2. Exclusive writers grab an
>> exclusive lock on byte 1 and 2. Readers who can cope with volatile data
>> get no lock at all.
>>
>> When opening, the first and second group would always have to test
>> whether there is a lock on the other byte, respectively. E.g. sharing
>> writers would first grab an exclusive lock on byte 1, then the shared
>> lock on byte 2 and then release the exclusive lock again.
>>
>> Would that work?
> 
> I'm afraid it wouldn't. If you start the sharing writer first and then
> the writer-blocking reader, the writer doesn't hold a lock on byte 1 any
> more,

But it holds a lock on byte 2.

>   so the reader can start even though someone is writing to the
> image.

It can't because it would try to grab an exclusive lock on byte 2 before
grabbing the shared lock on byte 1.

Max

>On the other hand, the writer can't keep an exclusive lock
> because it would block other users that can share the image.
> 
> Kevin
> 




signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH v5 12/13] iotests: Add assert_json_filename_equal() method

2016-10-25 Thread Max Reitz
Since the order of keys in JSON filenames is not necessarily fixed, they
should not be compared to fixed strings. This method takes a Python dict
as a reference, parses a given JSON filename and compares both.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/iotests.py | 20 
 1 file changed, 20 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index c589deb..1f30cfc 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -222,6 +222,19 @@ class QMPTestCase(unittest.TestCase):
 self.fail('invalid index "%s" in path "%s" in "%s"' % 
(idx, path, str(d)))
 return d
 
+def flatten_qmp_object(self, obj, output=None, basestr=''):
+if output is None:
+output = dict()
+if isinstance(obj, list):
+for i in range(len(obj)):
+self.flatten_qmp_object(obj[i], output, basestr + str(i) + '.')
+elif isinstance(obj, dict):
+for key in obj:
+self.flatten_qmp_object(obj[key], output, basestr + key + '.')
+else:
+output[basestr[:-1]] = obj # Strip trailing '.'
+return output
+
 def assert_qmp_absent(self, d, path):
 try:
 result = self.dictpath(d, path)
@@ -252,6 +265,13 @@ class QMPTestCase(unittest.TestCase):
 self.assertTrue(False, "Cannot find %s %s in result:\n%s" % \
 (node_name, file_name, result))
 
+def assert_json_filename_equal(self, json_filename, reference):
+'''Asserts that the given filename is a json: filename and that its
+   content is equal to the given reference object'''
+self.assertEqual(json_filename[:5], 'json:')
+
self.assertEqual(self.flatten_qmp_object(json.loads(json_filename[5:])),
+ self.flatten_qmp_object(reference))
+
 def cancel_and_wait(self, drive='drive0', force=False, resume=False):
 '''Cancel a block job and wait for it to finish, returning the event'''
 result = self.vm.qmp('block-job-cancel', device=drive, force=force)
-- 
2.10.1




[Qemu-block] [PATCH v5 10/13] iotests.py: Allow concurrent qemu instances

2016-10-25 Thread Max Reitz
By adding an optional suffix to the files used for communication with a
VM, we can launch multiple VM instances concurrently.

Reviewed-by: Kevin Wolf 
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/iotests.py | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 5a2678f..c589deb 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -140,8 +140,10 @@ def log(msg, filters=[]):
 class VM(qtest.QEMUQtestMachine):
 '''A QEMU VM'''
 
-def __init__(self):
-super(VM, self).__init__(qemu_prog, qemu_opts, test_dir=test_dir,
+def __init__(self, path_suffix=''):
+name = "qemu%s-%d" % (path_suffix, os.getpid())
+super(VM, self).__init__(qemu_prog, qemu_opts, name=name,
+ test_dir=test_dir,
  socket_scm_helper=socket_scm_helper)
 if debug:
 self._debug = True
-- 
2.10.1




[Qemu-block] [PATCH v5 09/13] iotests.py: Add qemu_nbd function

2016-10-25 Thread Max Reitz
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/iotests.py | 8 
 1 file changed, 8 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 3329bc1..5a2678f 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -39,6 +39,10 @@ qemu_io_args = [os.environ.get('QEMU_IO_PROG', 'qemu-io')]
 if os.environ.get('QEMU_IO_OPTIONS'):
 qemu_io_args += os.environ['QEMU_IO_OPTIONS'].strip().split(' ')
 
+qemu_nbd_args = [os.environ.get('QEMU_NBD_PROG', 'qemu-nbd')]
+if os.environ.get('QEMU_NBD_OPTIONS'):
+qemu_nbd_args += os.environ['QEMU_NBD_OPTIONS'].strip().split(' ')
+
 qemu_prog = os.environ.get('QEMU_PROG', 'qemu')
 qemu_opts = os.environ.get('QEMU_OPTIONS', '').strip().split(' ')
 
@@ -87,6 +91,10 @@ def qemu_io(*args):
 sys.stderr.write('qemu-io received signal %i: %s\n' % (-exitcode, ' 
'.join(args)))
 return subp.communicate()[0]
 
+def qemu_nbd(*args):
+'''Run qemu-nbd in daemon mode and return the parent's exit code'''
+return subprocess.call(qemu_nbd_args + ['--fork'] + list(args))
+
 def compare_images(img1, img2, fmt1=imgfmt, fmt2=imgfmt):
 '''Return True if two image files are identical'''
 return qemu_img('compare', '-f', fmt1,
-- 
2.10.1




Re: [Qemu-block] [PATCH v8 02/36] qapi: Add ImageLockMode

2016-10-25 Thread Max Reitz
On 25.10.2016 07:36, Fam Zheng wrote:
> On Fri, 10/21 22:45, Max Reitz wrote:
>> On 30.09.2016 14:09, Fam Zheng wrote:
>>> Signed-off-by: Fam Zheng 
>>> ---
>>>  qapi/block-core.json | 18 ++
>>>  1 file changed, 18 insertions(+)
>>>
>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>> index 92193ab..22e8d04 100644
>>> --- a/qapi/block-core.json
>>> +++ b/qapi/block-core.json
>>> @@ -2754,3 +2754,21 @@
>>>'data' : { 'parent': 'str',
>>>   '*child': 'str',
>>>   '*node': 'str' } }
>>> +
>>> +##
>>> +# @ImageLockMode:
>>> +#
>>> +# @auto: defer to the block driver to use the least strict mode, based on
>>> +#the nature of format and read-only flag, and the supported locking
>>> +#operations of the protocol.
>>
>> I have some difficulty understanding this description. I'd intuitively
>> assume no locking to be the "least strict mode"; however, since it
>> should be always possible not to lock an image, this would mean that
>> auto=nolock. Which is hopefully isn't.
>>
>> If it's not easy to come up with a thorough explanation, perhaps it
>> would be best to give some examples which help to understand the concept
>> behind "auto" intuitively.
> 
> It could have beeen more specific, it's my bad being too terse here. Maybe
> something like this:
> 
> @auto: defer to the block layer to use an appropriate lock mode, based on
>the driver used and read-only option: for read-only images, shared
>lock mode, or otherwise exclusive lock mode, will be attempted; if
>the driver doesn't support this mode (or sharing is particularly
>desired by its design), nolock will be used.
> 
> ?

Sounds good to me, if that's how it's supposed to be.

Do we actually want to use nolock "if sharing is particularly desired by
its design"? I mean, I think one of the drivers that would apply to is
NBD, but is the fact that multiple parties can freely access (and write
to!) an NBD disk at the same time really what we want or just a design
limitation?

Max



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH v4 3/5] block/ssh: Add InetSocketAddress and accept it

2016-10-25 Thread Ashijeet Acharya
Add InetSocketAddress compatibility to SSH driver.

Add a new option "server" to the SSH block driver which then accepts
a InetSocketAddress.

"host" and "port" are supported as legacy options and are mapped to
their InetSocketAddress representation.

Signed-off-by: Ashijeet Acharya 
---
 block/ssh.c | 98 +++--
 1 file changed, 82 insertions(+), 16 deletions(-)

diff --git a/block/ssh.c b/block/ssh.c
index 75cb7bc..d814006 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -30,10 +30,14 @@
 #include "block/block_int.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
+#include "qemu/cutils.h"
 #include "qemu/sockets.h"
 #include "qemu/uri.h"
+#include "qapi-visit.h"
 #include "qapi/qmp/qint.h"
 #include "qapi/qmp/qstring.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qobject-output-visitor.h"
 
 /* DEBUG_SSH=1 enables the DPRINTF (debugging printf) statements in
  * this block driver code.
@@ -74,8 +78,9 @@ typedef struct BDRVSSHState {
  */
 LIBSSH2_SFTP_ATTRIBUTES attrs;
 
+InetSocketAddress *inet;
+
 /* Used to warn if 'flush' is not supported. */
-char *hostport;
 bool unsafe_flush_warning;
 } BDRVSSHState;
 
@@ -89,7 +94,6 @@ static void ssh_state_init(BDRVSSHState *s)
 
 static void ssh_state_free(BDRVSSHState *s)
 {
-g_free(s->hostport);
 if (s->sftp_handle) {
 libssh2_sftp_close(s->sftp_handle);
 }
@@ -263,7 +267,8 @@ static bool ssh_has_filename_options_conflict(QDict 
*options, Error **errp)
 !strcmp(qe->key, "port") ||
 !strcmp(qe->key, "path") ||
 !strcmp(qe->key, "user") ||
-!strcmp(qe->key, "host_key_check"))
+!strcmp(qe->key, "host_key_check") ||
+strstart(qe->key, "server.", NULL))
 {
 error_setg(errp, "Option '%s' cannot be used with a file name",
qe->key);
@@ -555,14 +560,69 @@ static QemuOptsList ssh_runtime_opts = {
 },
 };
 
+static bool ssh_process_legacy_socket_options(QDict *output_opts,
+  QemuOpts *legacy_opts,
+  Error **errp)
+{
+const char *host = qemu_opt_get(legacy_opts, "host");
+const char *port = qemu_opt_get(legacy_opts, "port");
+
+if (!host && port) {
+error_setg(errp, "port may not be used without host");
+return false;
+}
+
+if (host) {
+qdict_put(output_opts, "server.host", qstring_from_str(host));
+qdict_put(output_opts, "server.port",
+  qstring_from_str(port ?: stringify(22)));
+}
+
+return true;
+}
+
+static InetSocketAddress *ssh_config(BDRVSSHState *s, QDict *options,
+ Error **errp)
+{
+InetSocketAddress *inet = NULL;
+QDict *addr = NULL;
+QObject *crumpled_addr = NULL;
+Visitor *iv = NULL;
+Error *local_error = NULL;
+
+qdict_extract_subqdict(options, , "server.");
+if (!qdict_size(addr)) {
+error_setg(errp, "SSH server address missing");
+goto out;
+}
+
+crumpled_addr = qdict_crumple(addr, true, errp);
+if (!crumpled_addr) {
+goto out;
+}
+
+iv = qobject_input_visitor_new(crumpled_addr, true);
+visit_type_InetSocketAddress(iv, NULL, , _error);
+if (local_error) {
+error_propagate(errp, local_error);
+goto out;
+}
+
+out:
+QDECREF(addr);
+qobject_decref(crumpled_addr);
+visit_free(iv);
+return inet;
+}
+
 static int connect_to_ssh(BDRVSSHState *s, QDict *options,
   int ssh_flags, int creat_mode, Error **errp)
 {
 int r, ret;
 QemuOpts *opts = NULL;
 Error *local_err = NULL;
-const char *host, *user, *path, *host_key_check;
-int port;
+const char *user, *path, *host_key_check;
+long port = 0;
 
 opts = qemu_opts_create(_runtime_opts, NULL, 0, _abort);
 qemu_opts_absorb_qdict(opts, options, _err);
@@ -572,15 +632,11 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
 goto err;
 }
 
-host = qemu_opt_get(opts, "host");
-if (!host) {
+if (!ssh_process_legacy_socket_options(options, opts, errp)) {
 ret = -EINVAL;
-error_setg(errp, "No hostname was specified");
 goto err;
 }
 
-port = qemu_opt_get_number(opts, "port", 22);
-
 path = qemu_opt_get(opts, "path");
 if (!path) {
 ret = -EINVAL;
@@ -603,12 +659,21 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
 host_key_check = "yes";
 }
 
-/* Construct the host:port name for inet_connect. */
-g_free(s->hostport);
-s->hostport = g_strdup_printf("%s:%d", host, port);
+/* Pop the config into our state object, Exit if invalid */
+s->inet = ssh_config(s, options, errp);
+if (!s->inet) {
+ret = -EINVAL;
+goto err;
+}
+
+if 

Re: [Qemu-block] [PATCH v8 03/36] block: Introduce image file locking

2016-10-25 Thread Max Reitz
On 25.10.2016 07:48, Fam Zheng wrote:
> On Fri, 10/21 23:04, Max Reitz wrote:

[...]

>>> +int bdrv_set_lock_mode(BlockDriverState *bs, ImageLockMode mode)
>>> +{
>>> +int ret;
>>> +
>>> +if (bs->cur_lock == mode) {
>>> +return 0;
>>> +} else if (!bs->drv) {
>>> +return -ENOMEDIUM;
>>> +} else if (!bs->drv->bdrv_lockf) {
>>> +if (bs->file) {
>>> +return bdrv_set_lock_mode(bs->file->bs, mode);
>>> +}
>>> +return 0;
>>> +}
>>> +ret = bs->drv->bdrv_lockf(bs, mode);
>>> +if (ret == -ENOTSUP) {
>>> +/* Handle it the same way as !bs->drv->bdrv_lockf */
>>> +ret = 0;
>>
>> Yes, well, why do you handle both as success? Wouldn't returning
>> -ENOTSUP make more sense?
>>
>> I guess the caller can find out itself by checking whether bs->cur_lock
>> has changed, but...
> 
> I can't think of a reason for any caller to do something different for 
> -ENOTSUP
> from success, hence the check here.

OK, that's fine, then.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v5 00/13] qapi: Allow blockdev-add for NBD

2016-10-25 Thread Max Reitz
On 25.10.2016 15:11, Max Reitz wrote:

[...]

> ***This series depends on qdict_crumple() as it has been merged by
>Markus to his qapi-next branch (i.e. without @recursive).***

Actually, it depends on the whole branch.

Max



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH v5 08/13] qapi: Allow blockdev-add for NBD

2016-10-25 Thread Max Reitz
Signed-off-by: Max Reitz 
---
 qapi/block-core.json | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 97b1205..4b4a74c 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1703,14 +1703,15 @@
 #
 # @host_device, @host_cdrom: Since 2.1
 # @gluster: Since 2.7
+# @nbd: Since 2.8
 #
 # Since: 2.0
 ##
 { 'enum': 'BlockdevDriver',
   'data': [ 'archipelago', 'blkdebug', 'blkverify', 'bochs', 'cloop',
 'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
-'host_device', 'http', 'https', 'luks', 'null-aio', 'null-co',
-'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
+'host_device', 'http', 'https', 'luks', 'nbd', 'null-aio',
+'null-co', 'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
'replication', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
 
 ##
@@ -2220,6 +2221,24 @@
   'data': { 'filename': 'str' } }
 
 ##
+# @BlockdevOptionsNbd
+#
+# Driver specific block device options for NBD.
+#
+# @server:  NBD server address
+#
+# @export:  #optional export name
+#
+# @tls-creds:   #optional TLS credentials ID
+#
+# Since: 2.8
+##
+{ 'struct': 'BlockdevOptionsNbd',
+  'data': { 'server': 'SocketAddress',
+'*export': 'str',
+'*tls-creds': 'str' } }
+
+##
 # @BlockdevOptions
 #
 # Options for creating a block device.  Many options are available for all
@@ -2264,7 +2283,7 @@
   'https':  'BlockdevOptionsCurl',
 # TODO iscsi: Wait for structured options
   'luks':   'BlockdevOptionsLUKS',
-# TODO nbd: Should take InetSocketAddress for 'host'?
+  'nbd':'BlockdevOptionsNbd',
 # TODO nfs: Wait for structured options
   'null-aio':   'BlockdevOptionsNull',
   'null-co':'BlockdevOptionsNull',
-- 
2.10.1




[Qemu-block] [PATCH v5 07/13] block/nbd: Use SocketAddress options

2016-10-25 Thread Max Reitz
Drop the use of legacy options in favor of the SocketAddress
representation, even for internal use (i.e. for storing the result of
the filename parsing).

Signed-off-by: Max Reitz 
---
 block/nbd.c | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index a778692..8ef1438 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -94,9 +94,13 @@ static int nbd_parse_uri(const char *filename, QDict 
*options)
 ret = -EINVAL;
 goto out;
 }
-qdict_put(options, "path", qstring_from_str(qp->p[0].value));
+qdict_put(options, "server.type", qstring_from_str("unix"));
+qdict_put(options, "server.data.path",
+  qstring_from_str(qp->p[0].value));
 } else {
 QString *host;
+char *port_str;
+
 /* nbd[+tcp]://host[:port]/export */
 if (!uri->server) {
 ret = -EINVAL;
@@ -111,12 +115,12 @@ static int nbd_parse_uri(const char *filename, QDict 
*options)
 host = qstring_from_str(uri->server);
 }
 
-qdict_put(options, "host", host);
-if (uri->port) {
-char* port_str = g_strdup_printf("%d", uri->port);
-qdict_put(options, "port", qstring_from_str(port_str));
-g_free(port_str);
-}
+qdict_put(options, "server.type", qstring_from_str("inet"));
+qdict_put(options, "server.data.host", host);
+
+port_str = g_strdup_printf("%d", uri->port ?: NBD_DEFAULT_PORT);
+qdict_put(options, "server.data.port", qstring_from_str(port_str));
+g_free(port_str);
 }
 
 out:
@@ -192,7 +196,8 @@ static void nbd_parse_filename(const char *filename, QDict 
*options,
 
 /* are we a UNIX or TCP socket? */
 if (strstart(host_spec, "unix:", )) {
-qdict_put(options, "path", qstring_from_str(unixpath));
+qdict_put(options, "server.type", qstring_from_str("unix"));
+qdict_put(options, "server.data.path", qstring_from_str(unixpath));
 } else {
 InetSocketAddress *addr = NULL;
 
@@ -201,8 +206,9 @@ static void nbd_parse_filename(const char *filename, QDict 
*options,
 goto out;
 }
 
-qdict_put(options, "host", qstring_from_str(addr->host));
-qdict_put(options, "port", qstring_from_str(addr->port));
+qdict_put(options, "server.type", qstring_from_str("inet"));
+qdict_put(options, "server.data.host", qstring_from_str(addr->host));
+qdict_put(options, "server.data.port", qstring_from_str(addr->port));
 qapi_free_InetSocketAddress(addr);
 }
 
-- 
2.10.1




[Qemu-block] [PATCH v5 06/13] block/nbd: Accept SocketAddress

2016-10-25 Thread Max Reitz
Add a new option "server" to the NBD block driver which accepts a
SocketAddress.

"path", "host" and "port" are still supported as legacy options and are
mapped to their corresponding SocketAddress representation.

Signed-off-by: Max Reitz 
---
 block/nbd.c   | 175 +++---
 tests/qemu-iotests/051.out|   4 +-
 tests/qemu-iotests/051.pc.out |   4 +-
 3 files changed, 117 insertions(+), 66 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index cdab20f..a778692 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -32,6 +32,9 @@
 #include "qemu/uri.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
+#include "qapi-visit.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qobject-output-visitor.h"
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qjson.h"
 #include "qapi/qmp/qint.h"
@@ -44,7 +47,8 @@ typedef struct BDRVNBDState {
 NbdClientSession client;
 
 /* For nbd_refresh_filename() */
-char *path, *host, *port, *export, *tlscredsid;
+SocketAddress *saddr;
+char *export, *tlscredsid;
 } BDRVNBDState;
 
 static int nbd_parse_uri(const char *filename, QDict *options)
@@ -131,7 +135,8 @@ static bool nbd_has_filename_options_conflict(QDict 
*options, Error **errp)
 if (!strcmp(e->key, "host") ||
 !strcmp(e->key, "port") ||
 !strcmp(e->key, "path") ||
-!strcmp(e->key, "export"))
+!strcmp(e->key, "export") ||
+strstart(e->key, "server.", NULL))
 {
 error_setg(errp, "Option '%s' cannot be used with a file name",
e->key);
@@ -205,50 +210,81 @@ out:
 g_free(file);
 }
 
-static SocketAddress *nbd_config(BDRVNBDState *s, QemuOpts *opts, Error **errp)
+static bool nbd_process_legacy_socket_options(QDict *output_options,
+  QemuOpts *legacy_opts,
+  Error **errp)
 {
-SocketAddress *saddr;
+const char *path = qemu_opt_get(legacy_opts, "path");
+const char *host = qemu_opt_get(legacy_opts, "host");
+const char *port = qemu_opt_get(legacy_opts, "port");
+const QDictEntry *e;
 
-s->path = g_strdup(qemu_opt_get(opts, "path"));
-s->host = g_strdup(qemu_opt_get(opts, "host"));
-s->port = g_strdup(qemu_opt_get(opts, "port"));
+if (!path && !host && !port) {
+return true;
+}
 
-if (!s->path == !s->host) {
-if (s->path) {
-error_setg(errp, "path and host may not be used at the same time");
-} else {
-error_setg(errp, "one of path and host must be specified");
+for (e = qdict_first(output_options); e; e = qdict_next(output_options, e))
+{
+if (strstart(e->key, "server.", NULL)) {
+error_setg(errp, "Cannot use 'server' and path/host/port at the "
+   "same time");
+return false;
 }
-return NULL;
 }
-if (s->port && !s->host) {
-error_setg(errp, "port may not be used without host");
-return NULL;
+
+if (path && host) {
+error_setg(errp, "path and host may not be used at the same time");
+return false;
+} else if (path) {
+if (port) {
+error_setg(errp, "port may not be used without host");
+return false;
+}
+
+qdict_put(output_options, "server.type", qstring_from_str("unix"));
+qdict_put(output_options, "server.data.path", qstring_from_str(path));
+} else if (host) {
+qdict_put(output_options, "server.type", qstring_from_str("inet"));
+qdict_put(output_options, "server.data.host", qstring_from_str(host));
+qdict_put(output_options, "server.data.port",
+  qstring_from_str(port ?: stringify(NBD_DEFAULT_PORT)));
 }
 
-saddr = g_new0(SocketAddress, 1);
+return true;
+}
 
-if (s->path) {
-UnixSocketAddress *q_unix;
-saddr->type = SOCKET_ADDRESS_KIND_UNIX;
-q_unix = saddr->u.q_unix.data = g_new0(UnixSocketAddress, 1);
-q_unix->path = g_strdup(s->path);
-} else {
-InetSocketAddress *inet;
-
-saddr->type = SOCKET_ADDRESS_KIND_INET;
-inet = saddr->u.inet.data = g_new0(InetSocketAddress, 1);
-inet->host = g_strdup(s->host);
-inet->port = g_strdup(s->port);
-if (!inet->port) {
-inet->port = g_strdup_printf("%d", NBD_DEFAULT_PORT);
-}
+static SocketAddress *nbd_config(BDRVNBDState *s, QDict *options, Error **errp)
+{
+SocketAddress *saddr = NULL;
+QDict *addr = NULL;
+QObject *crumpled_addr = NULL;
+Visitor *iv = NULL;
+Error *local_err = NULL;
+
+qdict_extract_subqdict(options, , "server.");
+if (!qdict_size(addr)) {
+error_setg(errp, "NBD server address missing");
+goto done;
 }
 
-s->client.is_unix = saddr->type == SOCKET_ADDRESS_KIND_UNIX;
+crumpled_addr 

[Qemu-block] [PATCH v5 13/13] iotests: Add test for NBD's blockdev-add interface

2016-10-25 Thread Max Reitz
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/147 | 196 +
 tests/qemu-iotests/147.out |   5 ++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 202 insertions(+)
 create mode 100755 tests/qemu-iotests/147
 create mode 100644 tests/qemu-iotests/147.out

diff --git a/tests/qemu-iotests/147 b/tests/qemu-iotests/147
new file mode 100755
index 000..32d2a03
--- /dev/null
+++ b/tests/qemu-iotests/147
@@ -0,0 +1,196 @@
+#!/usr/bin/env python
+#
+# Test case for NBD's blockdev-add interface
+#
+# Copyright (C) 2016 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import socket
+import stat
+import time
+import iotests
+from iotests import cachemode, imgfmt, qemu_img, qemu_nbd
+
+NBD_PORT = 10811
+
+test_img = os.path.join(iotests.test_dir, 'test.img')
+unix_socket = os.path.join(iotests.test_dir, 'nbd.socket')
+
+class NBDBlockdevAddBase(iotests.QMPTestCase):
+def blockdev_add_options(self, address, export=None):
+options = { 'node-name': 'nbd-blockdev',
+'driver': 'raw',
+'file': {
+'driver': 'nbd',
+'server': address
+} }
+if export is not None:
+options['file']['export'] = export
+return options
+
+def client_test(self, filename, address, export=None):
+bao = self.blockdev_add_options(address, export)
+result = self.vm.qmp('blockdev-add', **bao)
+self.assert_qmp(result, 'return', {})
+
+result = self.vm.qmp('query-named-block-nodes')
+for node in result['return']:
+if node['node-name'] == 'nbd-blockdev':
+if isinstance(filename, str):
+self.assert_qmp(node, 'image/filename', filename)
+else:
+self.assert_json_filename_equal(node['image']['filename'],
+filename)
+break
+
+result = self.vm.qmp('x-blockdev-del', node_name='nbd-blockdev')
+self.assert_qmp(result, 'return', {})
+
+
+class QemuNBD(NBDBlockdevAddBase):
+def setUp(self):
+qemu_img('create', '-f', iotests.imgfmt, test_img, '64k')
+self.vm = iotests.VM()
+self.vm.launch()
+
+def tearDown(self):
+self.vm.shutdown()
+os.remove(test_img)
+try:
+os.remove(unix_socket)
+except OSError:
+pass
+
+def _server_up(self, *args):
+self.assertEqual(qemu_nbd('-f', imgfmt, test_img, *args), 0)
+
+def test_inet(self):
+self._server_up('-p', str(NBD_PORT))
+address = { 'type': 'inet',
+'data': {
+'host': 'localhost',
+'port': str(NBD_PORT)
+} }
+self.client_test('nbd://localhost:%i' % NBD_PORT, address)
+
+def test_unix(self):
+self._server_up('-k', unix_socket)
+address = { 'type': 'unix',
+'data': { 'path': unix_socket } }
+self.client_test('nbd+unix://?socket=' + unix_socket, address)
+
+
+class BuiltinNBD(NBDBlockdevAddBase):
+def setUp(self):
+qemu_img('create', '-f', iotests.imgfmt, test_img, '64k')
+self.vm = iotests.VM()
+self.vm.launch()
+self.server = iotests.VM('.server')
+self.server.add_drive_raw('if=none,id=nbd-export,' +
+  'file=%s,' % test_img +
+  'format=%s,' % imgfmt +
+  'cache=%s' % cachemode)
+self.server.launch()
+
+def tearDown(self):
+self.vm.shutdown()
+self.server.shutdown()
+os.remove(test_img)
+try:
+os.remove(unix_socket)
+except OSError:
+pass
+
+def _server_up(self, address):
+result = self.server.qmp('nbd-server-start', addr=address)
+self.assert_qmp(result, 'return', {})
+
+result = self.server.qmp('nbd-server-add', device='nbd-export')
+self.assert_qmp(result, 'return', {})
+
+def _server_down(self):
+result = self.server.qmp('nbd-server-stop')
+self.assert_qmp(result, 'return', {})
+
+def test_inet(self):
+address = { 'type': 'inet',
+ 

[Qemu-block] [PATCH v5 02/13] block/nbd: Reject port parameter without host

2016-10-25 Thread Max Reitz
Currently, a port that is passed along with a UNIX socket path is
silently ignored. That is not exactly ideal, it should be an error
instead.

Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Max Reitz 
---
 block/nbd.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index ce7c14f..eaca33c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -197,6 +197,7 @@ static SocketAddress *nbd_config(BDRVNBDState *s, QemuOpts 
*opts, Error **errp)
 
 s->path = g_strdup(qemu_opt_get(opts, "path"));
 s->host = g_strdup(qemu_opt_get(opts, "host"));
+s->port = g_strdup(qemu_opt_get(opts, "port"));
 
 if (!s->path == !s->host) {
 if (s->path) {
@@ -206,6 +207,10 @@ static SocketAddress *nbd_config(BDRVNBDState *s, QemuOpts 
*opts, Error **errp)
 }
 return NULL;
 }
+if (s->port && !s->host) {
+error_setg(errp, "port may not be used without host");
+return NULL;
+}
 
 saddr = g_new0(SocketAddress, 1);
 
@@ -217,8 +222,6 @@ static SocketAddress *nbd_config(BDRVNBDState *s, QemuOpts 
*opts, Error **errp)
 } else {
 InetSocketAddress *inet;
 
-s->port = g_strdup(qemu_opt_get(opts, "port"));
-
 saddr->type = SOCKET_ADDRESS_KIND_INET;
 inet = saddr->u.inet.data = g_new0(InetSocketAddress, 1);
 inet->host = g_strdup(s->host);
-- 
2.10.1




[Qemu-block] [PATCH v5 11/13] socket_scm_helper: Accept fd directly

2016-10-25 Thread Max Reitz
This gives us more freedom about the fd that is passed to qemu, allowing
us to e.g. pass sockets.

Reviewed-by: Kevin Wolf 
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/socket_scm_helper.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/tests/qemu-iotests/socket_scm_helper.c 
b/tests/qemu-iotests/socket_scm_helper.c
index 80cadf4..eb76d31 100644
--- a/tests/qemu-iotests/socket_scm_helper.c
+++ b/tests/qemu-iotests/socket_scm_helper.c
@@ -60,7 +60,7 @@ static int send_fd(int fd, int fd_to_send)
 }
 
 /* Convert string to fd number. */
-static int get_fd_num(const char *fd_str)
+static int get_fd_num(const char *fd_str, bool silent)
 {
 int sock;
 char *err;
@@ -68,12 +68,16 @@ static int get_fd_num(const char *fd_str)
 errno = 0;
 sock = strtol(fd_str, , 10);
 if (errno) {
-fprintf(stderr, "Failed in strtol for socket fd, reason: %s\n",
-strerror(errno));
+if (!silent) {
+fprintf(stderr, "Failed in strtol for socket fd, reason: %s\n",
+strerror(errno));
+}
 return -1;
 }
 if (!*fd_str || *err || sock < 0) {
-fprintf(stderr, "bad numerical value for socket fd '%s'\n", fd_str);
+if (!silent) {
+fprintf(stderr, "bad numerical value for socket fd '%s'\n", 
fd_str);
+}
 return -1;
 }
 
@@ -104,18 +108,21 @@ int main(int argc, char **argv, char **envp)
 }
 
 
-sock = get_fd_num(argv[1]);
+sock = get_fd_num(argv[1], false);
 if (sock < 0) {
 return EXIT_FAILURE;
 }
 
-/* Now only open a file in readonly mode for test purpose. If more precise
-   control is needed, use python script in file operation, which is
-   supposed to fork and exec this program. */
-fd = open(argv[2], O_RDONLY);
+fd = get_fd_num(argv[2], true);
 if (fd < 0) {
-fprintf(stderr, "Failed to open file '%s'\n", argv[2]);
-return EXIT_FAILURE;
+/* Now only open a file in readonly mode for test purpose. If more
+   precise control is needed, use python script in file operation, 
which
+   is supposed to fork and exec this program. */
+fd = open(argv[2], O_RDONLY);
+if (fd < 0) {
+fprintf(stderr, "Failed to open file '%s'\n", argv[2]);
+return EXIT_FAILURE;
+}
 }
 
 ret = send_fd(sock, fd);
-- 
2.10.1




[Qemu-block] [PATCH v5 00/13] qapi: Allow blockdev-add for NBD

2016-10-25 Thread Max Reitz
This series adds blockdev-add support for NBD clients.

Patches 1, 2, 3, and 4 are minor patches with no functional relation to
this series, other than the fact that later patches will touch the code
they touch, too.

Patch 5 prepares the code for the addition of a new option prefix, which
is "server.".

Patch 6 makes the NBD client accept a SocketAddress under the "server"
option (or rather, a flattened SocketAddress QDict with its keys
prefixed by "server."). The old options "host", "port", and "path" are
supported as legacy options and translated to the respective
SocketAddress representation.

Patch 7 drops usage of "host", "port", and "path" outside of
nbd_has_filename_options_conflict(),
nbd_process_legacy_socket_options(), and nbd_refresh_filename(), making
those options nothing but legacy.

Patch 8, the goal of this series, is again not very complicated.

Patches 9, 10, 11, and 12 are required for the iotest added in patch 13.
It will invoke qemu-nbd, so patch 9 is required. Besides qemu-nbd, it
will launch an NBD server VM concurrently to the client VM, which is why
patch 10 is required. And finally, it will test whether we can add an
NBD BDS by passing it a file descriptor, which patch 11 is needed for
(so we use the socket_scm_helper to pass sockets to qemu).

During rebase of v5 I noticed that the order of keys in has changed when
querying a json: filename. Instead of just adjusting the reference
string, I decided to write a function for actually parsing such
filenames and comparing them against reference dicts. This is
implemented in patch 12.

Patch 13 then adds the iotest for NBD's blockdev-add interface.


***This series depends on qdict_crumple() as it has been merged by
   Markus to his qapi-next branch (i.e. without @recursive).***


v5:
- Patch 6:
  - "server" instead of "address"
  - Dropped comparison of key against "server" (or "address",
previously), because all options are flattened anyway and it looks
strange to assume otherwise here
  - Use strstart() instead of strncmp()
  - Reject use of the new server option and the legacy path/host/port
options at the same time
  - qdict_crumple() no longer has a @recursive parameter
  - The former QMP visitors are now called QObject visitors
- Patch 7: Only rebase conflicts because of %s/address/server/g
- Patch 8: Same
- Patch 12: As explained above, comparing json: filenames against
reference strings is not extremely clever. This new patch
adds a function to compare such filenames against reference
Python dicts (which is thus order-independent).
- Patch 13:
  - %s/address/server/g
  - Rebase conflict because of the blockdev-add parameter change
  - Use the new assert_json_filename_equal() function from patch 12
  - Delete the Unix socket file (if there is one) in tearDown()


git-backport-diff against v4:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/13:[] [--] 'block/nbd: Drop trailing "." in error messages'
002/13:[] [--] 'block/nbd: Reject port parameter without host'
003/13:[] [--] 'block/nbd: Default port in nbd_refresh_filename()'
004/13:[] [--] 'block/nbd: Use qdict_put()'
005/13:[] [--] 'block/nbd: Add nbd_has_filename_options_conflict()'
006/13:[0093] [FC] 'block/nbd: Accept SocketAddress'
007/13:[0020] [FC] 'block/nbd: Use SocketAddress options'
008/13:[0004] [FC] 'qapi: Allow blockdev-add for NBD'
009/13:[] [--] 'iotests.py: Add qemu_nbd function'
010/13:[] [--] 'iotests.py: Allow concurrent qemu instances'
011/13:[] [--] 'socket_scm_helper: Accept fd directly'
012/13:[down] 'iotests: Add assert_json_filename_equal() method'
013/13:[0081] [FC] 'iotests: Add test for NBD's blockdev-add interface'


Max Reitz (13):
  block/nbd: Drop trailing "." in error messages
  block/nbd: Reject port parameter without host
  block/nbd: Default port in nbd_refresh_filename()
  block/nbd: Use qdict_put()
  block/nbd: Add nbd_has_filename_options_conflict()
  block/nbd: Accept SocketAddress
  block/nbd: Use SocketAddress options
  qapi: Allow blockdev-add for NBD
  iotests.py: Add qemu_nbd function
  iotests.py: Allow concurrent qemu instances
  socket_scm_helper: Accept fd directly
  iotests: Add assert_json_filename_equal() method
  iotests: Add test for NBD's blockdev-add interface

 block/nbd.c| 234 +
 qapi/block-core.json   |  25 +++-
 tests/qemu-iotests/051.out |   4 +-
 tests/qemu-iotests/051.pc.out  |   4 +-
 tests/qemu-iotests/147 | 196 +++
 tests/qemu-iotests/147.out |   5 +
 tests/qemu-iotests/group   |   1 +
 tests/qemu-iotests/iotests.py  |  34 -
 tests/qemu-iotests/socket_scm_helper.c |  29 ++--
 9 

[Qemu-block] [PATCH v5 05/13] block/nbd: Add nbd_has_filename_options_conflict()

2016-10-25 Thread Max Reitz
Right now, we have four possible options that conflict with specifying
an NBD filename, and a future patch will add another one ("address").
This future option is a nested QDict that is flattened at this point,
requiring us to test each option whether its key has an "address."
prefix. Therefore, we will then need to iterate through all options
(including the "export" option which was not covered so far).

Adding this iteration logic now will simplify adding the new option
later. A nice side effect is that the user will not receive a long list
of five options which are not supposed to be specified with a filename,
but we can actually print the problematic option.

Reviewed-by: Kevin Wolf 
Signed-off-by: Max Reitz 
---
 block/nbd.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index c539fb5..cdab20f 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -123,6 +123,25 @@ out:
 return ret;
 }
 
+static bool nbd_has_filename_options_conflict(QDict *options, Error **errp)
+{
+const QDictEntry *e;
+
+for (e = qdict_first(options); e; e = qdict_next(options, e)) {
+if (!strcmp(e->key, "host") ||
+!strcmp(e->key, "port") ||
+!strcmp(e->key, "path") ||
+!strcmp(e->key, "export"))
+{
+error_setg(errp, "Option '%s' cannot be used with a file name",
+   e->key);
+return true;
+}
+}
+
+return false;
+}
+
 static void nbd_parse_filename(const char *filename, QDict *options,
Error **errp)
 {
@@ -131,12 +150,7 @@ static void nbd_parse_filename(const char *filename, QDict 
*options,
 const char *host_spec;
 const char *unixpath;
 
-if (qdict_haskey(options, "host")
-|| qdict_haskey(options, "port")
-|| qdict_haskey(options, "path"))
-{
-error_setg(errp, "host/port/path and a file name may not be specified "
- "at the same time");
+if (nbd_has_filename_options_conflict(options, errp)) {
 return;
 }
 
-- 
2.10.1




[Qemu-block] [PATCH v5 01/13] block/nbd: Drop trailing "." in error messages

2016-10-25 Thread Max Reitz
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Max Reitz 
---
 block/nbd.c   | 4 ++--
 tests/qemu-iotests/051.out| 4 ++--
 tests/qemu-iotests/051.pc.out | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 6bc06d6..ce7c14f 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -200,9 +200,9 @@ static SocketAddress *nbd_config(BDRVNBDState *s, QemuOpts 
*opts, Error **errp)
 
 if (!s->path == !s->host) {
 if (s->path) {
-error_setg(errp, "path and host may not be used at the same 
time.");
+error_setg(errp, "path and host may not be used at the same time");
 } else {
-error_setg(errp, "one of path and host must be specified.");
+error_setg(errp, "one of path and host must be specified");
 }
 return NULL;
 }
diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out
index 408d613..9e584fe 100644
--- a/tests/qemu-iotests/051.out
+++ b/tests/qemu-iotests/051.out
@@ -222,7 +222,7 @@ Testing: -drive driver=file
 QEMU_PROG: -drive driver=file: The 'file' block driver requires a file name
 
 Testing: -drive driver=nbd
-QEMU_PROG: -drive driver=nbd: one of path and host must be specified.
+QEMU_PROG: -drive driver=nbd: one of path and host must be specified
 
 Testing: -drive driver=raw
 QEMU_PROG: -drive driver=raw: Can't use 'raw' as a block driver for the 
protocol level
@@ -231,7 +231,7 @@ Testing: -drive file.driver=file
 QEMU_PROG: -drive file.driver=file: The 'file' block driver requires a file 
name
 
 Testing: -drive file.driver=nbd
-QEMU_PROG: -drive file.driver=nbd: one of path and host must be specified.
+QEMU_PROG: -drive file.driver=nbd: one of path and host must be specified
 
 Testing: -drive file.driver=raw
 QEMU_PROG: -drive file.driver=raw: Can't use 'raw' as a block driver for the 
protocol level
diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
index ec6d222..6395a30 100644
--- a/tests/qemu-iotests/051.pc.out
+++ b/tests/qemu-iotests/051.pc.out
@@ -316,7 +316,7 @@ Testing: -drive driver=file
 QEMU_PROG: -drive driver=file: The 'file' block driver requires a file name
 
 Testing: -drive driver=nbd
-QEMU_PROG: -drive driver=nbd: one of path and host must be specified.
+QEMU_PROG: -drive driver=nbd: one of path and host must be specified
 
 Testing: -drive driver=raw
 QEMU_PROG: -drive driver=raw: Can't use 'raw' as a block driver for the 
protocol level
@@ -325,7 +325,7 @@ Testing: -drive file.driver=file
 QEMU_PROG: -drive file.driver=file: The 'file' block driver requires a file 
name
 
 Testing: -drive file.driver=nbd
-QEMU_PROG: -drive file.driver=nbd: one of path and host must be specified.
+QEMU_PROG: -drive file.driver=nbd: one of path and host must be specified
 
 Testing: -drive file.driver=raw
 QEMU_PROG: -drive file.driver=raw: Can't use 'raw' as a block driver for the 
protocol level
-- 
2.10.1




[Qemu-block] [PATCH v5 03/13] block/nbd: Default port in nbd_refresh_filename()

2016-10-25 Thread Max Reitz
Instead of not emitting the port in nbd_refresh_filename(), just set it
to the default if the user did not specify it. This makes the logic a
bit simpler.

Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Max Reitz 
---
 block/nbd.c | 19 ++-
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index eaca33c..c77a969 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -444,6 +444,7 @@ static void nbd_refresh_filename(BlockDriverState *bs, 
QDict *options)
 {
 BDRVNBDState *s = bs->opaque;
 QDict *opts = qdict_new();
+const char *port = s->port ?: stringify(NBD_DEFAULT_PORT);
 
 qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd")));
 
@@ -453,27 +454,19 @@ static void nbd_refresh_filename(BlockDriverState *bs, 
QDict *options)
 } else if (s->path && !s->export) {
 snprintf(bs->exact_filename, sizeof(bs->exact_filename),
  "nbd+unix://?socket=%s", s->path);
-} else if (!s->path && s->export && s->port) {
+} else if (!s->path && s->export) {
 snprintf(bs->exact_filename, sizeof(bs->exact_filename),
- "nbd://%s:%s/%s", s->host, s->port, s->export);
-} else if (!s->path && s->export && !s->port) {
+ "nbd://%s:%s/%s", s->host, port, s->export);
+} else if (!s->path && !s->export) {
 snprintf(bs->exact_filename, sizeof(bs->exact_filename),
- "nbd://%s/%s", s->host, s->export);
-} else if (!s->path && !s->export && s->port) {
-snprintf(bs->exact_filename, sizeof(bs->exact_filename),
- "nbd://%s:%s", s->host, s->port);
-} else if (!s->path && !s->export && !s->port) {
-snprintf(bs->exact_filename, sizeof(bs->exact_filename),
- "nbd://%s", s->host);
+ "nbd://%s:%s", s->host, port);
 }
 
 if (s->path) {
 qdict_put_obj(opts, "path", QOBJECT(qstring_from_str(s->path)));
-} else if (s->port) {
-qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(s->host)));
-qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(s->port)));
 } else {
 qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(s->host)));
+qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(port)));
 }
 if (s->export) {
 qdict_put_obj(opts, "export", QOBJECT(qstring_from_str(s->export)));
-- 
2.10.1




[Qemu-block] [PATCH v5 04/13] block/nbd: Use qdict_put()

2016-10-25 Thread Max Reitz
Instead of inlining this nice macro (i.e. resorting to
qdict_put_obj(..., QOBJECT(...))), use it.

Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Max Reitz 
---
 block/nbd.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index c77a969..c539fb5 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -446,7 +446,7 @@ static void nbd_refresh_filename(BlockDriverState *bs, 
QDict *options)
 QDict *opts = qdict_new();
 const char *port = s->port ?: stringify(NBD_DEFAULT_PORT);
 
-qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd")));
+qdict_put(opts, "driver", qstring_from_str("nbd"));
 
 if (s->path && s->export) {
 snprintf(bs->exact_filename, sizeof(bs->exact_filename),
@@ -463,17 +463,16 @@ static void nbd_refresh_filename(BlockDriverState *bs, 
QDict *options)
 }
 
 if (s->path) {
-qdict_put_obj(opts, "path", QOBJECT(qstring_from_str(s->path)));
+qdict_put(opts, "path", qstring_from_str(s->path));
 } else {
-qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(s->host)));
-qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(port)));
+qdict_put(opts, "host", qstring_from_str(s->host));
+qdict_put(opts, "port", qstring_from_str(port));
 }
 if (s->export) {
-qdict_put_obj(opts, "export", QOBJECT(qstring_from_str(s->export)));
+qdict_put(opts, "export", qstring_from_str(s->export));
 }
 if (s->tlscredsid) {
-qdict_put_obj(opts, "tls-creds",
-  QOBJECT(qstring_from_str(s->tlscredsid)));
+qdict_put(opts, "tls-creds", qstring_from_str(s->tlscredsid));
 }
 
 bs->full_open_options = opts;
-- 
2.10.1




[Qemu-block] [PATCH v4 4/5] block/ssh: Use InetSocketAddress options

2016-10-25 Thread Ashijeet Acharya
Drop the use of legacy options in favour of the InetSocketAddress
options.

Signed-off-by: Ashijeet Acharya 
Reviewed-by: Max Reitz 
Reviewed-by: Kevin Wolf 
---
 block/ssh.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/ssh.c b/block/ssh.c
index d814006..6e6966d 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -197,6 +197,7 @@ static int parse_uri(const char *filename, QDict *options, 
Error **errp)
 {
 URI *uri = NULL;
 QueryParams *qp;
+char *port_str;
 int i;
 
 uri = uri_parse(filename);
@@ -229,11 +230,11 @@ static int parse_uri(const char *filename, QDict 
*options, Error **errp)
 qdict_put(options, "user", qstring_from_str(uri->user));
 }
 
-qdict_put(options, "host", qstring_from_str(uri->server));
+qdict_put(options, "server.host", qstring_from_str(uri->server));
 
-if (uri->port) {
-qdict_put(options, "port", qint_from_int(uri->port));
-}
+port_str = g_strdup_printf("%d", uri->port ?: 22);
+qdict_put(options, "server.port", qstring_from_str(port_str));
+g_free(port_str);
 
 qdict_put(options, "path", qstring_from_str(uri->path));
 
-- 
2.6.2




[Qemu-block] [PATCH v4 5/5] qapi: allow blockdev-add for ssh

2016-10-25 Thread Ashijeet Acharya
Introduce new object 'BlockdevOptionsSsh' in qapi/block-core.json to
support blockdev-add for SSH network protocol driver. Use only 'struct
InetSocketAddress' since SSH only supports connection over TCP.

Signed-off-by: Ashijeet Acharya 
Reviewed-by: Kevin Wolf 
---
 qapi/block-core.json | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9d797b8..689dc0a 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1716,7 +1716,8 @@
 'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
 'host_device', 'http', 'https', 'luks', 'null-aio', 'null-co',
 'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
-   'replication', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
+'replication', 'ssh', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc',
+'vvfat' ] }
 
 ##
 # @BlockdevOptionsFile
@@ -1953,6 +1954,27 @@
 '*vport': 'int',
 '*segment': 'str' } }
 
+##
+# @BlockdevOptionsSsh
+#
+# @server:  host address
+#
+# @path:path to the image on the host
+#
+# @user:#optional user as which to connect, defaults to current
+#   local user name
+#
+# @host_key_check   #optional defines how and what to check the host
+#   key against, defaults to "yes"
+#
+# Since 2.8
+##
+{ 'struct': 'BlockdevOptionsSsh',
+  'data': { 'server': 'InetSocketAddress',
+'path': 'str',
+'*user': 'str',
+'*host_key_check': 'str' } }
+
 
 ##
 # @BlkdebugEvent
@@ -2281,7 +2303,7 @@
 # TODO rbd: Wait for structured options
   'replication':'BlockdevOptionsReplication',
 # TODO sheepdog: Wait for structured options
-# TODO ssh: Should take InetSocketAddress for 'host'?
+  'ssh':'BlockdevOptionsSsh',
   'tftp':   'BlockdevOptionsCurl',
   'vdi':'BlockdevOptionsGenericFormat',
   'vhdx':   'BlockdevOptionsGenericFormat',
-- 
2.6.2




[Qemu-block] [PATCH v4 1/5] block/ssh: Add ssh_has_filename_options_conflict()

2016-10-25 Thread Ashijeet Acharya
We have 5 options plus ("server") option which is added in the next
patch that conflict with specifying a SSH filename. We need to iterate
over all the options to check whether its key has an "server." prefix.

This iteration will help us adding the new option "server" easily.

Signed-off-by: Ashijeet Acharya 
Reviewed-by: Max Reitz 
Reviewed-by: Kevin Wolf 
---
 block/ssh.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/block/ssh.c b/block/ssh.c
index 5ce12b6..75cb7bc 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -254,15 +254,30 @@ static int parse_uri(const char *filename, QDict 
*options, Error **errp)
 return -EINVAL;
 }
 
+static bool ssh_has_filename_options_conflict(QDict *options, Error **errp)
+{
+const QDictEntry *qe;
+
+for (qe = qdict_first(options); qe; qe = qdict_next(options, qe)) {
+if (!strcmp(qe->key, "host") ||
+!strcmp(qe->key, "port") ||
+!strcmp(qe->key, "path") ||
+!strcmp(qe->key, "user") ||
+!strcmp(qe->key, "host_key_check"))
+{
+error_setg(errp, "Option '%s' cannot be used with a file name",
+   qe->key);
+return true;
+}
+}
+
+return false;
+}
+
 static void ssh_parse_filename(const char *filename, QDict *options,
Error **errp)
 {
-if (qdict_haskey(options, "user") ||
-qdict_haskey(options, "host") ||
-qdict_haskey(options, "port") ||
-qdict_haskey(options, "path") ||
-qdict_haskey(options, "host_key_check")) {
-error_setg(errp, "user, host, port, path, host_key_check cannot be 
used at the same time as a file option");
+if (ssh_has_filename_options_conflict(options, errp)) {
 return;
 }
 
-- 
2.6.2




[Qemu-block] [PATCH v4 0/5] Allow blockdev-add for SSH

2016-10-25 Thread Ashijeet Acharya
Previously posted series patches:
v3: https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03781.html
v2: https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03403.html
v1: http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg02137.html

This series adds blockdev-add support for SSH block driver.

Patch 1 prepares the code for the addition of a new option prefix,
which is "server.". This is accomplished by adding a
ssh_has_filename_options_conflict() function which helps to iterate
over the various options and check for conflict.

Patch 2 makes inet_connect_saddr() in util/qemu-sockets.c public.

Patch 3 first adds InetSocketAddress compatibility to SSH block driver
and then makes it accept a InetSocketAddress under the "server" option.
The old options "host" and "port" are supported as legacy options and
then translated to the respective InetSocketAddress representation.

Patch 4 drops the usage of "host" and "port" outside of
ssh_has_filename_options_conflict() and
ssh_process_legacy_socket_options() functions in order to make them
legacy options completely.

Patch 5 helps to allow blockdev-add support for the SSH block driver
by making the SSH option available.


*** This series depends on the following patch: ***
"qdict: implement a qdict_crumple method for un-flattening a dict"
from Daniel's "QAPI/QOM work for non-scalar object properties"
series.

Changes in v4:
- remove hostport from BDRVSSHState and use s->inet->host in warning message

Changes in v3:
- reorder patch 2 and 3 from v2 (Max)
- fix coding-style issue in patch 2 (Max)
- drop testing to check for "server" as an object itself (Max)
- fix strstart() bug (Max)
- fix a segfault bug when host gets set to NULL (Max)
- revert back to using qobject_input_visitor_new() (Max & Kevin)
- use qemu_strtol() instead of atoi() for better error handling (Kevin)
- make @user an optional argument in qapi/block-core.json (Max)
- update documentation for BlockdevOptionsSsh (Max)

Changes in v2:
- Use strstart() instead of strcmp() (Kevin)
- Use qobject_input_visitor_new_autocast() instead of
  qmp_input_visitor_new() and change header files accordingly (Kevin)
- Use inet_connect_saddr() instead of inet_connect() (Kevin)
- Drop the contruction of : string (Kevin)
- Fix the bug in ssh_process_legacy_socket_options()




Ashijeet Acharya (5):
  block/ssh: Add ssh_has_filename_options_conflict()
  util/qemu-sockets: Make inet_connect_saddr() public
  block/ssh: Add InetSocketAddress and accept it
  block/ssh: Use InetSocketAddress options
  qapi: allow blockdev-add for ssh

 block/ssh.c| 132 +++--
 include/qemu/sockets.h |   2 +
 qapi/block-core.json   |  26 +-
 util/qemu-sockets.c|   4 +-
 4 files changed, 135 insertions(+), 29 deletions(-)

-- 
2.6.2




[Qemu-block] [PATCH v4 2/5] util/qemu-sockets: Make inet_connect_saddr() public

2016-10-25 Thread Ashijeet Acharya
Make inet_connect_saddr() in util/qemu-sockets.c public in order to be
able to use it with InetSocketAddress sockets outside of
util/qemu-sockets.c independently.

Signed-off-by: Ashijeet Acharya 
Reviewed-by: Kevin Wolf 
---
 include/qemu/sockets.h | 2 ++
 util/qemu-sockets.c| 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/qemu/sockets.h b/include/qemu/sockets.h
index 9eb2470..5589e68 100644
--- a/include/qemu/sockets.h
+++ b/include/qemu/sockets.h
@@ -34,6 +34,8 @@ typedef void NonBlockingConnectHandler(int fd, Error *err, 
void *opaque);
 
 InetSocketAddress *inet_parse(const char *str, Error **errp);
 int inet_connect(const char *str, Error **errp);
+int inet_connect_saddr(InetSocketAddress *saddr, Error **errp,
+   NonBlockingConnectHandler *callback, void *opaque);
 
 NetworkAddressFamily inet_netfamily(int family);
 
diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 4cef549..31f7fc6 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -412,8 +412,8 @@ static struct addrinfo 
*inet_parse_connect_saddr(InetSocketAddress *saddr,
  * function succeeds, callback will be called when the connection
  * completes, with the file descriptor on success, or -1 on error.
  */
-static int inet_connect_saddr(InetSocketAddress *saddr, Error **errp,
-  NonBlockingConnectHandler *callback, void 
*opaque)
+int inet_connect_saddr(InetSocketAddress *saddr, Error **errp,
+   NonBlockingConnectHandler *callback, void *opaque)
 {
 Error *local_err = NULL;
 struct addrinfo *res, *e;
-- 
2.6.2




[Qemu-block] ping Re: [PATCH 00/18] Dirty bitmaps postcopy migration

2016-10-25 Thread Vladimir Sementsov-Ogievskiy

ping

For now there are some notes mostly about accessory patches. What about 
migration itself? Is it ok? Has it a chance of being merged one day?


16.08.2016 13:25, Vladimir Sementsov-Ogievskiy wrote:

v2:
some bugs fixed, iotests a bit changed and merged into one test.
based on block-next (https://github.com/XanClic/qemu/commits/block-next)
clone: tag postcopy-v2 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: 
https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=refs%2Ftags%2Fpostcopy-v2

v1:

These series are derived from my 'Dirty bitmaps migration' series. The
core idea is switch to postcopy migration and drop usage of meta
bitmaps.

These patches provide dirty bitmap postcopy migration feature. Only
named dirty bitmaps are to be migrated. Migration may be enabled using
migration capabilities.

The overall method (thanks to John Snow):

1. migrate bitmaps meta data in .save_live_setup
- create/find related bitmaps on target
- disable them
- create successors (anonimous children) only for enabled migrated
  bitmaps
2. do nothing in precopy stage
3. just before target vm start: enable successors, created in (1)
4. migrate bitmap data
5. reclaime bitmaps (merge successors to their parents)
6. enable bitmaps (only bitmaps, which was enabled in source)


Some patches are unchnaged from (v7) of 'Dirty bitmaps migration'
(DBMv7). I've left Reviewed-by's for them, if you don't like it, say me
and I'll drop them in the following version.

So, relatively to last DBMv7:

01-04: new patches, splitting common postcopy migration out of ram
postcopy migration
05: equal to DBMv7.05
06: new
07: equal to DBMv7.06
08: new
09: equal to DBMv7.07
10: new
11: derived from DBMv7.08, see below
12-15: equal to DBMv7.09-12
16: derived from DBMv7.13
- switch from fifo to socket, as postcopy don't work with fifo
  for now
- change parameters: size, granularity, regions
- add time.sleep, to wait for postcopy migration phase (bad
  temporary solution.
- drop Reviewed-by
17: new

11: the core patch of the series, it is derived from
[DBMv7.08: migration: add migration_block-dirty-bitmap.c]
There are a lot of changes related to switching from precopy to
postcopy, but functions related to migration stream itself
(structs, send/load sequences) are mostly unchnaged.

So, changes, to switch from precopy to postcopy:
- removed all staff related to meta bitmaps and dirty phase!!!
- add dirty_bitmap_mig_enable_successors, and call it before
  target vm start in loadvm_postcopy_handle_run
- add enabled_bitmaps list of bitmaps for
  dirty_bitmap_mig_enable_successors

- enabled flag is send with start bitmap chunk instead of
  completion chunk
- sectors_per_chunk is calculated directly from CHUNK_SIZE, not
  using meta bitmap granularity

- dirty_bitmap_save_iterate: remove dirty_phase, move bulk_phase
  to postcopy stage
- dirty_bitmap_save_pending: remove dirty phase related pending,
  switch pending to non-postcopyable
- dirty_bitmap_load_start: get enabled flag and prepare
  successors for enabled bitmaps, also add them to
  enabled_bitmaps list
- dirty_bitmap_load_complete: for enabled bitmaps: merge them
  with successors and enable

- savevm handlers:
  * remove separate savevm_dirty_bitmap_live_iterate_handlers state
(it was bad idea, any way), and move its save_live_iterate to
savevm_dirty_bitmap_handlers
  * add is_active_iterate savevm handler, which allows iterations
only in postcopy stage (after stopping source vm)
  * add has_postcopy savevm handler. (ofcourse, just returning true)
  * use save_live_complete_postcopy instead of
save_live_complete_precopy

Other changes:
- some debug output changed
- remove MIN_LIVE_SIZE, is_live_iterative and related staff (it
  was needed to omit iterations if bitmap data is small, possibly
  this should be reimplemented)

Vladimir Sementsov-Ogievskiy (18):
   migration: add has_postcopy savevm handler
   migration: fix ram_save_pending
   migration: split common postcopy out of ram postcopy
   migration: introduce postcopy-only pending
   block: add bdrv_next_dirty_bitmap()
   block: add bdrv_dirty_bitmap_enable_successor()
   qapi: add dirty-bitmaps migration capability
   block/dirty-bitmap: add bdrv_dirty_bitmap_release_successor
   migration: include migrate_dirty_bitmaps in migrate_postcopy
   migration/qemu-file: add qemu_put_counted_string()
   migration: add is_active_iterate handler
   migration: add postcopy migration of dirty bitmaps
   iotests: maintain several vms in test
   iotests: add add_incoming_migration to VM class
   qapi: add md5 

Re: [Qemu-block] [PATCH 4/4] block: Cater to iscsi with non-power-of-2 discard

2016-10-25 Thread Peter Lieven

Am 25.10.2016 um 14:19 schrieb Paolo Bonzini:


On 25/10/2016 14:12, Peter Lieven wrote:

Am 25.10.2016 um 14:09 schrieb Paolo Bonzini:

On 25/10/2016 14:03, Peter Lieven wrote:

Am 01.08.2016 um 11:22 schrieb Paolo Bonzini:

On 28/07/2016 04:39, Eric Blake wrote:

On 07/27/2016 01:25 AM, Fam Zheng wrote:

On Thu, 07/21 13:34, Eric Blake wrote:

+max_write_zeroes = max_write_zeroes / alignment * alignment;

Not using QEMU_ALIGN_DOWN despite patch 3?

Looks like I missed that on the rebase. Can fix if there is a
reason for
a respin.

Since Stefan acked this, I'm applying the patch and fixing it to use
QEMU_ALIGN_DOWN.

Paolo

Hi,

I came across a sort of regression we introduced with the dropping of
head and tail
of an unaligned discard.

The discard alignment that we use to trim the discard request is just a
hint.

I learned on the equallogics that a page (which is this unusal 15MB
large) is
unallocated even if the discard happens in pieces. E.g. in slices of 1MB
requests.

  From my point of view I would like to restore the old behaviour.
What do
you think?

The right logic should be the one in Linux: if splitting a request, and
the next starting sector would be misaligned, stop the discard at the
previous aligned sector.  Otherwise leave everything alone.

Just to clarify. I mean the guest would send incremental 1MB discards
we would now drop all of them if the alignment is 15MB. Previously,
we have sent all of the 1MB requests.

Yes.  In this case there would be no need at all to split the request,
so each request should be passed through.

But hey, that firmware is seriously weird. :)


Yes, so you would not change the new implementation?

Even if the discard is e.g. 1MB it could theretically be that internally
the device has a finer granularity. Its an optimal discard alignment
not the minimum required discard size. I think thats a difference.
It does not say I can't handle smaller discards.

Peter




Re: [Qemu-block] [Qemu-devel] [PATCH v14 02/21] qdict: implement a qdict_crumple method for un-flattening a dict

2016-10-25 Thread Markus Armbruster
"Daniel P. Berrange"  writes:

> On Tue, Oct 25, 2016 at 12:03:33PM +0200, Markus Armbruster wrote:
>> Max Reitz  writes:
>> 
>> > On 21.10.2016 11:58, Markus Armbruster wrote:
>> >> "Daniel P. Berrange"  writes:
>> >> 
>> >>> On Tue, Oct 18, 2016 at 04:32:13PM +0200, Markus Armbruster wrote:
>>  "Daniel P. Berrange"  writes:
>> 
>> > The qdict_flatten() method will take a dict whose elements are
>> > further nested dicts/lists and flatten them by concatenating
>> > keys.
>> >
>> > The qdict_crumple() method aims to do the reverse, taking a flat
>> > qdict, and turning it into a set of nested dicts/lists. It will
>> > apply nesting based on the key name, with a '.' indicating a
>> > new level in the hierarchy. If the keys in the nested structure
>> > are all numeric, it will create a list, otherwise it will create
>> > a dict.
>> >
>> > If the keys are a mixture of numeric and non-numeric, or the
>> > numeric keys are not in strictly ascending order, an error will
>> > be reported.
>> >
>> > As an example, a flat dict containing
>> >
>> >  {
>> >'foo.0.bar': 'one',
>> >'foo.0.wizz': '1',
>> >'foo.1.bar': 'two',
>> >'foo.1.wizz': '2'
>> >  }
>> >
>> > will get turned into a dict with one element 'foo' whose
>> > value is a list. The list elements will each in turn be
>> > dicts.
>> >
>> >  {
>> >'foo': [
>> >  { 'bar': 'one', 'wizz': '1' },
>> >  { 'bar': 'two', 'wizz': '2' }
>> >],
>> >  }
>> >
>> > If the key is intended to contain a literal '.', then it must
>> > be escaped as '..'. ie a flat dict
>> >
>> >   {
>> >  'foo..bar': 'wizz',
>> >  'bar.foo..bar': 'eek',
>> >  'bar.hello': 'world'
>> >   }
>> >
>> > Will end up as
>> >
>> >   {
>> >  'foo.bar': 'wizz',
>> >  'bar': {
>> > 'foo.bar': 'eek',
>> > 'hello': 'world'
>> >  }
>> >   }
>> >
>> > The intent of this function is that it allows a set of QemuOpts
>> > to be turned into a nested data structure that mirrors the nesting
>> > used when the same object is defined over QMP.
>> >
>> > Reviewed-by: Eric Blake 
>> > Reviewed-by: Kevin Wolf 
>> > Reviewed-by: Marc-André Lureau 
>> > Signed-off-by: Daniel P. Berrange 
>> > ---
>> >  include/qapi/qmp/qdict.h |   1 +
>> >  qobject/qdict.c  | 289 
>> > +++
>> >  tests/check-qdict.c  | 261 
>> > ++
>> >  3 files changed, 551 insertions(+)
>> >
>> > diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h
>> > index 71b8eb0..e0d24e1 100644
>> > --- a/include/qapi/qmp/qdict.h
>> > +++ b/include/qapi/qmp/qdict.h
>> > @@ -73,6 +73,7 @@ void qdict_flatten(QDict *qdict);
>> >  void qdict_extract_subqdict(QDict *src, QDict **dst, const char 
>> > *start);
>> >  void qdict_array_split(QDict *src, QList **dst);
>> >  int qdict_array_entries(QDict *src, const char *subqdict);
>> > +QObject *qdict_crumple(const QDict *src, bool recursive, Error 
>> > **errp);
>> >  
>> >  void qdict_join(QDict *dest, QDict *src, bool overwrite);
>> >  
>> > diff --git a/qobject/qdict.c b/qobject/qdict.c
>> > index 60f158c..c38e90e 100644
>> > --- a/qobject/qdict.c
>> > +++ b/qobject/qdict.c
>>  [...]
>> > +/**
>> > + * qdict_crumple:
>> > + * @src: the original flat dictionary (only scalar values) to crumple
>> > + * @recursive: true to recursively crumple nested dictionaries
>> 
>>  Is recursive=false used outside tests in this series?
>> >>>
>> >>> No, its not used.
>> >>>
>> >>> It was suggested in a way earlier version by Max, but not sure if his
>> >>> code uses it or not.
>> >> 
>> >> In general, I prefer features to be added right before they're used, and
>> >> I really dislike adding features "just in case".  YAGNI.
>> >> 
>> >> Max, do you actually need this one?  If yes, please explain your use
>> >> case.
>> >
>> > As far as I can tell from a quick glance, I made the point for v1 that
>> > qdict_crumple() could be simplified by using qdict_array_split() and
>> > qdict_array_entries().
>> >
>> > Dan then (correctly) said that using these functions would worsen
>> > runtime performance of qdict_crumple() and that instead we can replace
>> > qdict_array_split() by qdict_crumple(). However, for that to work, we
>> > need to be able to make qdict_crumple() non-recursive (because
>> > qdict_array_split() is non-recursive).
>> >
>> > Dan also said that in the long run we want to keep the tree structure in
>> 

Re: [Qemu-block] [PATCH 4/4] block: Cater to iscsi with non-power-of-2 discard

2016-10-25 Thread Paolo Bonzini


On 25/10/2016 14:12, Peter Lieven wrote:
> Am 25.10.2016 um 14:09 schrieb Paolo Bonzini:
>>
>> On 25/10/2016 14:03, Peter Lieven wrote:
>>> Am 01.08.2016 um 11:22 schrieb Paolo Bonzini:
 On 28/07/2016 04:39, Eric Blake wrote:
> On 07/27/2016 01:25 AM, Fam Zheng wrote:
>> On Thu, 07/21 13:34, Eric Blake wrote:
>>> +max_write_zeroes = max_write_zeroes / alignment * alignment;
>> Not using QEMU_ALIGN_DOWN despite patch 3?
> Looks like I missed that on the rebase. Can fix if there is a
> reason for
> a respin.
 Since Stefan acked this, I'm applying the patch and fixing it to use
 QEMU_ALIGN_DOWN.

 Paolo
>>> Hi,
>>>
>>> I came across a sort of regression we introduced with the dropping of
>>> head and tail
>>> of an unaligned discard.
>>>
>>> The discard alignment that we use to trim the discard request is just a
>>> hint.
>>>
>>> I learned on the equallogics that a page (which is this unusal 15MB
>>> large) is
>>> unallocated even if the discard happens in pieces. E.g. in slices of 1MB
>>> requests.
>>>
>>>  From my point of view I would like to restore the old behaviour.
>>> What do
>>> you think?
>> The right logic should be the one in Linux: if splitting a request, and
>> the next starting sector would be misaligned, stop the discard at the
>> previous aligned sector.  Otherwise leave everything alone.
> 
> Just to clarify. I mean the guest would send incremental 1MB discards
> we would now drop all of them if the alignment is 15MB. Previously,
> we have sent all of the 1MB requests.

Yes.  In this case there would be no need at all to split the request,
so each request should be passed through.

But hey, that firmware is seriously weird. :)

Paolo



Re: [Qemu-block] [PATCH 4/4] block: Cater to iscsi with non-power-of-2 discard

2016-10-25 Thread Peter Lieven

Am 25.10.2016 um 14:09 schrieb Paolo Bonzini:


On 25/10/2016 14:03, Peter Lieven wrote:

Am 01.08.2016 um 11:22 schrieb Paolo Bonzini:

On 28/07/2016 04:39, Eric Blake wrote:

On 07/27/2016 01:25 AM, Fam Zheng wrote:

On Thu, 07/21 13:34, Eric Blake wrote:

+max_write_zeroes = max_write_zeroes / alignment * alignment;

Not using QEMU_ALIGN_DOWN despite patch 3?

Looks like I missed that on the rebase. Can fix if there is a reason for
a respin.

Since Stefan acked this, I'm applying the patch and fixing it to use
QEMU_ALIGN_DOWN.

Paolo

Hi,

I came across a sort of regression we introduced with the dropping of
head and tail
of an unaligned discard.

The discard alignment that we use to trim the discard request is just a
hint.

I learned on the equallogics that a page (which is this unusal 15MB
large) is
unallocated even if the discard happens in pieces. E.g. in slices of 1MB
requests.

 From my point of view I would like to restore the old behaviour. What do
you think?

The right logic should be the one in Linux: if splitting a request, and
the next starting sector would be misaligned, stop the discard at the
previous aligned sector.  Otherwise leave everything alone.


Just to clarify. I mean the guest would send incremental 1MB discards
we would now drop all of them if the alignment is 15MB. Previously,
we have sent all of the 1MB requests.

Peter




Re: [Qemu-block] [PATCH 4/4] block: Cater to iscsi with non-power-of-2 discard

2016-10-25 Thread Paolo Bonzini


On 25/10/2016 14:03, Peter Lieven wrote:
> Am 01.08.2016 um 11:22 schrieb Paolo Bonzini:
>>
>> On 28/07/2016 04:39, Eric Blake wrote:
>>> On 07/27/2016 01:25 AM, Fam Zheng wrote:
 On Thu, 07/21 13:34, Eric Blake wrote:
> +max_write_zeroes = max_write_zeroes / alignment * alignment;
 Not using QEMU_ALIGN_DOWN despite patch 3?
>>> Looks like I missed that on the rebase. Can fix if there is a reason for
>>> a respin.
>> Since Stefan acked this, I'm applying the patch and fixing it to use
>> QEMU_ALIGN_DOWN.
>>
>> Paolo
> 
> Hi,
> 
> I came across a sort of regression we introduced with the dropping of
> head and tail
> of an unaligned discard.
> 
> The discard alignment that we use to trim the discard request is just a
> hint.
> 
> I learned on the equallogics that a page (which is this unusal 15MB
> large) is
> unallocated even if the discard happens in pieces. E.g. in slices of 1MB
> requests.
> 
> From my point of view I would like to restore the old behaviour. What do
> you think?

The right logic should be the one in Linux: if splitting a request, and
the next starting sector would be misaligned, stop the discard at the
previous aligned sector.  Otherwise leave everything alone.

Paolo



Re: [Qemu-block] [PATCH 4/4] block: Cater to iscsi with non-power-of-2 discard

2016-10-25 Thread Peter Lieven

Am 01.08.2016 um 11:22 schrieb Paolo Bonzini:


On 28/07/2016 04:39, Eric Blake wrote:

On 07/27/2016 01:25 AM, Fam Zheng wrote:

On Thu, 07/21 13:34, Eric Blake wrote:

+max_write_zeroes = max_write_zeroes / alignment * alignment;

Not using QEMU_ALIGN_DOWN despite patch 3?

Looks like I missed that on the rebase. Can fix if there is a reason for
a respin.

Since Stefan acked this, I'm applying the patch and fixing it to use
QEMU_ALIGN_DOWN.

Paolo


Hi,

I came across a sort of regression we introduced with the dropping of head and 
tail
of an unaligned discard.

The discard alignment that we use to trim the discard request is just a hint.

I learned on the equallogics that a page (which is this unusal 15MB large) is
unallocated even if the discard happens in pieces. E.g. in slices of 1MB 
requests.

From my point of view I would like to restore the old behaviour. What do you 
think?

Thanks,
Peter




Re: [Qemu-block] [PATCH 16/22] qmp: add persistent flag to block-dirty-bitmap-add

2016-10-25 Thread Vladimir Sementsov-Ogievskiy

24.10.2016 20:30, Max Reitz wrote:

On 24.10.2016 17:12, Vladimir Sementsov-Ogievskiy wrote:

10.10.2016 19:08, Max Reitz wrote:

On 30.09.2016 12:53, Vladimir Sementsov-Ogievskiy wrote:


[...]


+}
  out:
   aio_context_release(aio_context);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 31f9990..2bf56cd 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1235,10 +1235,15 @@
   # @granularity: #optional the bitmap granularity, default is 64k for
   #   block-dirty-bitmap-add
   #
+# @persistent: #optional the bitmap is persistent, i.e. it will be
saved to
+#  corresponding block device on it's close. Default is
false.
+#  For block-dirty-bitmap-add. (Since 2.8)

I'm not sure what the "For block-dirty-bitmap-add." is supposed to mean,
because this whole struct is for block-dirty-bitmap-add (and for
block-dirty-bitmap-add transactions, to be exact, but @persistent will
surely work there, too, won't it?).

Also, I'd say "will be saved to the corresponding block device image
file" instead of just "block device", because in my understanding a
block device and its image file are two separate things.

Hmm.. But 'its close' is block device close, not file close.

In my understanding, it is the file close.


  And we call
common bdrv_* function to save it, so we actually save it to device, and
then the device puzzles out, how to actually save it.

Well, OK, it depends on what you mean by "block device". There are many
things we call a "block device", but nowadays I think it mostly refers
to either a guest block device or a BlockBackend (and as of lately,
we're more and more trying to hide the BlockBackend from the user, so
you could argue that it's only the guest device now).

The bdrv_* functions operate on block layer BDS nodes, and I don't think
we call them "block devices" (at least not any more).

In any case, I think for users the term "block device" refers to either
the device presented to the guest or to all of the block layer stuff
that's underneath, and it's not quite clear how you could save a bitmap
to that, or what it's supposed to mean to "close" a block device (you
can remove it, you can destroy it, you can delete it, but "close" it?).

But saying that it will be saved to the image file that is attached to
the block device will make it absolutely clear what we mean.

So what you have called a "device" here is neither what I'd call a
device (I'd call it a "node" or "BDS"), nor what I think users would
call a device. Also, it's not the BDS that puzzles out how to save it
but some block driver.



Ok, thank you.


+#
   # Since 2.4
   ##
   { 'struct': 'BlockDirtyBitmapAdd',
-  'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32' } }
+  'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
+  '*persistent': 'bool' } }

I think normally we'd align the new line so that the opening ' of
'*persistent' is under the opening ' of 'node'.


 ##
   # @block-dirty-bitmap-add
diff --git a/qmp-commands.hx b/qmp-commands.hx
index ba2a916..434b418 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1441,7 +1441,7 @@ EQMP
 {
   .name   = "block-dirty-bitmap-add",
-.args_type  = "node:B,name:s,granularity:i?",
+.args_type  = "node:B,name:s,granularity:i?,persistent:b?",
   .mhandler.cmd_new = qmp_marshal_block_dirty_bitmap_add,
   },
   @@ -1458,6 +1458,9 @@ Arguments:
   - "node": device/node on which to create dirty bitmap (json-string)
   - "name": name of the new dirty bitmap (json-string)
   - "granularity": granularity to track writes with (int, optional)
+- "persistent": bitmap will be saved to corresponding block device
+on it's close. Block driver should maintain
persistent bitmaps
+(json-bool, optional, default false) (Since 2.8)

And I don't know what the user is supposed to make of the information
that block drivers will take care of maintaining persistent bitmaps. All
they care about is that it will be stored in the corresponding image
file, so in my opinion it would be better to just omit the last sentence
here.

Users shoud know, that only qcow2 supports persistent bitmaps. Instead
of last sentence I can write "For now only Qcow2 disks supports
persistent bitmaps".

s/supports/support/, but yes, that sounds preferable to me.

Max




--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH 12/22] qcow2-bitmap: add IN_USE flag

2016-10-25 Thread Vladimir Sementsov-Ogievskiy

24.10.2016 20:18, Max Reitz wrote:

On 24.10.2016 19:08, Max Reitz wrote:

On 24.10.2016 13:35, Vladimir Sementsov-Ogievskiy wrote:

24.10.2016 13:32, Vladimir Sementsov-Ogievskiy пишет:

21.10.2016 22:58, Max Reitz пишет:

On 21.10.2016 17:34, Vladimir Sementsov-Ogievskiy wrote:

07.10.2016 22:44, Max Reitz пишет:

On 30.09.2016 12:53, Vladimir Sementsov-Ogievskiy wrote:

This flag means that the bitmap is now in use by the software or
was not
successfully saved. In any way, with this flag set the bitmap data
must
be considered inconsistent and should not be loaded.

With current implementation this flag is never set, as we just remove
bitmaps from the image after loading. But it defined in qcow2 spec
and
must be handled. Also, it can be used in future, if async schemes of
bitmap loading/saving are implemented.

We also remove in-use bitmaps from the image on open.

Signed-off-by: Vladimir Sementsov-Ogievskiy

---
block/qcow2-bitmap.c | 17 -
1 file changed, 16 insertions(+), 1 deletion(-)

Don't you want to make use of this flag? It would appear useful to
me if
you just marked autoload bitmaps as in_use instead of deleting them
from
the image when it's opened and then overwrite them when the image is
closed.

And what is the use of it?

You don't need to free any bitmaps when opening the file, and you don't
need to allocate any new bitmap space when closing it.

As bitmaps are sparce in file, I need to allocate new space when
closing. Or free it...


Is it a real problem to reallocate space in qcow2? If so, to minimaze
allocations, I will have to make a list of clusters of in_use bitmaps on
close, and then save current bitmaps to these clusters (and allocated
some additional clusters, or free extra clusters).

It's not a real problem, but it does take time, and I maintain that it's
time it doesn't need to take because you can just use the in_use flag.

I wouldn't worry about reusing clusters of other bitmaps. Of course we
can do that later on in some optimization but not now.

I just mean the basic case of some auto-loaded bitmap that is not being
deleted while the VM is running and is just saved back to the image file
once it is closed. I don't expect that users will always consume all of
the auto-loaded bitmaps while the VM is active...


Also, if I don't free them on open, I'll have to free them on remove of
bdrv dirty bitmap..

Yes, so? That takes time, but that is something the user will probably
expect. I wouldn't expect opening of the file to take that time.

Overall, it doesn't matter time-wise whether you free the bitmap data
when opening the image file or when the dirty bitmap is actually deleted
by the user or when the image file is closed. Just setting the single
in_use flag for all of the auto-loaded bitmaps will basically not take
any time.

On the other hand, as soon as just a single auto-loaded bitmap survives
a VM (or qemu-img) invocation, you will very, very likely safe at least
some time because writing the bitmap to the disk can reuse at least some
of the existing clusters.

By the way, dealing with removal of bitmaps when they are deleted by the
user is also positive when it comes to migration or read-only disks that
are later reopened R/W: Currently, as far as I can see, you just keep
the auto-load bitmaps in the image if the image is part of an incoming
migration or opened read-only, but then you continue under the
assumption that they are removed from the image. That's not very good.


You are right, I need to handle reopening more carefully. In my way, I 
need to delete bitmaps when reopening R/W and in yours - set in_use bit.




If you delete the bitmaps only when the user asks you to, then you can
just return an error that the bitmap cannot be removed at this time
because the image file is read-only or used in migration (and I don't
think the latter can even happen when it's the user who has requested
removal of the bitmap).

Max



Ok, I'll rewrite this way, using 'in_use' bit and trying to avoid 
allocate/free overhead.



--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH v3 2/5] util/qemu-sockets: Make inet_connect_saddr() public

2016-10-25 Thread Kevin Wolf
Am 17.10.2016 um 19:32 hat Ashijeet Acharya geschrieben:
> Make inet_connect_saddr() in util/qemu-sockets.c public in order to be
> able to use it with InetSocketAddress sockets outside of
> util/qemu-sockets.c independently.
> 
> Signed-off-by: Ashijeet Acharya 

Reviewed-by: Kevin Wolf 



Re: [Qemu-block] [Qemu-devel] [PATCH v14 02/21] qdict: implement a qdict_crumple method for un-flattening a dict

2016-10-25 Thread Markus Armbruster
Max Reitz  writes:

> On 21.10.2016 11:58, Markus Armbruster wrote:
>> "Daniel P. Berrange"  writes:
>> 
>>> On Tue, Oct 18, 2016 at 04:32:13PM +0200, Markus Armbruster wrote:
 "Daniel P. Berrange"  writes:

> The qdict_flatten() method will take a dict whose elements are
> further nested dicts/lists and flatten them by concatenating
> keys.
>
> The qdict_crumple() method aims to do the reverse, taking a flat
> qdict, and turning it into a set of nested dicts/lists. It will
> apply nesting based on the key name, with a '.' indicating a
> new level in the hierarchy. If the keys in the nested structure
> are all numeric, it will create a list, otherwise it will create
> a dict.
>
> If the keys are a mixture of numeric and non-numeric, or the
> numeric keys are not in strictly ascending order, an error will
> be reported.
>
> As an example, a flat dict containing
>
>  {
>'foo.0.bar': 'one',
>'foo.0.wizz': '1',
>'foo.1.bar': 'two',
>'foo.1.wizz': '2'
>  }
>
> will get turned into a dict with one element 'foo' whose
> value is a list. The list elements will each in turn be
> dicts.
>
>  {
>'foo': [
>  { 'bar': 'one', 'wizz': '1' },
>  { 'bar': 'two', 'wizz': '2' }
>],
>  }
>
> If the key is intended to contain a literal '.', then it must
> be escaped as '..'. ie a flat dict
>
>   {
>  'foo..bar': 'wizz',
>  'bar.foo..bar': 'eek',
>  'bar.hello': 'world'
>   }
>
> Will end up as
>
>   {
>  'foo.bar': 'wizz',
>  'bar': {
> 'foo.bar': 'eek',
> 'hello': 'world'
>  }
>   }
>
> The intent of this function is that it allows a set of QemuOpts
> to be turned into a nested data structure that mirrors the nesting
> used when the same object is defined over QMP.
>
> Reviewed-by: Eric Blake 
> Reviewed-by: Kevin Wolf 
> Reviewed-by: Marc-André Lureau 
> Signed-off-by: Daniel P. Berrange 
> ---
>  include/qapi/qmp/qdict.h |   1 +
>  qobject/qdict.c  | 289 
> +++
>  tests/check-qdict.c  | 261 ++
>  3 files changed, 551 insertions(+)
>
> diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h
> index 71b8eb0..e0d24e1 100644
> --- a/include/qapi/qmp/qdict.h
> +++ b/include/qapi/qmp/qdict.h
> @@ -73,6 +73,7 @@ void qdict_flatten(QDict *qdict);
>  void qdict_extract_subqdict(QDict *src, QDict **dst, const char *start);
>  void qdict_array_split(QDict *src, QList **dst);
>  int qdict_array_entries(QDict *src, const char *subqdict);
> +QObject *qdict_crumple(const QDict *src, bool recursive, Error **errp);
>  
>  void qdict_join(QDict *dest, QDict *src, bool overwrite);
>  
> diff --git a/qobject/qdict.c b/qobject/qdict.c
> index 60f158c..c38e90e 100644
> --- a/qobject/qdict.c
> +++ b/qobject/qdict.c
 [...]
> +/**
> + * qdict_crumple:
> + * @src: the original flat dictionary (only scalar values) to crumple
> + * @recursive: true to recursively crumple nested dictionaries

 Is recursive=false used outside tests in this series?
>>>
>>> No, its not used.
>>>
>>> It was suggested in a way earlier version by Max, but not sure if his
>>> code uses it or not.
>> 
>> In general, I prefer features to be added right before they're used, and
>> I really dislike adding features "just in case".  YAGNI.
>> 
>> Max, do you actually need this one?  If yes, please explain your use
>> case.
>
> As far as I can tell from a quick glance, I made the point for v1 that
> qdict_crumple() could be simplified by using qdict_array_split() and
> qdict_array_entries().
>
> Dan then (correctly) said that using these functions would worsen
> runtime performance of qdict_crumple() and that instead we can replace
> qdict_array_split() by qdict_crumple(). However, for that to work, we
> need to be able to make qdict_crumple() non-recursive (because
> qdict_array_split() is non-recursive).
>
> Dan also said that in the long run we want to keep the tree structure in
> the block layer instead of flattening everything down which would rid us
> of several other QDict functions (and would make non-recursive behavior
> obsolete again). I believe that this is an idea for the (far?) future,
> as can be seen from the discussion you and others had on this very topic
> in this version here.
>
> However, clearly there are no patches out yet that replace
> qdict_array_split() by qdict_crumple(recursive=false), and I certainly
> won't write any until qdict_crumple() is merged.

I prefer not 

Re: [Qemu-block] [PATCH RFC 3/7] replication: add shared-disk and shared-disk-id options

2016-10-25 Thread Changlong Xie

On 10/20/2016 09:57 PM, zhanghailiang wrote:

We use these two options to identify which disk is
shared

Signed-off-by: zhanghailiang 
Signed-off-by: Wen Congyang 
Signed-off-by: Zhang Chen 
---
  block/replication.c | 33 +
  1 file changed, 33 insertions(+)

diff --git a/block/replication.c b/block/replication.c
index 3bd1cf1..2a2fdb2 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -25,9 +25,12 @@
  typedef struct BDRVReplicationState {
  ReplicationMode mode;
  int replication_state;
+bool is_shared_disk;
+char *shared_disk_id;
  BdrvChild *active_disk;
  BdrvChild *hidden_disk;
  BdrvChild *secondary_disk;
+BdrvChild *primary_disk;
  char *top_id;
  ReplicationState *rs;
  Error *blocker;
@@ -53,6 +56,9 @@ static void replication_stop(ReplicationState *rs, bool 
failover,

  #define REPLICATION_MODE"mode"
  #define REPLICATION_TOP_ID  "top-id"
+#define REPLICATION_SHARED_DISK "shared-disk"
+#define REPLICATION_SHARED_DISK_ID "shared-disk-id"
+
  static QemuOptsList replication_runtime_opts = {
  .name = "replication",
  .head = QTAILQ_HEAD_INITIALIZER(replication_runtime_opts.head),
@@ -65,6 +71,14 @@ static QemuOptsList replication_runtime_opts = {
  .name = REPLICATION_TOP_ID,
  .type = QEMU_OPT_STRING,
  },
+{
+.name = REPLICATION_SHARED_DISK_ID,
+.type = QEMU_OPT_STRING,
+},
+{
+.name = REPLICATION_SHARED_DISK,
+.type = QEMU_OPT_BOOL,
+},
  { /* end of list */ }
  },
  };
@@ -85,6 +99,8 @@ static int replication_open(BlockDriverState *bs, QDict 
*options,
  QemuOpts *opts = NULL;
  const char *mode;
  const char *top_id;
+const char *shared_disk_id;
+BlockBackend *blk;

  ret = -EINVAL;
  opts = qemu_opts_create(_runtime_opts, NULL, 0, _abort);
@@ -114,6 +130,22 @@ static int replication_open(BlockDriverState *bs, QDict 
*options,
 "The option mode's value should be primary or secondary");
  goto fail;
  }
+s->is_shared_disk = qemu_opt_get_bool(opts, REPLICATION_SHARED_DISK,
+false);
+if (s->is_shared_disk && (s->mode == REPLICATION_MODE_PRIMARY)) {
+shared_disk_id = qemu_opt_get(opts, REPLICATION_SHARED_DISK_ID);
+if (!shared_disk_id) {
+error_setg(_err, "Missing shared disk blk");
+goto fail;
+}
+s->shared_disk_id = g_strdup(shared_disk_id);
+blk = blk_by_name(s->shared_disk_id);
+if (!blk) {
+error_setg(_err, "There is no %s block", s->shared_disk_id);


g_free(s->shared_disk_id);


+goto fail;
+}
+s->primary_disk = blk_root(blk);
+}

  s->rs = replication_new(bs, _ops);

@@ -130,6 +162,7 @@ static void replication_close(BlockDriverState *bs)
  {
  BDRVReplicationState *s = bs->opaque;

+g_free(s->shared_disk_id);
  if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
  replication_stop(s->rs, false, NULL);
  }







Re: [Qemu-block] [PATCH RFC 2/7] block-backend: Introduce blk_root() helper

2016-10-25 Thread Changlong Xie
I know you need blk->root in the next patch, but we strongly don't 
recommend your current solution.  Please refer Kevin's cf2ab8fc


1409 /* XXX Ugly way to get blk->root, but that's a feature, not a 
bug. This
1410  * hack makes it obvious that vhdx_write_header() bypasses the 
BlockBackend

1411  * here, which it really shouldn't be doing. */
1412 child = QLIST_FIRST(>parents);
1413 assert(!QLIST_NEXT(child, next_parent));

Then you can drop this commit.

On 10/20/2016 09:57 PM, zhanghailiang wrote:

With this helper function, we can get the BdrvChild struct
from BlockBackend

Signed-off-by: zhanghailiang 
---
  block/block-backend.c  | 5 +
  include/sysemu/block-backend.h | 1 +
  2 files changed, 6 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 1a724a8..66387f0 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -389,6 +389,11 @@ BlockDriverState *blk_bs(BlockBackend *blk)
  return blk->root ? blk->root->bs : NULL;
  }

+BdrvChild *blk_root(BlockBackend *blk)
+{
+return blk->root;
+}
+
  static BlockBackend *bdrv_first_blk(BlockDriverState *bs)
  {
  BdrvChild *child;
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index b07159b..867f9f5 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -99,6 +99,7 @@ void blk_remove_bs(BlockBackend *blk);
  void blk_insert_bs(BlockBackend *blk, BlockDriverState *bs);
  bool bdrv_has_blk(BlockDriverState *bs);
  bool bdrv_is_root_node(BlockDriverState *bs);
+BdrvChild *blk_root(BlockBackend *blk);

  void blk_set_allow_write_beyond_eof(BlockBackend *blk, bool allow);
  void blk_iostatus_enable(BlockBackend *blk);







[Qemu-block] [PATCH v5 3/4] fdc: Move qdev properties to FloppyDrive

2016-10-25 Thread Kevin Wolf
This makes the FloppyDrive qdev object actually useful: Now that it has
all properties that don't belong to the controller, you can actually
use '-device floppy' and get a working result.

Command line semantics is consistent with CD-ROM drives: By default you
get a single empty floppy drive. You can override it with -drive and
using the same index, but if you use -drive to add a floppy to a
different index, you get both of them. However, as soon as you use any
'-device floppy', even to a different slot, the default drive is
disabled.

Using '-device floppy' without specifying the unit will choose the first
free slot on the controller.

Signed-off-by: Kevin Wolf 
---
 hw/block/fdc.c | 120 ++---
 vl.c   |   1 +
 2 files changed, 89 insertions(+), 32 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 06ae6cf..17d29e7 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -35,6 +35,7 @@
 #include "qemu/timer.h"
 #include "hw/isa/isa.h"
 #include "hw/sysbus.h"
+#include "hw/block/block.h"
 #include "sysemu/block-backend.h"
 #include "sysemu/blockdev.h"
 #include "sysemu/sysemu.h"
@@ -487,12 +488,18 @@ static const BlockDevOps fd_block_ops = {
  OBJECT_CHECK(FloppyDrive, (obj), TYPE_FLOPPY_DRIVE)
 
 typedef struct FloppyDrive {
-DeviceState qdev;
-uint32_tunit;
+DeviceState qdev;
+uint32_tunit;
+BlockConf   conf;
+FloppyDriveType type;
 } FloppyDrive;
 
 static Property floppy_drive_properties[] = {
 DEFINE_PROP_UINT32("unit", FloppyDrive, unit, -1),
+DEFINE_BLOCK_PROPERTIES(FloppyDrive, conf),
+DEFINE_PROP_DEFAULT("drive-type", FloppyDrive, type,
+FLOPPY_DRIVE_TYPE_AUTO, qdev_prop_fdc_drive_type,
+FloppyDriveType),
 DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -501,6 +508,7 @@ static int floppy_drive_init(DeviceState *qdev)
 FloppyDrive *dev = FLOPPY_DRIVE(qdev);
 FloppyBus *bus = FLOPPY_BUS(qdev->parent_bus);
 FDrive *drive;
+int ret;
 
 if (dev->unit == -1) {
 for (dev->unit = 0; dev->unit < MAX_FD; dev->unit++) {
@@ -517,29 +525,57 @@ static int floppy_drive_init(DeviceState *qdev)
 return -1;
 }
 
-/* TODO Check whether unit is in use */
-
 drive = get_drv(bus->fdc, dev->unit);
-
 if (drive->blk) {
-if (blk_get_on_error(drive->blk, 0) != BLOCKDEV_ON_ERROR_ENOSPC) {
-error_report("fdc doesn't support drive option werror");
-return -1;
-}
-if (blk_get_on_error(drive->blk, 1) != BLOCKDEV_ON_ERROR_REPORT) {
-error_report("fdc doesn't support drive option rerror");
-return -1;
-}
-} else {
+error_report("Floppy unit %d is in use", dev->unit);
+return -1;
+}
+
+if (!dev->conf.blk) {
 /* Anonymous BlockBackend for an empty drive */
-drive->blk = blk_new();
+dev->conf.blk = blk_new();
+ret = blk_attach_dev(dev->conf.blk, qdev);
+assert(ret == 0);
 }
 
-fd_init(drive);
-if (drive->blk) {
-blk_set_dev_ops(drive->blk, _block_ops, drive);
-pick_drive_type(drive);
+blkconf_blocksizes(>conf);
+if (dev->conf.logical_block_size != 512 ||
+dev->conf.physical_block_size != 512)
+{
+error_report("Physical and logical block size must be 512 for floppy");
+return -1;
+}
+
+/* rerror/werror aren't supported by fdc and therefore not even registered
+ * with qdev. So set the defaults manually before they are used in
+ * blkconf_apply_backend_options(). */
+dev->conf.rerror = BLOCKDEV_ON_ERROR_AUTO;
+dev->conf.werror = BLOCKDEV_ON_ERROR_AUTO;
+blkconf_apply_backend_options(>conf);
+
+/* 'enospc' is the default for -drive, 'report' is what blk_new() gives us
+ * for empty drives. */
+if (blk_get_on_error(dev->conf.blk, 0) != BLOCKDEV_ON_ERROR_ENOSPC &&
+blk_get_on_error(dev->conf.blk, 0) != BLOCKDEV_ON_ERROR_REPORT) {
+error_report("fdc doesn't support drive option werror");
+return -1;
 }
+if (blk_get_on_error(dev->conf.blk, 1) != BLOCKDEV_ON_ERROR_REPORT) {
+error_report("fdc doesn't support drive option rerror");
+return -1;
+}
+
+drive->blk = dev->conf.blk;
+drive->fdctrl = bus->fdc;
+
+fd_init(drive);
+blk_set_dev_ops(drive->blk, _block_ops, drive);
+
+/* Keep 'type' qdev property and FDrive->drive in sync */
+drive->drive = dev->type;
+pick_drive_type(drive);
+dev->type = drive->drive;
+
 fd_revalidate(drive);
 
 return 0;
@@ -808,6 +844,10 @@ struct FDCtrl {
 FloppyBus bus;
 uint8_t num_floppies;
 FDrive drives[MAX_FD];
+struct {
+BlockBackend *blk;
+FloppyDriveType type;
+} qdev_for_drives[MAX_FD];
 int reset_sensei;
 uint32_t check_media_rate;
 FloppyDriveType fallback; /* type=auto failure 

[Qemu-block] [PATCH v5 4/4] qemu-iotests: Test creating floppy drives

2016-10-25 Thread Kevin Wolf
This tests the different supported methods to create floppy drives and
how they interact.

Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/172 |  246 ++
 tests/qemu-iotests/172.out | 1170 
 tests/qemu-iotests/group   |1 +
 3 files changed, 1417 insertions(+)
 create mode 100755 tests/qemu-iotests/172
 create mode 100644 tests/qemu-iotests/172.out

diff --git a/tests/qemu-iotests/172 b/tests/qemu-iotests/172
new file mode 100755
index 000..1b7d3a1
--- /dev/null
+++ b/tests/qemu-iotests/172
@@ -0,0 +1,246 @@
+#!/bin/bash
+#
+# Test floppy configuration
+#
+# Copyright (C) 2016 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=kw...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+if [ "$QEMU_DEFAULT_MACHINE" != "pc" ]; then
+_notrun "Requires a PC machine"
+fi
+
+function do_run_qemu()
+{
+(
+if ! test -t 0; then
+while read cmd; do
+echo $cmd
+done
+fi
+echo quit
+) | $QEMU -nographic -monitor stdio -serial none "$@"
+echo
+}
+
+function check_floppy_qtree()
+{
+echo
+echo Testing: "$@" | _filter_testdir
+
+# QEMU_OPTIONS contains -nodefaults, we don't want that here because the
+# defaults are part of what should be checked here.
+#
+# Apply the sed filter to stdout only, but keep the stderr output and
+# filter the qemu program name in it.
+echo "info qtree" |
+(QEMU_OPTIONS="" do_run_qemu "$@" |
+sed -ne '/^  dev: isa-fdc/,/^  dev:/{x;p}' ) 2>&1 |
+_filter_win32 | _filter_qemu
+}
+
+function check_cache_mode()
+{
+echo "info block none0" |
+QEMU_OPTIONS="" do_run_qemu -drive if=none,file="$TEST_IMG" "$@" |
+_filter_win32 | _filter_qemu | grep "Cache mode"
+}
+
+
+size=720k
+
+_make_test_img $size
+
+# Default drive semantics:
+#
+# By default you get a single empty floppy drive. You can override it with
+# -drive and using the same index, but if you use -drive to add a floppy to a
+# different index, you get both of them. However, as soon as you use any
+# '-device floppy', even to a different slot, the default drive is disabled.
+
+echo
+echo
+echo === Default ===
+
+check_floppy_qtree
+
+echo
+echo
+echo === Using -fda/-fdb options ===
+
+check_floppy_qtree -fda "$TEST_IMG"
+check_floppy_qtree -fdb "$TEST_IMG"
+check_floppy_qtree -fda "$TEST_IMG" -fdb "$TEST_IMG"
+
+
+echo
+echo
+echo === Using -drive options ===
+
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG"
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG",index=1
+check_floppy_qtree -drive if=floppy,file="$TEST_IMG" -drive 
if=floppy,file="$TEST_IMG",index=1
+
+echo
+echo
+echo === Using -drive if=none and -global ===
+
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -global isa-fdc.driveA=none0
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -global isa-fdc.driveB=none0
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive 
if=none,file="$TEST_IMG" \
+   -global isa-fdc.driveA=none0 -global isa-fdc.driveB=none1
+
+echo
+echo
+echo === Using -drive if=none and -device ===
+
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -device floppy,drive=none0
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -device 
floppy,drive=none0,unit=1
+check_floppy_qtree -drive if=none,file="$TEST_IMG" -drive 
if=none,file="$TEST_IMG" \
+   -device floppy,drive=none0 -device floppy,drive=none1,unit=1
+
+echo
+echo
+echo === Mixing -fdX and -global ===
+
+# Working
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global 
isa-fdc.driveB=none0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global 
isa-fdc.driveA=none0
+
+# Conflicting (-fdX wins)
+check_floppy_qtree -fda "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global 
isa-fdc.driveA=none0
+check_floppy_qtree -fdb "$TEST_IMG" -drive if=none,file="$TEST_IMG" -global 
isa-fdc.driveB=none0
+
+echo
+echo
+echo === Mixing -fdX and -device ===
+

Re: [Qemu-block] [PATCH v8 00/36] block: Image locking series

2016-10-25 Thread Fam Zheng
On Tue, 10/25 09:06, Richard W.M. Jones wrote:
> On Tue, Oct 25, 2016 at 03:09:51PM +0800, Fam Zheng wrote:
> > On Mon, 10/24 12:11, Kevin Wolf wrote:
> > > Am 22.10.2016 um 03:00 hat Max Reitz geschrieben:
> > > > 
> > > > 
> > > > I personally still don't like making locking a qdev property very much
> > > > because it doesn't make sense to me*. But I remember Kevin had his
> > > > reasons (even though I can no longer remember them) for asking for it,
> > > > and I don't have any besides "It doesn't make sense to me". After having
> > > > though a bit about it (= having written three paragraphs and deleted
> > > > them again), I guess I can make some sense of it, though it seems to be
> > > > a rather esoteric use case still; it appears to me that a guest could
> > > > use it to signal that it's fine with some block device being shared;
> > > > then we could use a shared lock or none at all or I don't know.
> > > > Otherwise, we should get an exclusive lock for write access and a shared
> > > > lock for read access.
> > > 
> > > The reason is pretty simple if you think about this question: Why do we
> > > need user input in the first place? 
> > 
> > I think the reason why we have an option at all is rather because of the 
> > special
> > case of libguestfs [1], otherwise locks should just be acquired sensibly as 
> > the
> > "auto" mode does.
> 
> It's not just that.
> 
> Qemu doesn't have enough information to make the correct decision
> about locking automatically -- for example, Qemu doesn't know if the
> guest is using cluster filesystems or not (or for some other reason
> the guest wants a shared writable block device, eg. for running Disk
> Paxos).

Yeah, okay, but I agree with Max that this doesn't make any sense on many
emulated device types that inherently aren't sharable in real world (like IDE).
It feels we are trying to solve two problems at the same time which makes a
questionable (qdev) interface.

Fam



[Qemu-block] [PATCH v5 2/4] fdc: Add a floppy drive qdev

2016-10-25 Thread Kevin Wolf
Floppy controllers automatically create two floppy drive devices in qdev
now. (They always created two drives, but managed them only internally.)

Signed-off-by: Kevin Wolf 
Reviewed-by: John Snow 
---
 hw/block/fdc.c | 151 +
 1 file changed, 120 insertions(+), 31 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 094c1e8..06ae6cf 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -60,6 +60,8 @@
 #define FLOPPY_BUS(obj) OBJECT_CHECK(FloppyBus, (obj), TYPE_FLOPPY_BUS)
 
 typedef struct FDCtrl FDCtrl;
+typedef struct FDrive FDrive;
+static FDrive *get_drv(FDCtrl *fdctrl, int unit);
 
 typedef struct FloppyBus {
 BusState bus;
@@ -180,7 +182,7 @@ typedef enum FDiskFlags {
 FDISK_DBL_SIDES  = 0x01,
 } FDiskFlags;
 
-typedef struct FDrive {
+struct FDrive {
 FDCtrl *fdctrl;
 BlockBackend *blk;
 /* Drive status */
@@ -201,7 +203,7 @@ typedef struct FDrive {
 uint8_t media_rate;   /* Data rate of medium*/
 
 bool media_validated; /* Have we validated the media? */
-} FDrive;
+};
 
 
 static FloppyDriveType get_fallback_drive_type(FDrive *drv);
@@ -466,6 +468,100 @@ static void fd_revalidate(FDrive *drv)
 }
 }
 
+static void fd_change_cb(void *opaque, bool load)
+{
+FDrive *drive = opaque;
+
+drive->media_changed = 1;
+drive->media_validated = false;
+fd_revalidate(drive);
+}
+
+static const BlockDevOps fd_block_ops = {
+.change_media_cb = fd_change_cb,
+};
+
+
+#define TYPE_FLOPPY_DRIVE "floppy"
+#define FLOPPY_DRIVE(obj) \
+ OBJECT_CHECK(FloppyDrive, (obj), TYPE_FLOPPY_DRIVE)
+
+typedef struct FloppyDrive {
+DeviceState qdev;
+uint32_tunit;
+} FloppyDrive;
+
+static Property floppy_drive_properties[] = {
+DEFINE_PROP_UINT32("unit", FloppyDrive, unit, -1),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static int floppy_drive_init(DeviceState *qdev)
+{
+FloppyDrive *dev = FLOPPY_DRIVE(qdev);
+FloppyBus *bus = FLOPPY_BUS(qdev->parent_bus);
+FDrive *drive;
+
+if (dev->unit == -1) {
+for (dev->unit = 0; dev->unit < MAX_FD; dev->unit++) {
+drive = get_drv(bus->fdc, dev->unit);
+if (!drive->blk) {
+break;
+}
+}
+}
+
+if (dev->unit >= MAX_FD) {
+error_report("Can't create floppy unit %d, bus supports only %d units",
+ dev->unit, MAX_FD);
+return -1;
+}
+
+/* TODO Check whether unit is in use */
+
+drive = get_drv(bus->fdc, dev->unit);
+
+if (drive->blk) {
+if (blk_get_on_error(drive->blk, 0) != BLOCKDEV_ON_ERROR_ENOSPC) {
+error_report("fdc doesn't support drive option werror");
+return -1;
+}
+if (blk_get_on_error(drive->blk, 1) != BLOCKDEV_ON_ERROR_REPORT) {
+error_report("fdc doesn't support drive option rerror");
+return -1;
+}
+} else {
+/* Anonymous BlockBackend for an empty drive */
+drive->blk = blk_new();
+}
+
+fd_init(drive);
+if (drive->blk) {
+blk_set_dev_ops(drive->blk, _block_ops, drive);
+pick_drive_type(drive);
+}
+fd_revalidate(drive);
+
+return 0;
+}
+
+static void floppy_drive_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *k = DEVICE_CLASS(klass);
+k->init = floppy_drive_init;
+set_bit(DEVICE_CATEGORY_STORAGE, k->categories);
+k->bus_type = TYPE_FLOPPY_BUS;
+k->props = floppy_drive_properties;
+k->desc = "virtual floppy drive";
+}
+
+static const TypeInfo floppy_drive_info = {
+.name = TYPE_FLOPPY_DRIVE,
+.parent = TYPE_DEVICE,
+.instance_size = sizeof(FloppyDrive),
+.class_init = floppy_drive_class_init,
+};
+
 //
 /* Intel 82078 floppy disk controller emulation  */
 
@@ -1185,9 +1281,9 @@ static inline FDrive *drv3(FDCtrl *fdctrl)
 }
 #endif
 
-static FDrive *get_cur_drv(FDCtrl *fdctrl)
+static FDrive *get_drv(FDCtrl *fdctrl, int unit)
 {
-switch (fdctrl->cur_drv) {
+switch (unit) {
 case 0: return drv0(fdctrl);
 case 1: return drv1(fdctrl);
 #if MAX_FD == 4
@@ -1198,6 +1294,11 @@ static FDrive *get_cur_drv(FDCtrl *fdctrl)
 }
 }
 
+static FDrive *get_cur_drv(FDCtrl *fdctrl)
+{
+return get_drv(fdctrl, fdctrl->cur_drv);
+}
+
 /* Status A register : 0x00 (read-only) */
 static uint32_t fdctrl_read_statusA(FDCtrl *fdctrl)
 {
@@ -2357,46 +2458,33 @@ static void fdctrl_result_timer(void *opaque)
 }
 }
 
-static void fdctrl_change_cb(void *opaque, bool load)
-{
-FDrive *drive = opaque;
-
-drive->media_changed = 1;
-drive->media_validated = false;
-fd_revalidate(drive);
-}
-
-static const BlockDevOps fdctrl_block_ops = {
-.change_media_cb = fdctrl_change_cb,
-};
-
 /* Init functions */
 static void fdctrl_connect_drives(FDCtrl *fdctrl, Error **errp)
 {
 unsigned int i;
 

[Qemu-block] [PATCH real v5 0/4] fdc: Use separate qdev device for drives

2016-10-25 Thread Kevin Wolf
We have been complaining for a long time about how the floppy controller and
floppy drives are combined in a single qdev device and how this makes the
device awkward to work with because it behaves different from all other block
devices.

The latest reason to complain was when I noticed that using qdev device names
in QMP commands (e.g. for media change) doesn't really work when only the
controller is a qdev device, but the drives aren't.

So I decided to have a go at it, and this is the result.

It doesn't actually change any of the inner workings of the floppy controller,
but it wires things up differently on the qdev layer so that a floppy
controller now exposes a bus on which the floppy drives sit. This results in a
structure that is similar to IDE where the actual drive state is still in the
controller and the qdev device basically just contains the qdev properties -
not pretty, but quite workable.

The commit message of patch 3 explains how to use it. In short, there is a
'-device floppy' now and it does what you would expect if you ever used ide-cd.

The other problem is old command lines, especially those using things like
'-global isa-fdc,driveA=...'. In order to keep them working, we need to forward
the property to an internally created floppy drive device. This is a bit like
usb-storage, which we know is ugly, but works well enough in practice. The good
news here is that in contrast to usb-storage, the floppy controller only does
the forwarding for legacy configurations; as soon as you start using '-device
floppy', it doesn't happen any more.

So as you may have expected, this conversion doesn't result in a perfect
device, but I think it's definitely an improvement over the old state. I hope
you like it despite the warts. :-)

v5:
- Apply _filter_qemu to stderr, too [John]
- Rename the bus to floppy-bus [Frederic]
- Use FLOPPY_BUS() instead of DO_UPDATE() [Frederic]

v4:
- John says that his grep is broken and hangs at 100% CPU with my attempt to
  extract the floppy controller from info qtree. Use a simpler sed command
  instead (which, unlike the grep command, doesn't handle arbitrary indentation
  level of the next item, but we know what comes next, so just hardcoding 10
  spaces works, too).

v3:
- Fixed omissons in the conversion sysbus-fdc and the Sun one. Nice, combining
  floppy controllers and weird platforms in a single series. [John]

v2:
- Added patch 4 (qemu-iotests case for floppy config on the command line)
- Patch 2: Create a floppy device only if a BlockBackend exists instead of
  always creating two of them
- Patch 2: Initialise drive->fdctrl even if no drive is attached, it is
  accessed anyway during migration
- Patch 3: Keep 'type' qdev property and FDrive->drive in sync
- Patch 3: Removed if with condition that is always true

Kevin Wolf (4):
  fdc: Add a floppy qbus
  fdc: Add a floppy drive qdev
  fdc: Move qdev properties to FloppyDrive
  qemu-iotests: Test creating floppy drives

 hw/block/fdc.c |  271 --
 tests/qemu-iotests/172 |  246 ++
 tests/qemu-iotests/172.out | 1170 
 tests/qemu-iotests/group   |1 +
 vl.c   |1 +
 5 files changed, 1641 insertions(+), 48 deletions(-)
 create mode 100755 tests/qemu-iotests/172
 create mode 100644 tests/qemu-iotests/172.out

-- 
1.8.3.1




[Qemu-block] [PATCH v5 1/4] fdc: Add a floppy qbus

2016-10-25 Thread Kevin Wolf
This adds a qbus to the floppy controller that should contain the floppy
drives eventually. At the moment it just exists and is empty.

Signed-off-by: Kevin Wolf 
Reviewed-by: John Snow 
---
 hw/block/fdc.c | 40 +++-
 1 file changed, 35 insertions(+), 5 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index b79873a..094c1e8 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -52,6 +52,33 @@
 }   \
 } while (0)
 
+
+//
+/* qdev floppy bus  */
+
+#define TYPE_FLOPPY_BUS "floppy-bus"
+#define FLOPPY_BUS(obj) OBJECT_CHECK(FloppyBus, (obj), TYPE_FLOPPY_BUS)
+
+typedef struct FDCtrl FDCtrl;
+
+typedef struct FloppyBus {
+BusState bus;
+FDCtrl *fdc;
+} FloppyBus;
+
+static const TypeInfo floppy_bus_info = {
+.name = TYPE_FLOPPY_BUS,
+.parent = TYPE_BUS,
+.instance_size = sizeof(FloppyBus),
+};
+
+static void floppy_bus_create(FDCtrl *fdc, FloppyBus *bus, DeviceState *dev)
+{
+qbus_create_inplace(bus, sizeof(FloppyBus), TYPE_FLOPPY_BUS, dev, NULL);
+bus->fdc = fdc;
+}
+
+
 //
 /* Floppy drive emulation   */
 
@@ -148,8 +175,6 @@ static FDriveSize drive_size(FloppyDriveType drive)
 #define FD_SECTOR_SC   2   /* Sector size code */
 #define FD_RESET_SENSEI_COUNT  4   /* Number of sense interrupts on RESET */
 
-typedef struct FDCtrl FDCtrl;
-
 /* Floppy disk drive emulation */
 typedef enum FDiskFlags {
 FDISK_DBL_SIDES  = 0x01,
@@ -684,6 +709,7 @@ struct FDCtrl {
 /* Power down config (also with status regB access mode */
 uint8_t pwrd;
 /* Floppy drives */
+FloppyBus bus;
 uint8_t num_floppies;
 FDrive drives[MAX_FD];
 int reset_sensei;
@@ -2442,7 +2468,8 @@ void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
 *fdc_tc = qdev_get_gpio_in(dev, 0);
 }
 
-static void fdctrl_realize_common(FDCtrl *fdctrl, Error **errp)
+static void fdctrl_realize_common(DeviceState *dev, FDCtrl *fdctrl,
+  Error **errp)
 {
 int i, j;
 static int command_tables_inited = 0;
@@ -2480,6 +2507,8 @@ static void fdctrl_realize_common(FDCtrl *fdctrl, Error 
**errp)
 k->register_channel(fdctrl->dma, fdctrl->dma_chann,
 _transfer_handler, fdctrl);
 }
+
+floppy_bus_create(fdctrl, >bus, dev);
 fdctrl_connect_drives(fdctrl, errp);
 }
 
@@ -2508,7 +2537,7 @@ static void isabus_fdc_realize(DeviceState *dev, Error 
**errp)
 }
 
 qdev_set_legacy_instance_id(dev, isa->iobase, 2);
-fdctrl_realize_common(fdctrl, );
+fdctrl_realize_common(dev, fdctrl, );
 if (err != NULL) {
 error_propagate(errp, err);
 return;
@@ -2559,7 +2588,7 @@ static void sysbus_fdc_common_realize(DeviceState *dev, 
Error **errp)
 FDCtrlSysBus *sys = SYSBUS_FDC(dev);
 FDCtrl *fdctrl = >state;
 
-fdctrl_realize_common(fdctrl, errp);
+fdctrl_realize_common(dev, fdctrl, errp);
 }
 
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i)
@@ -2744,6 +2773,7 @@ static void fdc_register_types(void)
 type_register_static(_fdc_type_info);
 type_register_static(_fdc_info);
 type_register_static(_fdc_info);
+type_register_static(_bus_info);
 }
 
 type_init(fdc_register_types)
-- 
1.8.3.1




Re: [Qemu-block] [Qemu-devel] [PATCH v5 0/4] fdc: Use separate qdev device for drives

2016-10-25 Thread Kevin Wolf
Am 24.10.2016 um 20:44 hat John Snow geschrieben:
> On 10/24/2016 07:37 AM, Kevin Wolf wrote:
> >We have been complaining for a long time about how the floppy controller and
> >floppy drives are combined in a single qdev device and how this makes the
> >device awkward to work with because it behaves different from all other block
> >devices.
>
> [...]
>
> >Alberto Garcia (2):
> >  throttle: Correct access to wrong BlockBackendPublic structures
> >  qemu-iotests: Test I/O in a single drive from a throttling group
> >
> >Halil Pasic (1):
> >  block: improve error handling in raw_open
> >
> >Pino Toscano (1):
> >  qapi: fix memory leak in bdrv_image_info_specific_dump
> >
> 
> uhhh
> 
> http://i.imgur.com/60YQvmg.png

wtf did I do there? :-/

I'll send a new version... But if it continues like this, we'll miss the
freeze...

Kevin



Re: [Qemu-block] [PATCH RFC 1/7] docs/block-replication: Add description for shared-disk case

2016-10-25 Thread Changlong Xie

On 10/20/2016 09:57 PM, zhanghailiang wrote:

Introuduce the scenario of shared-disk block replication
and how to use it.

Signed-off-by: zhanghailiang 
Signed-off-by: Wen Congyang 
Signed-off-by: Zhang Chen 
---
  docs/block-replication.txt | 131 +++--
  1 file changed, 127 insertions(+), 4 deletions(-)

diff --git a/docs/block-replication.txt b/docs/block-replication.txt
index 6bde673..97fcfc1 100644
--- a/docs/block-replication.txt
+++ b/docs/block-replication.txt
@@ -24,7 +24,7 @@ only dropped at next checkpoint time. To reduce the network 
transportation
  effort during a vmstate checkpoint, the disk modification operations of
  the Primary disk are asynchronously forwarded to the Secondary node.

-== Workflow ==
+== Non-shared disk workflow ==
  The following is the image of block replication workflow:

  +--+++
@@ -57,7 +57,7 @@ The following is the image of block replication workflow:
  4) Secondary write requests will be buffered in the Disk buffer and it
 will overwrite the existing sector content in the buffer.

-== Architecture ==
+== None-shared disk architecture ==


s/None-shared/Non-shared/g


  We are going to implement block replication from many basic
  blocks that are already in QEMU.

@@ -106,6 +106,74 @@ any state that would otherwise be lost by the speculative 
write-through
  of the NBD server into the secondary disk. So before block replication,
  the primary disk and secondary disk should contain the same data.

+== Shared Disk Mode Workflow ==
+The following is the image of block replication workflow:
+
++--+++
+|Primary Write Requests||Secondary Write Requests|
++--+++
+  |   |
+  |  (4)
+  |   V
+  |  /-\
+  | (2)Forward and write through | |
+  | +--> | Disk Buffer |
+  | || |
+  | |\-/
+  | |(1)read   |
+  | |  |
+   (3)write   | |  | backing file
+  V |  |
+ +-+   |
+ | Shared Disk | <-+
+ +-+
+
+1) Primary writes will read original data and forward it to Secondary
+   QEMU.
+2) Before Primary write requests are written to Shared disk, the
+   original sector content will be read from Shared disk and
+   forwarded and buffered in the Disk buffer on the secondary site,
+   but it will not overwrite the existing


extra spaces at the end of line


+   sector content(it could be from either "Secondary Write Requests" or


Need a space before "(" for better style.


+   previous COW of "Primary Write Requests") in the Disk buffer.
+3) Primary write requests will be written to Shared disk.
+4) Secondary write requests will be buffered in the Disk buffer and it
+   will overwrite the existing sector content in the buffer.
+
+== Shared Disk Mode Architecture ==
+We are going to implement block replication from many basic
+blocks that are already in QEMU.
+ virtio-blk ||   
.--
+ /  ||   | 
Secondary
+/   ||   
'--
+   /|| 
virtio-blk
+  / || 
 |
+  | ||   
replication(5)
+  |NBD  >   NBD   (2)  
 |
+  |  client ||server ---> hidden disk <-- 
active disk(4)
+  | ^   ||  |
+  |  replication(1) ||  |
+  | |   ||  |
+  |   +-'   ||  |
+ (3)  |drive-backup sync=none   ||  |
+. |   +-+   ||  |
+Primary | | |   ||   backing|
+' |  

Re: [Qemu-block] [Qemu-devel] Assertion failure on qcow2 disk with cluster_size != 64k

2016-10-25 Thread Kevin Wolf
Am 25.10.2016 um 01:06 hat Ed Swierk geschrieben:
> On Mon, Oct 24, 2016 at 2:21 PM, Eric Blake  wrote:
> > How are you getting max_transfer == 65536?  I can't reproduce it with
> > the following setup:
> >
> > $ qemu-img create -f qcow2 -o cluster_size=1M file 10M
> > $ qemu-io -f qcow2 -c 'w 7m 1k' file
> > $ qemu-io -f qcow2 -c 'w -z 8003584 2093056' file
> >
> > although I did confirm that the above sequence was enough to get the
> > -ENOTSUP failure and fall into the code calculating max_transfer.
> >
> > I'm guessing that you are using something other than a file system as
> > the backing protocol for your qcow2 image.  But do you really have a
> > protocol that takes AT MOST 64k per transaction, while still trying to a
> > cluster size of 1M in the qcow2 format?  That's rather awkward, as it
> > means that you are required to do 16 transactions per cluster (the whole
> > point of using larger clusters is usually to get fewer transactions).  I
> > think we need to get to a root cause of why you are seeing such a small
> > max_transfer, before I can propose the right patch, since I haven't been
> > able to reproduce it locally yet (although I admit I haven't tried to
> > see if blkdebug could reliably introduce artificial limits to simulate
> > your setup).  And it may turn out that I just have to fix the
> > bdrv_co_do_pwrite_zeroes() code to loop multiple times if the size of
> > the unaligned head really does exceed the max_transfer size that the
> > underlying protocol is able to support, rather than assuming that the
> > unaligned head/tail always fit in a single fallback write.
> 
> In this case I'm using a qcow2 image that's stored directly in a raw
> dm-crypt/LUKS container, which is itself a loop device on an ext4
> filesystem.
> 
> It appears loop devices (with or without dm-crypt/LUKS) report a
> 255-sector maximum per request via the BLKSECTGET ioctl, which qemu
> rounds down to 64k in raw_refresh_limits(). However this maximum
> appears to be just a hint: bdrv_driver_pwritev() succeeds even with a
> 385024-byte buffer of zeroes.

I suppose what happens is that we get short writes, but the raw-posix
driver actually has the loop to deal with this, so eventually we return
with the whole thing written.

Considering the presence of this loop, maybe we shouldn't set
bs->bl.max_transfer at all for raw-posix. Hm, except that for Linux AIO
we might actually need it.

> As for the 1M cluster size, this is a temporary workaround for another
> qemu issue (the default qcow2 L2 table cache size performs well with
> random reads covering only up to 8 GB of image data with 64k clusters;
> beyond that the L2 table cache thrashes). I agree this is not an
> optimal configuration for writes.

You can configure the qcow2 cache size without changing the cluster
size (though of course larger cluster sizes help the total metadata size
to stay smaller for larger image sizes):

-drive file=test.qcow2,l2-cache-size=16M

Kevin



Re: [Qemu-block] [PATCH v8 00/36] block: Image locking series

2016-10-25 Thread Kevin Wolf
Am 24.10.2016 um 20:03 hat Max Reitz geschrieben:
> On 24.10.2016 12:11, Kevin Wolf wrote:
> 
> [...]
> 
> > Now, the big question is how to translate this into file locking. This
> > could become a little tricky. I had a few thoughts involving another
> > lock on byte 2, but none of them actually worked out so far, because
> > what we want is essentially a lock that can be shared by readers, that
> > can also be shared by writers, but not by readers and writers at the
> > same time.
> 
> You can also share it between readers and writers, as long as everyone
> can cope with volatile data.

Sorry, that was ambiguous. I meant a file-level lock rather than the
high-level one. If we had a lock that can be shared by one or the other,
but not both, then two locks would be enough to build what we really
want.

> I agree that it's very similar to the proposed op blocker style, but I
> can't really come up with a meaningful translation either.
> 
> Maybe something like this (?): All readers who do not want the file to
> be modified grab a shared lock on byte 1. All writers who can deal with
> volatile data grab a shared lock on byte 2. Exclusive writers grab an
> exclusive lock on byte 1 and 2. Readers who can cope with volatile data
> get no lock at all.
> 
> When opening, the first and second group would always have to test
> whether there is a lock on the other byte, respectively. E.g. sharing
> writers would first grab an exclusive lock on byte 1, then the shared
> lock on byte 2 and then release the exclusive lock again.
> 
> Would that work?

I'm afraid it wouldn't. If you start the sharing writer first and then
the writer-blocking reader, the writer doesn't hold a lock on byte 1 any
more, so the reader can start even though someone is writing to the
image. On the other hand, the writer can't keep an exclusive lock
because it would block other users that can share the image.

Kevin


pgp3Met8LMvAM.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH v8 00/36] block: Image locking series

2016-10-25 Thread Richard W.M. Jones
On Tue, Oct 25, 2016 at 03:09:51PM +0800, Fam Zheng wrote:
> On Mon, 10/24 12:11, Kevin Wolf wrote:
> > Am 22.10.2016 um 03:00 hat Max Reitz geschrieben:
> > > 
> > > 
> > > I personally still don't like making locking a qdev property very much
> > > because it doesn't make sense to me*. But I remember Kevin had his
> > > reasons (even though I can no longer remember them) for asking for it,
> > > and I don't have any besides "It doesn't make sense to me". After having
> > > though a bit about it (= having written three paragraphs and deleted
> > > them again), I guess I can make some sense of it, though it seems to be
> > > a rather esoteric use case still; it appears to me that a guest could
> > > use it to signal that it's fine with some block device being shared;
> > > then we could use a shared lock or none at all or I don't know.
> > > Otherwise, we should get an exclusive lock for write access and a shared
> > > lock for read access.
> > 
> > The reason is pretty simple if you think about this question: Why do we
> > need user input in the first place? 
> 
> I think the reason why we have an option at all is rather because of the 
> special
> case of libguestfs [1], otherwise locks should just be acquired sensibly as 
> the
> "auto" mode does.

It's not just that.

Qemu doesn't have enough information to make the correct decision
about locking automatically -- for example, Qemu doesn't know if the
guest is using cluster filesystems or not (or for some other reason
the guest wants a shared writable block device, eg. for running Disk
Paxos).

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html



Re: [Qemu-block] [PATCH v8 00/36] block: Image locking series

2016-10-25 Thread Fam Zheng
On Mon, 10/24 12:11, Kevin Wolf wrote:
> Am 22.10.2016 um 03:00 hat Max Reitz geschrieben:
> > 
> > 
> > I personally still don't like making locking a qdev property very much
> > because it doesn't make sense to me*. But I remember Kevin had his
> > reasons (even though I can no longer remember them) for asking for it,
> > and I don't have any besides "It doesn't make sense to me". After having
> > though a bit about it (= having written three paragraphs and deleted
> > them again), I guess I can make some sense of it, though it seems to be
> > a rather esoteric use case still; it appears to me that a guest could
> > use it to signal that it's fine with some block device being shared;
> > then we could use a shared lock or none at all or I don't know.
> > Otherwise, we should get an exclusive lock for write access and a shared
> > lock for read access.
> 
> The reason is pretty simple if you think about this question: Why do we
> need user input in the first place? 

I think the reason why we have an option at all is rather because of the special
case of libguestfs [1], otherwise locks should just be acquired sensibly as the
"auto" mode does.

Other than that, I don't think we have a concrete use case of non-auto lock
mode.  Do we?

[1]: https://lists.nongnu.org/archive/html/qemu-block/2016-04/msg00414.html

Fam



Re: [Qemu-block] [PATCH v8 03/36] block: Introduce image file locking

2016-10-25 Thread Fam Zheng
On Fri, 10/21 23:04, Max Reitz wrote:
> > +ImageLockMode bdrv_lock_mode_from_flags(int flags)
> > +{
> > +if (flags & BDRV_O_NO_LOCK) {
> > +return IMAGE_LOCK_MODE_NOLOCK;
> > +} else if (flags & BDRV_O_SHARED_LOCK) {
> > +return IMAGE_LOCK_MODE_SHARED;
> > +} else if (flags & BDRV_O_EXCLUSIVE_LOCK) {
> > +return IMAGE_LOCK_MODE_EXCLUSIVE;
> > +} else {
> > +return IMAGE_LOCK_MODE_AUTO;
> > +}
> > +}
> 
> I don't know if there's been any discussion about the order of the flags
> here, but I personally would order them exactly the other way around:
> Asking for exclusive locking should override nolock, in my opinion.

The idea was to assert no two bits are set at the same time. But I seem to have
forgotten to actually add the assertion.

> 
> > +
> > +ImageLockMode bdrv_get_lock_mode(BlockDriverState *bs)
> > +{
> > +return bs->cur_lock;
> > +}
> > +
> > +int bdrv_set_lock_mode(BlockDriverState *bs, ImageLockMode mode)
> > +{
> > +int ret;
> > +
> > +if (bs->cur_lock == mode) {
> > +return 0;
> > +} else if (!bs->drv) {
> > +return -ENOMEDIUM;
> > +} else if (!bs->drv->bdrv_lockf) {
> > +if (bs->file) {
> > +return bdrv_set_lock_mode(bs->file->bs, mode);
> > +}
> > +return 0;
> > +}
> > +ret = bs->drv->bdrv_lockf(bs, mode);
> > +if (ret == -ENOTSUP) {
> > +/* Handle it the same way as !bs->drv->bdrv_lockf */
> > +ret = 0;
> 
> Yes, well, why do you handle both as success? Wouldn't returning
> -ENOTSUP make more sense?
> 
> I guess the caller can find out itself by checking whether bs->cur_lock
> has changed, but...

I can't think of a reason for any caller to do something different for -ENOTSUP
from success, hence the check here.

> 
> > +} else if (ret == 0) {
> > +bs->cur_lock = mode;
> > +}
> > +return ret;
> > +}
> > +
> >  static QemuOptsList bdrv_runtime_opts = {
> >  .name = "bdrv_common",
> >  .head = QTAILQ_HEAD_INITIALIZER(bdrv_runtime_opts.head),
> > @@ -1076,6 +1119,10 @@ static int bdrv_open_common(BlockDriverState *bs, 
> > BdrvChild *file,
> >  goto free_and_fail;
> >  }
> >  
> > +if (open_flags & BDRV_O_INACTIVE) {
> > +open_flags = (open_flags & ~BDRV_O_LOCK_MASK) & BDRV_O_NO_LOCK;
> 
> I suppose the second & is supposed to be a |?

Yes. Thanks for catching it.

> 
> > +}
> > +
> >  ret = refresh_total_sectors(bs, bs->total_sectors);
> >  if (ret < 0) {
> >  error_setg_errno(errp, -ret, "Could not refresh total sector 
> > count");
> > @@ -2273,6 +2320,7 @@ static void bdrv_close(BlockDriverState *bs)
> >  if (bs->drv) {
> >  BdrvChild *child, *next;
> >  
> > +bdrv_set_lock_mode(bs, IMAGE_LOCK_MODE_NOLOCK);
> >  bs->drv->bdrv_close(bs);
> >  bs->drv = NULL;
> >  
> > @@ -3188,6 +3236,9 @@ void bdrv_invalidate_cache(BlockDriverState *bs, 
> > Error **errp)
> 
> This function's name is pretty weird... Maybe it would be better to
> rename it to "bdrv_complete_incoming" or something. (Unrelated to this
> series, of course.)
> 
> >  error_setg_errno(errp, -ret, "Could not refresh total sector 
> > count");
> >  return;
> >  }
> > +if (bs->cur_lock != IMAGE_LOCK_MODE__MAX) {
> > +bdrv_set_lock_mode(bs, bs->cur_lock);
> > +}
> >  }
> >  
> >  void bdrv_invalidate_cache_all(Error **errp)
> > @@ -3230,6 +3281,7 @@ static int bdrv_inactivate_recurse(BlockDriverState 
> > *bs,
> >  }
> >  
> >  if (setting_flag) {
> > +ret = bdrv_set_lock_mode(bs, IMAGE_LOCK_MODE_NOLOCK);
> 
> Maybe it would make sense to do something with the return value...? :-)

Yes, sounds good.

Fam



Re: [Qemu-block] [Qemu-devel] [PATCH v8 15/36] qdev: Add "lock-mode" to block device options

2016-10-25 Thread Fam Zheng
On Sat, 10/22 02:11, Max Reitz wrote:
> On 30.09.2016 14:09, Fam Zheng wrote:
> > Signed-off-by: Fam Zheng 
> > ---
> >  hw/core/qdev-properties.c| 10 ++
> >  include/hw/block/block.h |  1 +
> >  include/hw/qdev-properties.h |  3 +++
> >  3 files changed, 14 insertions(+)
> 
> Why don't you add _conf.lock_mode in this very patch (instead of patch
> 3) and pull it in front of patch 12? Setting the mode won't do anything
> until it's implemented for the various block devices, but I don't think
> that's a bad thing.

Sounds good!

Fam