Re: [Qemu-devel] [PATCH 2/3] pc-bios/s390-ccw/net: Add support for pxelinux-style config files

2018-05-31 Thread Thomas Huth
On 30.05.2018 13:07, Viktor VM Mihajlovski wrote:
> On 30.05.2018 11:16, Thomas Huth wrote:
>> Since it is quite cumbersome to manually create a combined kernel with
>> initrd image for network booting, we now support loading via pxelinux
>> configuration files, too. In these files, the kernel, initrd and command
>> line parameters can be specified seperately, and the firmware then takes
>> care of glueing everything together in memory after the files have been
>> downloaded. See this URL for details about the config file layout:
>> https://www.syslinux.org/wiki/index.php?title=PXELINUX
>>
>> The user can either specify a config file directly as bootfile via DHCP
>> (but in this case, the file has to start either with "default" or a "#"
>> comment so we can distinguish it from binary kernels), or a folder (i.e.
>> the bootfile name must end with "/") where the firmware should look for
>> the typical pxelinux.cfg file names, e.g. based on MAC or IP address.
>> We also support the pxelinux.cfg DHCP options 209 and 210 from RFC 5071.
>>
>> Signed-off-by: Thomas Huth 
>> ---
>>  pc-bios/s390-ccw/netboot.mak |  7 ++--
>>  pc-bios/s390-ccw/netmain.c   | 79 
>> +++-
>>  2 files changed, 82 insertions(+), 4 deletions(-)
> [...]
>> diff --git a/pc-bios/s390-ccw/netmain.c b/pc-bios/s390-ccw/netmain.c
>> index 7533cf7..e84bb2b 100644
>> --- a/pc-bios/s390-ccw/netmain.c
>> +++ b/pc-bios/s390-ccw/netmain.c
> [...]
>> @@ -301,6 +363,18 @@ static int net_try_direct_tftp_load(filename_ip_t 
>> *fn_ip)
>>  if (!strncmp("* ", cfgbuf, 2)) {
>>  return handle_ins_cfg(fn_ip, cfgbuf, rc);
>>  }
>> +if (!strncasecmp("default", cfgbuf, 7) || !strncmp("# ", cfgbuf, 
>> 2)) {
> Minor, but I'm wondering whether this is not too cautious and could rule
> out valid config files. You might just unconditionally call
> pxelinux_parse_cfg and let it find out if this is as pxelinux config
> file or not.

I thought about this for a while, but I think I'd rather avoid this.
It's also possible that the user tried to load a small binary which
accidentally contained a string like "label" and "kernel", and then the
bios would try to interpret this as config file instead of just running
the binary.

If you feel really unhappy about the magic string matching with
"default" and "# " here, I think it would be better to remove this magic
completely instead. The original pxelinux loader and petitboot also do
not support this, but rely on the DHCP options 209 and 210 instead
(which we also support now). I only added this magic here for my own
convenience, since the built-in DHCP server of QEMU does not support the
options 209 and 210, and I still wanted to have a way to quickly change
from one config file to another... but now that the code is basically up
and running, it's not really required anymore, so removing this should
be fine. Alternatively, we could use a more sophisticated magic here,
like "# pxelinux" for example, just for the developers' convenience ...
what do you think?

 Thomas



Re: [Qemu-devel] Recording I/O activity after KVM does a VMEXIT

2018-05-31 Thread Pavel Dovgalyuk
Hi,

 

I’m not familiar with KVM, but I know successful attempts of replaying the 
execution by logging IO and MMIO in TCG mode.

The difference in CPU I/O and VM I/O is the following. In icount we record 
anything coming into the VM, but not into the CPU.

It means that the whole packet is recorded. Virtual hardware behaves 
deterministically and therefore CPU will get identical

input in case of replay, because the whole recorded packet is injected again by 
the filter.

 

Pavel Dovgalyuk

 

From: Arnabjyoti Kalita [mailto:akal...@cs.stonybrook.edu] 
Sent: Thursday, May 31, 2018 11:14 PM
To: Pavel Dovgalyuk
Cc: Stefan Hajnoczi; qemu-devel@nongnu.org; Pavel Dovgalyuk
Subject: Re: [Qemu-devel] Recording I/O activity after KVM does a VMEXIT

 

Dear Pavel,

 

Thank you for your answer. I am not being able to understand the difference 
between CPU I/Os and VM I/Os. Would any network packet that comes into the 
Guest OS from the outside be a part of VM I/O or CPU I/O ? I am only interested 
in "recording" and "replaying" those network packets that come from the outside 
into the networking backend and not the other way around. Say for example when 
I get a VMExit because of the arrival of a network packet, I will use the 
VMExit reason : "KVM_EXIT_MMIO"  to trace back to "e1000_mmio_write()" which I 
expect should be enough to record network packets that come from the outside 
and write to the guest address space for "e1000" devices. In such a case, I 
think I will not have to use the "network-filter" backend that you use to 
record VM I/O only. Let me know if you find errors in my approach.

 

I will try to see how I can record disk packets. If disk packets use other ways 
of writing to the guest memory apart from a normal VMExit, I will try to find 
it out. Eventually I hope that it will use one of the available disk front-end 
functions to write to the guest memory from the disk, just like e1000 does with 
an "e1000_mmio_write()" call. 

 

Thanks and best regards,

Arnab

 

 

 

 

 



 

On Thu, May 31, 2018 at 8:44 AM, Pavel Dovgalyuk  wrote:

> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> On Wed, May 30, 2018 at 11:19:13PM -0400, Arnabjyoti Kalita wrote:
> > I am trying to implement a 'minimal' record-replay mechanism for KVM, which
> > is similar to the one existing for TCG via -icount. I am trying to record
> > I/O events only (specifically disk and network events) when KVM does a
> > VMEXIT. This has led me to the function kvm_cpu_exec where I can clearly
> > see the different ways of handling all of the possible VMExit cases (like
> > PIO, MMIO etc.). To record network packets, I am working with the e1000
> > hardware device.
> >
> > Can I make sure that all of the network I/O, atleast for the e1000 device
> > happens through the KVM_EXIT_MMIO case and subsequent use of the
> > address_space_rw() function ? Do I also need to look at other functions as
> > well ? Also for recording disk activity, can I make sure that looking out
> > for the KVM_EXIT_MMIO and/or KVM_EXIT_PIO cases in the vmexit mechanism,
> > will be enough ?
> >
> > Let me know if there are other details that I need to take care of. I am
> > using QEMU 2.11 on a x86-64 CPU and the guest runs a Linux Kernel 4.4 with
> > Ubuntu 16.04.

The main icount-based record/replay advantage is that we don't record
any CPU IO. We record only VM IO (e.g., by using the network filter).

Disk devices may transfer data to CPU using DMA, therefore intercepting
only VMExit cases will not be enough.

Pavel Dovgalyuk

 



Re: [Qemu-devel] [PATCH 2/9 V5] hostmem-file: add the 'pmem' option

2018-05-31 Thread Junyan He
Because we have disk backend imitation for NVDimm kind memory, I think it is 
more

flexible for user to specify the real or not real pmem rather than we check in 
qemu

using the pmem_is_pmem API


From: Stefan Hajnoczi 
Sent: Thursday, May 31, 2018 1:09:42 PM
To: junyan...@gmx.com
Cc: qemu-devel@nongnu.org; Haozhong Zhang; xiaoguangrong.e...@gmail.com; 
crosthwaite.pe...@gmail.com; m...@redhat.com; dgilb...@redhat.com; 
ehabk...@redhat.com; quint...@redhat.com; Junyan He; stefa...@redhat.com; 
pbonz...@redhat.com; imamm...@redhat.com; r...@twiddle.net
Subject: Re: [Qemu-devel] [PATCH 2/9 V5] hostmem-file: add the 'pmem' option

On Thu, May 10, 2018 at 10:08:51AM +0800, junyan...@gmx.com wrote:
> From: Junyan He 
>
> When QEMU emulates vNVDIMM labels and migrates vNVDIMM devices, it
> needs to know whether the backend storage is a real persistent memory,
> in order to decide whether special operations should be performed to
> ensure the data persistence.
>
> This boolean option 'pmem' allows users to specify whether the backend
> storage of memory-backend-file is a real persistent memory. If
> 'pmem=on', QEMU will set the flag RAM_PMEM in the RAM block of the
> corresponding memory region.

I'm still not sure if this option is necessary since pmem_is_pmem() is
available with the introduction of the libpmem dependency.  Why can't it
be used?

Stefan


Re: [Qemu-devel] [qemu PATCH v4 0/4] support NFIT platform capabilities

2018-05-31 Thread Elliott, Robert (Persistent Memory)
> Ping on this series.  Rob, I think I've addressed all your feedback.
> Can you please verify?

I haven't tested it, but it reads OK.  I'm OK with just extending
the valid count for bits set to one for now; we can add a new
argument later if a need arises for extending it to express new bits
set to zero.





Re: [Qemu-devel] [PATCH v7 0/5] virtio-balloon: free page hint reporting support

2018-05-31 Thread Peter Xu
On Fri, Jun 01, 2018 at 12:58:24PM +0800, Peter Xu wrote:
> On Tue, Apr 24, 2018 at 02:13:43PM +0800, Wei Wang wrote:
> > This is the deivce part implementation to add a new feature,
> > VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> > receives the guest free page hints from the driver and clears the
> > corresponding bits in the dirty bitmap, so that those free pages are
> > not transferred by the migration thread to the destination.
> > 
> > - Test Environment
> > Host: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
> > Guest: 8G RAM, 4 vCPU
> > Migration setup: migrate_set_speed 100G, migrate_set_downtime 2 second
> > 
> > - Test Results
> > - Idle Guest Live Migration Time (results are averaged over 10 runs):
> > - Optimization v.s. Legacy = 271ms vs 1769ms --> ~86% reduction
> > - Guest with Linux Compilation Workload (make bzImage -j4):
> > - Live Migration Time (average)
> >   Optimization v.s. Legacy = 1265ms v.s. 2634ms --> ~51% reduction
> > - Linux Compilation Time
> >   Optimization v.s. Legacy = 4min56s v.s. 5min3s
> >   --> no obvious difference
> > 
> > - Source Code
> > - QEMU:  https://github.com/wei-w-wang/qemu-free-page-lm.git
> > - Linux: https://github.com/wei-w-wang/linux-free-page-lm.git
> 
> Hi, Wei,
> 
> I have a very high-level question to the series.
> 
> IIUC the core idea for this series is that we can avoid sending some
> of the pages if we know that we don't need to send them.  I think this
> is based on the fact that on the destination side all the pages are by
> default zero after they are malloced.  While before this series, IIUC
> any migration will send every single page to destination, no matter
> whether it's zeroed or not.  So I'm uncertain about whether this will
> affect the received bitmap on the destination side.  Say, before this
> series, the received bitmap will directly cover the whole RAM bitmap
> after migration is finished, now it's won't.  Will there be any side
> effect?  I don't see obvious issue now, but just raise this question
> up.
> 
> Meanwhile, this reminds me about a more funny idea: whether we can
> just avoid sending the zero pages directly from QEMU's perspective.
> In other words, can we just do nothing if save_zero_page() detected
> that the page is zero (I guess the is_zero_range() can be fast too,
> but I don't know exactly how fast it is)?  And how that would be
> differed from this page hinting way in either performance and other
> aspects.

I noticed a problem (after I wrote the above paragraph 5 minutes
ago...): when a page was valid and sent to the destination (with
non-zero data), however after a while that page was zeroed.  Then if
we don't send zero pages at all, we won't send the page after it's
zeroed.  Then on the destination side we'll have a stale non-zero
page.  Is my understanding correct?  Will that be a problem to this
series too where a valid page can be possibly freed and hinted?

> 
> I haven't digged into the kernel patches yet so I have totally no idea
> on the detailed implementation of the page hinting.  Please feel free
> to correct me if there is obvious misunderstandings.
> 
> Regards,
> 
> -- 
> Peter Xu

-- 
Peter Xu



Re: [Qemu-devel] [PATCH v7 0/5] virtio-balloon: free page hint reporting support

2018-05-31 Thread Peter Xu
On Tue, Apr 24, 2018 at 02:13:43PM +0800, Wei Wang wrote:
> This is the deivce part implementation to add a new feature,
> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> receives the guest free page hints from the driver and clears the
> corresponding bits in the dirty bitmap, so that those free pages are
> not transferred by the migration thread to the destination.
> 
> - Test Environment
> Host: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
> Guest: 8G RAM, 4 vCPU
> Migration setup: migrate_set_speed 100G, migrate_set_downtime 2 second
> 
> - Test Results
> - Idle Guest Live Migration Time (results are averaged over 10 runs):
> - Optimization v.s. Legacy = 271ms vs 1769ms --> ~86% reduction
> - Guest with Linux Compilation Workload (make bzImage -j4):
> - Live Migration Time (average)
>   Optimization v.s. Legacy = 1265ms v.s. 2634ms --> ~51% reduction
> - Linux Compilation Time
>   Optimization v.s. Legacy = 4min56s v.s. 5min3s
>   --> no obvious difference
> 
> - Source Code
> - QEMU:  https://github.com/wei-w-wang/qemu-free-page-lm.git
> - Linux: https://github.com/wei-w-wang/linux-free-page-lm.git

Hi, Wei,

I have a very high-level question to the series.

IIUC the core idea for this series is that we can avoid sending some
of the pages if we know that we don't need to send them.  I think this
is based on the fact that on the destination side all the pages are by
default zero after they are malloced.  While before this series, IIUC
any migration will send every single page to destination, no matter
whether it's zeroed or not.  So I'm uncertain about whether this will
affect the received bitmap on the destination side.  Say, before this
series, the received bitmap will directly cover the whole RAM bitmap
after migration is finished, now it's won't.  Will there be any side
effect?  I don't see obvious issue now, but just raise this question
up.

Meanwhile, this reminds me about a more funny idea: whether we can
just avoid sending the zero pages directly from QEMU's perspective.
In other words, can we just do nothing if save_zero_page() detected
that the page is zero (I guess the is_zero_range() can be fast too,
but I don't know exactly how fast it is)?  And how that would be
differed from this page hinting way in either performance and other
aspects.

I haven't digged into the kernel patches yet so I have totally no idea
on the detailed implementation of the page hinting.  Please feel free
to correct me if there is obvious misunderstandings.

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH v7 3/5] migration: API to clear bits of guest free pages from the dirty bitmap

2018-05-31 Thread Peter Xu
On Tue, Apr 24, 2018 at 02:13:46PM +0800, Wei Wang wrote:
> This patch adds an API to clear bits corresponding to guest free pages
> from the dirty bitmap. Spilt the free page block if it crosses the QEMU
> RAMBlock boundary.
> 
> Signed-off-by: Wei Wang 
> CC: Dr. David Alan Gilbert 
> CC: Juan Quintela 
> CC: Michael S. Tsirkin 
> ---
>  include/migration/misc.h |  2 ++
>  migration/ram.c  | 44 
>  2 files changed, 46 insertions(+)
> 
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 4ebf24c..113320e 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -14,11 +14,13 @@
>  #ifndef MIGRATION_MISC_H
>  #define MIGRATION_MISC_H
>  
> +#include "exec/cpu-common.h"
>  #include "qemu/notify.h"
>  
>  /* migration/ram.c */
>  
>  void ram_mig_init(void);
> +void qemu_guest_free_page_hint(void *addr, size_t len);
>  
>  /* migration/block.c */
>  
> diff --git a/migration/ram.c b/migration/ram.c
> index 9a72b1a..0147548 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2198,6 +2198,50 @@ static int ram_init_all(RAMState **rsp)
>  }
>  
>  /*
> + * This function clears bits of the free pages reported by the caller from 
> the
> + * migration dirty bitmap. @addr is the host address corresponding to the
> + * start of the continuous guest free pages, and @len is the total bytes of
> + * those pages.
> + */
> +void qemu_guest_free_page_hint(void *addr, size_t len)
> +{
> +RAMBlock *block;
> +ram_addr_t offset;
> +size_t used_len, start, npages;

Do we need to check here on whether a migration is in progress?  Since
if not I'm not sure whether this hint still makes any sense any more,
and more importantly it seems to me that block->bmap below at [1] is
only valid during a migration.  So I'm not sure whether QEMU will
crash if this function is called without a running migration.

> +
> +for (; len > 0; len -= used_len) {
> +block = qemu_ram_block_from_host(addr, false, );
> +if (unlikely(!block)) {
> +return;

We should never reach here, should we?  Assuming the callers of this
function should always pass in a correct host address. If we are very
sure that the host addr should be valid, could we just assert?

> +}
> +
> +/*
> + * This handles the case that the RAMBlock is resized after the free
> + * page hint is reported.
> + */
> +if (unlikely(offset > block->used_length)) {
> +return;
> +}
> +
> +if (len <= block->used_length - offset) {
> +used_len = len;
> +} else {
> +used_len = block->used_length - offset;
> +addr += used_len;
> +}
> +
> +start = offset >> TARGET_PAGE_BITS;
> +npages = used_len >> TARGET_PAGE_BITS;
> +
> +qemu_mutex_lock(_state->bitmap_mutex);

So now I think I understand the lock can still be meaningful since
this function now can be called outside the migration thread (e.g., in
vcpu thread).  But still it would be nice to mention it somewhere on
the truth of the lock.

Regards,

> +ram_state->migration_dirty_pages -=
> +  bitmap_count_one_with_offset(block->bmap, start, 
> npages);
> +bitmap_clear(block->bmap, start, npages);

[1]

> +qemu_mutex_unlock(_state->bitmap_mutex);
> +}
> +}
> +
> +/*
>   * Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>   * long-running RCU critical section.  When rcu-reclaims in the code
>   * start to become numerous it will be necessary to reduce the
> -- 
> 1.8.3.1
> 
> 

-- 
Peter Xu



Re: [Qemu-devel] [Qemu-block] [PATCH v2 8/8] block: Remove unused sector-based vectored I/O

2018-05-31 Thread Jeff Cody
On Thu, May 31, 2018 at 03:50:46PM -0500, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  Now that all callers of vectored I/O have been converted
> to use our preferred byte-based bdrv_co_p{read,write}v(), we can
> delete the unused bdrv_co_{read,write}v().
> 
> Furthermore, this gets rid of the signature difference between the
> public bdrv_co_writev() and the callback .bdrv_co_writev (the
> latter still exists, because some drivers still need more work
> before they are fully byte-based).
> 
> Signed-off-by: Eric Blake 
> Reviewed-by: Stefan Hajnoczi 

Reviewed-by: Jeff Cody 

> ---
> v2: commit typo fix [Kashyap]
> ---
>  include/block/block.h |  4 
>  block/io.c| 36 
>  2 files changed, 40 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 3894edda9de..fe40d2929ac 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -285,10 +285,6 @@ int bdrv_pwrite(BdrvChild *child, int64_t offset, const 
> void *buf, int bytes);
>  int bdrv_pwritev(BdrvChild *child, int64_t offset, QEMUIOVector *qiov);
>  int bdrv_pwrite_sync(BdrvChild *child, int64_t offset,
>   const void *buf, int count);
> -int coroutine_fn bdrv_co_readv(BdrvChild *child, int64_t sector_num,
> -   int nb_sectors, QEMUIOVector *qiov);
> -int coroutine_fn bdrv_co_writev(BdrvChild *child, int64_t sector_num,
> -   int nb_sectors, QEMUIOVector *qiov);
>  /*
>   * Efficiently zero a region of the disk image.  Note that this is a regular
>   * I/O request like read or write and should have a reasonable size.  This
> diff --git a/block/io.c b/block/io.c
> index ca96b487eb8..1d86bfc0072 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1341,24 +1341,6 @@ int coroutine_fn bdrv_co_preadv(BdrvChild *child,
>  return ret;
>  }
> 
> -static int coroutine_fn bdrv_co_do_readv(BdrvChild *child,
> -int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
> -BdrvRequestFlags flags)
> -{
> -if (nb_sectors < 0 || nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
> -return -EINVAL;
> -}
> -
> -return bdrv_co_preadv(child, sector_num << BDRV_SECTOR_BITS,
> -  nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
> -}
> -
> -int coroutine_fn bdrv_co_readv(BdrvChild *child, int64_t sector_num,
> -   int nb_sectors, QEMUIOVector *qiov)
> -{
> -return bdrv_co_do_readv(child, sector_num, nb_sectors, qiov, 0);
> -}
> -
>  static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>  int64_t offset, int bytes, BdrvRequestFlags flags)
>  {
> @@ -1801,24 +1783,6 @@ out:
>  return ret;
>  }
> 
> -static int coroutine_fn bdrv_co_do_writev(BdrvChild *child,
> -int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
> -BdrvRequestFlags flags)
> -{
> -if (nb_sectors < 0 || nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
> -return -EINVAL;
> -}
> -
> -return bdrv_co_pwritev(child, sector_num << BDRV_SECTOR_BITS,
> -   nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
> -}
> -
> -int coroutine_fn bdrv_co_writev(BdrvChild *child, int64_t sector_num,
> -int nb_sectors, QEMUIOVector *qiov)
> -{
> -return bdrv_co_do_writev(child, sector_num, nb_sectors, qiov, 0);
> -}
> -
>  int coroutine_fn bdrv_co_pwrite_zeroes(BdrvChild *child, int64_t offset,
> int bytes, BdrvRequestFlags flags)
>  {
> -- 
> 2.14.3
> 
> 



Re: [Qemu-devel] [Qemu-block] [PATCH v2 6/8] replication: Switch to byte-based calls

2018-05-31 Thread Jeff Cody
On Thu, May 31, 2018 at 03:50:44PM -0500, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  Make the change for the last few sector-based calls
> into the block layer from the replication driver.
> 
> Ideally, the replication driver should switch to doing everything
> byte-based, but that's a more invasive change that requires a
> bit more auditing.
> 
> Signed-off-by: Eric Blake 

Reviewed-by: Jeff Cody 

> ---
>  block/replication.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/block/replication.c b/block/replication.c
> index 826db7b3049..6349d6958e4 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -246,13 +246,14 @@ static coroutine_fn int 
> replication_co_readv(BlockDriverState *bs,
>  backup_cow_request_begin(, child->bs->job,
>   sector_num * BDRV_SECTOR_SIZE,
>   remaining_bytes);
> -ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors,
> -qiov);
> +ret = bdrv_co_preadv(bs->file, sector_num * BDRV_SECTOR_SIZE,
> + remaining_bytes, qiov, 0);
>  backup_cow_request_end();
>  goto out;
>  }
> 
> -ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors, qiov);
> +ret = bdrv_co_preadv(bs->file, sector_num * BDRV_SECTOR_SIZE,
> + remaining_sectors * BDRV_SECTOR_SIZE, qiov, 0);
>  out:
>  return replication_return_value(s, ret);
>  }
> @@ -279,8 +280,8 @@ static coroutine_fn int 
> replication_co_writev(BlockDriverState *bs,
>  }
> 
>  if (ret == 0) {
> -ret = bdrv_co_writev(top, sector_num,
> - remaining_sectors, qiov);
> +ret = bdrv_co_pwritev(top, sector_num * BDRV_SECTOR_SIZE,
> +  remaining_sectors * BDRV_SECTOR_SIZE, qiov, 0);
>  return replication_return_value(s, ret);
>  }
> 
> @@ -306,7 +307,8 @@ static coroutine_fn int 
> replication_co_writev(BlockDriverState *bs,
>  qemu_iovec_concat(_qiov, qiov, bytes_done, count);
> 
>  target = ret ? top : base;
> -ret = bdrv_co_writev(target, sector_num, n, _qiov);
> +ret = bdrv_co_pwritev(target, sector_num * BDRV_SECTOR_SIZE,
> +  n * BDRV_SECTOR_SIZE, _qiov, 0);
>  if (ret < 0) {
>  goto out1;
>  }
> -- 
> 2.14.3
> 
> 



[Qemu-devel] vIOMMU Posted-interrupt implementation - atomic operation?

2018-05-31 Thread Jintack Lim
Hi,

I'm implementing Posted-interrupt functionality in vIOMMU. According
to Vt-d spec 5.2.3, IOMMU performs a coherent atomic read-modify-write
operation of the posted-interrupt descriptor. I wonder how can we
achieve this considering the guest can modify the same
posted-interrupt descriptor anytime. Is there any existing mechanism
that I can use in QEMU?

Thanks,
Jintack




Re: [Qemu-devel] [PATCH v7 2/5] migration: use bitmap_mutex in migration_bitmap_clear_dirty

2018-05-31 Thread Peter Xu
On Tue, Apr 24, 2018 at 02:13:45PM +0800, Wei Wang wrote:
> The bitmap mutex is used to synchronize threads to update the dirty
> bitmap and the migration_dirty_pages counter. This patch makes
> migration_bitmap_clear_dirty update the bitmap and counter under the
> mutex.
> 
> Signed-off-by: Wei Wang 
> CC: Dr. David Alan Gilbert 
> CC: Juan Quintela 
> CC: Michael S. Tsirkin 
> ---
>  migration/ram.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 0e90efa..9a72b1a 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -795,11 +795,14 @@ static inline bool 
> migration_bitmap_clear_dirty(RAMState *rs,
>  {
>  bool ret;
>  
> +qemu_mutex_lock(>bitmap_mutex);
>  ret = test_and_clear_bit(page, rb->bmap);
>  
>  if (ret) {
>  rs->migration_dirty_pages--;
>  }
> +qemu_mutex_unlock(>bitmap_mutex);
> +
>  return ret;
>  }
>  
> -- 
> 1.8.3.1
> 
> 

Do we need the lock after all?

I see that we introduced this lock due to device hotplug at dd63169766
("migration: extend migration_bitmap", 2015-07-07), however now we
actually don't allow that to happen any more after commit b06424de62
("migration: Disable hotplug/unplug during migration", 2017-04-21).
I'm not sure whether it means that the lock is not needed now.

If so, this patch seems to be unecessary.

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH 2/3] pc-bios/s390-ccw/net: Add support for pxelinux-style config files

2018-05-31 Thread Thomas Huth
On 31.05.2018 23:25, Farhan Ali wrote:
> 
> 
> On 05/30/2018 05:16 AM, Thomas Huth wrote:
>> Since it is quite cumbersome to manually create a combined kernel with
>> initrd image for network booting, we now support loading via pxelinux
>> configuration files, too. In these files, the kernel, initrd and command
>> line parameters can be specified seperately, and the firmware then takes
>> care of glueing everything together in memory after the files have been
>> downloaded. See this URL for details about the config file layout:
>> https://www.syslinux.org/wiki/index.php?title=PXELINUX
>>
>> The user can either specify a config file directly as bootfile via DHCP
>> (but in this case, the file has to start either with "default" or a "#"
>> comment so we can distinguish it from binary kernels), or a folder (i.e.
>> the bootfile name must end with "/") where the firmware should look for
>> the typical pxelinux.cfg file names, e.g. based on MAC or IP address.
>> We also support the pxelinux.cfg DHCP options 209 and 210 from RFC 5071.
>>
>> Signed-off-by: Thomas Huth 
>> ---
>>   pc-bios/s390-ccw/netboot.mak |  7 ++--
>>   pc-bios/s390-ccw/netmain.c   | 79
>> +++-
>>   2 files changed, 82 insertions(+), 4 deletions(-)
>>
>> diff --git a/pc-bios/s390-ccw/netboot.mak b/pc-bios/s390-ccw/netboot.mak
>> index a73be36..8af0cfd 100644
>> --- a/pc-bios/s390-ccw/netboot.mak
>> +++ b/pc-bios/s390-ccw/netboot.mak
>> @@ -25,8 +25,9 @@ CTYPE_OBJS = isdigit.o isxdigit.o toupper.o
>>   %.o : $(SLOF_DIR)/lib/libc/ctype/%.c
>>   $(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@
>> $<,"CC","$(TARGET_DIR)$@")
>>   -STRING_OBJS = strcat.o strchr.o strcmp.o strcpy.o strlen.o
>> strncmp.o strncpy.o \
>> -  strstr.o memset.o memcpy.o memmove.o memcmp.o
>> +STRING_OBJS = strcat.o strchr.o strrchr.o strcpy.o strlen.o strncpy.o \
>> +  strcmp.o strncmp.o strcasecmp.o strncasecmp.o strstr.o \
>> +  memset.o memcpy.o memmove.o memcmp.o
>>   %.o : $(SLOF_DIR)/lib/libc/string/%.c
>>   $(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@
>> $<,"CC","$(TARGET_DIR)$@")
>>   @@ -50,7 +51,7 @@ libc.a: $(LIBCOBJS)
>>   # libnet files:
>>     LIBNETOBJS := args.o dhcp.o dns.o icmpv6.o ipv6.o tcp.o udp.o
>> bootp.o \
>> -  dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o
>> +  dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o pxelinux.o
>>   LIBNETCFLAGS := $(QEMU_CFLAGS) -DDHCPARCH=0x1F $(LIBC_INC)
>> $(LIBNET_INC)
>>     %.o : $(SLOF_DIR)/lib/libnet/%.c
>> diff --git a/pc-bios/s390-ccw/netmain.c b/pc-bios/s390-ccw/netmain.c
>> index 7533cf7..e84bb2b 100644
>> --- a/pc-bios/s390-ccw/netmain.c
>> +++ b/pc-bios/s390-ccw/netmain.c
>> @@ -30,6 +30,7 @@
>>   #include 
>>   #include 
>>   #include 
>> +#include 
>>     #include "s390-ccw.h"
>>   #include "virtio.h"
>> @@ -41,12 +42,14 @@ extern char _start[];
>>     #define KERNEL_ADDR ((void *)0L)
>>   #define KERNEL_MAX_SIZE ((long)_start)
>> +#define ARCH_COMMAND_LINE_SIZE  896  /* Taken from Linux
>> kernel */
>>     char stack[PAGE_SIZE * 8] __attribute__((aligned(PAGE_SIZE)));
>>   IplParameterBlock iplb __attribute__((aligned(PAGE_SIZE)));
>>   static char cfgbuf[2048];
>>     static SubChannelId net_schid = { .one = 1 };
>> +static uint8_t mac[6];
>>   static uint64_t dest_timer;
>>     static uint64_t get_timer_ms(void)
>> @@ -158,7 +161,6 @@ static int tftp_load(filename_ip_t *fnip, void
>> *buffer, int len)
>>     static int net_init(filename_ip_t *fn_ip)
>>   {
>> -    uint8_t mac[6];
>>   int rc;
>>     memset(fn_ip, 0, sizeof(filename_ip_t));
>> @@ -233,6 +235,66 @@ static void net_release(filename_ip_t *fn_ip)
>>   }
>>     /**
>> + * Load a kernel with initrd (i.e. with the information that we've
>> got from
>> + * a pxelinux.cfg config file)
>> + */
>> +static int load_kernel_with_initrd(filename_ip_t *fn_ip,
>> +   struct pl_cfg_entry *entry)
>> +{
>> +    int rc;
>> +
>> +    printf("Loading pxelinux.cfg entry '%s'\n", entry->label);
>> +
>> +    if (!entry->kernel) {
>> +    printf("Kernel entry is missing!\n");
>> +    return -1;
>> +    }
>> +
>> +    strncpy(fn_ip->filename, entry->kernel, sizeof(fn_ip->filename));
>> +    rc = tftp_load(fn_ip, KERNEL_ADDR, KERNEL_MAX_SIZE);
>> +    if (rc < 0) {
>> +    return rc;
>> +    }
>> +
>> +    if (entry->initrd) {
>> +    uint64_t iaddr = (rc + 0xfff) & ~0xfffUL;
>> +
>> +    strncpy(fn_ip->filename, entry->initrd,
>> sizeof(fn_ip->filename));
>> +    rc = tftp_load(fn_ip, (void *)iaddr, KERNEL_MAX_SIZE - iaddr);
>> +    if (rc < 0) {
>> +    return rc;
>> +    }
>> +    /* Patch location and size: */
>> +    *(uint64_t *)0x10408 = iaddr;
>> +    *(uint64_t *)0x10410 = rc;
>> +    rc += iaddr;
>> +    }
>> +
>> +    if (entry->append) {
>> +    strncpy((char *)0x10480, entry->append, ARCH_COMMAND_LINE_SIZE);
>> +    }
>> +
>> +    return 

Re: [Qemu-devel] [PATCH RFC] qemu-error: introduce error_report_once

2018-05-31 Thread Peter Xu
On Thu, May 31, 2018 at 12:03:17PM +0100, Stefan Hajnoczi wrote:
> On Tue, May 15, 2018 at 05:13:56PM +0800, Peter Xu wrote:
> > I stole the printk_once() macro.
> > 
> > I always wanted to be able to print some error directly if there is a
> > buffer to dump, however we can't use error_report() really quite often
> > when there can be any DDOS attack.  To avoid that, we can introduce a
> > print-once function for it.
> 
> I like the idea.  It solves the problem of guest-triggerable error
> messages that we should know about for troubleshooting but can't due to
> DoS.
> 
> Stefan

Thanks for the positive feedback.  Please feel free to have a look on
the latest version:

Subject: [PATCH v4 0/2] error-report: introduce {error|warn}_report_once
Message-Id: <20180524044454.11792-1-pet...@redhat.com>

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH v7 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-05-31 Thread Wei Wang

On 06/01/2018 01:42 AM, Michael S. Tsirkin wrote:

On Thu, May 31, 2018 at 10:27:00AM +0800, Wei Wang wrote:

On 05/30/2018 08:47 PM, Michael S. Tsirkin wrote:

On Wed, May 30, 2018 at 05:12:09PM +0800, Wei Wang wrote:

On 05/29/2018 11:24 PM, Michael S. Tsirkin wrote:

On Tue, Apr 24, 2018 at 02:13:47PM +0800, Wei Wang wrote:

+/*
+ * Balloon will report pages which were free at the time of this call. As the
+ * reporting happens asynchronously, dirty bit logging must be enabled before
+ * this call is made.
+ */
+void balloon_free_page_start(void)
+{
+balloon_free_page_start_fn(balloon_opaque);
+}

Please create notifier support, not a single global.

OK. The start is called at the end of bitmap_sync, and the stop is called at
the beginning of bitmap_sync. In this case, we will need to add two
migration states, MIGRATION_STATUS_BEFORE_BITMAP_SYNC and
MIGRATION_STATUS_AFTER_BITMAP_SYNC, right?

If that's the way you do it, you need to ask migration guys, not me.

Yeah, I know.. thanks for the virtio part.


+
+static bool virtio_balloon_free_page_support(void *opaque)
+{
+VirtIOBalloon *s = opaque;
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
or if poison is negotiated.

Will make it
return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT) &&
!virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON)

I mean the reverse:
virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT) ||
virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON)


If poison has been negotiated you must migrate the
guest supplied value even if you don't use it for hints.


Just a little confused with the logic. Writing it that way means that we are
taking this possibility "virtio_vdev_has_feature(vdev,
VIRTIO_BALLOON_F_FREE_PAGE_HINT)=fasle, virtio_vdev_has_feature(vdev,
VIRTIO_BALLOON_F_PAGE_POISON)=true" into account, and let the support
function return true when F_FREE_PAGE_HINT isn't supported.

All I am saying is that in this configuration, you must migrate
the poison value programmed by guest even if you do not
yet use it without VIRTIO_BALLOON_F_FREE_PAGE_HINT.

Right now you have
a section:
+.needed = virtio_balloon_free_page_support,

which includes the poison value.

So if guest migrates after writing the poison value,
it's lost. Not nice.


If guest doesn't support F_FREE_PAGE_HINT, it doesn't support the free page
reporting (even the free page vq). I'm not sure why we tell the migration
thread that the free page reporting feature is supported via this support
function. If the support function simply returns false when F_FREE_PAGE_HINT
isn't negotiated, the legacy migration already migrates the poisoned pages
(not skipped, but may be compressed).

I think it would be better to simply use the original "return
virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)" here.


So maybe you should put the poison value in a separate section then.


Yes, that looks good to me, thanks.

Best,
Wei




Re: [Qemu-devel] [PATCH v2 7/8] vhdx: Switch to byte-based calls

2018-05-31 Thread Jeff Cody
On Thu, May 31, 2018 at 03:50:45PM -0500, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  Make the change for the last few sector-based calls
> into the block layer from the vhdx driver.
> 
> Ideally, the vhdx driver should switch to doing everything
> byte-based, but that's a more invasive change that requires a
> bit more auditing.
> 
> Signed-off-by: Eric Blake 

Reviewed-by: Jeff Cody 

> ---
>  block/vhdx.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/block/vhdx.c b/block/vhdx.c
> index 0b1e21c7501..295d3276120 100644
> --- a/block/vhdx.c
> +++ b/block/vhdx.c
> @@ -1126,9 +1126,9 @@ static coroutine_fn int vhdx_co_readv(BlockDriverState 
> *bs, int64_t sector_num,
>  break;
>  case PAYLOAD_BLOCK_FULLY_PRESENT:
>  qemu_co_mutex_unlock(>lock);
> -ret = bdrv_co_readv(bs->file,
> -sinfo.file_offset >> BDRV_SECTOR_BITS,
> -sinfo.sectors_avail, _qiov);
> +ret = bdrv_co_preadv(bs->file, sinfo.file_offset,
> + sinfo.sectors_avail * BDRV_SECTOR_SIZE,
> + _qiov, 0);
>  qemu_co_mutex_lock(>lock);
>  if (ret < 0) {
>  goto exit;
> @@ -1348,9 +1348,9 @@ static coroutine_fn int vhdx_co_writev(BlockDriverState 
> *bs, int64_t sector_num,
>  }
>  /* block exists, so we can just overwrite it */
>  qemu_co_mutex_unlock(>lock);
> -ret = bdrv_co_writev(bs->file,
> -sinfo.file_offset >> BDRV_SECTOR_BITS,
> -sectors_to_write, _qiov);
> +ret = bdrv_co_pwritev(bs->file, sinfo.file_offset,
> +  sectors_to_write * BDRV_SECTOR_SIZE,
> +  _qiov, 0);
>  qemu_co_mutex_lock(>lock);
>  if (ret < 0) {
>  goto error_bat_restore;
> -- 
> 2.14.3
> 



[Qemu-devel] [PATCH] docker: Update fedora image to 28

2018-05-31 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 tests/docker/dockerfiles/fedora.docker | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/docker/dockerfiles/fedora.docker 
b/tests/docker/dockerfiles/fedora.docker
index b706f42405..65d7761cf5 100644
--- a/tests/docker/dockerfiles/fedora.docker
+++ b/tests/docker/dockerfiles/fedora.docker
@@ -1,4 +1,4 @@
-FROM fedora:27
+FROM fedora:28
 ENV PACKAGES \
 ccache gettext git tar PyYAML sparse flex bison python3 bzip2 hostname \
 gcc gcc-c++ llvm clang make perl which bc findutils glib2-devel \
-- 
2.14.3




Re: [Qemu-devel] [PATCH v7 00/10] qemu-img convert with copy offloading

2018-05-31 Thread Fam Zheng
On Wed, 05/30 17:06, Stefan Hajnoczi wrote:
> On Tue, May 29, 2018 at 01:59:49PM +0800, Fam Zheng wrote:
> > v7: Fix qcow2.
> > 
> > v6: Pick up rev-by from Stefan and Eric.
> > Tweak patch 2 commit message.
> > 
> > v5: - Fix raw offset/bytes check for read. [Eric]
> > - Fix qcow2_handle_l2meta. [Stefan]
> > - Add coroutine_fn whereever appropriate. [Stefan]
> > 
> > v4: - Fix raw offset and size. [Eric]
> > - iscsi: Drop unnecessary return values and variables in favor of
> >   constants. [Stefan]
> > - qcow2: Handle small backing case. [Stefan]
> > - file-posix: Translate ENOSYS to ENOTSUP. [Stefan]
> > - API documentation and commit message. [Stefan]
> > - Add rev-by to patches 3, 5 - 10. [Stefan, Eric]
> > 
> > This series introduces block layer API for copy offloading and makes use of 
> > it
> > in qemu-img convert.
> > 
> > For now we implemented the operation in local file protocol with
> > copy_file_range(2).  Besides that it's possible to add similar to iscsi, nfs
> > and potentially more.
> > 
> > As far as its usage goes, in addition to qemu-img convert, we can emulate
> > offloading in scsi-disk (handle EXTENDED COPY command), and use the API in
> > block jobs too.
> 
> Fails to compile on Fedora 28.
> 
> Oops, I guess my glibc is too new.  This type of bug doesn't happen very
> often :D :D :D.
> 
> block/file-posix.c:1452:14: error: static declaration of ‘copy_file_range’ 
> follows non-static declaration
>  static off_t copy_file_range(int in_fd, off_t *in_off, int out_fd,
>   ^~~
> In file included from /home/stefanha/qemu/include/qemu/osdep.h:75,
>  from block/file-posix.c:25:
> /usr/include/unistd.h:1107:9: note: previous declaration of ‘copy_file_range’ 
> was here
>  ssize_t copy_file_range (int __infd, __off64_t *__pinoff,
>  ^~~

It means it's time to update our Fedora docker image! Will fix this by adding a
configure test.

Fam



Re: [Qemu-devel] [Qemu-block] [PATCH v2 1/8] parallels: Switch to byte-based calls

2018-05-31 Thread Jeff Cody
On Thu, May 31, 2018 at 03:50:39PM -0500, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  Make the change for the last few sector-based calls
> into the block layer from the parallels driver.
> 
> Ideally, the parallels driver should switch to doing everything
> byte-based, but that's a more invasive change that requires a
> bit more auditing.
> 
> Signed-off-by: Eric Blake 
> Reviewed-by: Stefan Hajnoczi 
> Reviewed-by: Denis V. Lunev 

Reviewed-by: Jeff Cody 

> ---
>  block/parallels.c | 16 ++--
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/block/parallels.c b/block/parallels.c
> index 6e9c37f44e1..a761c896d35 100644
> --- a/block/parallels.c
> +++ b/block/parallels.c
> @@ -226,14 +226,15 @@ static int64_t allocate_clusters(BlockDriverState *bs, 
> int64_t sector_num,
>  };
>  qemu_iovec_init_external(, , 1);
> 
> -ret = bdrv_co_readv(bs->backing, idx * s->tracks, nb_cow_sectors,
> -);
> +ret = bdrv_co_preadv(bs->backing, idx * s->tracks * BDRV_SECTOR_SIZE,
> + nb_cow_bytes, , 0);
>  if (ret < 0) {
>  qemu_vfree(iov.iov_base);
>  return ret;
>  }
> 
> -ret = bdrv_co_writev(bs->file, s->data_end, nb_cow_sectors, );
> +ret = bdrv_co_pwritev(bs->file, s->data_end * BDRV_SECTOR_SIZE,
> +  nb_cow_bytes, , 0);
>  qemu_vfree(iov.iov_base);
>  if (ret < 0) {
>  return ret;
> @@ -339,7 +340,8 @@ static coroutine_fn int 
> parallels_co_writev(BlockDriverState *bs,
>  qemu_iovec_reset(_qiov);
>  qemu_iovec_concat(_qiov, qiov, bytes_done, nbytes);
> 
> -ret = bdrv_co_writev(bs->file, position, n, _qiov);
> +ret = bdrv_co_pwritev(bs->file, position * BDRV_SECTOR_SIZE, nbytes,
> +  _qiov, 0);
>  if (ret < 0) {
>  break;
>  }
> @@ -378,7 +380,8 @@ static coroutine_fn int 
> parallels_co_readv(BlockDriverState *bs,
> 
>  if (position < 0) {
>  if (bs->backing) {
> -ret = bdrv_co_readv(bs->backing, sector_num, n, _qiov);
> +ret = bdrv_co_preadv(bs->backing, sector_num * 
> BDRV_SECTOR_SIZE,
> + nbytes, _qiov, 0);
>  if (ret < 0) {
>  break;
>  }
> @@ -386,7 +389,8 @@ static coroutine_fn int 
> parallels_co_readv(BlockDriverState *bs,
>  qemu_iovec_memset(_qiov, 0, 0, nbytes);
>  }
>  } else {
> -ret = bdrv_co_readv(bs->file, position, n, _qiov);
> +ret = bdrv_co_preadv(bs->file, position * BDRV_SECTOR_SIZE, 
> nbytes,
> + _qiov, 0);
>  if (ret < 0) {
>  break;
>  }
> -- 
> 2.14.3
> 
> 



[Qemu-devel] [PATCH v2 16/20] 9p: darwin: Compatibility for f/l*xattr

2018-05-31 Thread Keno Fischer
On darwin `fgetxattr` takes two extra optional arguments,
and the l* variants are not defined (in favor of an extra
flag to the regular variants.

Signed-off-by: Keno Fischer 
---

Changes from v1: New patch, qemu_fgetxattr had previously been
 moved to 9p-util as fgetxattr_follow. The remaining functions
 are required by the proxy-helper.

 Makefile|  6 ++
 fsdev/virtfs-proxy-helper.c |  9 +
 hw/9pfs/9p-local.c  | 12 
 hw/9pfs/9p-util.h   | 17 +
 4 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/Makefile b/Makefile
index 6d588d1..dac6efd 100644
--- a/Makefile
+++ b/Makefile
@@ -544,7 +544,13 @@ qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o 
$(COMMON_LDADDS)
 qemu-keymap$(EXESUF): qemu-keymap.o ui/input-keymap.o $(COMMON_LDADDS)
 
 fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o 
fsdev/9p-marshal.o fsdev/9p-iov-marshal.o $(COMMON_LDADDS)
+ifdef CONFIG_DARWIN
+fsdev/virtfs-proxy-helper$(EXESUF): hw/9pfs/9p-util-darwin.o
+endif
+ifdef CONFIG_LINUX
+fsdev/virtfs-proxy-helper$(EXESUF): hw/9pfs/9p-util-linux.o
 fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
+endif
 
 scsi/qemu-pr-helper$(EXESUF): scsi/qemu-pr-helper.o scsi/utils.o 
$(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS)
 ifdef CONFIG_MPATH
diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index 3bc1269..a26f8b8 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -28,6 +28,7 @@
 #include "qemu/statfs.h"
 #include "9p-iov-marshal.h"
 #include "hw/9pfs/9p-proxy.h"
+#include "hw/9pfs/9p-util.h"
 #include "fsdev/9p-iov-marshal.h"
 
 #define PROGNAME "virtfs-proxy-helper"
@@ -459,7 +460,7 @@ static int do_getxattr(int type, struct iovec *iovec, 
struct iovec *out_iovec)
 v9fs_string_init();
 retval = proxy_unmarshal(iovec, offset, "s", );
 if (retval > 0) {
-retval = lgetxattr(path.data, name.data, xattr.data, size);
+retval = qemu_lgetxattr(path.data, name.data, xattr.data, size);
 if (retval < 0) {
 retval = -errno;
 } else {
@@ -469,7 +470,7 @@ static int do_getxattr(int type, struct iovec *iovec, 
struct iovec *out_iovec)
 v9fs_string_free();
 break;
 case T_LLISTXATTR:
-retval = llistxattr(path.data, xattr.data, size);
+retval = qemu_llistxattr(path.data, xattr.data, size);
 if (retval < 0) {
 retval = -errno;
 } else {
@@ -1000,7 +1001,7 @@ static int process_requests(int sock)
 retval = proxy_unmarshal(_iovec, PROXY_HDR_SZ, "sssdd", ,
  , , , );
 if (retval > 0) {
-retval = lsetxattr(path.data,
+retval = qemu_lsetxattr(path.data,
name.data, value.data, size, flags);
 if (retval < 0) {
 retval = -errno;
@@ -1016,7 +1017,7 @@ static int process_requests(int sock)
 retval = proxy_unmarshal(_iovec,
  PROXY_HDR_SZ, "ss", , );
 if (retval > 0) {
-retval = lremovexattr(path.data, name.data);
+retval = qemu_lremovexattr(path.data, name.data);
 if (retval < 0) {
 retval = -errno;
 }
diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index 1f0b1b0..7830526 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -776,16 +776,20 @@ static int local_fstat(FsContext *fs_ctx, int fid_type,
 mode_t tmp_mode;
 dev_t tmp_dev;
 
-if (fgetxattr(fd, "user.virtfs.uid", _uid, sizeof(uid_t)) > 0) {
+if (qemu_fgetxattr(fd, "user.virtfs.uid",
+   _uid, sizeof(uid_t)) > 0) {
 stbuf->st_uid = le32_to_cpu(tmp_uid);
 }
-if (fgetxattr(fd, "user.virtfs.gid", _gid, sizeof(gid_t)) > 0) {
+if (qemu_fgetxattr(fd, "user.virtfs.gid",
+   _gid, sizeof(gid_t)) > 0) {
 stbuf->st_gid = le32_to_cpu(tmp_gid);
 }
-if (fgetxattr(fd, "user.virtfs.mode", _mode, sizeof(mode_t)) > 0) {
+if (qemu_fgetxattr(fd, "user.virtfs.mode",
+   _mode, sizeof(mode_t)) > 0) {
 stbuf->st_mode = le32_to_cpu(tmp_mode);
 }
-if (fgetxattr(fd, "user.virtfs.rdev", _dev, sizeof(dev_t)) > 0) {
+if (qemu_fgetxattr(fd, "user.virtfs.rdev",
+   _dev, sizeof(dev_t)) > 0) {
 stbuf->st_rdev = le64_to_cpu(tmp_dev);
 }
 } else if (fs_ctx->export_flags & V9FS_SM_MAPPED_FILE) {
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index 79ed6b2..50a03c7 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -19,6 +19,23 @@
 #define O_PATH_9P_UTIL 0
 #endif
 
+#ifdef CONFIG_DARWIN
+#define qemu_fgetxattr(...) fgetxattr(__VA_ARGS__, 0, 0)
+#define 

[Qemu-devel] [PATCH v2 20/20] 9p: darwin: configure: Allow VirtFS on Darwin

2018-05-31 Thread Keno Fischer
Signed-off-by: Keno Fischer 
---

Changes from v1: Now builds the proxy-helper on Darwin.

 Makefile.objs |  1 +
 configure | 22 +++---
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index c6c3554..a2245c9 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -104,6 +104,7 @@ common-obj-$(CONFIG_WIN32) += os-win32.o
 common-obj-$(CONFIG_POSIX) += os-posix.o
 
 common-obj-$(CONFIG_LINUX) += fsdev/
+common-obj-$(CONFIG_DARWIN) += fsdev/
 
 common-obj-y += migration/
 
diff --git a/configure b/configure
index a6a4616..4808459 100755
--- a/configure
+++ b/configure
@@ -5535,16 +5535,28 @@ if test "$want_tools" = "yes" ; then
   fi
 fi
 if test "$softmmu" = yes ; then
-  if test "$linux" = yes; then
-if test "$virtfs" != no && test "$cap" = yes && test "$attr" = yes ; then
+  if test "$virtfs" != no; then
+if test "$linux" = yes; then
+  if test "$cap" = yes && test "$attr" = yes ; then
+virtfs=yes
+tools="$tools fsdev/virtfs-proxy-helper\$(EXESUF)"
+  else
+if test "$virtfs" = yes; then
+  error_exit "VirtFS requires libcap devel and libattr devel under 
Linux"
+fi
+virtfs=no
+  fi
+elif test "$darwin" = yes; then
   virtfs=yes
   tools="$tools fsdev/virtfs-proxy-helper\$(EXESUF)"
 else
   if test "$virtfs" = yes; then
-error_exit "VirtFS requires libcap devel and libattr devel"
+error_exit "VirtFS is supported only on Linux and Darwin"
   fi
   virtfs=no
 fi
+  fi
+  if test "$linux" = yes; then
 if test "$mpath" != no && test "$mpathpersist" = yes ; then
   mpath=yes
 else
@@ -,10 +5567,6 @@ if test "$softmmu" = yes ; then
 fi
 tools="$tools scsi/qemu-pr-helper\$(EXESUF)"
   else
-if test "$virtfs" = yes; then
-  error_exit "VirtFS is supported only on Linux"
-fi
-virtfs=no
 if test "$mpath" = yes; then
   error_exit "Multipath is supported only on Linux"
 fi
-- 
2.8.1




[Qemu-devel] [PATCH v2 18/20] 9p: darwin: Implement compatibility for mknodat

2018-05-31 Thread Keno Fischer
Darwin does not support mknodat. However, to avoid race conditions
with later setting the permissions, we must avoid using mknod on
the full path instead. We could try to fchdir, but that would cause
problems if multiple threads try to call mknodat at the same time.
However, luckily there is a solution: Darwin as an (unexposed in the
C library) system call that sets the cwd for the current thread only.
This should suffice to use mknod safely.

Signed-off-by: Keno Fischer 
---

Changes from v1: New patch. The previous series marked mknodat unsupported.

 hw/9pfs/9p-local.c   |  5 +++--
 hw/9pfs/9p-util-darwin.c | 25 +
 hw/9pfs/9p-util-linux.c  |  5 +
 hw/9pfs/9p-util.h|  2 ++
 4 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index 47e8580..c7a2b08 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -668,7 +668,7 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
*dir_path,
 
 if (fs_ctx->export_flags & V9FS_SM_MAPPED ||
 fs_ctx->export_flags & V9FS_SM_MAPPED_FILE) {
-err = mknodat(dirfd, name, fs_ctx->fmode | S_IFREG, 0);
+err = qemu_mknodat(dirfd, name, fs_ctx->fmode | S_IFREG, 0);
 if (err == -1) {
 goto out;
 }
@@ -683,7 +683,7 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
*dir_path,
 }
 } else if (fs_ctx->export_flags & V9FS_SM_PASSTHROUGH ||
fs_ctx->export_flags & V9FS_SM_NONE) {
-err = mknodat(dirfd, name, credp->fc_mode, credp->fc_rdev);
+err = qemu_mknodat(dirfd, name, credp->fc_mode, credp->fc_rdev);
 if (err == -1) {
 goto out;
 }
@@ -696,6 +696,7 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
*dir_path,
 
 err_end:
 unlinkat_preserve_errno(dirfd, name, 0);
+
 out:
 close_preserve_errno(dirfd);
 return err;
diff --git a/hw/9pfs/9p-util-darwin.c b/hw/9pfs/9p-util-darwin.c
index ac414bc..49fe7d3 100644
--- a/hw/9pfs/9p-util-darwin.c
+++ b/hw/9pfs/9p-util-darwin.c
@@ -158,3 +158,28 @@ done:
 close_preserve_errno(fd);
 return ret;
 }
+
+#ifndef SYS___pthread_fchdir
+# define SYS___pthread_fchdir 349
+#endif
+
+static int fchdir_thread_local(int fd)
+{
+return syscall(SYS___pthread_fchdir, fd);
+}
+
+int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev)
+{
+int preserved_errno, err;
+if (fchdir_thread_local(dirfd) < 0) {
+return -1;
+}
+err = mknod(filename, mode, dev);
+preserved_errno = errno;
+/* Stop using the thread-local cwd */
+fchdir_thread_local(-1);
+if (err < 0) {
+errno = preserved_errno;
+}
+return err;
+}
diff --git a/hw/9pfs/9p-util-linux.c b/hw/9pfs/9p-util-linux.c
index 3902378..06399c5 100644
--- a/hw/9pfs/9p-util-linux.c
+++ b/hw/9pfs/9p-util-linux.c
@@ -63,3 +63,8 @@ int utimensat_nofollow(int dirfd, const char *filename,
 {
 return utimensat(dirfd, filename, times, AT_SYMLINK_NOFOLLOW);
 }
+
+int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev)
+{
+return mknodat(dirfd, filename, mode, dev);
+}
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index b1dc08a..127564d 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -90,4 +90,6 @@ ssize_t fremovexattrat_nofollow(int dirfd, const char 
*filename,
 int utimensat_nofollow(int dirfd, const char *filename,
const struct timespec times[2]);
 
+int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev);
+
 #endif
-- 
2.8.1




[Qemu-devel] [PATCH v6] arm_gicv3_kvm: kvm_dist_get/put: skip the registers banked by GICR

2018-05-31 Thread Shannon Zhao
While we skip the GIC_INTERNAL irqs, we don't change the register offset
accordingly. This will overlap the GICR registers value and leave the
last GIC_INTERNAL irq's registers out of update.

Fix this by skipping the registers banked by GICR.

Also for migration compatibility if the migration source (old version
qemu) doesn't send gicd_no_migration_shift_bug = 1 to destination, then
we shift the data of PPI to get the right data for SPI.

Fixes: 367b9f527becdd20ddf116e17a3c0c2bbc486920
Cc: qemu-sta...@nongnu.org
Reviewed-by: Eric Auger 
Reviewed-by: Peter Maydell 
Signed-off-by: Shannon Zhao 
---
V6: Fix typos and keep same with offset for clroffset 
---
 hw/intc/arm_gicv3_common.c | 79 ++
 hw/intc/arm_gicv3_kvm.c| 38 ++
 include/hw/intc/arm_gicv3_common.h |  1 +
 3 files changed, 118 insertions(+)

diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index 7b54d52..864b7c6 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -27,6 +27,7 @@
 #include "hw/intc/arm_gicv3_common.h"
 #include "gicv3_internal.h"
 #include "hw/arm/linux-boot-if.h"
+#include "sysemu/kvm.h"
 
 static int gicv3_pre_save(void *opaque)
 {
@@ -141,6 +142,79 @@ static const VMStateDescription vmstate_gicv3_cpu = {
 }
 };
 
+static int gicv3_gicd_no_migration_shift_bug_pre_load(void *opaque)
+{
+GICv3State *cs = opaque;
+
+   /*
+* The gicd_no_migration_shift_bug flag is used for migration compatibility
+* for old version QEMU which may have the GICD bmp shift bug under KVM 
mode.
+* Strictly, what we want to know is whether the migration source is using
+* KVM. Since we don't have any way to determine that, we look at whether 
the
+* destination is using KVM; this is close enough because for the older QEMU
+* versions with this bug KVM -> TCG migration didn't work anyway. If the
+* source is a newer QEMU without this bug it will transmit the migration
+* subsection which sets the flag to true; otherwise it will remain set to
+* the value we select here.
+*/
+if (kvm_enabled()) {
+cs->gicd_no_migration_shift_bug = false;
+}
+
+return 0;
+}
+
+static int gicv3_gicd_no_migration_shift_bug_post_load(void *opaque,
+   int version_id)
+{
+GICv3State *cs = opaque;
+
+if (cs->gicd_no_migration_shift_bug) {
+return 0;
+}
+
+/* Older versions of QEMU had a bug in the handling of state save/restore
+ * to the KVM GICv3: they got the offset in the bitmap arrays wrong,
+ * so that instead of the data for external interrupts 32 and up
+ * starting at bit position 32 in the bitmap, it started at bit
+ * position 64. If we're receiving data from a QEMU with that bug,
+ * we must move the data down into the right place.
+ */
+memmove(cs->group, (uint8_t *)cs->group + GIC_INTERNAL / 8,
+sizeof(cs->group) - GIC_INTERNAL / 8);
+memmove(cs->grpmod, (uint8_t *)cs->grpmod + GIC_INTERNAL / 8,
+sizeof(cs->grpmod) - GIC_INTERNAL / 8);
+memmove(cs->enabled, (uint8_t *)cs->enabled + GIC_INTERNAL / 8,
+sizeof(cs->enabled) - GIC_INTERNAL / 8);
+memmove(cs->pending, (uint8_t *)cs->pending + GIC_INTERNAL / 8,
+sizeof(cs->pending) - GIC_INTERNAL / 8);
+memmove(cs->active, (uint8_t *)cs->active + GIC_INTERNAL / 8,
+sizeof(cs->active) - GIC_INTERNAL / 8);
+memmove(cs->edge_trigger, (uint8_t *)cs->edge_trigger + GIC_INTERNAL / 8,
+sizeof(cs->edge_trigger) - GIC_INTERNAL / 8);
+
+/*
+ * While this new version QEMU doesn't have this kind of bug as we fix it,
+ * so it needs to set the flag to true to indicate that and it's necessary
+ * for next migration to work from this new version QEMU.
+ */
+cs->gicd_no_migration_shift_bug = true;
+
+return 0;
+}
+
+const VMStateDescription vmstate_gicv3_gicd_no_migration_shift_bug = {
+.name = "arm_gicv3/gicd_no_migration_shift_bug",
+.version_id = 1,
+.minimum_version_id = 1,
+.pre_load = gicv3_gicd_no_migration_shift_bug_pre_load,
+.post_load = gicv3_gicd_no_migration_shift_bug_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_BOOL(gicd_no_migration_shift_bug, GICv3State),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_gicv3 = {
 .name = "arm_gicv3",
 .version_id = 1,
@@ -165,6 +239,10 @@ static const VMStateDescription vmstate_gicv3 = {
 VMSTATE_STRUCT_VARRAY_POINTER_UINT32(cpu, GICv3State, num_cpu,
  vmstate_gicv3_cpu, GICv3CPUState),
 VMSTATE_END_OF_LIST()
+},
+.subsections = (const VMStateDescription * []) {
+_gicv3_gicd_no_migration_shift_bug,
+NULL
 }
 };
 
@@ -364,6 +442,7 @@ static void arm_gicv3_common_reset(DeviceState *dev)
 gicv3_gicd_group_set(s, 

[Qemu-devel] [PATCH v2 19/20] 9p: darwin: virtfs-proxy: Implement setuid code for darwin

2018-05-31 Thread Keno Fischer
Darwin does not have linux capabilities, so make that code linux-only.
Darwin also does not have setresuid/gid. The correct way to temporarily
drop capabilities is to call seteuid/gid.

Also factor out the code that acquires acquire_dac_override into a separate
function in the linux implementation. I had originally done this when
I thought it made sense to have only one `setugid` function, but I retained
this because it seems clearer this way.

Signed-off-by: Keno Fischer 
---

Changes from v1: New patch.

 fsdev/virtfs-proxy-helper.c | 200 +++-
 1 file changed, 125 insertions(+), 75 deletions(-)

diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index d8dd3f5..6baf2a6 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -82,6 +82,7 @@ static void do_perror(const char *string)
 }
 }
 
+#ifdef CONFIG_LINUX
 static int do_cap_set(cap_value_t *cap_value, int size, int reset)
 {
 cap_t caps;
@@ -121,6 +122,85 @@ error:
 return -1;
 }
 
+static int acquire_dac_override(void)
+{
+cap_value_t cap_list[] = {
+CAP_DAC_OVERRIDE,
+};
+return do_cap_set(cap_list, ARRAY_SIZE(cap_list), 0);
+}
+
+/*
+ * from man 7 capabilities, section
+ * Effect of User ID Changes on Capabilities:
+ * If the effective user ID is changed from nonzero to 0, then the permitted
+ * set is copied to the effective set.  If the effective user ID is changed
+ * from 0 to nonzero, then all capabilities are are cleared from the effective
+ * set.
+ *
+ * The setfsuid/setfsgid man pages warn that changing the effective user ID may
+ * expose the program to unwanted signals, but this is not true anymore: for an
+ * unprivileged (without CAP_KILL) program to send a signal, the real or
+ * effective user ID of the sending process must equal the real or saved user
+ * ID of the target process.  Even when dropping privileges, it is enough to
+ * keep the saved UID to a "privileged" value and virtfs-proxy-helper won't
+ * be exposed to signals.  So just use setresuid/setresgid.
+ */
+static int setugid(int uid, int gid, int *suid, int *sgid)
+{
+int retval;
+
+*suid = geteuid();
+*sgid = getegid();
+
+if (setresgid(-1, gid, *sgid) == -1) {
+retval = -errno;
+goto err_out;
+}
+
+if (setresuid(-1, uid, *suid) == -1) {
+retval = -errno;
+goto err_sgid;
+}
+
+if (uid != 0 || gid != 0) {
+/*
+* We still need DAC_OVERRIDE because we don't change
+* supplementary group ids, and hence may be subjected DAC rules
+*/
+if (acquire_dac_override() < 0) {
+retval = -errno;
+goto err_suid;
+}
+}
+return 0;
+
+err_suid:
+if (setresuid(-1, *suid, *suid) == -1) {
+abort();
+}
+err_sgid:
+if (setresgid(-1, *sgid, *sgid) == -1) {
+abort();
+}
+err_out:
+return retval;
+}
+
+/*
+ * This is used to reset the ugid back with the saved values
+ * There is nothing much we can do checking error values here.
+ */
+static void resetugid(int suid, int sgid)
+{
+if (setresgid(-1, sgid, sgid) == -1) {
+abort();
+}
+if (setresuid(-1, suid, suid) == -1) {
+abort();
+}
+}
+
 static int init_capabilities(void)
 {
 /* helper needs following capabilities only */
@@ -135,6 +215,51 @@ static int init_capabilities(void)
 };
 return do_cap_set(cap_list, ARRAY_SIZE(cap_list), 1);
 }
+#else
+static int setugid(int uid, int gid, int *suid, int *sgid)
+{
+int retval;
+
+*suid = geteuid();
+*sgid = getegid();
+
+if (setegid(gid) == -1) {
+retval = -errno;
+goto err_out;
+}
+
+if (seteuid(uid) == -1) {
+retval = -errno;
+goto err_sgid;
+}
+
+err_sgid:
+if (setgid(*sgid) == -1) {
+abort();
+}
+err_out:
+return retval;
+}
+
+/*
+ * This is used to reset the ugid back with the saved values
+ * There is nothing much we can do checking error values here.
+ */
+static void resetugid(int suid, int sgid)
+{
+if (setegid(sgid) == -1) {
+abort();
+}
+if (seteuid(suid) == -1) {
+abort();
+}
+}
+
+static int init_capabilities(void)
+{
+return 0;
+}
+#endif
 
 static int socket_read(int sockfd, void *buff, ssize_t size)
 {
@@ -279,81 +404,6 @@ static int send_status(int sockfd, struct iovec *iovec, 
int status)
 }
 
 /*
- * from man 7 capabilities, section
- * Effect of User ID Changes on Capabilities:
- * If the effective user ID is changed from nonzero to 0, then the permitted
- * set is copied to the effective set.  If the effective user ID is changed
- * from 0 to nonzero, then all capabilities are are cleared from the effective
- * set.
- *
- * The setfsuid/setfsgid man pages warn that changing the effective user ID may
- * expose the program to unwanted signals, but this is not true anymore: for an
- * unprivileged (without CAP_KILL) program to send a 

[Qemu-devel] [PATCH v2 12/20] 9p: darwin: Explicitly cast comparisons of mode_t with -1

2018-05-31 Thread Keno Fischer
Comparisons of mode_t with -1 require an explicit cast, since mode_t
is unsigned on Darwin.

Signed-off-by: Keno Fischer 
---

Changes from v1: Split from strchrnul change

 hw/9pfs/9p-local.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index 6222891..1f0b1b0 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -310,7 +310,7 @@ update_map_file:
 if (credp->fc_gid != -1) {
 gid = credp->fc_gid;
 }
-if (credp->fc_mode != -1) {
+if (credp->fc_mode != (mode_t)-1) {
 mode = credp->fc_mode;
 }
 if (credp->fc_rdev != -1) {
@@ -416,7 +416,7 @@ static int local_set_xattrat(int dirfd, const char *path, 
FsCred *credp)
 return err;
 }
 }
-if (credp->fc_mode != -1) {
+if (credp->fc_mode != (mode_t)-1) {
 uint32_t tmp_mode = cpu_to_le32(credp->fc_mode);
 err = fsetxattrat_nofollow(dirfd, path, "user.virtfs.mode", _mode,
sizeof(mode_t), 0);
-- 
2.8.1




[Qemu-devel] [PATCH v2 15/20] 9p: darwin: *xattr_nofollow implementations

2018-05-31 Thread Keno Fischer
This implements the darwin equivalent of the functions that were
moved to 9p-util(-linux) earlier in this series in the new
9p-util-darwin file.

Signed-off-by: Keno Fischer 
---

Changes from v1:
 * New 9p-util-darwin.c rather than ifdefs in 9p-util.c
 * Drop incorrect AT_NOFOLLOW from the actual call

 hw/9pfs/9p-util-darwin.c | 64 
 hw/9pfs/Makefile.objs|  1 +
 2 files changed, 65 insertions(+)
 create mode 100644 hw/9pfs/9p-util-darwin.c

diff --git a/hw/9pfs/9p-util-darwin.c b/hw/9pfs/9p-util-darwin.c
new file mode 100644
index 000..cdb4c9e
--- /dev/null
+++ b/hw/9pfs/9p-util-darwin.c
@@ -0,0 +1,64 @@
+/*
+ * 9p utilities (Darwin Implementation)
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/xattr.h"
+#include "9p-util.h"
+
+ssize_t fgetxattrat_nofollow(int dirfd, const char *filename, const char *name,
+ void *value, size_t size)
+{
+int ret;
+int fd = openat_file(dirfd, filename,
+ O_RDONLY | O_PATH_9P_UTIL | O_NOFOLLOW, 0);
+if (fd == -1) {
+return -1;
+}
+ret = fgetxattr(fd, name, value, size, 0, 0);
+close_preserve_errno(fd);
+return ret;
+}
+
+ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
+  char *list, size_t size)
+{
+int ret;
+int fd = openat_file(dirfd, filename,
+ O_RDONLY | O_PATH_9P_UTIL | O_NOFOLLOW, 0);
+if (fd == -1) {
+return -1;
+}
+ret = flistxattr(fd, list, size, 0);
+close_preserve_errno(fd);
+return ret;
+}
+
+ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
+const char *name)
+{
+int ret;
+int fd = openat_file(dirfd, filename, O_PATH_9P_UTIL | O_NOFOLLOW, 0);
+if (fd == -1) {
+return -1;
+}
+ret = fremovexattr(fd, name, 0);
+close_preserve_errno(fd);
+return ret;
+}
+
+int fsetxattrat_nofollow(int dirfd, const char *filename, const char *name,
+ void *value, size_t size, int flags)
+{
+int ret;
+int fd = openat_file(dirfd, filename, O_PATH_9P_UTIL | O_NOFOLLOW, 0);
+if (fd == -1) {
+return -1;
+}
+ret = fsetxattr(fd, name, value, size, 0, flags);
+close_preserve_errno(fd);
+return ret;
+}
diff --git a/hw/9pfs/Makefile.objs b/hw/9pfs/Makefile.objs
index 083508f..24a8695 100644
--- a/hw/9pfs/Makefile.objs
+++ b/hw/9pfs/Makefile.objs
@@ -1,5 +1,6 @@
 common-obj-y  = 9p.o
 common-obj-$(CONFIG_LINUX) += 9p-util-linux.o
+common-obj-$(CONFIG_DARWIN) += 9p-util-darwin.o
 common-obj-y += 9p-local.o 9p-xattr.o
 common-obj-y += 9p-xattr-user.o 9p-posix-acl.o
 common-obj-y += coth.o cofs.o codir.o cofile.o
-- 
2.8.1




[Qemu-devel] [PATCH v2 17/20] 9p: darwin: Provide a fallback implementation for utimensat

2018-05-31 Thread Keno Fischer
This function is new in Mac OS 10.13. Provide a fallback implementation
when building against older SDKs. The complication in the definition comes
having to separately handle the used SDK version and the target OS version.

- If the SDK version is too low (__MAC_10_13 not defined), utimensat is not
  defined in the header, so we must not try to use it (doing so would error).
- Otherwise, if the targetted OS version is at least 10.13, we know this
  function is available, so we can unconditionally call it.
- Lastly, we check for the availability of the __builtin_available macro to
  potentially insert a dynamic check for this OS version. However, 
__builtin_available
  is only available with sufficiently recent versions of clang and while all
  Apple clang versions that ship with Xcode versions that support the 10.13
  SDK support with builtin, we want to allow building with compilers other
  than Apple clang that may not support this builtin.

Signed-off-by: Keno Fischer 
---

Changes from v1:
 * Correct calculation of tv_usec
 * Support UTIME_NOW/UTIME/OMIT
 * Now covers fsdev/virtfs-proxy-helper.c

 fsdev/virtfs-proxy-helper.c |  3 +-
 hw/9pfs/9p-local.c  |  2 +-
 hw/9pfs/9p-util-darwin.c| 96 +
 hw/9pfs/9p-util-linux.c |  6 +++
 hw/9pfs/9p-util.h   |  8 
 hw/9pfs/9p.c|  1 +
 6 files changed, 113 insertions(+), 3 deletions(-)

diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index a26f8b8..d8dd3f5 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -957,8 +957,7 @@ static int process_requests(int sock)
  [0].tv_sec, [0].tv_nsec,
  [1].tv_sec, [1].tv_nsec);
 if (retval > 0) {
-retval = utimensat(AT_FDCWD, path.data, spec,
-   AT_SYMLINK_NOFOLLOW);
+retval = utimensat_nofollow(AT_FDCWD, path.data, spec);
 if (retval < 0) {
 retval = -errno;
 }
diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index 7830526..47e8580 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -1071,7 +1071,7 @@ static int local_utimensat(FsContext *s, V9fsPath 
*fs_path,
 goto out;
 }
 
-ret = utimensat(dirfd, name, buf, AT_SYMLINK_NOFOLLOW);
+ret = utimensat_nofollow(dirfd, name, buf);
 close_preserve_errno(dirfd);
 out:
 g_free(dirpath);
diff --git a/hw/9pfs/9p-util-darwin.c b/hw/9pfs/9p-util-darwin.c
index cdb4c9e..ac414bc 100644
--- a/hw/9pfs/9p-util-darwin.c
+++ b/hw/9pfs/9p-util-darwin.c
@@ -62,3 +62,99 @@ int fsetxattrat_nofollow(int dirfd, const char *filename, 
const char *name,
 close_preserve_errno(fd);
 return ret;
 }
+
+#ifndef __has_builtin
+#define __has_builtin(x) 0
+#endif
+
+static int update_times_from_stat(int fd, struct timespec times[2],
+  int update0, int update1)
+{
+struct stat buf;
+int ret = fstat(fd, );
+if (ret == -1) {
+return ret;
+}
+if (update0) {
+times[0] = buf.st_atimespec;
+}
+if (update1) {
+times[1] = buf.st_mtimespec;
+}
+return 0;
+}
+
+int utimensat_nofollow(int dirfd, const char *filename,
+   const struct timespec times_in[2])
+{
+int ret, fd;
+int special0, special1;
+struct timeval futimes_buf[2];
+struct timespec times[2];
+memcpy(times, times_in, 2 * sizeof(struct timespec));
+
+/* Check whether we have an SDK version that defines utimensat */
+#if defined(__MAC_10_13)
+# if __MAC_OS_X_VERSION_MIN_REQUIRED >= __MAC_10_13
+#  define UTIMENSAT_AVAILABLE 1
+# elif __has_builtin(__builtin_available)
+#  define UTIMENSAT_AVAILABLE __builtin_available(macos 10.13, *)
+# else
+#  define UTIMENSAT_AVAILABLE 0
+# endif
+if (UTIMENSAT_AVAILABLE) {
+return utimensat(dirfd, filename, times, AT_SYMLINK_NOFOLLOW);
+}
+#endif
+
+/* utimensat not available. Use futimes. */
+fd = openat_file(dirfd, filename, O_PATH_9P_UTIL | O_NOFOLLOW, 0);
+if (fd == -1) {
+return -1;
+}
+
+special0 = times[0].tv_nsec == UTIME_OMIT;
+special1 = times[1].tv_nsec == UTIME_OMIT;
+if (special0 || special1) {
+/* If both are set, nothing to do */
+if (special0 && special1) {
+ret = 0;
+goto done;
+}
+
+ret = update_times_from_stat(fd, times, special0, special1);
+if (ret < 0) {
+goto done;
+}
+}
+
+special0 = times[0].tv_nsec == UTIME_NOW;
+special1 = times[1].tv_nsec == UTIME_NOW;
+if (special0 || special1) {
+ret = futimes(fd, NULL);
+if (ret < 0) {
+goto done;
+}
+
+/* If both are set, we are done */
+if (special0 && special1) {
+ret = 0;
+goto done;
+}
+
+ret = 

[Qemu-devel] [PATCH v2 10/20] 9p: darwin: Handle struct stat(fs) differences

2018-05-31 Thread Keno Fischer
Signed-off-by: Keno Fischer 
---

Changes since v1:
 * Now also covers fsdev/virtfs-proxy-helper.c

 fsdev/virtfs-proxy-helper.c | 14 +++---
 hw/9pfs/9p-proxy.c  | 17 ++---
 hw/9pfs/9p-synth.c  |  2 ++
 hw/9pfs/9p.c| 16 ++--
 4 files changed, 41 insertions(+), 8 deletions(-)

diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index 94fb069..3bc1269 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -506,12 +506,15 @@ static void stat_to_prstat(ProxyStat *pr_stat, struct 
stat *stat)
 pr_stat->st_size = stat->st_size;
 pr_stat->st_blksize = stat->st_blksize;
 pr_stat->st_blocks = stat->st_blocks;
+#ifdef CONFIG_DARWIN
+pr_stat->st_atim_nsec = stat->st_atimespec.tv_nsec;
+pr_stat->st_mtim_nsec = stat->st_mtimespec.tv_nsec;
+pr_stat->st_ctim_nsec = stat->st_ctimespec.tv_nsec;
+#else
 pr_stat->st_atim_sec = stat->st_atim.tv_sec;
-pr_stat->st_atim_nsec = stat->st_atim.tv_nsec;
 pr_stat->st_mtim_sec = stat->st_mtim.tv_sec;
-pr_stat->st_mtim_nsec = stat->st_mtim.tv_nsec;
 pr_stat->st_ctim_sec = stat->st_ctim.tv_sec;
-pr_stat->st_ctim_nsec = stat->st_ctim.tv_nsec;
+#endif
 }
 
 static void statfs_to_prstatfs(ProxyStatFS *pr_stfs, struct statfs *stfs)
@@ -524,10 +527,15 @@ static void statfs_to_prstatfs(ProxyStatFS *pr_stfs, 
struct statfs *stfs)
 pr_stfs->f_bavail = stfs->f_bavail;
 pr_stfs->f_files = stfs->f_files;
 pr_stfs->f_ffree = stfs->f_ffree;
+#ifdef CONFIG_DARWIN
+pr_stfs->f_fsid[0] = stfs->f_fsid.val[0];
+pr_stfs->f_fsid[1] = stfs->f_fsid.val[1];
+#else
 pr_stfs->f_fsid[0] = stfs->f_fsid.__val[0];
 pr_stfs->f_fsid[1] = stfs->f_fsid.__val[1];
 pr_stfs->f_namelen = stfs->f_namelen;
 pr_stfs->f_frsize = stfs->f_frsize;
+#endif
 }
 
 /*
diff --git a/hw/9pfs/9p-proxy.c b/hw/9pfs/9p-proxy.c
index 47a94e0..8a2c174 100644
--- a/hw/9pfs/9p-proxy.c
+++ b/hw/9pfs/9p-proxy.c
@@ -117,10 +117,15 @@ static void prstatfs_to_statfs(struct statfs *stfs, 
ProxyStatFS *prstfs)
 stfs->f_bavail = prstfs->f_bavail;
 stfs->f_files = prstfs->f_files;
 stfs->f_ffree = prstfs->f_ffree;
+#ifdef CONFIG_DARWIN
+stfs->f_fsid.val[0] = prstfs->f_fsid[0] & 0xU;
+stfs->f_fsid.val[1] = prstfs->f_fsid[1] >> 32 & 0xU;
+#else
 stfs->f_fsid.__val[0] = prstfs->f_fsid[0] & 0xU;
 stfs->f_fsid.__val[1] = prstfs->f_fsid[1] >> 32 & 0xU;
 stfs->f_namelen = prstfs->f_namelen;
 stfs->f_frsize = prstfs->f_frsize;
+#endif
 }
 
 /* Converts proxy_stat structure to VFS stat structure */
@@ -137,12 +142,18 @@ static void prstat_to_stat(struct stat *stbuf, ProxyStat 
*prstat)
stbuf->st_size = prstat->st_size;
stbuf->st_blksize = prstat->st_blksize;
stbuf->st_blocks = prstat->st_blocks;
-   stbuf->st_atim.tv_sec = prstat->st_atim_sec;
-   stbuf->st_atim.tv_nsec = prstat->st_atim_nsec;
+   stbuf->st_atime = prstat->st_atim_sec;
stbuf->st_mtime = prstat->st_mtim_sec;
-   stbuf->st_mtim.tv_nsec = prstat->st_mtim_nsec;
stbuf->st_ctime = prstat->st_ctim_sec;
+#ifdef CONFIG_DARWIN
+   stbuf->st_atimespec.tv_nsec = prstat->st_atim_nsec;
+   stbuf->st_mtimespec.tv_nsec = prstat->st_mtim_nsec;
+   stbuf->st_ctimespec.tv_nsec = prstat->st_ctim_nsec;
+#else
+   stbuf->st_atim.tv_nsec = prstat->st_atim_nsec;
+   stbuf->st_mtim.tv_nsec = prstat->st_mtim_nsec;
stbuf->st_ctim.tv_nsec = prstat->st_ctim_nsec;
+#endif
 }
 
 /*
diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
index 54239c9..eb68b42 100644
--- a/hw/9pfs/9p-synth.c
+++ b/hw/9pfs/9p-synth.c
@@ -426,7 +426,9 @@ static int synth_statfs(FsContext *s, V9fsPath *fs_path,
 stbuf->f_bsize = 512;
 stbuf->f_blocks = 0;
 stbuf->f_files = synth_node_count;
+#ifndef CONFIG_DARWIN
 stbuf->f_namelen = NAME_MAX;
+#endif
 return 0;
 }
 
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index a757374..212f569 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -905,11 +905,17 @@ static void stat_to_v9stat_dotl(V9fsState *s, const 
struct stat *stbuf,
 v9lstat->st_blksize = stbuf->st_blksize;
 v9lstat->st_blocks = stbuf->st_blocks;
 v9lstat->st_atime_sec = stbuf->st_atime;
-v9lstat->st_atime_nsec = stbuf->st_atim.tv_nsec;
 v9lstat->st_mtime_sec = stbuf->st_mtime;
-v9lstat->st_mtime_nsec = stbuf->st_mtim.tv_nsec;
 v9lstat->st_ctime_sec = stbuf->st_ctime;
+#ifdef CONFIG_DARWIN
+v9lstat->st_atime_nsec = stbuf->st_atimespec.tv_nsec;
+v9lstat->st_mtime_nsec = stbuf->st_mtimespec.tv_nsec;
+v9lstat->st_ctime_nsec = stbuf->st_ctimespec.tv_nsec;
+#else
+v9lstat->st_atime_nsec = stbuf->st_atim.tv_nsec;
+v9lstat->st_mtime_nsec = stbuf->st_mtim.tv_nsec;
 v9lstat->st_ctime_nsec = stbuf->st_ctim.tv_nsec;
+#endif
 /* Currently we only support BASIC fields in stat */
 v9lstat->st_result_mask = P9_STATS_BASIC;
 
@@ -2959,9 +2965,15 @@ static int v9fs_fill_statfs(V9fsState *s, V9fsPDU *pdu, 

[Qemu-devel] [PATCH v2 09/20] 9p: Properly check/translate flags in unlinkat

2018-05-31 Thread Keno Fischer
This code previously relied on P9_DOTL_AT_REMOVEDIR and AT_REMOVEDIR
having the same numerical value and deferred any errorchecking to the
syscall itself. However, while the former assumption is true on Linux,
it is not true in general. Thus, add appropriate error checking and
translation to the 9p unlinkat server code.

Signed-off-by: Keno Fischer 
---

Changes since v1:
 * Code was moved from 9p-local.c to server entry point in 9p.c

 hw/9pfs/9p.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index b80db65..a757374 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -2522,7 +2522,7 @@ static void coroutine_fn v9fs_unlinkat(void *opaque)
 {
 int err = 0;
 V9fsString name;
-int32_t dfid, flags;
+int32_t dfid, flags, rflags = 0;
 size_t offset = 7;
 V9fsPath path;
 V9fsFidState *dfidp;
@@ -2549,6 +2549,15 @@ static void coroutine_fn v9fs_unlinkat(void *opaque)
 goto out_nofid;
 }
 
+if (flags & ~P9_DOTL_AT_REMOVEDIR) {
+err = -EINVAL;
+goto out_nofid;
+}
+
+if (flags & P9_DOTL_AT_REMOVEDIR) {
+rflags |= AT_REMOVEDIR;
+}
+
 dfidp = get_fid(pdu, dfid);
 if (dfidp == NULL) {
 err = -EINVAL;
@@ -2567,7 +2576,7 @@ static void coroutine_fn v9fs_unlinkat(void *opaque)
 if (err < 0) {
 goto out_err;
 }
-err = v9fs_co_unlinkat(pdu, >path, , flags);
+err = v9fs_co_unlinkat(pdu, >path, , rflags);
 if (!err) {
 err = offset;
 }
-- 
2.8.1




[Qemu-devel] [PATCH v2 14/20] 9p: darwin: Provide a compatibility definition for XATTR_SIZE_MAX

2018-05-31 Thread Keno Fischer
Signed-off-by: Keno Fischer 
---

No change from v1.

 hw/9pfs/9p.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 70cfab9..24802b9 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -3374,6 +3374,13 @@ out_nofid:
 v9fs_string_free();
 }
 
+#if defined(CONFIG_DARWIN) && !defined(XATTR_SIZE_MAX)
+/* Darwin doesn't seem to define a maximum xattr size in its user
+   user space header, but looking at the kernel source, HFS supports
+   up to INT32_MAX, so use that as the maximum.
+*/
+#define XATTR_SIZE_MAX INT32_MAX
+#endif
 static void coroutine_fn v9fs_xattrcreate(void *opaque)
 {
 int flags;
-- 
2.8.1




[Qemu-devel] [PATCH v2 08/20] 9p: Rename 9p-util -> 9p-util-linux

2018-05-31 Thread Keno Fischer
The current file only has the Linux versions of these functions.
Rename the file accordingly and update the Makefile to only build
it on Linux. A Darwin version of these will follow later in the
series.

Signed-off-by: Keno Fischer 
---

Changes since v1: New patch

 hw/9pfs/9p-util-linux.c | 59 +
 hw/9pfs/9p-util.c   | 59 -
 hw/9pfs/Makefile.objs   |  3 ++-
 3 files changed, 61 insertions(+), 60 deletions(-)
 create mode 100644 hw/9pfs/9p-util-linux.c
 delete mode 100644 hw/9pfs/9p-util.c

diff --git a/hw/9pfs/9p-util-linux.c b/hw/9pfs/9p-util-linux.c
new file mode 100644
index 000..defa3a4
--- /dev/null
+++ b/hw/9pfs/9p-util-linux.c
@@ -0,0 +1,59 @@
+/*
+ * 9p utilities (Linux Implementation)
+ *
+ * Copyright IBM, Corp. 2017
+ *
+ * Authors:
+ *  Greg Kurz 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/xattr.h"
+#include "9p-util.h"
+
+ssize_t fgetxattrat_nofollow(int dirfd, const char *filename, const char *name,
+ void *value, size_t size)
+{
+char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
+int ret;
+
+ret = lgetxattr(proc_path, name, value, size);
+g_free(proc_path);
+return ret;
+}
+
+ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
+  char *list, size_t size)
+{
+char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
+int ret;
+
+ret = llistxattr(proc_path, list, size);
+g_free(proc_path);
+return ret;
+}
+
+ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
+const char *name)
+{
+char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
+int ret;
+
+ret = lremovexattr(proc_path, name);
+g_free(proc_path);
+return ret;
+}
+
+int fsetxattrat_nofollow(int dirfd, const char *filename, const char *name,
+ void *value, size_t size, int flags)
+{
+char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
+int ret;
+
+ret = lsetxattr(proc_path, name, value, size, flags);
+g_free(proc_path);
+return ret;
+}
diff --git a/hw/9pfs/9p-util.c b/hw/9pfs/9p-util.c
deleted file mode 100644
index 614b7fc..000
--- a/hw/9pfs/9p-util.c
+++ /dev/null
@@ -1,59 +0,0 @@
-/*
- * 9p utilities
- *
- * Copyright IBM, Corp. 2017
- *
- * Authors:
- *  Greg Kurz 
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- */
-
-#include "qemu/osdep.h"
-#include "qemu/xattr.h"
-#include "9p-util.h"
-
-ssize_t fgetxattrat_nofollow(int dirfd, const char *filename, const char *name,
- void *value, size_t size)
-{
-char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
-int ret;
-
-ret = lgetxattr(proc_path, name, value, size);
-g_free(proc_path);
-return ret;
-}
-
-ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
-  char *list, size_t size)
-{
-char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
-int ret;
-
-ret = llistxattr(proc_path, list, size);
-g_free(proc_path);
-return ret;
-}
-
-ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
-const char *name)
-{
-char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
-int ret;
-
-ret = lremovexattr(proc_path, name);
-g_free(proc_path);
-return ret;
-}
-
-int fsetxattrat_nofollow(int dirfd, const char *filename, const char *name,
- void *value, size_t size, int flags)
-{
-char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
-int ret;
-
-ret = lsetxattr(proc_path, name, value, size, flags);
-g_free(proc_path);
-return ret;
-}
diff --git a/hw/9pfs/Makefile.objs b/hw/9pfs/Makefile.objs
index fd90b62..083508f 100644
--- a/hw/9pfs/Makefile.objs
+++ b/hw/9pfs/Makefile.objs
@@ -1,4 +1,5 @@
-common-obj-y  = 9p.o 9p-util.o
+common-obj-y  = 9p.o
+common-obj-$(CONFIG_LINUX) += 9p-util-linux.o
 common-obj-y += 9p-local.o 9p-xattr.o
 common-obj-y += 9p-xattr-user.o 9p-posix-acl.o
 common-obj-y += coth.o cofs.o codir.o cofile.o
-- 
2.8.1




[Qemu-devel] [PATCH v2 13/20] 9p: darwin: Ignore O_{NOATIME, DIRECT}

2018-05-31 Thread Keno Fischer
Darwin doesn't have either of these flags. Darwin does have
F_NOCACHE, which is similar to O_DIRECT, but has different
enough semantics that other projects don't generally map
them automatically. In any case, we don't support O_DIRECT
on Linux at the moment either.

Signed-off-by: Keno Fischer 
---

Changes from v1: Undo accidental formatting change

 hw/9pfs/9p.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 9751246..70cfab9 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -123,11 +123,18 @@ static int dotl_to_open_flags(int flags)
 { P9_DOTL_NONBLOCK, O_NONBLOCK } ,
 { P9_DOTL_DSYNC, O_DSYNC },
 { P9_DOTL_FASYNC, FASYNC },
+#ifndef CONFIG_DARWIN
+{ P9_DOTL_NOATIME, O_NOATIME },
+/* On Darwin, we could map to F_NOCACHE, which is
+   similar, but doesn't quite have the same
+   semantics. However, we don't support O_DIRECT
+   even on linux at the moment, so we just ignore
+   it here. */
 { P9_DOTL_DIRECT, O_DIRECT },
+#endif
 { P9_DOTL_LARGEFILE, O_LARGEFILE },
 { P9_DOTL_DIRECTORY, O_DIRECTORY },
 { P9_DOTL_NOFOLLOW, O_NOFOLLOW },
-{ P9_DOTL_NOATIME, O_NOATIME },
 { P9_DOTL_SYNC, O_SYNC },
 };
 
@@ -156,10 +163,12 @@ static int get_dotl_openflags(V9fsState *s, int oflags)
  */
 flags = dotl_to_open_flags(oflags);
 flags &= ~(O_NOCTTY | O_ASYNC | O_CREAT);
+#ifndef CONFIG_DARWIN
 /*
  * Ignore direct disk access hint until the server supports it.
  */
 flags &= ~O_DIRECT;
+#endif
 return flags;
 }
 
-- 
2.8.1




[Qemu-devel] [PATCH v2 07/20] 9p: Move a couple xattr functions to 9p-util

2018-05-31 Thread Keno Fischer
These functions will need custom implementations on Darwin. Since the
implementation is very similar among all of them, and 9p-util already
has the _nofollow version of fgetxattrat, let's move them all there.

Signed-off-by: Keno Fischer 
---

Changes since v1:
 * fgetxattr_follow is dropped in favor of a different approach
   later in the series.

 hw/9pfs/9p-util.c  | 33 +
 hw/9pfs/9p-util.h  |  4 
 hw/9pfs/9p-xattr.c | 33 -
 3 files changed, 37 insertions(+), 33 deletions(-)

diff --git a/hw/9pfs/9p-util.c b/hw/9pfs/9p-util.c
index f709c27..614b7fc 100644
--- a/hw/9pfs/9p-util.c
+++ b/hw/9pfs/9p-util.c
@@ -24,3 +24,36 @@ ssize_t fgetxattrat_nofollow(int dirfd, const char 
*filename, const char *name,
 g_free(proc_path);
 return ret;
 }
+
+ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
+  char *list, size_t size)
+{
+char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
+int ret;
+
+ret = llistxattr(proc_path, list, size);
+g_free(proc_path);
+return ret;
+}
+
+ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
+const char *name)
+{
+char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
+int ret;
+
+ret = lremovexattr(proc_path, name);
+g_free(proc_path);
+return ret;
+}
+
+int fsetxattrat_nofollow(int dirfd, const char *filename, const char *name,
+ void *value, size_t size, int flags)
+{
+char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
+int ret;
+
+ret = lsetxattr(proc_path, name, value, size, flags);
+g_free(proc_path);
+return ret;
+}
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index dc0d2e2..79ed6b2 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -60,5 +60,9 @@ ssize_t fgetxattrat_nofollow(int dirfd, const char *path, 
const char *name,
  void *value, size_t size);
 int fsetxattrat_nofollow(int dirfd, const char *path, const char *name,
  void *value, size_t size, int flags);
+ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
+  char *list, size_t size);
+ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
+const char *name);
 
 #endif
diff --git a/hw/9pfs/9p-xattr.c b/hw/9pfs/9p-xattr.c
index d05c1a1..c696d8f 100644
--- a/hw/9pfs/9p-xattr.c
+++ b/hw/9pfs/9p-xattr.c
@@ -60,17 +60,6 @@ ssize_t pt_listxattr(FsContext *ctx, const char *path,
 return name_size;
 }
 
-static ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
- char *list, size_t size)
-{
-char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
-int ret;
-
-ret = llistxattr(proc_path, list, size);
-g_free(proc_path);
-return ret;
-}
-
 /*
  * Get the list and pass to each layer to find out whether
  * to send the data or not
@@ -196,17 +185,6 @@ ssize_t pt_getxattr(FsContext *ctx, const char *path, 
const char *name,
 return local_getxattr_nofollow(ctx, path, name, value, size);
 }
 
-int fsetxattrat_nofollow(int dirfd, const char *filename, const char *name,
- void *value, size_t size, int flags)
-{
-char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
-int ret;
-
-ret = lsetxattr(proc_path, name, value, size, flags);
-g_free(proc_path);
-return ret;
-}
-
 ssize_t local_setxattr_nofollow(FsContext *ctx, const char *path,
 const char *name, void *value, size_t size,
 int flags)
@@ -235,17 +213,6 @@ int pt_setxattr(FsContext *ctx, const char *path, const 
char *name, void *value,
 return local_setxattr_nofollow(ctx, path, name, value, size, flags);
 }
 
-static ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
-   const char *name)
-{
-char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
-int ret;
-
-ret = lremovexattr(proc_path, name);
-g_free(proc_path);
-return ret;
-}
-
 ssize_t local_removexattr_nofollow(FsContext *ctx, const char *path,
const char *name)
 {
-- 
2.8.1




[Qemu-devel] [PATCH v2 05/20] 9p: Properly set errp in fstatfs error path

2018-05-31 Thread Keno Fischer
In the review of

9p: Avoid warning if FS_IOC_GETVERSION is not defined

Grep Kurz noted this error path was failing to set errp.
Fix that.

Signed-off-by: Keno Fischer 
---

Changes since v1: New patch

 hw/9pfs/9p-local.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index adc169a..576c8e3 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -1420,6 +1420,8 @@ static int local_init(FsContext *ctx, Error **errp)
  */
 if (fstatfs(data->mountfd, ) < 0) {
 close_preserve_errno(data->mountfd);
+error_setg_errno(errp, errno,
+"failed to stat file system at '%s'", ctx->fs_root);
 goto err;
 }
 switch (stbuf.f_type) {
-- 
2.8.1




[Qemu-devel] [PATCH v2 06/20] 9p: Avoid warning if FS_IOC_GETVERSION is not defined

2018-05-31 Thread Keno Fischer
Both `stbuf` and `local_ioc_getversion` where unused when
FS_IOC_GETVERSION was not defined, causing a compiler warning.

Reorgnaize the code to avoid this warning.

Signed-off-by: Keno Fischer 
---

Changes since v1:
 * As request in review, logic is factored into a
   local_ioc_getversion_init function.

 hw/9pfs/9p-local.c | 43 +--
 1 file changed, 25 insertions(+), 18 deletions(-)

diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index 576c8e3..6222891 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -1375,10 +1375,10 @@ static int local_unlinkat(FsContext *ctx, V9fsPath *dir,
 return ret;
 }
 
+#ifdef FS_IOC_GETVERSION
 static int local_ioc_getversion(FsContext *ctx, V9fsPath *path,
 mode_t st_mode, uint64_t *st_gen)
 {
-#ifdef FS_IOC_GETVERSION
 int err;
 V9fsFidOpenState fid_open;
 
@@ -1397,32 +1397,19 @@ static int local_ioc_getversion(FsContext *ctx, 
V9fsPath *path,
 err = ioctl(fid_open.fd, FS_IOC_GETVERSION, st_gen);
 local_close(ctx, _open);
 return err;
-#else
-errno = ENOTTY;
-return -1;
-#endif
 }
+#endif
 
-static int local_init(FsContext *ctx, Error **errp)
+static int local_ioc_getversion_init(FsContext *ctx, LocalData *data)
 {
+#ifdef FS_IOC_GETVERSION
 struct statfs stbuf;
-LocalData *data = g_malloc(sizeof(*data));
 
-data->mountfd = open(ctx->fs_root, O_DIRECTORY | O_RDONLY);
-if (data->mountfd == -1) {
-error_setg_errno(errp, errno, "failed to open '%s'", ctx->fs_root);
-goto err;
-}
-
-#ifdef FS_IOC_GETVERSION
 /*
  * use ioc_getversion only if the ioctl is definied
  */
 if (fstatfs(data->mountfd, ) < 0) {
-close_preserve_errno(data->mountfd);
-error_setg_errno(errp, errno,
-"failed to stat file system at '%s'", ctx->fs_root);
-goto err;
+return -1;
 }
 switch (stbuf.f_type) {
 case EXT2_SUPER_MAGIC:
@@ -1433,6 +1420,26 @@ static int local_init(FsContext *ctx, Error **errp)
 break;
 }
 #endif
+return 0;
+}
+
+static int local_init(FsContext *ctx, Error **errp)
+{
+LocalData *data = g_malloc(sizeof(*data));
+
+data->mountfd = open(ctx->fs_root, O_DIRECTORY | O_RDONLY);
+if (data->mountfd == -1) {
+error_setg_errno(errp, errno, "failed to open '%s'", ctx->fs_root);
+goto err;
+}
+
+if (local_ioc_getversion_init(ctx, data) < 0) {
+close_preserve_errno(data->mountfd);
+error_setg_errno(errp, errno,
+"failed initialize ioc_getversion for file system at '%s'",
+ctx->fs_root);
+goto err;
+}
 
 if (ctx->export_flags & V9FS_SM_PASSTHROUGH) {
 ctx->xops = passthrough_xattr_ops;
-- 
2.8.1




[Qemu-devel] [PATCH v2 11/20] 9p: darwin: Handle struct dirent differences

2018-05-31 Thread Keno Fischer
On darwin d_seekoff exists, but is optional and does not seem to
be commonly used by file systems. Use `telldir` instead to obtain
the seek offset.

Signed-off-by: Keno Fischer 
---

Changes since v1:
 * Drop setting d_seekoff in synth_direntry
 * Factor telldir vs d_off logic into v9fs_dent_telldir
 * Error path for telldir failure

 hw/9pfs/9p-synth.c |  2 ++
 hw/9pfs/9p.c   | 36 
 2 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
index eb68b42..a312f8c 100644
--- a/hw/9pfs/9p-synth.c
+++ b/hw/9pfs/9p-synth.c
@@ -221,7 +221,9 @@ static void synth_direntry(V9fsSynthNode *node,
 {
 strcpy(entry->d_name, node->name);
 entry->d_ino = node->attr->inode;
+#ifndef CONFIG_DARWIN
 entry->d_off = off + 1;
+#endif
 }
 
 static struct dirent *synth_get_dentry(V9fsSynthNode *dir,
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 212f569..9751246 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1738,6 +1738,25 @@ static int v9fs_xattr_read(V9fsState *s, V9fsPDU *pdu, 
V9fsFidState *fidp,
 return offset;
 }
 
+/**
+ * Get the seek offset of a dirent. If not available from the structure itself,
+ * obtain it by calling telldir.
+ */
+static int v9fs_dent_telldir(V9fsPDU *pdu, V9fsFidState *fidp,
+ struct dirent *dent)
+{
+#ifdef CONFIG_DARWIN
+/*
+ * Darwin has d_seekoff, which appears to function similarly to d_off.
+ * However, it does not appear to be supported on all file systems,
+ * so use telldir for correctness.
+ */
+return v9fs_co_telldir(pdu, fidp);
+#else
+return dent->d_off;
+#endif
+}
+
 static int coroutine_fn v9fs_do_readdir_with_stat(V9fsPDU *pdu,
   V9fsFidState *fidp,
   uint32_t max_count)
@@ -1801,7 +1820,11 @@ static int coroutine_fn 
v9fs_do_readdir_with_stat(V9fsPDU *pdu,
 count += len;
 v9fs_stat_free();
 v9fs_path_free();
-saved_dir_pos = dent->d_off;
+saved_dir_pos = v9fs_dent_telldir(pdu, fidp, dent);
+if (saved_dir_pos < 0) {
+err = saved_dir_pos;
+break;
+}
 }
 
 v9fs_readdir_unlock(>fs.dir);
@@ -1915,7 +1938,7 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, 
V9fsFidState *fidp,
 V9fsString name;
 int len, err = 0;
 int32_t count = 0;
-off_t saved_dir_pos;
+off_t saved_dir_pos, off;
 struct dirent *dent;
 
 /* save the directory position */
@@ -1951,10 +1974,15 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, 
V9fsFidState *fidp,
 /* Fill the other fields with dummy values */
 qid.type = 0;
 qid.version = 0;
+off = v9fs_dent_telldir(pdu, fidp, dent);
+if (off < 0) {
+err = off;
+break;
+}
 
 /* 11 = 7 + 4 (7 = start offset, 4 = space for storing count) */
 len = pdu_marshal(pdu, 11 + count, "Qqbs",
-  , dent->d_off,
+  , off,
   dent->d_type, );
 
 v9fs_readdir_unlock(>fs.dir);
@@ -1966,7 +1994,7 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, 
V9fsFidState *fidp,
 }
 count += len;
 v9fs_string_free();
-saved_dir_pos = dent->d_off;
+saved_dir_pos = off;
 }
 
 v9fs_readdir_unlock(>fs.dir);
-- 
2.8.1




[Qemu-devel] [PATCH v2 02/20] 9p: proxy: Fix size passed to `connect`

2018-05-31 Thread Keno Fischer
The size to pass to the `connect` call is the size of the entire
`struct sockaddr_un`. Passing anything shorter than this causes errors
on darwin.

Signed-off-by: Keno Fischer 
---

Changes since v1: New patch

 hw/9pfs/9p-proxy.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/9pfs/9p-proxy.c b/hw/9pfs/9p-proxy.c
index e2e0329..47a94e0 100644
--- a/hw/9pfs/9p-proxy.c
+++ b/hw/9pfs/9p-proxy.c
@@ -1088,7 +1088,7 @@ static int proxy_ioc_getversion(FsContext *fs_ctx, 
V9fsPath *path,
 
 static int connect_namedsocket(const char *path, Error **errp)
 {
-int sockfd, size;
+int sockfd;
 struct sockaddr_un helper;
 
 if (strlen(path) >= sizeof(helper.sun_path)) {
@@ -1102,8 +1102,7 @@ static int connect_namedsocket(const char *path, Error 
**errp)
 }
 strcpy(helper.sun_path, path);
 helper.sun_family = AF_UNIX;
-size = strlen(helper.sun_path) + sizeof(helper.sun_family);
-if (connect(sockfd, (struct sockaddr *), size) < 0) {
+if (connect(sockfd, (struct sockaddr *), sizeof(helper)) < 0) {
 error_setg_errno(errp, errno, "failed to connect to '%s'", path);
 close(sockfd);
 return -1;
-- 
2.8.1




[Qemu-devel] [PATCH v2 04/20] 9p: linux: Fix a couple Linux assumptions

2018-05-31 Thread Keno Fischer
 - Guard Linux only headers.
 - Add qemu/statfs.h header to abstract over the which
   headers are needed for struct statfs
 - Define `ENOATTR` only if not only defined
   (it's defined in system headers on Darwin).

Signed-off-by: Keno Fischer 
---

Changes since v1:
 * New qemu/statfs.h header to factor out the header selection
   to a common place. I did not write a configure check,
   since the rest of 9p only supports linux/darwin anyway,
   so there didn't seem to be much point, but I can write
   one if requested.
 * Now also covers fsdev/virtfs-proxy-helper.c

 fsdev/file-op-9p.h  |  2 +-
 fsdev/virtfs-proxy-helper.c |  4 +++-
 hw/9pfs/9p-local.c  |  2 ++
 include/qemu/statfs.h   | 19 +++
 include/qemu/xattr.h|  4 +++-
 5 files changed, 28 insertions(+), 3 deletions(-)
 create mode 100644 include/qemu/statfs.h

diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
index 3fa062b..111f804 100644
--- a/fsdev/file-op-9p.h
+++ b/fsdev/file-op-9p.h
@@ -16,7 +16,7 @@
 
 #include 
 #include 
-#include 
+#include "qemu/statfs.h"
 #include "qemu-fsdev-throttle.h"
 
 #define SM_LOCAL_MODE_BITS0600
diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
index 6f132c5..94fb069 100644
--- a/fsdev/virtfs-proxy-helper.c
+++ b/fsdev/virtfs-proxy-helper.c
@@ -13,17 +13,19 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_LINUX
 #include 
 #include 
-#include 
 #include 
 #include 
 #ifdef CONFIG_LINUX_MAGIC_H
 #include 
 #endif
+#endif
 #include "qemu-common.h"
 #include "qemu/sockets.h"
 #include "qemu/xattr.h"
+#include "qemu/statfs.h"
 #include "9p-iov-marshal.h"
 #include "hw/9pfs/9p-proxy.h"
 #include "fsdev/9p-iov-marshal.h"
diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index bcf2798..adc169a 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -27,10 +27,12 @@
 #include "qemu/error-report.h"
 #include "qemu/option.h"
 #include 
+#ifdef CONFIG_LINUX
 #include 
 #ifdef CONFIG_LINUX_MAGIC_H
 #include 
 #endif
+#endif
 #include 
 
 #ifndef XFS_SUPER_MAGIC
diff --git a/include/qemu/statfs.h b/include/qemu/statfs.h
new file mode 100644
index 000..dde289f
--- /dev/null
+++ b/include/qemu/statfs.h
@@ -0,0 +1,19 @@
+/*
+ * Host statfs header abstraction
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2, or any
+ * later version.  See the COPYING file in the top-level directory.
+ *
+ */
+#ifndef QEMU_STATFS_H
+#define QEMU_STATFS_H
+
+#ifdef CONFIG_LINUX
+# include 
+#endif
+#ifdef CONFIG_DARWIN
+# include 
+# include 
+#endif
+
+#endif
diff --git a/include/qemu/xattr.h b/include/qemu/xattr.h
index a83fe8e..f1d0f7b 100644
--- a/include/qemu/xattr.h
+++ b/include/qemu/xattr.h
@@ -22,7 +22,9 @@
 #ifdef CONFIG_LIBATTR
 #  include 
 #else
-#  define ENOATTR ENODATA
+#  if !defined(ENOATTR)
+#define ENOATTR ENODATA
+#  endif
 #  include 
 #endif
 
-- 
2.8.1




[Qemu-devel] [PATCH v2 03/20] 9p: xattr: Fix crash due to free of uninitialized value

2018-05-31 Thread Keno Fischer
If the size returned from llistxattr is 0, we skipped the malloc
call, leaving xattr.value uninitialized. However, this value is
later passed to `g_free` without any further checks, causing an
error. Fix that by always calling g_malloc unconditionally. If
`size` is 0, it will return a pointer that is safe to pass to g_free,
likely NULL.

Signed-off-by: Keno Fischer 
---

Changes since v1: New patch

 hw/9pfs/9p.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index d74302d..b80db65 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -3256,8 +3256,8 @@ static void coroutine_fn v9fs_xattrwalk(void *opaque)
 xattr_fidp->fs.xattr.len = size;
 xattr_fidp->fid_type = P9_FID_XATTR;
 xattr_fidp->fs.xattr.xattrwalk_fid = true;
+xattr_fidp->fs.xattr.value = g_malloc0(size);
 if (size) {
-xattr_fidp->fs.xattr.value = g_malloc0(size);
 err = v9fs_co_llistxattr(pdu, _fidp->path,
  xattr_fidp->fs.xattr.value,
  xattr_fidp->fs.xattr.len);
-- 
2.8.1




[Qemu-devel] [PATCH v2 01/20] cutils: Provide strchrnul

2018-05-31 Thread Keno Fischer
strchrnul is a GNU extension and thus unavailable on a number of targets.
In the review for a commit removing strchrnul from 9p, I was asked to
create a qemu_strchrnul helper to factor out this functionality.
Do so, and use it in a number of other places in the code base that inlined
the replacement pattern in a place where strchrnul could be used.

Signed-off-by: Keno Fischer 
---

Changes since v1: New patch

 hw/9pfs/9p-local.c|  2 +-
 include/qemu/cutils.h |  1 +
 monitor.c |  8 ++--
 util/cutils.c | 13 +
 util/qemu-option.c|  6 +-
 util/uri.c|  6 ++
 6 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index b37b1db..bcf2798 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -65,7 +65,7 @@ int local_open_nofollow(FsContext *fs_ctx, const char *path, 
int flags,
 assert(*path != '/');
 
 head = g_strdup(path);
-c = strchrnul(path, '/');
+c = qemu_strchrnul(path, '/');
 if (*c) {
 /* Intermediate path element */
 head[c - path] = 0;
diff --git a/include/qemu/cutils.h b/include/qemu/cutils.h
index a663340..bc40c30 100644
--- a/include/qemu/cutils.h
+++ b/include/qemu/cutils.h
@@ -122,6 +122,7 @@ int qemu_strnlen(const char *s, int max_len);
  * Returns: the pointer originally in @input.
  */
 char *qemu_strsep(char **input, const char *delim);
+const char *qemu_strchrnul(const char *s, int c);
 time_t mktimegm(struct tm *tm);
 int qemu_fdatasync(int fd);
 int fcntl_setfl(int fd, int flag);
diff --git a/monitor.c b/monitor.c
index 922cfc0..e1f01c4 100644
--- a/monitor.c
+++ b/monitor.c
@@ -798,9 +798,7 @@ static int compare_cmd(const char *name, const char *list)
 p = list;
 for(;;) {
 pstart = p;
-p = strchr(p, '|');
-if (!p)
-p = pstart + strlen(pstart);
+p = qemu_strchrnul(p, '|');
 if ((p - pstart) == len && !memcmp(pstart, name, len))
 return 1;
 if (*p == '\0')
@@ -3401,9 +3399,7 @@ static void cmd_completion(Monitor *mon, const char 
*name, const char *list)
 p = list;
 for(;;) {
 pstart = p;
-p = strchr(p, '|');
-if (!p)
-p = pstart + strlen(pstart);
+p = qemu_strchrnul(p, '|');
 len = p - pstart;
 if (len > sizeof(cmd) - 2)
 len = sizeof(cmd) - 2;
diff --git a/util/cutils.c b/util/cutils.c
index 0de69e6..6e078b0 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -545,6 +545,19 @@ int qemu_strtou64(const char *nptr, const char **endptr, 
int base,
 }
 
 /**
+ * Searches for the first occurrence of 'c' in 's', and returns a pointer
+ * to the trailing null byte if none was found.
+ */
+const char *qemu_strchrnul(const char *s, int c)
+{
+const char *e = strchr(s, c);
+if (!e) {
+e = s + strlen(s);
+}
+return e;
+}
+
+/**
  * parse_uint:
  *
  * @s: String to parse
diff --git a/util/qemu-option.c b/util/qemu-option.c
index 58d1c23..54eca12 100644
--- a/util/qemu-option.c
+++ b/util/qemu-option.c
@@ -77,11 +77,7 @@ const char *get_opt_value(const char *p, char **value)
 
 *value = NULL;
 while (1) {
-offset = strchr(p, ',');
-if (!offset) {
-offset = p + strlen(p);
-}
-
+offset = qemu_strchrnul(p, ',');
 length = offset - p;
 if (*offset != '\0' && *(offset + 1) == ',') {
 length++;
diff --git a/util/uri.c b/util/uri.c
index 8624a7a..8bdef84 100644
--- a/util/uri.c
+++ b/util/uri.c
@@ -52,6 +52,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/cutils.h"
 
 #include "qemu/uri.h"
 
@@ -2266,10 +2267,7 @@ struct QueryParams *query_params_parse(const char *query)
 /* Find the next separator, or end of the string. */
 end = strchr(query, '&');
 if (!end) {
-end = strchr(query, ';');
-}
-if (!end) {
-end = query + strlen(query);
+end = qemu_strchrnul(query, ';');
 }
 
 /* Find the first '=' character between here and end. */
-- 
2.8.1




[Qemu-devel] [PATCH v2 00/20] 9p: Add support for Darwin

2018-05-31 Thread Keno Fischer
This is v2 of my patch series to provide 9p server
support for Darwin.

The patches in this series address review from v1,
now support building the virtfs proxy, as well
as fix bugs found since v1.

Keno Fischer (20):
  cutils: Provide strchrnul
  9p: proxy: Fix size passed to `connect`
  9p: xattr: Fix crash due to free of uninitialized value
  9p: linux: Fix a couple Linux assumptions
  9p: Properly set errp in fstatfs error path
  9p: Avoid warning if FS_IOC_GETVERSION is not defined
  9p: Move a couple xattr functions to 9p-util
  9p: Rename 9p-util -> 9p-util-linux
  9p: Properly error check and translate flags in unlinkat
  9p: darwin: Handle struct stat(fs) differences
  9p: darwin: Handle struct dirent differences
  9p: darwin: Explicitly cast comparisons of mode_t with -1
  9p: darwin: Ignore O_{NOATIME, DIRECT}
  9p: darwin: Provide a compatibility definition for XATTR_SIZE_MAX
  9p: darwin: *xattr_nofollow implementations
  9p: darwin: Compatibility for f/l*xattr
  9p: darwin: Provide a fallback implementation for utimensat
  9p: darwin: Implement compatibility for mknodat
  9p: darwin: virtfs-proxy: Implement setuid code for darwin
  9p: darwin: configure: Allow VirtFS on Darwin

 Makefile|   6 ++
 Makefile.objs   |   1 +
 configure   |  22 +++--
 fsdev/file-op-9p.h  |   2 +-
 fsdev/virtfs-proxy-helper.c | 230 
 hw/9pfs/9p-local.c  |  68 -
 hw/9pfs/9p-proxy.c  |  22 +++--
 hw/9pfs/9p-synth.c  |   4 +
 hw/9pfs/9p-util-darwin.c| 185 +++
 hw/9pfs/9p-util-linux.c |  70 ++
 hw/9pfs/9p-util.c   |  26 -
 hw/9pfs/9p-util.h   |  31 ++
 hw/9pfs/9p-xattr.c  |  33 ---
 hw/9pfs/9p.c|  86 +++--
 hw/9pfs/Makefile.objs   |   4 +-
 include/qemu/cutils.h   |   1 +
 include/qemu/statfs.h   |  19 
 include/qemu/xattr.h|   4 +-
 monitor.c   |   8 +-
 util/cutils.c   |  13 +++
 util/qemu-option.c  |   6 +-
 util/uri.c  |   6 +-
 22 files changed, 636 insertions(+), 211 deletions(-)
 create mode 100644 hw/9pfs/9p-util-darwin.c
 create mode 100644 hw/9pfs/9p-util-linux.c
 delete mode 100644 hw/9pfs/9p-util.c
 create mode 100644 include/qemu/statfs.h

-- 
2.8.1




Re: [Qemu-devel] [PATCH 1/6] gdbstub: Return the fd from gdbserver_start

2018-05-31 Thread Richard Henderson
On 05/31/2018 04:15 PM, Philippe Mathieu-Daudé wrote:
> Hi Richard,
> 
> On 05/31/2018 07:49 PM, Richard Henderson wrote:
>> This will allow us to protect gdbserver_fd from the guest.
>>
>> Signed-off-by: Richard Henderson 
>> ---
>>  gdbstub.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/gdbstub.c b/gdbstub.c
>> index 6081e719c5..057d0d65c5 100644
>> --- a/gdbstub.c
>> +++ b/gdbstub.c
>> @@ -1890,15 +1890,16 @@ static int gdbserver_open(int port)
>>  int gdbserver_start(int port)
>>  {
>>  gdbserver_fd = gdbserver_open(port);
>> -if (gdbserver_fd < 0)
>> +if (gdbserver_fd < 0) {
>>  return -1;
>> +}
>>  /* accept connections */
>>  if (!gdb_accept()) {
>>  close(gdbserver_fd);
>>  gdbserver_fd = -1;
>>  return -1;
>>  }
>> -return 0;
>> +return gdbserver_fd;
> 
> I agree with your change, but what about !CONFIG_USER_ONLY?

It's still a non-negative number, so the success value is still the same.


r~



Re: [Qemu-devel] [PATCH v3 0/4] Add new CD-ROM related qtests

2018-05-31 Thread John Snow



On 04/27/2018 05:40 AM, Thomas Huth wrote:
> With one of my clean-up patches (see commit 1454509726719e0933c800), I
> recently accidentially broke the "-cdrom" parameter (more precisely
> "-drive if=scsi") on a couple of boards, since there was no error
> detected during the "make check" regression testing. This is clearly an
> indication that we are lacking tests in this area.
> So this small patch series now introduces some tests for CD-ROM drives:
> The first two patches introduce the possibility to check that booting
> from CD-ROM drives still works fine for x86 and s390x, and the third
> patch adds a test that certain machines can at least still be started
> with the "-cdrom" parameter (i.e. that test would have catched the
> mistake that I did with my SCSI cleanup patch).
> 
> v3:
>  - Rebased to current master branch
>  - Add a final patch that adds an entry to the MAINTAINERS file
> 
> v2:
>  - Use g_spawn_sync() instead of execlp() to run genisoimage
>  - The "-cdrom" parameter test is now run on all architectures (with
>machine "none" for the machines that are not explicitly checked)
>  - Some rewordings and improved comments here and there
> 
> Thomas Huth (4):
>   tests/boot-sector: Add magic bytes to s390x boot code header
>   tests/cdrom-test: Test booting from CD-ROM ISO image file
>   tests/cdrom-test: Test that -cdrom parameter is working
>   MAINTAINERS: Add the cdrom-test to John's section
> 
>  MAINTAINERS|   1 +
>  tests/Makefile.include |   2 +
>  tests/boot-sector.c|   9 +-
>  tests/cdrom-test.c | 222 
> +
>  4 files changed, 231 insertions(+), 3 deletions(-)
>  create mode 100644 tests/cdrom-test.c
> 

*coughs*

Thanks, applied to my IDE tree:

https://github.com/jnsnow/qemu/commits/ide
https://github.com/jnsnow/qemu.git

--js



Re: [Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements

2018-05-31 Thread John Snow



On 05/31/2018 06:28 PM, John Snow wrote:
> This set just adds register names so that the read/write traces make
> more sense on their own without having to memorize register offsets.
> It also splits read/write traces into supported/unsupported subsets,
> so you can just monitor for things that QEMU is likely doing wrong.
> 
> v2:
>  - Added qemu_log_mask(LOG_UNIMP, ...) statements in addition to traces
>for writes to unknown/unsupported registers. (Philippe)
> 
> John Snow (16):
>   ahci: add port register enumeration
>   ahci: modify ahci_port_read to use register numbers
>   ahci: make port read traces more descriptive
>   ahci: fix spacing damage on ahci_port_write
>   ahci: combine identical clauses in port write
>   ahci: modify ahci_port_write to use register numbers
>   ahci: make port write traces more descriptive
>   ahci: delete old port register address definitions
>   ahci: add host register enumeration
>   ahci: fix host register max address
>   ahci: modify ahci_mem_read_32 to work on register numbers
>   ahci: make mem_read_32 traces more descriptive
>   ahci: fix spacing damage on ahci_mem_write
>   ahci: adjust ahci_mem_write to work on registers
>   ahci: delete old host register address definitions
>   ahci: make ahci_mem_write traces more descriptive
> 
>  hw/ide/ahci.c  | 314 
> ++---
>  hw/ide/ahci_internal.h |  63 ++
>  hw/ide/trace-events|  13 +-
>  3 files changed, 241 insertions(+), 149 deletions(-)
> 

Touched up the format strings, and

Thanks, applied to my IDE tree:

https://github.com/jnsnow/qemu/commits/ide
https://github.com/jnsnow/qemu.git

--js



Re: [Qemu-devel] [PATCH 0/3] ahci: fix completion race condition

2018-05-31 Thread John Snow



On 05/30/2018 08:43 PM, John Snow wrote:
> Commit d759c951f changed the main thread lock release/reacquisition,
> and in so doing apparently jostled loose a race condition in the AHCI
> code.
> 
> Patch 2 should be sufficient to fix this, and patches 1 and 3 are just
> little trivial fixes.
> 
> This might be sufficient to fix the bug as reported at
> https://bugs.launchpad.net/qemu/+bug/1769189
> but the nature of the timing changes make it difficult to confirm,
> so I am posting this patchset for the reporters to help test.
> 
> John Snow (3):
>   ahci: trim signatures on raise/lower
>   ahci: fix PxCI register race
>   ahci: don't schedule unnecessary BH
> 
>  hw/ide/ahci.c | 24 +++-
>  1 file changed, 11 insertions(+), 13 deletions(-)
> 

Thanks for the testing and reviews, everyone!

Thanks, applied to my IDE tree:

https://github.com/jnsnow/qemu/commits/ide
https://github.com/jnsnow/qemu.git

--js



Re: [Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements

2018-05-31 Thread John Snow



On 05/31/2018 06:49 PM, no-re...@patchew.org wrote:
> Hi,
> 
> This series failed docker-mingw@fedora build test. Please find the testing 
> commands and
> their output below. If you have Docker installed, you can probably reproduce 
> it
> locally.
> 

[ blah blah blah ]

> In file included from /tmp/qemu-test/src/hw/ide/ahci.c:30:0:
> /tmp/qemu-test/src/hw/ide/ahci.c: In function 'ahci_mem_write':
> /tmp/qemu-test/src/hw/ide/ahci.c:497:38: error: format '%lx' expects argument 
> of type 'long unsigned int', but argument 3 has type 'hwaddr {aka long long 
> unsigned int}' [-Werror=format=]
>  qemu_log_mask(LOG_UNIMP, "Attempted write to unimplemented 
> register:"
>   ^
> /tmp/qemu-test/src/include/qemu/log.h:85:22: note: in definition of macro 
> 'qemu_log_mask'
>  qemu_log(FMT, ## __VA_ARGS__);  \
>   ^~~
> /tmp/qemu-test/src/hw/ide/ahci.c:498:63: note: format string is defined here
>" AHCI host register %s, offset 0x%lx: 0x%"PRIu64,
>  ~~^
>  %llx
> In file included from /tmp/qemu-test/src/hw/ide/ahci.c:30:0:
> /tmp/qemu-test/src/hw/ide/ahci.c:511:34: error: format '%lx' expects argument 
> of type 'long unsigned int', but argument 2 has type 'hwaddr {aka long long 
> unsigned int}' [-Werror=format=]
>  qemu_log_mask(LOG_UNIMP, "Attempted write to unimplemented register: 
> "
>   ^
> /tmp/qemu-test/src/include/qemu/log.h:85:22: note: in definition of macro 
> 'qemu_log_mask'
>  qemu_log(FMT, ## __VA_ARGS__);  \
>   ^~~
> /tmp/qemu-test/src/hw/ide/ahci.c:512:59: note: format string is defined here
>"AHCI global register at offset 0x%lx: 0x%"PRIu64,
>  ~~^
>  %llx

Rookie stuff; I forgot hwaddr was just uint64_t, which is implemented as
long on x86_64. Replaced address and value specifications both with
"0x%"PRIx64 in both cases.

--js



Re: [Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements

2018-05-31 Thread John Snow



On 05/31/2018 06:47 PM, no-re...@patchew.org wrote:
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Type: series
> Message-id: 20180531222835.16558-1-js...@redhat.com
> Subject: [Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> 
> BASE=base
> n=1
> total=$(git log --oneline $BASE.. | wc -l)
> failed=0
> 
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> git config --local diff.algorithm histogram
> 
> commits="$(git log --format=%H --reverse $BASE..)"
> for c in $commits; do
> echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
> if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; 
> then
> failed=1
> echo
> fi
> n=$((n+1))
> done
> 
> exit $failed
> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> From https://github.com/patchew-project/qemu
>  t [tag update]patchew/20180531212435.165261-1-ebl...@redhat.com 
> -> patchew/20180531212435.165261-1-ebl...@redhat.com
>  * [new tag]   patchew/20180531222835.16558-1-js...@redhat.com -> 
> patchew/20180531222835.16558-1-js...@redhat.com
> Switched to a new branch 'test'
> 9932482af2 ahci: make ahci_mem_write traces more descriptive
> c89f4076bd ahci: delete old host register address definitions
> 70f11b4d95 ahci: adjust ahci_mem_write to work on registers
> c13dce7973 ahci: fix spacing damage on ahci_mem_write
> fab5307968 ahci: make mem_read_32 traces more descriptive
> ace1b4ffac ahci: modify ahci_mem_read_32 to work on register numbers
> 3021a82d31 ahci: fix host register max address
> 3c4808a257 ahci: add host register enumeration
> f57758869b ahci: delete old port register address definitions
> 52d04d9b2c ahci: make port write traces more descriptive
> 08f435fad3 ahci: modify ahci_port_write to use register numbers
> 634c79d46b ahci: combine identical clauses in port write
> 50b162ae60 ahci: fix spacing damage on ahci_port_write
> 5895ec8f28 ahci: make port read traces more descriptive
> 68fd051a9a ahci: modify ahci_port_read to use register numbers
> 0478088585 ahci: add port register enumeration
> 
> === OUTPUT BEGIN ===
> Checking PATCH 1/16: ahci: add port register enumeration...
> WARNING: line over 80 characters
> #71: FILE: hw/ide/ahci_internal.h:94:
> +AHCI_PORT_REG_SCR_NOTIF   = 15, /* PxSNTF: SATA phy register: 
> SNotification */
> 
> total: 0 errors, 1 warnings, 72 lines checked
> 
> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> Checking PATCH 2/16: ahci: modify ahci_port_read to use register numbers...
> Checking PATCH 3/16: ahci: make port read traces more descriptive...
> Checking PATCH 4/16: ahci: fix spacing damage on ahci_port_write...
> ERROR: spaces required around that '|' (ctx:VxV)
> #83: FILE: hw/ide/ahci.c:316:
> +(val & ~(PORT_CMD_RO_MASK|PORT_CMD_ICC_MASK));

I'll fix that in this patch.

>   ^
> 
> total: 1 errors, 0 warnings, 156 lines checked
> 
> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> Checking PATCH 5/16: ahci: combine identical clauses in port write...
> Checking PATCH 6/16: ahci: modify ahci_port_write to use register numbers...
> Checking PATCH 7/16: ahci: make port write traces more descriptive...
> Checking PATCH 8/16: ahci: delete old port register address definitions...
> Checking PATCH 9/16: ahci: add host register enumeration...
> Checking PATCH 10/16: ahci: fix host register max address...
> Checking PATCH 11/16: ahci: modify ahci_mem_read_32 to work on register 
> numbers...
> Checking PATCH 12/16: ahci: make mem_read_32 traces more descriptive...
> Checking PATCH 13/16: ahci: fix spacing damage on ahci_mem_write...
> Checking PATCH 14/16: ahci: adjust ahci_mem_write to work on registers...
> Checking PATCH 15/16: ahci: delete old host register address definitions...
> Checking PATCH 16/16: ahci: make ahci_mem_write traces more descriptive...
> WARNING: line over 80 characters
> #18: FILE: hw/ide/ahci.c:497:
> +qemu_log_mask(LOG_UNIMP, "Attempted write to unimplemented 
> register:"
> 

TIL that my emacs configuration considers column 80 in-bounds.
Fixed.

> total: 0 errors, 1 warnings, 35 lines checked
> 
> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> === OUTPUT END ===
> 
> Test command exited with code: 1
> 
> 
> ---
> Email generated automatically by Patchew [http://patchew.org/].
> Please send your feedback to patchew-de...@redhat.com
> 

You're a harsh master, patchew.

--js



Re: [Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements

2018-05-31 Thread John Snow



On 05/31/2018 07:24 PM, Philippe Mathieu-Daudé wrote:
> On 05/31/2018 07:28 PM, John Snow wrote:
>> This set just adds register names so that the read/write traces make
>> more sense on their own without having to memorize register offsets.
>> It also splits read/write traces into supported/unsupported subsets,
>> so you can just monitor for things that QEMU is likely doing wrong.
>>
>> v2:
>>  - Added qemu_log_mask(LOG_UNIMP, ...) statements in addition to traces
>>for writes to unknown/unsupported registers. (Philippe)
> 
> Thanks,
> 
> With format string and style issues reported by checkpatch fixed:
> Reviewed-by: Philippe Mathieu-Daudé 
> 
> (I recommend you to setup the scripts/git.orderfile to ease reviews).
> 

Ah, yeah, my environment is kind of broken at the moment, sorry :(

Just haven't had the time to figure out why git-publish is broken so I'm
missing a few of my usual pieces.

--js



Re: [Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements

2018-05-31 Thread Philippe Mathieu-Daudé
On 05/31/2018 07:28 PM, John Snow wrote:
> This set just adds register names so that the read/write traces make
> more sense on their own without having to memorize register offsets.
> It also splits read/write traces into supported/unsupported subsets,
> so you can just monitor for things that QEMU is likely doing wrong.
> 
> v2:
>  - Added qemu_log_mask(LOG_UNIMP, ...) statements in addition to traces
>for writes to unknown/unsupported registers. (Philippe)

Thanks,

With format string and style issues reported by checkpatch fixed:
Reviewed-by: Philippe Mathieu-Daudé 

(I recommend you to setup the scripts/git.orderfile to ease reviews).

> 
> John Snow (16):
>   ahci: add port register enumeration
>   ahci: modify ahci_port_read to use register numbers
>   ahci: make port read traces more descriptive
>   ahci: fix spacing damage on ahci_port_write
>   ahci: combine identical clauses in port write
>   ahci: modify ahci_port_write to use register numbers
>   ahci: make port write traces more descriptive
>   ahci: delete old port register address definitions
>   ahci: add host register enumeration
>   ahci: fix host register max address
>   ahci: modify ahci_mem_read_32 to work on register numbers
>   ahci: make mem_read_32 traces more descriptive
>   ahci: fix spacing damage on ahci_mem_write
>   ahci: adjust ahci_mem_write to work on registers
>   ahci: delete old host register address definitions
>   ahci: make ahci_mem_write traces more descriptive
> 
>  hw/ide/ahci.c  | 314 
> ++---
>  hw/ide/ahci_internal.h |  63 ++
>  hw/ide/trace-events|  13 +-
>  3 files changed, 241 insertions(+), 149 deletions(-)
> 



Re: [Qemu-devel] [PATCH 11/13] 9p: darwin: Mark mknod as unsupported

2018-05-31 Thread Keno Fischer
On Thu, May 31, 2018 at 7:06 PM, Keno Fischer  wrote:
> On Thu, May 31, 2018 at 6:56 PM, Keno Fischer  wrote:
 My concern was that allowing this would cause unexpected
 behavior, since the device numbers will differ between OS X
 and Linux. Though maybe this isn't the place to worry about
 that.
>>>
>>> The numbers may differ indeed but we don't really care since the
>>> server never opens device files. This is just a directory entry.
>>
>> Ok, let me try to implement it. However, I don't think it is possible
>> to implement mknodat (or at least I can't think of how) on Darwin
>> directly. I could use regular mknod, but presumably, this is used
>> to avoid a race condition between creating the device and setting
>> the permissions. Can you think of a good way to resolve that?
>
> Would it work to fchdir in to the directory and use a cwd-relative
> mknod, then fchdir back? Or do we need to maintain the cwd
> while in this code?

Sorry for the triple-self-post here, but I took a look at the Darwin kernel
source and there's an unexposed (from the Darwin C library) syscall that
only changes the current thread's cwd. That seems like it should be safe,
so I'll go ahead and use that to implement mknodat. I'll also submit a
feature request to apple to implement mknodat.



Re: [Qemu-devel] [PATCH 1/6] gdbstub: Return the fd from gdbserver_start

2018-05-31 Thread Philippe Mathieu-Daudé
Hi Richard,

On 05/31/2018 07:49 PM, Richard Henderson wrote:
> This will allow us to protect gdbserver_fd from the guest.
> 
> Signed-off-by: Richard Henderson 
> ---
>  gdbstub.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/gdbstub.c b/gdbstub.c
> index 6081e719c5..057d0d65c5 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -1890,15 +1890,16 @@ static int gdbserver_open(int port)
>  int gdbserver_start(int port)
>  {
>  gdbserver_fd = gdbserver_open(port);
> -if (gdbserver_fd < 0)
> +if (gdbserver_fd < 0) {
>  return -1;
> +}
>  /* accept connections */
>  if (!gdb_accept()) {
>  close(gdbserver_fd);
>  gdbserver_fd = -1;
>  return -1;
>  }
> -return 0;
> +return gdbserver_fd;

I agree with your change, but what about !CONFIG_USER_ONLY?

It should be safe enough documenting the different behaviors in
include/exec/gdbstub.h.

>  }
>  
>  /* Disable gdb stub for child processes.  */
> 



Re: [Qemu-devel] [PATCH v1 4/8] docker: update Travis docker image

2018-05-31 Thread Philippe Mathieu-Daudé
On 05/31/2018 05:14 PM, Alex Bennée wrote:
> 
> Philippe Mathieu-Daudé  writes:
> 
>> Hi Alex,
>>
>> On 05/30/2018 08:06 AM, Alex Bennée wrote:
>>> This is still poorly documented by Travis but according to:
>>>
>>>   
>>> https://docs.travis-ci.com/user/common-build-problems/#Running-a-Container-Based-Docker-Image-Locally
>>>
>>> their reference images are now hosted on Docker Hub. So we update the
>>> FROM line to refer to the new default image. We also need a few
>>> additional tweaks:
>>>
>>>   - re-enable deb-src lines for our build-dep install
>>>   - add explicit PATH definition for tools
>>
>> I don't understand how this is related to QEMU testing, isn't it rather
>> some Travis-ci bug? We don't need to use PhantomJS / Neo4j / Maven.
> 
> It's oddly constructed I'll grant you but I just set the path to what a
> running image has. The normal image is started up with a full systemd
> init whereas we drop directly into the shell.

OK, can you add a comment about it? (so we don't remove what seems
unrelated).

> 
>>
>>>   - force the build USER to be Travis
>>>   - add clang to FEATURES for our test-clang machinery
>>>
>>> Signed-off-by: Alex Bennée 
>>> ---
>>>  tests/docker/dockerfiles/travis.docker | 7 +--
>>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tests/docker/dockerfiles/travis.docker 
>>> b/tests/docker/dockerfiles/travis.docker
>>> index 605b6e429b..6e90f033d5 100644
>>> --- a/tests/docker/dockerfiles/travis.docker
>>> +++ b/tests/docker/dockerfiles/travis.docker
>>> @@ -1,8 +1,11 @@
>>> -FROM quay.io/travisci/travis-ruby
>>> +FROM travisci/ci-garnet:packer-1512502276-986baf0
>>>  ENV DEBIAN_FRONTEND noninteractive
>>>  ENV LANG en_US.UTF-8
>>>  ENV LC_ALL en_US.UTF-8
>>> +RUN cat /etc/apt/sources.list | sed "s/# deb-src/deb-src/" >> 
>>> /etc/apt/sources.list
>>>  RUN apt-get update
>>>  RUN apt-get -y build-dep qemu
>>>  RUN apt-get -y install device-tree-compiler python2.7 python-yaml 
>>> dh-autoreconf gdb strace lsof net-tools
>>> -ENV FEATURES pyyaml

# Travis tools require PhantomJS / Neo4j / Maven accessible
# in their PATH (QEMU build won't access them).

Reviewed-by: Philippe Mathieu-Daudé 

>>> +ENV PATH 
>>> /usr/local/phantomjs/bin:/usr/local/phantomjs:/usr/local/neo4j-3.2.7/bin:/usr/local/maven-3.5.2/bin:/usr/local/cmake-3.9.2/bin:/usr/local/clang-5.0.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
>>> +ENV FEATURES clang pyyaml
>>> +USER travis
>>>
> 
> 
> --
> Alex Bennée
> 



Re: [Qemu-devel] [PATCH 11/13] 9p: darwin: Mark mknod as unsupported

2018-05-31 Thread Keno Fischer
On Thu, May 31, 2018 at 6:56 PM, Keno Fischer  wrote:
>>> My concern was that allowing this would cause unexpected
>>> behavior, since the device numbers will differ between OS X
>>> and Linux. Though maybe this isn't the place to worry about
>>> that.
>>
>> The numbers may differ indeed but we don't really care since the
>> server never opens device files. This is just a directory entry.
>
> Ok, let me try to implement it. However, I don't think it is possible
> to implement mknodat (or at least I can't think of how) on Darwin
> directly. I could use regular mknod, but presumably, this is used
> to avoid a race condition between creating the device and setting
> the permissions. Can you think of a good way to resolve that?

Would it work to fchdir in to the directory and use a cwd-relative
mknod, then fchdir back? Or do we need to maintain the cwd
while in this code?



Re: [Qemu-devel] [PATCH v5 0/6] linux-user: Reorg interp_prefix handling

2018-05-31 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180531224911.23725-1-richard.hender...@linaro.org
Subject: [Qemu-devel] [PATCH v5 0/6] linux-user: Reorg interp_prefix handling

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   
patchew/20180531224911.23725-1-richard.hender...@linaro.org -> 
patchew/20180531224911.23725-1-richard.hender...@linaro.org
Switched to a new branch 'test'
2755fa1986 linux-user: Use *at functions to implement interp_prefix
8af5f75489 linux-user: Check is_hostfd in mmap syscalls
83374c532a linux-user: Check contains_hostfd in select syscalls
41b80ec144 linux-user: Check is_hostfd in do_syscall
fb3aa3a0d8 linux-user: Add host_fds and manipulators
11567891c5 gdbstub: Return the fd from gdbserver_start

=== OUTPUT BEGIN ===
Checking PATCH 1/6: gdbstub: Return the fd from gdbserver_start...
Checking PATCH 2/6: linux-user: Add host_fds and manipulators...
Checking PATCH 3/6: linux-user: Check is_hostfd in do_syscall...
Checking PATCH 4/6: linux-user: Check contains_hostfd in select syscalls...
Checking PATCH 5/6: linux-user: Check is_hostfd in mmap syscalls...
Checking PATCH 6/6: linux-user: Use *at functions to implement interp_prefix...
ERROR: do not use assignment in if condition
#200: FILE: linux-user/syscall.c:8541:
+if (!(fn = lock_user_string(arg2))) {

ERROR: do not use assignment in if condition
#218: FILE: linux-user/syscall.c:8562:
+if (!(fn = lock_user_string(arg1))) {

ERROR: do not use assignment in if condition
#236: FILE: linux-user/syscall.c:8577:
+if (!(fn = lock_user_string(arg2))) {

ERROR: do not use assignment in if condition
#254: FILE: linux-user/syscall.c:8735:
+if (!(fn = lock_user_string(arg1))) {

ERROR: do not use assignment in if condition
#333: FILE: linux-user/syscall.c:9769:
+if (!(fn = lock_user_string(arg1))) {

ERROR: do not use assignment in if condition
#349: FILE: linux-user/syscall.c:9809:
+if (!(fn = lock_user_string(arg1))) {

ERROR: do not use assignment in if condition
#365: FILE: linux-user/syscall.c:10099:
+if (!(fn = lock_user_string(arg1))) {

ERROR: do not use assignment in if condition
#380: FILE: linux-user/syscall.c:10111:
+if (!(fn = lock_user_string(arg1))) {

ERROR: do not use assignment in if condition
#398: FILE: linux-user/syscall.c:11303:
+if (!(fn = lock_user_string(arg1))) {

ERROR: do not use assignment in if condition
#415: FILE: linux-user/syscall.c:11317:
+if (!(fn = lock_user_string(arg1))) {

ERROR: do not use assignment in if condition
#433: FILE: linux-user/syscall.c:11349:
+if (!(fn = lock_user_string(arg2))) {

ERROR: do not use assignment in if condition
#452: FILE: linux-user/syscall.c:12394:
+if (!(fn = lock_user_string(arg2))) {

ERROR: do not use assignment in if condition
#472: FILE: linux-user/syscall.c:12433:
+if (!(fn = lock_user_string(arg2))) {

total: 13 errors, 0 warnings, 423 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [PATCHv2] slirp: Fix spurious error report when sending directly

2018-05-31 Thread Philippe Mathieu-Daudé
On 05/31/2018 05:31 PM, Samuel Thibault wrote:
> Move check to where it actually is useful, and reduce scope of 'len'
> variable along the way.
> 
> Signed-off-by: Samuel Thibault 

Thanks!

Reviewed-by: Philippe Mathieu-Daudé 

> ---
> 
> Difference from v1:
> - move check instead of initializing len.
> 
>  slirp/socket.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/slirp/socket.c b/slirp/socket.c
> index e2a71c9b04..08fe98907d 100644
> --- a/slirp/socket.c
> +++ b/slirp/socket.c
> @@ -340,7 +340,7 @@ sosendoob(struct socket *so)
>   struct sbuf *sb = >so_rcv;
>   char buff[2048]; /* XXX Shouldn't be sending more oob data than this */
>  
> - int n, len;
> + int n;
>  
>   DEBUG_CALL("sosendoob");
>   DEBUG_ARG("so = %p", so);
> @@ -359,7 +359,7 @@ sosendoob(struct socket *so)
>* send it all
>*/
>   uint32_t urgc = so->so_urgc;
> - len = (sb->sb_data + sb->sb_datalen) - sb->sb_rptr;
> + int len = (sb->sb_data + sb->sb_datalen) - sb->sb_rptr;
>   if (len > urgc) {
>   len = urgc;
>   }
> @@ -374,13 +374,13 @@ sosendoob(struct socket *so)
>   len += n;
>   }
>   n = slirp_send(so, buff, len, (MSG_OOB)); /* |MSG_DONTWAIT)); */
> - }
> -
>  #ifdef DEBUG
> - if (n != len) {
> - DEBUG_ERROR((dfd, "Didn't send all data urgently X\n"));
> - }
> + if (n != len) {
> + DEBUG_ERROR((dfd, "Didn't send all data urgently 
> X\n"));
> + }
>  #endif
> + }
> +
>   if (n < 0) {
>   return n;
>   }
> 



Re: [Qemu-devel] [PATCH 11/13] 9p: darwin: Mark mknod as unsupported

2018-05-31 Thread Keno Fischer
>> My concern was that allowing this would cause unexpected
>> behavior, since the device numbers will differ between OS X
>> and Linux. Though maybe this isn't the place to worry about
>> that.
>
> The numbers may differ indeed but we don't really care since the
> server never opens device files. This is just a directory entry.

Ok, let me try to implement it. However, I don't think it is possible
to implement mknodat (or at least I can't think of how) on Darwin
directly. I could use regular mknod, but presumably, this is used
to avoid a race condition between creating the device and setting
the permissions. Can you think of a good way to resolve that?



[Qemu-devel] [PATCH 3/6] linux-user: Check is_hostfd in do_syscall

2018-05-31 Thread Richard Henderson
This is the vast majority of all fd inputs, but not all.

Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 303 +++
 1 file changed, 280 insertions(+), 23 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index d02c16bbc6..5339f0bc1c 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8034,6 +8034,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 if (arg3 == 0)
 ret = 0;
 else {
+if (is_hostfd(arg1)) {
+goto ebadf;
+}
 if (!(p = lock_user(VERIFY_WRITE, arg2, arg3, 0)))
 goto efault;
 ret = get_errno(safe_read(arg1, p, arg3));
@@ -8045,6 +8048,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 }
 break;
 case TARGET_NR_write:
+if (is_hostfd(arg1)) {
+goto ebadf;
+}
 if (!(p = lock_user(VERIFY_READ, arg2, arg3, 1)))
 goto efault;
 if (fd_trans_target_to_host_data(arg1)) {
@@ -8072,6 +8078,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 break;
 #endif
 case TARGET_NR_openat:
+if (is_hostfd(arg1)) {
+goto ebadf;
+}
 if (!(p = lock_user_string(arg2)))
 goto efault;
 ret = get_errno(do_openat(cpu_env, arg1, p,
@@ -8082,16 +8091,25 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 break;
 #if defined(TARGET_NR_name_to_handle_at) && defined(CONFIG_OPEN_BY_HANDLE)
 case TARGET_NR_name_to_handle_at:
+if (is_hostfd(arg1)) {
+goto ebadf;
+}
 ret = do_name_to_handle_at(arg1, arg2, arg3, arg4, arg5);
 break;
 #endif
 #if defined(TARGET_NR_open_by_handle_at) && defined(CONFIG_OPEN_BY_HANDLE)
 case TARGET_NR_open_by_handle_at:
+if (is_hostfd(arg1)) {
+goto ebadf;
+}
 ret = do_open_by_handle_at(arg1, arg2, arg3);
 fd_trans_unregister(ret);
 break;
 #endif
 case TARGET_NR_close:
+if (is_hostfd(arg1)) {
+goto ebadf;
+}
 fd_trans_unregister(arg1);
 ret = get_errno(close(arg1));
 break;
@@ -8155,7 +8173,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #endif
 #if defined(TARGET_NR_linkat)
 case TARGET_NR_linkat:
-{
+if (is_hostfd(arg1)) {
+goto ebadf;
+} else {
 void * p2 = NULL;
 if (!arg2 || !arg4)
 goto efault;
@@ -8180,6 +8200,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #endif
 #if defined(TARGET_NR_unlinkat)
 case TARGET_NR_unlinkat:
+if (is_hostfd(arg1)) {
+goto ebadf;
+}
 if (!(p = lock_user_string(arg2)))
 goto efault;
 ret = get_errno(unlinkat(arg1, p, arg3));
@@ -8311,6 +8334,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #endif
 #if defined(TARGET_NR_mknodat)
 case TARGET_NR_mknodat:
+if (is_hostfd(arg1)) {
+goto ebadf;
+}
 if (!(p = lock_user_string(arg2)))
 goto efault;
 ret = get_errno(mknodat(arg1, p, arg3, arg4));
@@ -8334,6 +8360,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 goto unimplemented;
 #endif
 case TARGET_NR_lseek:
+if (is_hostfd(arg1)) {
+goto ebadf;
+}
 ret = get_errno(lseek(arg1, arg2, arg3));
 break;
 #if defined(TARGET_NR_getxpid) && defined(TARGET_ALPHA)
@@ -8484,7 +8513,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #endif
 #if defined(TARGET_NR_futimesat)
 case TARGET_NR_futimesat:
-{
+if (is_hostfd(arg1)) {
+goto ebadf;
+} else {
 struct timeval *tvp, tv[2];
 if (arg3) {
 if (copy_from_user_timeval([0], arg3)
@@ -8520,6 +8551,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #endif
 #if defined(TARGET_NR_faccessat) && defined(__NR_faccessat)
 case TARGET_NR_faccessat:
+if (is_hostfd(arg1)) {
+goto ebadf;
+}
 if (!(p = lock_user_string(arg2)))
 goto efault;
 ret = get_errno(faccessat(arg1, p, arg3, 0));
@@ -8564,7 +8598,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #endif
 #if defined(TARGET_NR_renameat)
 case TARGET_NR_renameat:
-{
+if (is_hostfd(arg1)) {
+goto ebadf;
+} else {
 void *p2;
 p  = lock_user_string(arg2);
 p2 = lock_user_string(arg4);
@@ -8579,7 +8615,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #endif
 #if defined(TARGET_NR_renameat2)
 case TARGET_NR_renameat2:
-{
+if (is_hostfd(arg1)) {
+goto ebadf;
+} else {
 void *p2;
 p  = lock_user_string(arg2);
   

Re: [Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements

2018-05-31 Thread no-reply
Hi,

This series failed docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

Type: series
Message-id: 20180531222835.16558-1-js...@redhat.com
Subject: [Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
time make docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
9932482af2 ahci: make ahci_mem_write traces more descriptive
c89f4076bd ahci: delete old host register address definitions
70f11b4d95 ahci: adjust ahci_mem_write to work on registers
c13dce7973 ahci: fix spacing damage on ahci_mem_write
fab5307968 ahci: make mem_read_32 traces more descriptive
ace1b4ffac ahci: modify ahci_mem_read_32 to work on register numbers
3021a82d31 ahci: fix host register max address
3c4808a257 ahci: add host register enumeration
f57758869b ahci: delete old port register address definitions
52d04d9b2c ahci: make port write traces more descriptive
08f435fad3 ahci: modify ahci_port_write to use register numbers
634c79d46b ahci: combine identical clauses in port write
50b162ae60 ahci: fix spacing damage on ahci_port_write
5895ec8f28 ahci: make port read traces more descriptive
68fd051a9a ahci: modify ahci_port_read to use register numbers
0478088585 ahci: add port register enumeration

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-v_cvirkw/src/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
  BUILD   fedora
make[1]: Entering directory '/var/tmp/patchew-tester-tmp-v_cvirkw/src'
  GEN 
/var/tmp/patchew-tester-tmp-v_cvirkw/src/docker-src.2018-05-31-18.47.55.18742/qemu.tar
Cloning into 
'/var/tmp/patchew-tester-tmp-v_cvirkw/src/docker-src.2018-05-31-18.47.55.18742/qemu.tar.vroot'...
done.
Checking out files:  49% (3050/6180)   
Checking out files:  50% (3090/6180)   
Checking out files:  51% (3152/6180)   
Checking out files:  52% (3214/6180)   
Checking out files:  53% (3276/6180)   
Checking out files:  54% (3338/6180)   
Checking out files:  55% (3399/6180)   
Checking out files:  56% (3461/6180)   
Checking out files:  57% (3523/6180)   
Checking out files:  58% (3585/6180)   
Checking out files:  59% (3647/6180)   
Checking out files:  60% (3708/6180)   
Checking out files:  61% (3770/6180)   
Checking out files:  62% (3832/6180)   
Checking out files:  63% (3894/6180)   
Checking out files:  64% (3956/6180)   
Checking out files:  65% (4017/6180)   
Checking out files:  66% (4079/6180)   
Checking out files:  67% (4141/6180)   
Checking out files:  68% (4203/6180)   
Checking out files:  69% (4265/6180)   
Checking out files:  70% (4326/6180)   
Checking out files:  71% (4388/6180)   
Checking out files:  72% (4450/6180)   
Checking out files:  73% (4512/6180)   
Checking out files:  74% (4574/6180)   
Checking out files:  75% (4635/6180)   
Checking out files:  76% (4697/6180)   
Checking out files:  77% (4759/6180)   
Checking out files:  78% (4821/6180)   
Checking out files:  79% (4883/6180)   
Checking out files:  80% (4944/6180)   
Checking out files:  81% (5006/6180)   
Checking out files:  82% (5068/6180)   
Checking out files:  83% (5130/6180)   
Checking out files:  84% (5192/6180)   
Checking out files:  85% (5253/6180)   
Checking out files:  86% (5315/6180)   
Checking out files:  87% (5377/6180)   
Checking out files:  88% (5439/6180)   
Checking out files:  89% (5501/6180)   
Checking out files:  90% (5562/6180)   
Checking out files:  91% (5624/6180)   
Checking out files:  92% (5686/6180)   
Checking out files:  93% (5748/6180)   
Checking out files:  94% (5810/6180)   
Checking out files:  95% (5871/6180)   
Checking out files:  96% (5933/6180)   
Checking out files:  97% (5995/6180)   
Checking out files:  98% (6057/6180)   
Checking out files:  98% (6105/6180)   
Checking out files:  99% (6119/6180)   
Checking out files: 100% (6180/6180)   
Checking out files: 100% (6180/6180), done.
Your branch is up-to-date with 'origin/test'.
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 
'/var/tmp/patchew-tester-tmp-v_cvirkw/src/docker-src.2018-05-31-18.47.55.18742/qemu.tar.vroot/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered 
for path 'ui/keycodemapdb'
Cloning into 
'/var/tmp/patchew-tester-tmp-v_cvirkw/src/docker-src.2018-05-31-18.47.55.18742/qemu.tar.vroot/ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out 
'6b3d716e2b6472eb7189d3220552280ef3d832ce'
  COPYRUNNER
RUN test-mingw in qemu:fedora 
Packages installed:
PyYAML-3.12-5.fc27.x86_64

[Qemu-devel] [PATCH 6/6] linux-user: Use *at functions to implement interp_prefix

2018-05-31 Thread Richard Henderson
If the interp_prefix is a complete chroot, it may have a *lot* of files.
Setting up the cache for this is quite expensive.

For the most part, we can use the *at versions of various syscalls to
attempt the operation in the prefix.  For the few cases that remain,
attempt the operation in the prefix via concatenation and then retry
if that fails.

Signed-off-by: Richard Henderson 
---
 linux-user/qemu.h|  37 ++
 linux-user/elfload.c |   5 +-
 linux-user/main.c|  26 ++-
 linux-user/syscall.c | 163 +--
 4 files changed, 174 insertions(+), 57 deletions(-)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 33dafbe0e4..05a82a3628 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -471,7 +471,44 @@ void mmap_fork_start(void);
 void mmap_fork_end(int child);
 
 /* main.c */
+extern int interp_dirfd;
 extern unsigned long guest_stack_size;
+char *interp_prefix_path(const char *path);
+
+/* If PATH is absolute, attempt an operation first within interp_dirfd,
+ * using OPENAT_EXPR.  If that fails with ENOENT, or if PATH is not
+ * absolute, only use NORMAL_EXPR.
+ */
+#define TRY_INTERP_FD(RET, PATH, OPENAT_EXPR, NORMAL_EXPR)  \
+do {\
+if (interp_dirfd >= 0 && (PATH)[0] == '/') {\
+RET = OPENAT_EXPR;  \
+if (RET != -1 || errno != ENOENT) { \
+break;  \
+}   \
+}   \
+RET = NORMAL_EXPR;  \
+} while (0)
+
+/* If PATH is absolute, attempt an operation first with interp_prefix
+ * prefixed.  If that fails with ENOENT, or if PATH is not absolute,
+ * only attempt with PATH.
+ */
+#define TRY_INTERP_PATH(RET, PATH, EXPR)\
+do {\
+char *new_##PATH = interp_prefix_path(PATH);\
+if (new_##PATH) {   \
+__typeof(PATH) save_##PATH = PATH;  \
+PATH = new_##PATH;  \
+RET = EXPR; \
+free(new_##PATH);   \
+PATH = save_##PATH; \
+if (RET != -1 || errno != ENOENT) { \
+break;  \
+}   \
+}   \
+RET = EXPR; \
+} while (0)
 
 /* user access */
 
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 13bc78d0c8..abdf5bbf01 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -6,7 +6,6 @@
 
 #include "qemu.h"
 #include "disas/disas.h"
-#include "qemu/path.h"
 
 #ifdef _ARCH_PPC64
 #undef ARCH_DLINFO
@@ -2375,7 +2374,9 @@ static void load_elf_interp(const char *filename, struct 
image_info *info,
 {
 int fd, retval;
 
-fd = open(path(filename), O_RDONLY);
+TRY_INTERP_FD(fd, filename,
+  openat(interp_dirfd, filename + 1, O_RDONLY),
+  open(filename, O_RDONLY));
 if (fd < 0) {
 goto exit_perror;
 }
diff --git a/linux-user/main.c b/linux-user/main.c
index ee3f323c08..4e956fc00c 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -23,7 +23,6 @@
 
 #include "qapi/error.h"
 #include "qemu.h"
-#include "qemu/path.h"
 #include "qemu/config-file.h"
 #include "qemu/cutils.h"
 #include "qemu/help_option.h"
@@ -89,9 +88,27 @@ unsigned long reserved_va;
 
 static void usage(int exitcode);
 
+int interp_dirfd;
 static const char *interp_prefix = CONFIG_QEMU_INTERP_PREFIX;
 const char *qemu_uname_release;
 
+char *interp_prefix_path(const char *path)
+{
+size_t i_len, p_len;
+char *ret;
+
+if (interp_prefix == NULL || path[0] != '/') {
+return NULL;
+}
+i_len = strlen(interp_prefix);
+p_len = strlen(path) + 1;
+ret = g_malloc(i_len + p_len);
+
+memcpy(ret, interp_prefix, i_len);
+memcpy(ret + i_len, path, p_len);
+return ret;
+}
+
 /* XXX: on x86 MAP_GROWSDOWN only works if ESP <= address + 32, so
we allocate a bigger stack. Need a better solution, for example
by remapping the process stack directly at the right place */
@@ -671,7 +688,12 @@ int main(int argc, char **argv, char **envp)
 memset(, 0, sizeof (bprm));
 
 /* Scan interp_prefix dir for replacement files. */
-init_paths(interp_prefix);
+interp_dirfd = open(interp_prefix, O_CLOEXEC | O_DIRECTORY | O_PATH);
+if (interp_dirfd >= 0) {
+add_hostfd(interp_dirfd);
+} else {
+interp_prefix = NULL;
+}
 
 init_qemu_uname_release();
 
diff --git 

[Qemu-devel] [PATCH 4/6] linux-user: Check contains_hostfd in select syscalls

2018-05-31 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 5339f0bc1c..b98125829b 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -1506,6 +1506,11 @@ static abi_long do_select(int n,
 if (ret) {
 return ret;
 }
+if (contains_hostfd() ||
+contains_hostfd() ||
+contains_hostfd()) {
+return -TARGET_EBADF;
+}
 
 if (target_tv_addr) {
 if (copy_from_user_timeval(, target_tv_addr))
@@ -9392,6 +9397,11 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 if (ret) {
 goto fail;
 }
+if (contains_hostfd() ||
+contains_hostfd() ||
+contains_hostfd()) {
+goto ebadf;
+}
 
 /*
  * This takes a timespec, and not a timeval, so we cannot
-- 
2.17.0




[Qemu-devel] [PATCH 2/6] linux-user: Add host_fds and manipulators

2018-05-31 Thread Richard Henderson
This allows emulation of guest syscalls to reject
manipulations to fds used by the host.

Signed-off-by: Richard Henderson 
---
 linux-user/qemu.h | 30 ++
 linux-user/main.c | 27 ++-
 2 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index c55c8e294b..33dafbe0e4 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -155,6 +155,36 @@ void task_settid(TaskState *);
 void stop_all_tasks(void);
 extern const char *qemu_uname_release;
 extern unsigned long mmap_min_addr;
+extern fd_set host_fds;
+
+/**
+ * is_hostfd:
+ * @fd: file descriptor to check
+ *
+ * Return true if @fd is being used by the host and therefore any
+ * guest system call referencing @fd should return EBADF.
+ */
+static inline bool is_hostfd(int fd)
+{
+return fd >= 0 && fd < FD_SETSIZE && FD_ISSET(fd, _fds);
+}
+
+/**
+ * contains_hostfd:
+ * @fds: fd_set of descriptors to check
+ *
+ * Return true if any descriptor in @fds are being used by the host
+ * and therefore the guest system call should return EBADF.
+ */
+bool contains_hostfd(const fd_set *fds);
+
+/**
+ * add_hostfd:
+ * @fd: file descriptor to reserve
+ *
+ * Add @fd to the set of file descriptors to reserve for the host.
+ */
+void add_hostfd(int fd);
 
 /* ??? See if we can avoid exposing so much of the loader internals.  */
 
diff --git a/linux-user/main.c b/linux-user/main.c
index 78d6d3e7eb..ee3f323c08 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -49,6 +49,7 @@ static const char *cpu_type;
 unsigned long mmap_min_addr;
 unsigned long guest_base;
 int have_guest_base;
+fd_set host_fds;
 
 /*
  * When running 32-on-64 we should make sure we can fit all of the possible
@@ -112,6 +113,23 @@ int cpu_get_pic_interrupt(CPUX86State *env)
 }
 #endif
 
+bool contains_hostfd(const fd_set *fds)
+{
+int i;
+for (i = 0; i < ARRAY_SIZE(__FDS_BITS(fds)); ++i) {
+if (__FDS_BITS(fds)[i] & __FDS_BITS(_fds)[i]) {
+return true;
+}
+}
+return true;
+}
+
+void add_hostfd(int fd)
+{
+g_assert(fd >= 0 && fd < FD_SETSIZE);
+FD_SET(fd, _fds);
+}
+
 /***/
 /* Helper routines for implementing atomic operations.  */
 
@@ -805,12 +823,19 @@ int main(int argc, char **argv, char **envp)
 
 target_cpu_copy_regs(env, regs);
 
+/* Prevent the guest from closing the log file.  */
+if (qemu_logfile && qemu_logfile != stderr) {
+add_hostfd(fileno(qemu_logfile));
+}
+
 if (gdbstub_port) {
-if (gdbserver_start(gdbstub_port) < 0) {
+int fd = gdbserver_start(gdbstub_port);
+if (fd < 0) {
 fprintf(stderr, "qemu: could not open gdbserver on port %d\n",
 gdbstub_port);
 exit(EXIT_FAILURE);
 }
+add_hostfd(fd);
 gdb_handlesig(cpu, 0);
 }
 cpu_loop(env);
-- 
2.17.0




[Qemu-devel] [PATCH 1/6] gdbstub: Return the fd from gdbserver_start

2018-05-31 Thread Richard Henderson
This will allow us to protect gdbserver_fd from the guest.

Signed-off-by: Richard Henderson 
---
 gdbstub.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 6081e719c5..057d0d65c5 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -1890,15 +1890,16 @@ static int gdbserver_open(int port)
 int gdbserver_start(int port)
 {
 gdbserver_fd = gdbserver_open(port);
-if (gdbserver_fd < 0)
+if (gdbserver_fd < 0) {
 return -1;
+}
 /* accept connections */
 if (!gdb_accept()) {
 close(gdbserver_fd);
 gdbserver_fd = -1;
 return -1;
 }
-return 0;
+return gdbserver_fd;
 }
 
 /* Disable gdb stub for child processes.  */
-- 
2.17.0




[Qemu-devel] [PATCH v5 0/6] linux-user: Reorg interp_prefix handling

2018-05-31 Thread Richard Henderson
Changes since v4 (2018-01-28)
  * Protect host fds from guest manipulation.
  * Rename the interp_dirfd macro to TRY_INTERP_FD
  * Add TRY_INTERP_PATH for acct, statfs, inotify_add_watch

Changes since v3 (2017-12-29)
  * Use DO/WHILE as the control construct; wrap it in a macro.
  * Introduce linux_user_path to handle the cases *at syscalls
do not cover.

Changes since v2 (2017-12-04)
  * Use IF as the control construct instead of SWITCH.

Changes since v1 (2016-11-??)
  * Require interp_dirfd set before trying the *at path.


r~


Richard Henderson (6):
  gdbstub: Return the fd from gdbserver_start
  linux-user: Add host_fds and manipulators
  linux-user: Check is_hostfd in do_syscall
  linux-user: Check contains_hostfd in select syscalls
  linux-user: Check is_hostfd in mmap syscalls
  linux-user: Use *at functions to implement interp_prefix

 linux-user/qemu.h|  67 ++
 gdbstub.c|   5 +-
 linux-user/elfload.c |   5 +-
 linux-user/main.c|  53 -
 linux-user/syscall.c | 485 ---
 5 files changed, 532 insertions(+), 83 deletions(-)

-- 
2.17.0




[Qemu-devel] [PATCH 5/6] linux-user: Check is_hostfd in mmap syscalls

2018-05-31 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index b98125829b..d7513d5dac 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -9605,11 +9605,17 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 v5 = tswapal(v[4]);
 v6 = tswapal(v[5]);
 unlock_user(v, arg1, 0);
+if (is_hostfd(v5)) {
+goto ebadf;
+}
 ret = get_errno(target_mmap(v1, v2, v3,
 target_to_host_bitmask(v4, 
mmap_flags_tbl),
 v5, v6));
 }
 #else
+if (is_hostfd(arg5)) {
+goto ebadf;
+}
 ret = get_errno(target_mmap(arg1, arg2, arg3,
 target_to_host_bitmask(arg4, 
mmap_flags_tbl),
 arg5,
@@ -9622,6 +9628,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #ifndef MMAP_SHIFT
 #define MMAP_SHIFT 12
 #endif
+if (is_hostfd(arg5)) {
+goto ebadf;
+}
 ret = get_errno(target_mmap(arg1, arg2, arg3,
 target_to_host_bitmask(arg4, 
mmap_flags_tbl),
 arg5,
-- 
2.17.0




Re: [Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements

2018-05-31 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180531222835.16558-1-js...@redhat.com
Subject: [Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]patchew/20180531212435.165261-1-ebl...@redhat.com -> 
patchew/20180531212435.165261-1-ebl...@redhat.com
 * [new tag]   patchew/20180531222835.16558-1-js...@redhat.com -> 
patchew/20180531222835.16558-1-js...@redhat.com
Switched to a new branch 'test'
9932482af2 ahci: make ahci_mem_write traces more descriptive
c89f4076bd ahci: delete old host register address definitions
70f11b4d95 ahci: adjust ahci_mem_write to work on registers
c13dce7973 ahci: fix spacing damage on ahci_mem_write
fab5307968 ahci: make mem_read_32 traces more descriptive
ace1b4ffac ahci: modify ahci_mem_read_32 to work on register numbers
3021a82d31 ahci: fix host register max address
3c4808a257 ahci: add host register enumeration
f57758869b ahci: delete old port register address definitions
52d04d9b2c ahci: make port write traces more descriptive
08f435fad3 ahci: modify ahci_port_write to use register numbers
634c79d46b ahci: combine identical clauses in port write
50b162ae60 ahci: fix spacing damage on ahci_port_write
5895ec8f28 ahci: make port read traces more descriptive
68fd051a9a ahci: modify ahci_port_read to use register numbers
0478088585 ahci: add port register enumeration

=== OUTPUT BEGIN ===
Checking PATCH 1/16: ahci: add port register enumeration...
WARNING: line over 80 characters
#71: FILE: hw/ide/ahci_internal.h:94:
+AHCI_PORT_REG_SCR_NOTIF   = 15, /* PxSNTF: SATA phy register: 
SNotification */

total: 0 errors, 1 warnings, 72 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 2/16: ahci: modify ahci_port_read to use register numbers...
Checking PATCH 3/16: ahci: make port read traces more descriptive...
Checking PATCH 4/16: ahci: fix spacing damage on ahci_port_write...
ERROR: spaces required around that '|' (ctx:VxV)
#83: FILE: hw/ide/ahci.c:316:
+(val & ~(PORT_CMD_RO_MASK|PORT_CMD_ICC_MASK));
  ^

total: 1 errors, 0 warnings, 156 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 5/16: ahci: combine identical clauses in port write...
Checking PATCH 6/16: ahci: modify ahci_port_write to use register numbers...
Checking PATCH 7/16: ahci: make port write traces more descriptive...
Checking PATCH 8/16: ahci: delete old port register address definitions...
Checking PATCH 9/16: ahci: add host register enumeration...
Checking PATCH 10/16: ahci: fix host register max address...
Checking PATCH 11/16: ahci: modify ahci_mem_read_32 to work on register 
numbers...
Checking PATCH 12/16: ahci: make mem_read_32 traces more descriptive...
Checking PATCH 13/16: ahci: fix spacing damage on ahci_mem_write...
Checking PATCH 14/16: ahci: adjust ahci_mem_write to work on registers...
Checking PATCH 15/16: ahci: delete old host register address definitions...
Checking PATCH 16/16: ahci: make ahci_mem_write traces more descriptive...
WARNING: line over 80 characters
#18: FILE: hw/ide/ahci.c:497:
+qemu_log_mask(LOG_UNIMP, "Attempted write to unimplemented 
register:"

total: 0 errors, 1 warnings, 35 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[Qemu-devel] [PATCH v2 16/16] ahci: make ahci_mem_write traces more descriptive

2018-05-31 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/ahci.c   | 13 -
 hw/ide/trace-events |  4 +++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 1309f80458..6cab853741 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -494,13 +494,24 @@ static void ahci_mem_write(void *opaque, hwaddr addr,
 /* FIXME report write? */
 break;
 default:
-trace_ahci_mem_write_unknown(s, size, addr, val);
+qemu_log_mask(LOG_UNIMP, "Attempted write to unimplemented 
register:"
+  " AHCI host register %s, offset 0x%lx: 0x%"PRIu64,
+  AHCIHostReg_lookup[regnum], addr, val);
+trace_ahci_mem_write_host_unimpl(s, size,
+ AHCIHostReg_lookup[regnum], addr);
 }
+trace_ahci_mem_write_host(s, size, AHCIHostReg_lookup[regnum],
+ addr, val);
 } else if ((addr >= AHCI_PORT_REGS_START_ADDR) &&
(addr < (AHCI_PORT_REGS_START_ADDR +
 (s->ports * AHCI_PORT_ADDR_OFFSET_LEN {
 ahci_port_write(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7,
 addr & AHCI_PORT_ADDR_OFFSET_MASK, val);
+} else {
+qemu_log_mask(LOG_UNIMP, "Attempted write to unimplemented register: "
+  "AHCI global register at offset 0x%lx: 0x%"PRIu64,
+  addr, val);
+trace_ahci_mem_write_unimpl(s, size, addr, val);
 }
 }
 
diff --git a/hw/ide/trace-events b/hw/ide/trace-events
index 8149a54db8..e6bd95f52f 100644
--- a/hw/ide/trace-events
+++ b/hw/ide/trace-events
@@ -77,7 +77,9 @@ ahci_mem_read_32_host(void *s, const char *reg, uint64_t 
addr, uint32_t val) "ah
 ahci_mem_read_32_host_default(void *s, const char *reg, uint64_t addr) 
"ahci(%p): unimplemented mem read [reg:%s] @ 0x%"PRIx64
 ahci_mem_read(void *s, unsigned size, uint64_t addr, uint64_t val) "ahci(%p): 
read%u @ 0x%"PRIx64": 0x%016"PRIx64
 ahci_mem_write(void *s, unsigned size, uint64_t addr, uint64_t val) "ahci(%p): 
write%u @ 0x%"PRIx64": 0x%016"PRIx64
-ahci_mem_write_unknown(void *s, unsigned size, uint64_t addr, uint64_t val) 
"ahci(%p): write%u to unknown register 0x%"PRIx64": 0x%016"PRIx64
+ahci_mem_write_host_unimpl(void *s, unsigned size, const char *reg, uint64_t 
addr) "ahci(%p) unimplemented write%u [reg:%s] @ 0x%"PRIx64
+ahci_mem_write_host(void *s, unsigned size, const char *reg, uint64_t addr, 
uint64_t val) "ahci(%p) write%u [reg:%s] @ 0x%"PRIx64": 0x%016"PRIx64
+ahci_mem_write_unimpl(void *s, unsigned size, uint64_t addr, uint64_t val) 
"ahci(%p): write%u to unknown register 0x%"PRIx64": 0x%016"PRIx64
 ahci_set_signature(void *s, int port, uint8_t nsector, uint8_t sector, uint8_t 
lcyl, uint8_t hcyl, uint32_t sig) "ahci(%p)[%d]: set signature sector:0x%02x 
nsector:0x%02x lcyl:0x%02x hcyl:0x%02x (cumulatively: 0x%08x)"
 ahci_reset_port(void *s, int port) "ahci(%p)[%d]: reset port"
 ahci_unmap_fis_address_null(void *s, int port) "ahci(%p)[%d]: Attempt to unmap 
NULL FIS address"
-- 
2.14.3




[Qemu-devel] [PATCH v2 12/16] ahci: make mem_read_32 traces more descriptive

2018-05-31 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/ahci.c   | 7 +--
 hw/ide/trace-events | 3 +++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 7edbb30f60..5e902c6615 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -47,7 +47,6 @@ static bool ahci_map_fis_address(AHCIDevice *ad);
 static void ahci_unmap_clb_address(AHCIDevice *ad);
 static void ahci_unmap_fis_address(AHCIDevice *ad);
 
-__attribute__((__unused__)) /* TODO */
 static const char *AHCIHostReg_lookup[AHCI_HOST_REG__COUNT] = {
 [AHCI_HOST_REG_CAP]= "CAP",
 [AHCI_HOST_REG_CTL]= "GHC",
@@ -406,13 +405,17 @@ static uint64_t ahci_mem_read_32(void *opaque, hwaddr 
addr)
 val = s->control_regs.version;
 break;
 default:
-break;
+trace_ahci_mem_read_32_host_default(s, AHCIHostReg_lookup[regnum],
+addr);
 }
+trace_ahci_mem_read_32_host(s, AHCIHostReg_lookup[regnum], addr, val);
 } else if ((addr >= AHCI_PORT_REGS_START_ADDR) &&
(addr < (AHCI_PORT_REGS_START_ADDR +
 (s->ports * AHCI_PORT_ADDR_OFFSET_LEN {
 val = ahci_port_read(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7,
  addr & AHCI_PORT_ADDR_OFFSET_MASK);
+} else {
+trace_ahci_mem_read_32_default(s, addr, val);
 }
 
 trace_ahci_mem_read_32(s, addr, val);
diff --git a/hw/ide/trace-events b/hw/ide/trace-events
index 1efbbb8114..8149a54db8 100644
--- a/hw/ide/trace-events
+++ b/hw/ide/trace-events
@@ -72,6 +72,9 @@ ahci_trigger_irq(void *s, int port, const char *name, 
uint32_t val, uint32_t old
 ahci_port_write(void *s, int port, const char *reg, int offset, uint32_t val) 
"ahci(%p)[%d]: port write [reg:%s] @ 0x%x: 0x%08x"
 ahci_port_write_unimpl(void *s, int port, const char *reg, int offset, 
uint32_t val) "ahci(%p)[%d]: unimplemented port write [reg:%s] @ 0x%x: 0x%08x"
 ahci_mem_read_32(void *s, uint64_t addr, uint32_t val) "ahci(%p): mem read @ 
0x%"PRIx64": 0x%08x"
+ahci_mem_read_32_default(void *s, uint64_t addr, uint32_t val) "ahci(%p): mem 
read @ 0x%"PRIx64": 0x%08x"
+ahci_mem_read_32_host(void *s, const char *reg, uint64_t addr, uint32_t val) 
"ahci(%p): mem read [reg:%s] @ 0x%"PRIx64": 0x%08x"
+ahci_mem_read_32_host_default(void *s, const char *reg, uint64_t addr) 
"ahci(%p): unimplemented mem read [reg:%s] @ 0x%"PRIx64
 ahci_mem_read(void *s, unsigned size, uint64_t addr, uint64_t val) "ahci(%p): 
read%u @ 0x%"PRIx64": 0x%016"PRIx64
 ahci_mem_write(void *s, unsigned size, uint64_t addr, uint64_t val) "ahci(%p): 
write%u @ 0x%"PRIx64": 0x%016"PRIx64
 ahci_mem_write_unknown(void *s, unsigned size, uint64_t addr, uint64_t val) 
"ahci(%p): write%u to unknown register 0x%"PRIx64": 0x%016"PRIx64
-- 
2.14.3




[Qemu-devel] [PATCH v2 11/16] ahci: modify ahci_mem_read_32 to work on register numbers

2018-05-31 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/ahci.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index f2e8bce797..7edbb30f60 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -386,22 +386,27 @@ static uint64_t ahci_mem_read_32(void *opaque, hwaddr 
addr)
 uint32_t val = 0;
 
 if (addr < AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR) {
-switch (addr) {
-case HOST_CAP:
+enum AHCIHostReg regnum = addr / 4;
+assert(regnum < AHCI_HOST_REG__COUNT);
+
+switch (regnum) {
+case AHCI_HOST_REG_CAP:
 val = s->control_regs.cap;
 break;
-case HOST_CTL:
+case AHCI_HOST_REG_CTL:
 val = s->control_regs.ghc;
 break;
-case HOST_IRQ_STAT:
+case AHCI_HOST_REG_IRQ_STAT:
 val = s->control_regs.irqstatus;
 break;
-case HOST_PORTS_IMPL:
+case AHCI_HOST_REG_PORTS_IMPL:
 val = s->control_regs.impl;
 break;
-case HOST_VERSION:
+case AHCI_HOST_REG_VERSION:
 val = s->control_regs.version;
 break;
+default:
+break;
 }
 } else if ((addr >= AHCI_PORT_REGS_START_ADDR) &&
(addr < (AHCI_PORT_REGS_START_ADDR +
-- 
2.14.3




[Qemu-devel] [PATCH v2 01/16] ahci: add port register enumeration

2018-05-31 Thread John Snow
Instead of tracking offsets, lets count the registers.

Signed-off-by: John Snow 
---
 hw/ide/ahci.c  | 25 +
 hw/ide/ahci_internal.h | 29 +
 2 files changed, 54 insertions(+)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index e22d7be05f..48130c6439 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -46,6 +46,31 @@ static bool ahci_map_fis_address(AHCIDevice *ad);
 static void ahci_unmap_clb_address(AHCIDevice *ad);
 static void ahci_unmap_fis_address(AHCIDevice *ad);
 
+__attribute__((__unused__)) /* TODO */
+static const char *AHCIPortReg_lookup[AHCI_PORT_REG__COUNT] = {
+[AHCI_PORT_REG_LST_ADDR]= "PxCLB",
+[AHCI_PORT_REG_LST_ADDR_HI] = "PxCLBU",
+[AHCI_PORT_REG_FIS_ADDR]= "PxFB",
+[AHCI_PORT_REG_FIS_ADDR_HI] = "PxFBU",
+[AHCI_PORT_REG_IRQ_STAT]= "PxIS",
+[AHCI_PORT_REG_IRQ_MASK]= "PXIE",
+[AHCI_PORT_REG_CMD] = "PxCMD",
+[7] = "Reserved",
+[AHCI_PORT_REG_TFDATA]  = "PxTFD",
+[AHCI_PORT_REG_SIG] = "PxSIG",
+[AHCI_PORT_REG_SCR_STAT]= "PxSSTS",
+[AHCI_PORT_REG_SCR_CTL] = "PxSCTL",
+[AHCI_PORT_REG_SCR_ERR] = "PxSERR",
+[AHCI_PORT_REG_SCR_ACT] = "PxSACT",
+[AHCI_PORT_REG_CMD_ISSUE]   = "PxCI",
+[AHCI_PORT_REG_SCR_NOTIF]   = "PxSNTF",
+[AHCI_PORT_REG_FIS_CTL] = "PxFBS",
+[AHCI_PORT_REG_DEV_SLEEP]   = "PxDEVSLP",
+[18 ... 27] = "Reserved",
+[AHCI_PORT_REG_VENDOR_1 ...
+ AHCI_PORT_REG_VENDOR_4]= "PxVS",
+};
+
 static const char *AHCIPortIRQ_lookup[AHCI_PORT_IRQ__COUNT] = {
 [AHCI_PORT_IRQ_BIT_DHRS] = "DHRS",
 [AHCI_PORT_IRQ_BIT_PSS]  = "PSS",
diff --git a/hw/ide/ahci_internal.h b/hw/ide/ahci_internal.h
index 1a25d6c039..eb7e1eefc0 100644
--- a/hw/ide/ahci_internal.h
+++ b/hw/ide/ahci_internal.h
@@ -74,6 +74,34 @@
 #define HOST_CAP_NCQ  (1 << 30) /* Native Command Queueing */
 #define HOST_CAP_64   (1U << 31) /* PCI DAC (64-bit DMA) support */
 
+/* registers for each SATA port */
+enum AHCIPortReg {
+AHCI_PORT_REG_LST_ADDR= 0, /* PxCLB: command list DMA addr */
+AHCI_PORT_REG_LST_ADDR_HI = 1, /* PxCLBU: command list DMA addr hi */
+AHCI_PORT_REG_FIS_ADDR= 2, /* PxFB: FIS rx buf addr */
+AHCI_PORT_REG_FIS_ADDR_HI = 3, /* PxFBU: FIX rx buf addr hi */
+AHCI_PORT_REG_IRQ_STAT= 4, /* PxIS: interrupt status */
+AHCI_PORT_REG_IRQ_MASK= 5, /* PxIE: interrupt enable/disable mask */
+AHCI_PORT_REG_CMD = 6, /* PxCMD: port command */
+/* RESERVED */
+AHCI_PORT_REG_TFDATA  = 8, /* PxTFD: taskfile data */
+AHCI_PORT_REG_SIG = 9, /* PxSIG: device TF signature */
+AHCI_PORT_REG_SCR_STAT= 10, /* PxSSTS: SATA phy register: SStatus */
+AHCI_PORT_REG_SCR_CTL = 11, /* PxSCTL: SATA phy register: SControl */
+AHCI_PORT_REG_SCR_ERR = 12, /* PxSERR: SATA phy register: SError */
+AHCI_PORT_REG_SCR_ACT = 13, /* PxSACT: SATA phy register: SActive */
+AHCI_PORT_REG_CMD_ISSUE   = 14, /* PxCI: command issue */
+AHCI_PORT_REG_SCR_NOTIF   = 15, /* PxSNTF: SATA phy register: 
SNotification */
+AHCI_PORT_REG_FIS_CTL = 16, /* PxFBS: Port multiplier switching ctl */
+AHCI_PORT_REG_DEV_SLEEP   = 17, /* PxDEVSLP: device sleep control */
+/* RESERVED */
+AHCI_PORT_REG_VENDOR_1= 28, /* PxVS: Vendor Specific */
+AHCI_PORT_REG_VENDOR_2= 29,
+AHCI_PORT_REG_VENDOR_3= 30,
+AHCI_PORT_REG_VENDOR_4= 31,
+AHCI_PORT_REG__COUNT  = 32
+};
+
 /* registers for each SATA port */
 #define PORT_LST_ADDR 0x00 /* command list DMA addr */
 #define PORT_LST_ADDR_HI  0x04 /* command list DMA addr hi */
@@ -82,6 +110,7 @@
 #define PORT_IRQ_STAT 0x10 /* interrupt status */
 #define PORT_IRQ_MASK 0x14 /* interrupt enable/disable mask */
 #define PORT_CMD  0x18 /* port command */
+
 #define PORT_TFDATA   0x20 /* taskfile data */
 #define PORT_SIG  0x24 /* device TF signature */
 #define PORT_SCR_STAT 0x28 /* SATA phy register: SStatus */
-- 
2.14.3




[Qemu-devel] [PATCH v2 14/16] ahci: adjust ahci_mem_write to work on registers

2018-05-31 Thread John Snow
Actually, this function looks pretty broken, but for now, let's finish
up what this series of commits came here to do.

Signed-off-by: John Snow 
---
 hw/ide/ahci.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 568e6b848a..1309f80458 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -468,11 +468,14 @@ static void ahci_mem_write(void *opaque, hwaddr addr,
 }
 
 if (addr < AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR) {
-switch (addr) {
-case HOST_CAP: /* R/WO, RO */
+enum AHCIHostReg regnum = addr / 4;
+assert(regnum < AHCI_HOST_REG__COUNT);
+
+switch (regnum) {
+case AHCI_HOST_REG_CAP: /* R/WO, RO */
 /* FIXME handle R/WO */
 break;
-case HOST_CTL: /* R/W */
+case AHCI_HOST_REG_CTL: /* R/W */
 if (val & HOST_CTL_RESET) {
 ahci_reset(s);
 } else {
@@ -480,14 +483,14 @@ static void ahci_mem_write(void *opaque, hwaddr addr,
 ahci_check_irq(s);
 }
 break;
-case HOST_IRQ_STAT: /* R/WC, RO */
+case AHCI_HOST_REG_IRQ_STAT: /* R/WC, RO */
 s->control_regs.irqstatus &= ~val;
 ahci_check_irq(s);
 break;
-case HOST_PORTS_IMPL: /* R/WO, RO */
+case AHCI_HOST_REG_PORTS_IMPL: /* R/WO, RO */
 /* FIXME handle R/WO */
 break;
-case HOST_VERSION: /* RO */
+case AHCI_HOST_REG_VERSION: /* RO */
 /* FIXME report write? */
 break;
 default:
-- 
2.14.3




[Qemu-devel] [PATCH v2 03/16] ahci: make port read traces more descriptive

2018-05-31 Thread John Snow
A trace is added to let us watch unimplemented registers specifically,
as these are more likely to cause us trouble. Otherwise, the port read
traces now tell us what register is getting hit, which is nicer.

Signed-off-by: John Snow 
---
 hw/ide/ahci.c   | 5 +++--
 hw/ide/trace-events | 3 ++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 5187bf34ad..5b5ab823f4 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -46,7 +46,6 @@ static bool ahci_map_fis_address(AHCIDevice *ad);
 static void ahci_unmap_clb_address(AHCIDevice *ad);
 static void ahci_unmap_fis_address(AHCIDevice *ad);
 
-__attribute__((__unused__)) /* TODO */
 static const char *AHCIPortReg_lookup[AHCI_PORT_REG__COUNT] = {
 [AHCI_PORT_REG_LST_ADDR]= "PxCLB",
 [AHCI_PORT_REG_LST_ADDR_HI] = "PxCLBU",
@@ -149,10 +148,12 @@ static uint32_t ahci_port_read(AHCIState *s, int port, 
int offset)
 val = pr->cmd_issue;
 break;
 default:
+trace_ahci_port_read_default(s, port, AHCIPortReg_lookup[regnum],
+ offset);
 val = 0;
 }
 
-trace_ahci_port_read(s, port, offset, val);
+trace_ahci_port_read(s, port, AHCIPortReg_lookup[regnum], offset, val);
 return val;
 }
 
diff --git a/hw/ide/trace-events b/hw/ide/trace-events
index 5c0e59ec42..0db18d8271 100644
--- a/hw/ide/trace-events
+++ b/hw/ide/trace-events
@@ -63,7 +63,8 @@ ide_atapi_cmd_read_dma_cb_aio(void *s, int lba, int n) 
"IDEState: %p; aio read:
 ide_atapi_cmd_packet(void *s, uint16_t limit, const char *packet) "IDEState: 
%p; limit=0x%x packet: %s"
 
 # hw/ide/ahci.c
-ahci_port_read(void *s, int port, int offset, uint32_t ret) "ahci(%p)[%d]: 
port read @ 0x%x: 0x%08x"
+ahci_port_read(void *s, int port, const char *reg, int offset, uint32_t ret) 
"ahci(%p)[%d]: port read [reg:%s] @ 0x%x: 0x%08x"
+ahci_port_read_default(void *s, int port, const char *reg, int offset) 
"ahci(%p)[%d]: unimplemented port read [reg:%s] @ 0x%x"
 ahci_irq_raise(void *s) "ahci(%p): raise irq"
 ahci_irq_lower(void *s) "ahci(%p): lower irq"
 ahci_check_irq(void *s, uint32_t old, uint32_t new) "ahci(%p): check irq 
0x%08x --> 0x%08x"
-- 
2.14.3




[Qemu-devel] [PATCH v2 15/16] ahci: delete old host register address definitions

2018-05-31 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/ahci_internal.h | 6 --
 1 file changed, 6 deletions(-)

diff --git a/hw/ide/ahci_internal.h b/hw/ide/ahci_internal.h
index f9dcf8b6e6..2953243929 100644
--- a/hw/ide/ahci_internal.h
+++ b/hw/ide/ahci_internal.h
@@ -55,12 +55,6 @@
 #define RX_FIS_UNK0x60 /* offset of Unknown FIS data */
 
 /* global controller registers */
-#define HOST_CAP  0x00 /* host capabilities */
-#define HOST_CTL  0x04 /* global host control */
-#define HOST_IRQ_STAT 0x08 /* interrupt status */
-#define HOST_PORTS_IMPL   0x0c /* bitmap of implemented ports */
-#define HOST_VERSION  0x10 /* AHCI spec. version compliancy */
-
 enum AHCIHostReg {
 AHCI_HOST_REG_CAP= 0,  /* CAP: host capabilities */
 AHCI_HOST_REG_CTL= 1,  /* GHC: global host control */
-- 
2.14.3




[Qemu-devel] [PATCH v2 09/16] ahci: add host register enumeration

2018-05-31 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/ahci.c  | 15 +++
 hw/ide/ahci_internal.h | 15 +++
 2 files changed, 30 insertions(+)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 01463f0f51..f2e8bce797 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -47,6 +47,21 @@ static bool ahci_map_fis_address(AHCIDevice *ad);
 static void ahci_unmap_clb_address(AHCIDevice *ad);
 static void ahci_unmap_fis_address(AHCIDevice *ad);
 
+__attribute__((__unused__)) /* TODO */
+static const char *AHCIHostReg_lookup[AHCI_HOST_REG__COUNT] = {
+[AHCI_HOST_REG_CAP]= "CAP",
+[AHCI_HOST_REG_CTL]= "GHC",
+[AHCI_HOST_REG_IRQ_STAT]   = "IS",
+[AHCI_HOST_REG_PORTS_IMPL] = "PI",
+[AHCI_HOST_REG_VERSION]= "VS",
+[AHCI_HOST_REG_CCC_CTL]= "CCC_CTL",
+[AHCI_HOST_REG_CCC_PORTS]  = "CCC_PORTS",
+[AHCI_HOST_REG_EM_LOC] = "EM_LOC",
+[AHCI_HOST_REG_EM_CTL] = "EM_CTL",
+[AHCI_HOST_REG_CAP2]   = "CAP2",
+[AHCI_HOST_REG_BOHC]   = "BOHC",
+};
+
 static const char *AHCIPortReg_lookup[AHCI_PORT_REG__COUNT] = {
 [AHCI_PORT_REG_LST_ADDR]= "PxCLB",
 [AHCI_PORT_REG_LST_ADDR_HI] = "PxCLBU",
diff --git a/hw/ide/ahci_internal.h b/hw/ide/ahci_internal.h
index db00c9aa39..af366db6f3 100644
--- a/hw/ide/ahci_internal.h
+++ b/hw/ide/ahci_internal.h
@@ -61,6 +61,21 @@
 #define HOST_PORTS_IMPL   0x0c /* bitmap of implemented ports */
 #define HOST_VERSION  0x10 /* AHCI spec. version compliancy */
 
+enum AHCIHostReg {
+AHCI_HOST_REG_CAP= 0,  /* CAP: host capabilities */
+AHCI_HOST_REG_CTL= 1,  /* GHC: global host control */
+AHCI_HOST_REG_IRQ_STAT   = 2,  /* IS: interrupt status */
+AHCI_HOST_REG_PORTS_IMPL = 3,  /* PI: bitmap of implemented ports */
+AHCI_HOST_REG_VERSION= 4,  /* VS: AHCI spec. version compliancy */
+AHCI_HOST_REG_CCC_CTL= 5,  /* CCC_CTL: CCC Control */
+AHCI_HOST_REG_CCC_PORTS  = 6,  /* CCC_PORTS: CCC Ports */
+AHCI_HOST_REG_EM_LOC = 7,  /* EM_LOC: Enclosure Mgmt Location */
+AHCI_HOST_REG_EM_CTL = 8,  /* EM_CTL: Enclosure Mgmt Control */
+AHCI_HOST_REG_CAP2   = 9,  /* CAP2: host capabilities, extended */
+AHCI_HOST_REG_BOHC   = 10, /* BOHC: firmare/os handoff ctrl & status */
+AHCI_HOST_REG__COUNT = 11
+};
+
 /* HOST_CTL bits */
 #define HOST_CTL_RESET(1 << 0)  /* reset controller; self-clear */
 #define HOST_CTL_IRQ_EN   (1 << 1)  /* global IRQ enable */
-- 
2.14.3




[Qemu-devel] [PATCH v2 02/16] ahci: modify ahci_port_read to use register numbers

2018-05-31 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/ahci.c | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 48130c6439..5187bf34ad 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -93,41 +93,42 @@ static const char *AHCIPortIRQ_lookup[AHCI_PORT_IRQ__COUNT] 
= {
 [AHCI_PORT_IRQ_BIT_CPDS] = "CPDS"
 };
 
-static uint32_t  ahci_port_read(AHCIState *s, int port, int offset)
+static uint32_t ahci_port_read(AHCIState *s, int port, int offset)
 {
 uint32_t val;
-AHCIPortRegs *pr;
-pr = >dev[port].port_regs;
+AHCIPortRegs *pr = >dev[port].port_regs;
+enum AHCIPortReg regnum = offset / sizeof(uint32_t);
+assert(regnum < (AHCI_PORT_ADDR_OFFSET_LEN / sizeof(uint32_t)));
 
-switch (offset) {
-case PORT_LST_ADDR:
+switch (regnum) {
+case AHCI_PORT_REG_LST_ADDR:
 val = pr->lst_addr;
 break;
-case PORT_LST_ADDR_HI:
+case AHCI_PORT_REG_LST_ADDR_HI:
 val = pr->lst_addr_hi;
 break;
-case PORT_FIS_ADDR:
+case AHCI_PORT_REG_FIS_ADDR:
 val = pr->fis_addr;
 break;
-case PORT_FIS_ADDR_HI:
+case AHCI_PORT_REG_FIS_ADDR_HI:
 val = pr->fis_addr_hi;
 break;
-case PORT_IRQ_STAT:
+case AHCI_PORT_REG_IRQ_STAT:
 val = pr->irq_stat;
 break;
-case PORT_IRQ_MASK:
+case AHCI_PORT_REG_IRQ_MASK:
 val = pr->irq_mask;
 break;
-case PORT_CMD:
+case AHCI_PORT_REG_CMD:
 val = pr->cmd;
 break;
-case PORT_TFDATA:
+case AHCI_PORT_REG_TFDATA:
 val = pr->tfdata;
 break;
-case PORT_SIG:
+case AHCI_PORT_REG_SIG:
 val = pr->sig;
 break;
-case PORT_SCR_STAT:
+case AHCI_PORT_REG_SCR_STAT:
 if (s->dev[port].port.ifs[0].blk) {
 val = SATA_SCR_SSTATUS_DET_DEV_PRESENT_PHY_UP |
   SATA_SCR_SSTATUS_SPD_GEN1 | SATA_SCR_SSTATUS_IPM_ACTIVE;
@@ -135,19 +136,18 @@ static uint32_t  ahci_port_read(AHCIState *s, int port, 
int offset)
 val = SATA_SCR_SSTATUS_DET_NODEV;
 }
 break;
-case PORT_SCR_CTL:
+case AHCI_PORT_REG_SCR_CTL:
 val = pr->scr_ctl;
 break;
-case PORT_SCR_ERR:
+case AHCI_PORT_REG_SCR_ERR:
 val = pr->scr_err;
 break;
-case PORT_SCR_ACT:
+case AHCI_PORT_REG_SCR_ACT:
 val = pr->scr_act;
 break;
-case PORT_CMD_ISSUE:
+case AHCI_PORT_REG_CMD_ISSUE:
 val = pr->cmd_issue;
 break;
-case PORT_RESERVED:
 default:
 val = 0;
 }
-- 
2.14.3




[Qemu-devel] [PATCH v2 13/16] ahci: fix spacing damage on ahci_mem_write

2018-05-31 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/ahci.c | 47 +++
 1 file changed, 23 insertions(+), 24 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 5e902c6615..568e6b848a 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -469,37 +469,36 @@ static void ahci_mem_write(void *opaque, hwaddr addr,
 
 if (addr < AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR) {
 switch (addr) {
-case HOST_CAP: /* R/WO, RO */
-/* FIXME handle R/WO */
-break;
-case HOST_CTL: /* R/W */
-if (val & HOST_CTL_RESET) {
-ahci_reset(s);
-} else {
-s->control_regs.ghc = (val & 0x3) | HOST_CTL_AHCI_EN;
-ahci_check_irq(s);
-}
-break;
-case HOST_IRQ_STAT: /* R/WC, RO */
-s->control_regs.irqstatus &= ~val;
+case HOST_CAP: /* R/WO, RO */
+/* FIXME handle R/WO */
+break;
+case HOST_CTL: /* R/W */
+if (val & HOST_CTL_RESET) {
+ahci_reset(s);
+} else {
+s->control_regs.ghc = (val & 0x3) | HOST_CTL_AHCI_EN;
 ahci_check_irq(s);
-break;
-case HOST_PORTS_IMPL: /* R/WO, RO */
-/* FIXME handle R/WO */
-break;
-case HOST_VERSION: /* RO */
-/* FIXME report write? */
-break;
-default:
-trace_ahci_mem_write_unknown(s, size, addr, val);
+}
+break;
+case HOST_IRQ_STAT: /* R/WC, RO */
+s->control_regs.irqstatus &= ~val;
+ahci_check_irq(s);
+break;
+case HOST_PORTS_IMPL: /* R/WO, RO */
+/* FIXME handle R/WO */
+break;
+case HOST_VERSION: /* RO */
+/* FIXME report write? */
+break;
+default:
+trace_ahci_mem_write_unknown(s, size, addr, val);
 }
 } else if ((addr >= AHCI_PORT_REGS_START_ADDR) &&
(addr < (AHCI_PORT_REGS_START_ADDR +
-(s->ports * AHCI_PORT_ADDR_OFFSET_LEN {
+(s->ports * AHCI_PORT_ADDR_OFFSET_LEN {
 ahci_port_write(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7,
 addr & AHCI_PORT_ADDR_OFFSET_MASK, val);
 }
-
 }
 
 static const MemoryRegionOps ahci_mem_ops = {
-- 
2.14.3




[Qemu-devel] [PATCH v2 00/16] AHCI: tracing improvements

2018-05-31 Thread John Snow
This set just adds register names so that the read/write traces make
more sense on their own without having to memorize register offsets.
It also splits read/write traces into supported/unsupported subsets,
so you can just monitor for things that QEMU is likely doing wrong.

v2:
 - Added qemu_log_mask(LOG_UNIMP, ...) statements in addition to traces
   for writes to unknown/unsupported registers. (Philippe)

John Snow (16):
  ahci: add port register enumeration
  ahci: modify ahci_port_read to use register numbers
  ahci: make port read traces more descriptive
  ahci: fix spacing damage on ahci_port_write
  ahci: combine identical clauses in port write
  ahci: modify ahci_port_write to use register numbers
  ahci: make port write traces more descriptive
  ahci: delete old port register address definitions
  ahci: add host register enumeration
  ahci: fix host register max address
  ahci: modify ahci_mem_read_32 to work on register numbers
  ahci: make mem_read_32 traces more descriptive
  ahci: fix spacing damage on ahci_mem_write
  ahci: adjust ahci_mem_write to work on registers
  ahci: delete old host register address definitions
  ahci: make ahci_mem_write traces more descriptive

 hw/ide/ahci.c  | 314 ++---
 hw/ide/ahci_internal.h |  63 ++
 hw/ide/trace-events|  13 +-
 3 files changed, 241 insertions(+), 149 deletions(-)

-- 
2.14.3




[Qemu-devel] [PATCH v2 10/16] ahci: fix host register max address

2018-05-31 Thread John Snow
Yes, comment, it ought to be 0x2C.

Signed-off-by: John Snow 
---
 hw/ide/ahci_internal.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/ide/ahci_internal.h b/hw/ide/ahci_internal.h
index af366db6f3..f9dcf8b6e6 100644
--- a/hw/ide/ahci_internal.h
+++ b/hw/ide/ahci_internal.h
@@ -224,8 +224,7 @@ enum AHCIPortIRQ {
 #define SATA_SIGNATURE_CDROM   0xeb140101
 #define SATA_SIGNATURE_DISK0x0101
 
-#define AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR 0x20
-/* Shouldn't this be 0x2c? */
+#define AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR 0x2c
 
 #define AHCI_PORT_REGS_START_ADDR  0x100
 #define AHCI_PORT_ADDR_OFFSET_MASK 0x7f
-- 
2.14.3




[Qemu-devel] [PATCH v2 04/16] ahci: fix spacing damage on ahci_port_write

2018-05-31 Thread John Snow
Churn.

Signed-off-by: John Snow 
---
 hw/ide/ahci.c | 142 +-
 1 file changed, 71 insertions(+), 71 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 5b5ab823f4..92780990a3 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -279,85 +279,85 @@ static int ahci_cond_start_engines(AHCIDevice *ad)
 return 0;
 }
 
-static void  ahci_port_write(AHCIState *s, int port, int offset, uint32_t val)
+static void ahci_port_write(AHCIState *s, int port, int offset, uint32_t val)
 {
 AHCIPortRegs *pr = >dev[port].port_regs;
 
 trace_ahci_port_write(s, port, offset, val);
 switch (offset) {
-case PORT_LST_ADDR:
-pr->lst_addr = val;
-break;
-case PORT_LST_ADDR_HI:
-pr->lst_addr_hi = val;
-break;
-case PORT_FIS_ADDR:
-pr->fis_addr = val;
-break;
-case PORT_FIS_ADDR_HI:
-pr->fis_addr_hi = val;
-break;
-case PORT_IRQ_STAT:
-pr->irq_stat &= ~val;
-ahci_check_irq(s);
-break;
-case PORT_IRQ_MASK:
-pr->irq_mask = val & 0xfdc000ff;
-ahci_check_irq(s);
-break;
-case PORT_CMD:
-/* Block any Read-only fields from being set;
- * including LIST_ON and FIS_ON.
- * The spec requires to set ICC bits to zero after the ICC change
- * is done. We don't support ICC state changes, therefore always
- * force the ICC bits to zero.
- */
-pr->cmd = (pr->cmd & PORT_CMD_RO_MASK) |
-  (val & ~(PORT_CMD_RO_MASK|PORT_CMD_ICC_MASK));
+case PORT_LST_ADDR:
+pr->lst_addr = val;
+break;
+case PORT_LST_ADDR_HI:
+pr->lst_addr_hi = val;
+break;
+case PORT_FIS_ADDR:
+pr->fis_addr = val;
+break;
+case PORT_FIS_ADDR_HI:
+pr->fis_addr_hi = val;
+break;
+case PORT_IRQ_STAT:
+pr->irq_stat &= ~val;
+ahci_check_irq(s);
+break;
+case PORT_IRQ_MASK:
+pr->irq_mask = val & 0xfdc000ff;
+ahci_check_irq(s);
+break;
+case PORT_CMD:
+/* Block any Read-only fields from being set;
+ * including LIST_ON and FIS_ON.
+ * The spec requires to set ICC bits to zero after the ICC change
+ * is done. We don't support ICC state changes, therefore always
+ * force the ICC bits to zero.
+ */
+pr->cmd = (pr->cmd & PORT_CMD_RO_MASK) |
+(val & ~(PORT_CMD_RO_MASK|PORT_CMD_ICC_MASK));
 
-/* Check FIS RX and CLB engines */
-ahci_cond_start_engines(>dev[port]);
+/* Check FIS RX and CLB engines */
+ahci_cond_start_engines(>dev[port]);
 
-/* XXX usually the FIS would be pending on the bus here and
-   issuing deferred until the OS enables FIS receival.
-   Instead, we only submit it once - which works in most
-   cases, but is a hack. */
-if ((pr->cmd & PORT_CMD_FIS_ON) &&
-!s->dev[port].init_d2h_sent) {
-ahci_init_d2h(>dev[port]);
-}
+/* XXX usually the FIS would be pending on the bus here and
+   issuing deferred until the OS enables FIS receival.
+   Instead, we only submit it once - which works in most
+   cases, but is a hack. */
+if ((pr->cmd & PORT_CMD_FIS_ON) &&
+!s->dev[port].init_d2h_sent) {
+ahci_init_d2h(>dev[port]);
+}
 
-check_cmd(s, port);
-break;
-case PORT_TFDATA:
-/* Read Only. */
-break;
-case PORT_SIG:
-/* Read Only */
-break;
-case PORT_SCR_STAT:
-/* Read Only */
-break;
-case PORT_SCR_CTL:
-if (((pr->scr_ctl & AHCI_SCR_SCTL_DET) == 1) &&
-((val & AHCI_SCR_SCTL_DET) == 0)) {
-ahci_reset_port(s, port);
-}
-pr->scr_ctl = val;
-break;
-case PORT_SCR_ERR:
-pr->scr_err &= ~val;
-break;
-case PORT_SCR_ACT:
-/* RW1 */
-pr->scr_act |= val;
-break;
-case PORT_CMD_ISSUE:
-pr->cmd_issue |= val;
-check_cmd(s, port);
-break;
-default:
-break;
+check_cmd(s, port);
+break;
+case PORT_TFDATA:
+/* Read Only. */
+break;
+case PORT_SIG:
+/* Read Only */
+break;
+case PORT_SCR_STAT:
+/* Read Only */
+break;
+case PORT_SCR_CTL:
+if (((pr->scr_ctl & AHCI_SCR_SCTL_DET) == 1) &&
+((val & AHCI_SCR_SCTL_DET) == 0)) {
+ahci_reset_port(s, port);
+}
+pr->scr_ctl = val;
+break;
+case PORT_SCR_ERR:

[Qemu-devel] [PATCH v2 05/16] ahci: combine identical clauses in port write

2018-05-31 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/ahci.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 92780990a3..476841df58 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -330,11 +330,7 @@ static void ahci_port_write(AHCIState *s, int port, int 
offset, uint32_t val)
 check_cmd(s, port);
 break;
 case PORT_TFDATA:
-/* Read Only. */
-break;
 case PORT_SIG:
-/* Read Only */
-break;
 case PORT_SCR_STAT:
 /* Read Only */
 break;
-- 
2.14.3




[Qemu-devel] [PATCH v2 08/16] ahci: delete old port register address definitions

2018-05-31 Thread John Snow
They're now unused.

Signed-off-by: John Snow 
---
 hw/ide/ahci_internal.h | 18 --
 1 file changed, 18 deletions(-)

diff --git a/hw/ide/ahci_internal.h b/hw/ide/ahci_internal.h
index eb7e1eefc0..db00c9aa39 100644
--- a/hw/ide/ahci_internal.h
+++ b/hw/ide/ahci_internal.h
@@ -102,24 +102,6 @@ enum AHCIPortReg {
 AHCI_PORT_REG__COUNT  = 32
 };
 
-/* registers for each SATA port */
-#define PORT_LST_ADDR 0x00 /* command list DMA addr */
-#define PORT_LST_ADDR_HI  0x04 /* command list DMA addr hi */
-#define PORT_FIS_ADDR 0x08 /* FIS rx buf addr */
-#define PORT_FIS_ADDR_HI  0x0c /* FIS rx buf addr hi */
-#define PORT_IRQ_STAT 0x10 /* interrupt status */
-#define PORT_IRQ_MASK 0x14 /* interrupt enable/disable mask */
-#define PORT_CMD  0x18 /* port command */
-
-#define PORT_TFDATA   0x20 /* taskfile data */
-#define PORT_SIG  0x24 /* device TF signature */
-#define PORT_SCR_STAT 0x28 /* SATA phy register: SStatus */
-#define PORT_SCR_CTL  0x2c /* SATA phy register: SControl */
-#define PORT_SCR_ERR  0x30 /* SATA phy register: SError */
-#define PORT_SCR_ACT  0x34 /* SATA phy register: SActive */
-#define PORT_CMD_ISSUE0x38 /* command issue */
-#define PORT_RESERVED 0x3c /* reserved */
-
 /* Port interrupt bit descriptors */
 enum AHCIPortIRQ {
 AHCI_PORT_IRQ_BIT_DHRS = 0,
-- 
2.14.3




[Qemu-devel] [PATCH v2 06/16] ahci: modify ahci_port_write to use register numbers

2018-05-31 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/ahci.c | 32 +---
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 476841df58..efecf849a9 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -282,30 +282,32 @@ static int ahci_cond_start_engines(AHCIDevice *ad)
 static void ahci_port_write(AHCIState *s, int port, int offset, uint32_t val)
 {
 AHCIPortRegs *pr = >dev[port].port_regs;
+enum AHCIPortReg regnum = offset / sizeof(uint32_t);
+assert(regnum < (AHCI_PORT_ADDR_OFFSET_LEN / sizeof(uint32_t)));
 
 trace_ahci_port_write(s, port, offset, val);
-switch (offset) {
-case PORT_LST_ADDR:
+switch (regnum) {
+case AHCI_PORT_REG_LST_ADDR:
 pr->lst_addr = val;
 break;
-case PORT_LST_ADDR_HI:
+case AHCI_PORT_REG_LST_ADDR_HI:
 pr->lst_addr_hi = val;
 break;
-case PORT_FIS_ADDR:
+case AHCI_PORT_REG_FIS_ADDR:
 pr->fis_addr = val;
 break;
-case PORT_FIS_ADDR_HI:
+case AHCI_PORT_REG_FIS_ADDR_HI:
 pr->fis_addr_hi = val;
 break;
-case PORT_IRQ_STAT:
+case AHCI_PORT_REG_IRQ_STAT:
 pr->irq_stat &= ~val;
 ahci_check_irq(s);
 break;
-case PORT_IRQ_MASK:
+case AHCI_PORT_REG_IRQ_MASK:
 pr->irq_mask = val & 0xfdc000ff;
 ahci_check_irq(s);
 break;
-case PORT_CMD:
+case AHCI_PORT_REG_CMD:
 /* Block any Read-only fields from being set;
  * including LIST_ON and FIS_ON.
  * The spec requires to set ICC bits to zero after the ICC change
@@ -329,26 +331,26 @@ static void ahci_port_write(AHCIState *s, int port, int 
offset, uint32_t val)
 
 check_cmd(s, port);
 break;
-case PORT_TFDATA:
-case PORT_SIG:
-case PORT_SCR_STAT:
+case AHCI_PORT_REG_TFDATA:
+case AHCI_PORT_REG_SIG:
+case AHCI_PORT_REG_SCR_STAT:
 /* Read Only */
 break;
-case PORT_SCR_CTL:
+case AHCI_PORT_REG_SCR_CTL:
 if (((pr->scr_ctl & AHCI_SCR_SCTL_DET) == 1) &&
 ((val & AHCI_SCR_SCTL_DET) == 0)) {
 ahci_reset_port(s, port);
 }
 pr->scr_ctl = val;
 break;
-case PORT_SCR_ERR:
+case AHCI_PORT_REG_SCR_ERR:
 pr->scr_err &= ~val;
 break;
-case PORT_SCR_ACT:
+case AHCI_PORT_REG_SCR_ACT:
 /* RW1 */
 pr->scr_act |= val;
 break;
-case PORT_CMD_ISSUE:
+case AHCI_PORT_REG_CMD_ISSUE:
 pr->cmd_issue |= val;
 check_cmd(s, port);
 break;
-- 
2.14.3




[Qemu-devel] [PATCH v2 07/16] ahci: make port write traces more descriptive

2018-05-31 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/ahci.c   | 8 +++-
 hw/ide/trace-events | 3 ++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index efecf849a9..01463f0f51 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -27,6 +27,7 @@
 #include "hw/pci/pci.h"
 
 #include "qemu/error-report.h"
+#include "qemu/log.h"
 #include "sysemu/block-backend.h"
 #include "sysemu/dma.h"
 #include "hw/ide/internal.h"
@@ -284,8 +285,8 @@ static void ahci_port_write(AHCIState *s, int port, int 
offset, uint32_t val)
 AHCIPortRegs *pr = >dev[port].port_regs;
 enum AHCIPortReg regnum = offset / sizeof(uint32_t);
 assert(regnum < (AHCI_PORT_ADDR_OFFSET_LEN / sizeof(uint32_t)));
+trace_ahci_port_write(s, port, AHCIPortReg_lookup[regnum], offset, val);
 
-trace_ahci_port_write(s, port, offset, val);
 switch (regnum) {
 case AHCI_PORT_REG_LST_ADDR:
 pr->lst_addr = val;
@@ -355,6 +356,11 @@ static void ahci_port_write(AHCIState *s, int port, int 
offset, uint32_t val)
 check_cmd(s, port);
 break;
 default:
+trace_ahci_port_write_unimpl(s, port, AHCIPortReg_lookup[regnum],
+ offset, val);
+qemu_log_mask(LOG_UNIMP, "Attempted write to unimplemented register: "
+  "AHCI port %d register %s, offset 0x%x: 0x%"PRIu32,
+  port, AHCIPortReg_lookup[regnum], offset, val);
 break;
 }
 }
diff --git a/hw/ide/trace-events b/hw/ide/trace-events
index 0db18d8271..1efbbb8114 100644
--- a/hw/ide/trace-events
+++ b/hw/ide/trace-events
@@ -69,7 +69,8 @@ ahci_irq_raise(void *s) "ahci(%p): raise irq"
 ahci_irq_lower(void *s) "ahci(%p): lower irq"
 ahci_check_irq(void *s, uint32_t old, uint32_t new) "ahci(%p): check irq 
0x%08x --> 0x%08x"
 ahci_trigger_irq(void *s, int port, const char *name, uint32_t val, uint32_t 
old, uint32_t new, uint32_t effective) "ahci(%p)[%d]: trigger irq +%s (0x%08x); 
irqstat: 0x%08x --> 0x%08x; effective: 0x%08x"
-ahci_port_write(void *s, int port, int offset, uint32_t val) "ahci(%p)[%d]: 
port write @ 0x%x: 0x%08x"
+ahci_port_write(void *s, int port, const char *reg, int offset, uint32_t val) 
"ahci(%p)[%d]: port write [reg:%s] @ 0x%x: 0x%08x"
+ahci_port_write_unimpl(void *s, int port, const char *reg, int offset, 
uint32_t val) "ahci(%p)[%d]: unimplemented port write [reg:%s] @ 0x%x: 0x%08x"
 ahci_mem_read_32(void *s, uint64_t addr, uint32_t val) "ahci(%p): mem read @ 
0x%"PRIx64": 0x%08x"
 ahci_mem_read(void *s, unsigned size, uint64_t addr, uint64_t val) "ahci(%p): 
read%u @ 0x%"PRIx64": 0x%016"PRIx64
 ahci_mem_write(void *s, unsigned size, uint64_t addr, uint64_t val) "ahci(%p): 
write%u @ 0x%"PRIx64": 0x%016"PRIx64
-- 
2.14.3




Re: [Qemu-devel] [qemu PATCH v4 0/4] support NFIT platform capabilities

2018-05-31 Thread Ross Zwisler
On Fri, May 25, 2018 at 08:51:22PM +0300, Michael S. Tsirkin wrote:
> On Mon, May 21, 2018 at 10:31:59AM -0600, Ross Zwisler wrote:
> > Changes since v3:
> >  * Updated the text in docs/nvdimm.txt to make it clear that the value
> >being passed in on the command line in an integer made up of various
> >bit fields. (Rob Elliott)
> >  
> >  * Updated the "Highest Valid Capability" byte to be dynamic based on
> >the highest valid bit in the user's input. (Rob Elliott)
> 
> Igor could you review pls? I think your comments have been addressed.

Ping?  Igor, can you pick this up?



Re: [Qemu-devel] [Qemu-block] [PATCH] block: Ignore generated job QAPI files

2018-05-31 Thread Jeff Cody
On Thu, May 31, 2018 at 04:24:35PM -0500, Eric Blake wrote:
> Commit bf42508f introduced new generated files; make sure they
> don't get accidentally committed from an in-tree build.
> 
> Signed-off-by: Eric Blake 

Reviewed-by: Jeff Cody 

> ---
>  .gitignore | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/.gitignore b/.gitignore
> index 81e1f2fb0f1..9da3b3e6267 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -36,6 +36,7 @@
>  /qapi/qapi-commands-common.[ch]
>  /qapi/qapi-commands-crypto.[ch]
>  /qapi/qapi-commands-introspect.[ch]
> +/qapi/qapi-commands-job.[ch]
>  /qapi/qapi-commands-migration.[ch]
>  /qapi/qapi-commands-misc.[ch]
>  /qapi/qapi-commands-net.[ch]
> @@ -53,6 +54,7 @@
>  /qapi/qapi-events-common.[ch]
>  /qapi/qapi-events-crypto.[ch]
>  /qapi/qapi-events-introspect.[ch]
> +/qapi/qapi-events-job.[ch]
>  /qapi/qapi-events-migration.[ch]
>  /qapi/qapi-events-misc.[ch]
>  /qapi/qapi-events-net.[ch]
> @@ -71,6 +73,7 @@
>  /qapi/qapi-types-common.[ch]
>  /qapi/qapi-types-crypto.[ch]
>  /qapi/qapi-types-introspect.[ch]
> +/qapi/qapi-types-job.[ch]
>  /qapi/qapi-types-migration.[ch]
>  /qapi/qapi-types-misc.[ch]
>  /qapi/qapi-types-net.[ch]
> @@ -88,6 +91,7 @@
>  /qapi/qapi-visit-common.[ch]
>  /qapi/qapi-visit-crypto.[ch]
>  /qapi/qapi-visit-introspect.[ch]
> +/qapi/qapi-visit-job.[ch]
>  /qapi/qapi-visit-migration.[ch]
>  /qapi/qapi-visit-misc.[ch]
>  /qapi/qapi-visit-net.[ch]
> -- 
> 2.14.3
> 
> 



Re: [Qemu-devel] [PATCH 2/3] pc-bios/s390-ccw/net: Add support for pxelinux-style config files

2018-05-31 Thread Farhan Ali




On 05/30/2018 05:16 AM, Thomas Huth wrote:

Since it is quite cumbersome to manually create a combined kernel with
initrd image for network booting, we now support loading via pxelinux
configuration files, too. In these files, the kernel, initrd and command
line parameters can be specified seperately, and the firmware then takes
care of glueing everything together in memory after the files have been
downloaded. See this URL for details about the config file layout:
https://www.syslinux.org/wiki/index.php?title=PXELINUX

The user can either specify a config file directly as bootfile via DHCP
(but in this case, the file has to start either with "default" or a "#"
comment so we can distinguish it from binary kernels), or a folder (i.e.
the bootfile name must end with "/") where the firmware should look for
the typical pxelinux.cfg file names, e.g. based on MAC or IP address.
We also support the pxelinux.cfg DHCP options 209 and 210 from RFC 5071.

Signed-off-by: Thomas Huth 
---
  pc-bios/s390-ccw/netboot.mak |  7 ++--
  pc-bios/s390-ccw/netmain.c   | 79 +++-
  2 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/pc-bios/s390-ccw/netboot.mak b/pc-bios/s390-ccw/netboot.mak
index a73be36..8af0cfd 100644
--- a/pc-bios/s390-ccw/netboot.mak
+++ b/pc-bios/s390-ccw/netboot.mak
@@ -25,8 +25,9 @@ CTYPE_OBJS = isdigit.o isxdigit.o toupper.o
  %.o : $(SLOF_DIR)/lib/libc/ctype/%.c
$(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@ 
$<,"CC","$(TARGET_DIR)$@")
  
-STRING_OBJS = strcat.o strchr.o strcmp.o strcpy.o strlen.o strncmp.o strncpy.o \

- strstr.o memset.o memcpy.o memmove.o memcmp.o
+STRING_OBJS = strcat.o strchr.o strrchr.o strcpy.o strlen.o strncpy.o \
+ strcmp.o strncmp.o strcasecmp.o strncasecmp.o strstr.o \
+ memset.o memcpy.o memmove.o memcmp.o
  %.o : $(SLOF_DIR)/lib/libc/string/%.c
$(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@ 
$<,"CC","$(TARGET_DIR)$@")
  
@@ -50,7 +51,7 @@ libc.a: $(LIBCOBJS)

  # libnet files:
  
  LIBNETOBJS := args.o dhcp.o dns.o icmpv6.o ipv6.o tcp.o udp.o bootp.o \

- dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o
+ dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o pxelinux.o
  LIBNETCFLAGS := $(QEMU_CFLAGS) -DDHCPARCH=0x1F $(LIBC_INC) $(LIBNET_INC)
  
  %.o : $(SLOF_DIR)/lib/libnet/%.c

diff --git a/pc-bios/s390-ccw/netmain.c b/pc-bios/s390-ccw/netmain.c
index 7533cf7..e84bb2b 100644
--- a/pc-bios/s390-ccw/netmain.c
+++ b/pc-bios/s390-ccw/netmain.c
@@ -30,6 +30,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "s390-ccw.h"

  #include "virtio.h"
@@ -41,12 +42,14 @@ extern char _start[];
  
  #define KERNEL_ADDR ((void *)0L)

  #define KERNEL_MAX_SIZE ((long)_start)
+#define ARCH_COMMAND_LINE_SIZE  896  /* Taken from Linux kernel */
  
  char stack[PAGE_SIZE * 8] __attribute__((aligned(PAGE_SIZE)));

  IplParameterBlock iplb __attribute__((aligned(PAGE_SIZE)));
  static char cfgbuf[2048];
  
  static SubChannelId net_schid = { .one = 1 };

+static uint8_t mac[6];
  static uint64_t dest_timer;
  
  static uint64_t get_timer_ms(void)

@@ -158,7 +161,6 @@ static int tftp_load(filename_ip_t *fnip, void *buffer, int 
len)
  
  static int net_init(filename_ip_t *fn_ip)

  {
-uint8_t mac[6];
  int rc;
  
  memset(fn_ip, 0, sizeof(filename_ip_t));

@@ -233,6 +235,66 @@ static void net_release(filename_ip_t *fn_ip)
  }
  
  /**

+ * Load a kernel with initrd (i.e. with the information that we've got from
+ * a pxelinux.cfg config file)
+ */
+static int load_kernel_with_initrd(filename_ip_t *fn_ip,
+   struct pl_cfg_entry *entry)
+{
+int rc;
+
+printf("Loading pxelinux.cfg entry '%s'\n", entry->label);
+
+if (!entry->kernel) {
+printf("Kernel entry is missing!\n");
+return -1;
+}
+
+strncpy(fn_ip->filename, entry->kernel, sizeof(fn_ip->filename));
+rc = tftp_load(fn_ip, KERNEL_ADDR, KERNEL_MAX_SIZE);
+if (rc < 0) {
+return rc;
+}
+
+if (entry->initrd) {
+uint64_t iaddr = (rc + 0xfff) & ~0xfffUL;
+
+strncpy(fn_ip->filename, entry->initrd, sizeof(fn_ip->filename));
+rc = tftp_load(fn_ip, (void *)iaddr, KERNEL_MAX_SIZE - iaddr);
+if (rc < 0) {
+return rc;
+}
+/* Patch location and size: */
+*(uint64_t *)0x10408 = iaddr;
+*(uint64_t *)0x10410 = rc;
+rc += iaddr;
+}
+
+if (entry->append) {
+strncpy((char *)0x10480, entry->append, ARCH_COMMAND_LINE_SIZE);
+}
+
+return rc;
+}
+
+#define MAX_PXELINUX_ENTRIES 16
+
+static int net_try_pxelinux_cfg(filename_ip_t *fn_ip)
+{
+struct pl_cfg_entry entries[MAX_PXELINUX_ENTRIES];
+int num_ent, def_ent = 0;
+
+num_ent = pxelinux_load_parse_cfg(fn_ip, mac, NULL, DEFAULT_TFTP_RETRIES,
+  cfgbuf, sizeof(cfgbuf),
+   

[Qemu-devel] [PATCH] block: Ignore generated job QAPI files

2018-05-31 Thread Eric Blake
Commit bf42508f introduced new generated files; make sure they
don't get accidentally committed from an in-tree build.

Signed-off-by: Eric Blake 
---
 .gitignore | 4 
 1 file changed, 4 insertions(+)

diff --git a/.gitignore b/.gitignore
index 81e1f2fb0f1..9da3b3e6267 100644
--- a/.gitignore
+++ b/.gitignore
@@ -36,6 +36,7 @@
 /qapi/qapi-commands-common.[ch]
 /qapi/qapi-commands-crypto.[ch]
 /qapi/qapi-commands-introspect.[ch]
+/qapi/qapi-commands-job.[ch]
 /qapi/qapi-commands-migration.[ch]
 /qapi/qapi-commands-misc.[ch]
 /qapi/qapi-commands-net.[ch]
@@ -53,6 +54,7 @@
 /qapi/qapi-events-common.[ch]
 /qapi/qapi-events-crypto.[ch]
 /qapi/qapi-events-introspect.[ch]
+/qapi/qapi-events-job.[ch]
 /qapi/qapi-events-migration.[ch]
 /qapi/qapi-events-misc.[ch]
 /qapi/qapi-events-net.[ch]
@@ -71,6 +73,7 @@
 /qapi/qapi-types-common.[ch]
 /qapi/qapi-types-crypto.[ch]
 /qapi/qapi-types-introspect.[ch]
+/qapi/qapi-types-job.[ch]
 /qapi/qapi-types-migration.[ch]
 /qapi/qapi-types-misc.[ch]
 /qapi/qapi-types-net.[ch]
@@ -88,6 +91,7 @@
 /qapi/qapi-visit-common.[ch]
 /qapi/qapi-visit-crypto.[ch]
 /qapi/qapi-visit-introspect.[ch]
+/qapi/qapi-visit-job.[ch]
 /qapi/qapi-visit-migration.[ch]
 /qapi/qapi-visit-misc.[ch]
 /qapi/qapi-visit-net.[ch]
-- 
2.14.3




[Qemu-devel] [PATCH 1/5] block: Add blklogwrites

2018-05-31 Thread Ari Sundholm
From: Aapo Vienamo 

Implements a block device write logging system, similar to Linux kernel
device mapper dm-log-writes. The write operations that are performed
on a block device are logged to a file or another block device. The
write log format is identical to the dm-log-writes format. Currently,
log markers are not supported.

This functionality can be used for fail-safe and fs consistency
testing. By implementing it in qemu, tests utilizing write logs can be
be used to test non-Linux drivers and older kernels.

The implementation is based on the blkverify and blkdebug block drivers.

Signed-off-by: Aapo Vienamo 
Signed-off-by: Ari Sundholm 
---
 block.c   |   6 -
 block/Makefile.objs   |   1 +
 block/blklogwrites.c  | 441 ++
 include/block/block.h |   7 +
 4 files changed, 449 insertions(+), 6 deletions(-)
 create mode 100644 block/blklogwrites.c

diff --git a/block.c b/block.c
index 501b64c..c8cffe1 100644
--- a/block.c
+++ b/block.c
@@ -1914,12 +1914,6 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, 
uint64_t shared,
 return 0;
 }
 
-#define DEFAULT_PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \
- | BLK_PERM_WRITE \
- | BLK_PERM_WRITE_UNCHANGED \
- | BLK_PERM_RESIZE)
-#define DEFAULT_PERM_UNCHANGED (BLK_PERM_ALL & ~DEFAULT_PERM_PASSTHROUGH)
-
 void bdrv_filter_default_perms(BlockDriverState *bs, BdrvChild *c,
const BdrvChildRole *role,
BlockReopenQueue *reopen_queue,
diff --git a/block/Makefile.objs b/block/Makefile.objs
index 899bfb5..c8337bf 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -5,6 +5,7 @@ block-obj-y += qed-check.o
 block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
 block-obj-y += quorum.o
 block-obj-y += parallels.o blkdebug.o blkverify.o blkreplay.o
+block-obj-y += blklogwrites.o
 block-obj-y += block-backend.o snapshot.o qapi.o
 block-obj-$(CONFIG_WIN32) += file-win32.o win32-aio.o
 block-obj-$(CONFIG_POSIX) += file-posix.o
diff --git a/block/blklogwrites.c b/block/blklogwrites.c
new file mode 100644
index 000..78835bf
--- /dev/null
+++ b/block/blklogwrites.c
@@ -0,0 +1,441 @@
+/*
+ * Write logging blk driver based on blkverify and blkdebug.
+ *
+ * Copyright (c) 2017 Tuomas Tynkkynen 
+ * Copyright (c) 2018 Aapo Vienamo 
+ * Copyright (c) 2018 Ari Sundholm 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
+#include "block/block_int.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "qemu/cutils.h"
+#include "qemu/option.h"
+
+/* Disk format stuff - taken from Linux drivers/md/dm-log-writes.c */
+
+#define LOG_FLUSH_FLAG (1 << 0)
+#define LOG_FUA_FLAG (1 << 1)
+#define LOG_DISCARD_FLAG (1 << 2)
+#define LOG_MARK_FLAG (1 << 3)
+
+#define WRITE_LOG_VERSION 1ULL
+#define WRITE_LOG_MAGIC 0x6a736677736872ULL
+
+/* All fields are little-endian. */
+struct log_write_super {
+uint64_t magic;
+uint64_t version;
+uint64_t nr_entries;
+uint32_t sectorsize;
+};
+
+struct log_write_entry {
+uint64_t sector;
+uint64_t nr_sectors;
+uint64_t flags;
+uint64_t data_len;
+};
+
+/* End of disk format structures. */
+
+typedef struct {
+BdrvChild *log_file;
+uint64_t cur_log_sector;
+uint64_t nr_entries;
+} BDRVBlkLogWritesState;
+
+/* Valid blk_log_writes filenames look like:
+ *  blk_log_writes:path/to/raw_image:path/to/logfile */
+static void blk_log_writes_parse_filename(const char *filename, QDict *options,
+  Error **errp)
+{
+const char *c;
+QString *raw_path;
+
+/* Parse the blk_log_writes: prefix */
+if (!strstart(filename, "blk_log_writes:", )) {
+/* There was no prefix; therefore, all options have to be already
+ * present in the QDict (except for the filename) */
+qdict_put(options, "x-log", qstring_from_str(filename));
+return;
+}
+
+/* Parse the raw image filename */
+c = strchr(filename, ':');
+if (c == NULL) {
+error_setg(errp,
+   "blk_log_writes requires paths to both image and log");
+return;
+}
+
+raw_path = qstring_from_substr(filename, 0, c - filename - 1);
+qdict_put(options, "x-raw", raw_path);
+
+filename = c + 1;
+qdict_put(options, "x-log", qstring_from_str(filename));
+}
+
+static QemuOptsList runtime_opts = {
+.name = "blk_log_writes",
+.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
+.desc = {
+{
+.name = "x-raw",
+.type = QEMU_OPT_STRING,
+.help = "[internal use only, will be removed]",
+},
+{
+.name = "x-log",
+ 

Re: [Qemu-devel] [PATCH v2 34/40] job: Introduce qapi/job.json

2018-05-31 Thread Eric Blake

On 05/18/2018 08:21 AM, Kevin Wolf wrote:

This adds a separate schema file for all job-related definitions that
aren't tied to the block layer.

For a start, move the enums JobType, JobStatus and JobVerb.

Signed-off-by: Kevin Wolf 
---
  qapi/block-core.json  | 90 +---
  qapi/job.json | 94 +++
  qapi/qapi-schema.json |  1 +
  Makefile  |  9 +
  Makefile.objs |  4 +++
  5 files changed, 109 insertions(+), 89 deletions(-)
  create mode 100644 qapi/job.json


This commit now results in a dirty 'git status' for in-tree builds:

qapi/qapi-commands-job.c
qapi/qapi-commands-job.h
qapi/qapi-events-job.c
qapi/qapi-events-job.h
qapi/qapi-types-job.c
qapi/qapi-types-job.h
qapi/qapi-visit-job.c
qapi/qapi-visit-job.h

I'll propose a cleanup patch.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



[Qemu-devel] [PATCH 5/5] block/blklogwrites: Use the block device logical sector size when logging writes

2018-05-31 Thread Ari Sundholm
The guest OS may perform writes which are aligned to the logical
sector size instead of the physical one, so logging at this granularity
records the writes performed on the block device most faithfully.

Signed-off-by: Ari Sundholm 
---
 block/blklogwrites.c | 43 ++-
 1 file changed, 30 insertions(+), 13 deletions(-)

diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index 6cdad2c..3dc2fdf 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -47,6 +47,8 @@ struct log_write_entry {
 
 typedef struct {
 BdrvChild *log_file;
+uint32_t sectorsize;
+uint32_t sectorbits;
 uint64_t cur_log_sector;
 uint64_t nr_entries;
 } BDRVBlkLogWritesState;
@@ -125,6 +127,8 @@ static int blk_log_writes_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail;
 }
 
+s->sectorsize = BDRV_SECTOR_SIZE; /* May be updated later */
+s->sectorbits = BDRV_SECTOR_BITS;
 s->cur_log_sector = 1;
 s->nr_entries = 0;
 qdict_put_str(options, "log.driver", "raw");
@@ -229,11 +233,20 @@ static void 
blk_log_writes_refresh_limits(BlockDriverState *bs, Error **errp)
 }
 }
 
+static inline uint32_t blk_log_writes_log2(uint32_t value)
+{
+assert(value > 0);
+return 31 - clz32(value);
+}
+
 static void blk_log_writes_apply_blkconf(BlockDriverState *bs, BlockConf *conf)
 {
-assert(bs && conf && conf->blk);
+BDRVBlkLogWritesState *s = bs->opaque;
+assert(bs && conf && conf->blk && s);
 
-bs->bl.request_alignment = conf->logical_block_size;
+s->sectorsize = conf->logical_block_size;
+s->sectorbits = blk_log_writes_log2(s->sectorsize);
+bs->bl.request_alignment = s->sectorsize;
 if (conf->discard_granularity != (uint32_t)-1)
 bs->bl.pdiscard_alignment = conf->discard_granularity;
 
@@ -279,11 +292,11 @@ static void coroutine_fn blk_log_writes_co_do_log(void 
*opaque)
 {
 BlkLogWritesLogReq *lr = opaque;
 BDRVBlkLogWritesState *s = lr->r->bs->opaque;
-uint64_t cur_log_offset = s->cur_log_sector * BDRV_SECTOR_SIZE;
+uint64_t cur_log_offset = s->cur_log_sector << s->sectorbits;
 
 s->nr_entries++;
 s->cur_log_sector +=
-ROUND_UP(lr->qiov->size, BDRV_SECTOR_SIZE) >> BDRV_SECTOR_BITS;
+ROUND_UP(lr->qiov->size, s->sectorsize) >> s->sectorbits;
 
 lr->log_ret = bdrv_co_pwritev(s->log_file, cur_log_offset, lr->qiov->size,
   lr->qiov, 0);
@@ -294,20 +307,21 @@ static void coroutine_fn blk_log_writes_co_do_log(void 
*opaque)
 .magic  = cpu_to_le64(WRITE_LOG_MAGIC),
 .version= cpu_to_le64(WRITE_LOG_VERSION),
 .nr_entries = cpu_to_le64(s->nr_entries),
-.sectorsize = cpu_to_le32(1 << BDRV_SECTOR_BITS),
+.sectorsize = cpu_to_le32(s->sectorsize),
 };
-static const char zeroes[BDRV_SECTOR_SIZE - sizeof(super)] = { '\0' };
+void *zeroes = g_malloc0(s->sectorsize - sizeof(super));
 QEMUIOVector qiov;
 
 qemu_iovec_init(, 2);
 qemu_iovec_add(, , sizeof(super));
-qemu_iovec_add(, (void *)zeroes, sizeof(zeroes));
+qemu_iovec_add(, zeroes, s->sectorsize - sizeof(super));
 
 lr->log_ret =
-bdrv_co_pwritev(s->log_file, 0, BDRV_SECTOR_SIZE, , 0);
+bdrv_co_pwritev(s->log_file, 0, s->sectorsize, , 0);
 if (lr->log_ret == 0)
 lr->log_ret = bdrv_co_flush(s->log_file->bs);
 qemu_iovec_destroy();
+g_free(zeroes);
 }
 
 lr->r->done++;
@@ -332,6 +346,7 @@ blk_log_writes_co_log(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 {
 QEMUIOVector log_qiov;
 size_t niov = qiov ? qiov->niov : 0;
+BDRVBlkLogWritesState *s = bs->opaque;
 Coroutine *co_log, *co_file;
 BlkLogWritesReq r = {
 .co = qemu_coroutine_self(),
@@ -349,21 +364,22 @@ blk_log_writes_co_log(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 .r  = ,
 .qiov   = _qiov,
 .entry = {
-.sector = cpu_to_le64(offset >> BDRV_SECTOR_BITS),
-.nr_sectors = cpu_to_le64(bytes >> BDRV_SECTOR_BITS),
+.sector = cpu_to_le64(offset >> s->sectorbits),
+.nr_sectors = cpu_to_le64(bytes >> s->sectorbits),
 .flags  = cpu_to_le64(entry_flags),
 .data_len   = 0,
 },
 };
-static const char zeroes[BDRV_SECTOR_SIZE - sizeof(struct log_write_entry)]
-= { '\0' };
+void *zeroes = g_malloc0(s->sectorsize - sizeof(lr.entry));
 
+assert((1 << s->sectorbits) == s->sectorsize);
+assert(bs->bl.request_alignment == s->sectorsize);
 assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
 assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
 
 qemu_iovec_init(_qiov, niov + 2);
 qemu_iovec_add(_qiov, , sizeof(lr.entry));
-qemu_iovec_add(_qiov, (void *)zeroes, sizeof(zeroes));

[Qemu-devel] [PATCH 4/5] block/blklogwrites: Use block limits from the backend block configuration

2018-05-31 Thread Ari Sundholm
This is to ensure that writes are aligned properly for logging writes
to the virtual block device. This is important because the
dm-log-writes log format has a granularity of one sector for both the
offset and the size of each write. By using the logical sector size
for alignment, the log records the writes more faithfully for those
devices that have a non-512 logical sector size.

Note that even with this patch, blklogwrites still uses
BDRV_SECTOR_SIZE for logging. This will change in a subsequent patch
which will introduce support for non-512 sector sizes.

Signed-off-by: Ari Sundholm 
---
 block/blklogwrites.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index 78835bf..6cdad2c 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -229,6 +229,22 @@ static void blk_log_writes_refresh_limits(BlockDriverState 
*bs, Error **errp)
 }
 }
 
+static void blk_log_writes_apply_blkconf(BlockDriverState *bs, BlockConf *conf)
+{
+assert(bs && conf && conf->blk);
+
+bs->bl.request_alignment = conf->logical_block_size;
+if (conf->discard_granularity != (uint32_t)-1)
+bs->bl.pdiscard_alignment = conf->discard_granularity;
+
+if (bs->bl.pdiscard_alignment &&
+bs->bl.pdiscard_alignment < bs->bl.request_alignment)
+bs->bl.pdiscard_alignment = bs->bl.request_alignment;
+if (bs->bl.pwrite_zeroes_alignment &&
+bs->bl.pwrite_zeroes_alignment < bs->bl.request_alignment)
+bs->bl.pwrite_zeroes_alignment = bs->bl.request_alignment;
+}
+
 static int coroutine_fn
 blk_log_writes_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
  QEMUIOVector *qiov, int flags)
@@ -424,6 +440,7 @@ static BlockDriver bdrv_blk_log_writes = {
 .bdrv_refresh_filename  = blk_log_writes_refresh_filename,
 .bdrv_child_perm= blk_log_writes_child_perm,
 .bdrv_refresh_limits= blk_log_writes_refresh_limits,
+.bdrv_apply_blkconf = blk_log_writes_apply_blkconf,
 
 .bdrv_co_preadv = blk_log_writes_co_preadv,
 .bdrv_co_pwritev= blk_log_writes_co_pwritev,
-- 
2.7.4




[Qemu-devel] [PATCH 2/5] block: Add a mechanism for passing a block driver a block configuration

2018-05-31 Thread Ari Sundholm
A block driver may need to know about the block configuration, most
critically the sector sizes, of a block backend for alignment purposes
or for some other reason. It doesn't seem like qemu has an existing
mechanism for a block backend to convey the required information to
the relevant block driver when it is being realized.

The need for this mechanism rises from the fact that a drive may not
have a backend at the time it is created, as devices are created after
drives during qemu startup. Therefore, a driver requiring information
about the block configuration can get this information when a backend
is created for it at the earliest. The most natural place for this to
take place seems to be in the realization functions of the various
block device drivers, such as scsi-hd. The interface proposed here
allows the block driver to receive information about the block
configuration and the associated backend through a new callback.

Signed-off-by: Ari Sundholm 
---
 block/io.c| 19 +++
 hw/block/block.c  | 11 ++-
 include/block/block.h | 10 ++
 include/block/block_int.h |  9 +
 include/hw/block/block.h  |  1 +
 5 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index ca96b48..b806c91 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2835,3 +2835,22 @@ void bdrv_unregister_buf(BlockDriverState *bs, void 
*host)
 bdrv_unregister_buf(child->bs, host);
 }
 }
+
+void bdrv_apply_blkconf(BlockDriverState *bs, BlockConf *conf)
+{
+BlockDriver *drv = bs->drv;
+
+if (!drv)
+return;
+
+if (drv->bdrv_apply_blkconf) {
+drv->bdrv_apply_blkconf(bs, conf);
+return;
+}
+
+if (bs->file && bs->file->bs)
+bdrv_apply_blkconf(bs->file->bs, conf);
+
+if (bs->drv->supports_backing && backing_bs(bs))
+bdrv_apply_blkconf(backing_bs(bs), conf);
+}
diff --git a/hw/block/block.c b/hw/block/block.c
index b91e2b6..a2d33df 100644
--- a/hw/block/block.c
+++ b/hw/block/block.c
@@ -38,7 +38,7 @@ void blkconf_blocksizes(BlockConf *conf)
 /* fill in detected values if they are not defined via qemu command line */
 if (!conf->physical_block_size) {
 if (!backend_ret) {
-   conf->physical_block_size = blocksizes.phys;
+conf->physical_block_size = blocksizes.phys;
 } else {
 conf->physical_block_size = BDRV_SECTOR_SIZE;
 }
@@ -52,6 +52,15 @@ void blkconf_blocksizes(BlockConf *conf)
 }
 }
 
+void blkconf_apply_to_blkdrv(BlockConf *conf)
+{
+BlockBackend *blk = conf->blk;
+BlockDriverState *bs = blk_bs(blk);
+
+if (bs)
+bdrv_apply_blkconf(bs, conf);
+}
+
 bool blkconf_apply_backend_options(BlockConf *conf, bool readonly,
bool resizable, Error **errp)
 {
diff --git a/include/block/block.h b/include/block/block.h
index fb7d379..57a8b65 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -10,6 +10,7 @@
 #include "block/dirty-bitmap.h"
 #include "block/blockjob.h"
 #include "qemu/hbitmap.h"
+#include "hw/block/block.h"
 
 /* block.c */
 typedef struct BlockDriver BlockDriver;
@@ -618,4 +619,13 @@ bool bdrv_can_store_new_dirty_bitmap(BlockDriverState *bs, 
const char *name,
  */
 void bdrv_register_buf(BlockDriverState *bs, void *host, size_t size);
 void bdrv_unregister_buf(BlockDriverState *bs, void *host);
+
+/**
+ * bdrv_apply_blkconf:
+ *
+ * Recursively finds the highest-level block drivers among the files and
+ * backing files that accept a block configuration and applies the given block
+ * configuration to them.
+ */
+void bdrv_apply_blkconf(BlockDriverState *bs, BlockConf *conf);
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 6c0927b..1d99a0d 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -433,6 +433,15 @@ struct BlockDriver {
 void (*bdrv_abort_perm_update)(BlockDriverState *bs);
 
 /**
+ * Called to inform the driver of the block configuration of the virtual
+ * block device.
+ *
+ * This function is called by a block device realization function if the
+ * device wants to inform the block driver of its block configuration.
+ */
+void (*bdrv_apply_blkconf)(BlockDriverState *bs, BlockConf *conf);
+
+/**
  * Returns in @nperm and @nshared the permissions that the driver for @bs
  * needs on its child @c, based on the cumulative permissions requested by
  * the parents in @parent_perm and @parent_shared.
diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index d4f4dff..2861995 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -77,6 +77,7 @@ bool blkconf_geometry(BlockConf *conf, int *trans,
   unsigned cyls_max, unsigned heads_max, unsigned secs_max,
   Error **errp);
 void blkconf_blocksizes(BlockConf *conf);
+void blkconf_apply_to_blkdrv(BlockConf *conf);

[Qemu-devel] [PATCH 3/5] hw/scsi/scsi-disk: Always apply block configuration to block driver

2018-05-31 Thread Ari Sundholm
This is an example of what all block devices should ideally do in their
realization functions to convey the block configuration to the block
driver.

Signed-off-by: Ari Sundholm 
---
 hw/scsi/scsi-disk.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index ded23d3..d50517e 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2413,6 +2413,7 @@ static void scsi_realize(SCSIDevice *dev, Error **errp)
 blk_set_dev_ops(s->qdev.conf.blk, _disk_block_ops, s);
 }
 blk_set_guest_block_size(s->qdev.conf.blk, s->qdev.blocksize);
+blkconf_apply_to_blkdrv(>qdev.conf);
 
 blk_iostatus_enable(s->qdev.conf.blk);
 }
-- 
2.7.4




[Qemu-devel] [PATCH v2 7/8] vhdx: Switch to byte-based calls

2018-05-31 Thread Eric Blake
We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based calls
into the block layer from the vhdx driver.

Ideally, the vhdx driver should switch to doing everything
byte-based, but that's a more invasive change that requires a
bit more auditing.

Signed-off-by: Eric Blake 
---
 block/vhdx.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index 0b1e21c7501..295d3276120 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1126,9 +1126,9 @@ static coroutine_fn int vhdx_co_readv(BlockDriverState 
*bs, int64_t sector_num,
 break;
 case PAYLOAD_BLOCK_FULLY_PRESENT:
 qemu_co_mutex_unlock(>lock);
-ret = bdrv_co_readv(bs->file,
-sinfo.file_offset >> BDRV_SECTOR_BITS,
-sinfo.sectors_avail, _qiov);
+ret = bdrv_co_preadv(bs->file, sinfo.file_offset,
+ sinfo.sectors_avail * BDRV_SECTOR_SIZE,
+ _qiov, 0);
 qemu_co_mutex_lock(>lock);
 if (ret < 0) {
 goto exit;
@@ -1348,9 +1348,9 @@ static coroutine_fn int vhdx_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 }
 /* block exists, so we can just overwrite it */
 qemu_co_mutex_unlock(>lock);
-ret = bdrv_co_writev(bs->file,
-sinfo.file_offset >> BDRV_SECTOR_BITS,
-sectors_to_write, _qiov);
+ret = bdrv_co_pwritev(bs->file, sinfo.file_offset,
+  sectors_to_write * BDRV_SECTOR_SIZE,
+  _qiov, 0);
 qemu_co_mutex_lock(>lock);
 if (ret < 0) {
 goto error_bat_restore;
-- 
2.14.3




[Qemu-devel] [PATCH v2 8/8] block: Remove unused sector-based vectored I/O

2018-05-31 Thread Eric Blake
We are gradually moving away from sector-based interfaces, towards
byte-based.  Now that all callers of vectored I/O have been converted
to use our preferred byte-based bdrv_co_p{read,write}v(), we can
delete the unused bdrv_co_{read,write}v().

Furthermore, this gets rid of the signature difference between the
public bdrv_co_writev() and the callback .bdrv_co_writev (the
latter still exists, because some drivers still need more work
before they are fully byte-based).

Signed-off-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
---
v2: commit typo fix [Kashyap]
---
 include/block/block.h |  4 
 block/io.c| 36 
 2 files changed, 40 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 3894edda9de..fe40d2929ac 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -285,10 +285,6 @@ int bdrv_pwrite(BdrvChild *child, int64_t offset, const 
void *buf, int bytes);
 int bdrv_pwritev(BdrvChild *child, int64_t offset, QEMUIOVector *qiov);
 int bdrv_pwrite_sync(BdrvChild *child, int64_t offset,
  const void *buf, int count);
-int coroutine_fn bdrv_co_readv(BdrvChild *child, int64_t sector_num,
-   int nb_sectors, QEMUIOVector *qiov);
-int coroutine_fn bdrv_co_writev(BdrvChild *child, int64_t sector_num,
-   int nb_sectors, QEMUIOVector *qiov);
 /*
  * Efficiently zero a region of the disk image.  Note that this is a regular
  * I/O request like read or write and should have a reasonable size.  This
diff --git a/block/io.c b/block/io.c
index ca96b487eb8..1d86bfc0072 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1341,24 +1341,6 @@ int coroutine_fn bdrv_co_preadv(BdrvChild *child,
 return ret;
 }

-static int coroutine_fn bdrv_co_do_readv(BdrvChild *child,
-int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
-BdrvRequestFlags flags)
-{
-if (nb_sectors < 0 || nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
-return -EINVAL;
-}
-
-return bdrv_co_preadv(child, sector_num << BDRV_SECTOR_BITS,
-  nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
-}
-
-int coroutine_fn bdrv_co_readv(BdrvChild *child, int64_t sector_num,
-   int nb_sectors, QEMUIOVector *qiov)
-{
-return bdrv_co_do_readv(child, sector_num, nb_sectors, qiov, 0);
-}
-
 static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
 int64_t offset, int bytes, BdrvRequestFlags flags)
 {
@@ -1801,24 +1783,6 @@ out:
 return ret;
 }

-static int coroutine_fn bdrv_co_do_writev(BdrvChild *child,
-int64_t sector_num, int nb_sectors, QEMUIOVector *qiov,
-BdrvRequestFlags flags)
-{
-if (nb_sectors < 0 || nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
-return -EINVAL;
-}
-
-return bdrv_co_pwritev(child, sector_num << BDRV_SECTOR_BITS,
-   nb_sectors << BDRV_SECTOR_BITS, qiov, flags);
-}
-
-int coroutine_fn bdrv_co_writev(BdrvChild *child, int64_t sector_num,
-int nb_sectors, QEMUIOVector *qiov)
-{
-return bdrv_co_do_writev(child, sector_num, nb_sectors, qiov, 0);
-}
-
 int coroutine_fn bdrv_co_pwrite_zeroes(BdrvChild *child, int64_t offset,
int bytes, BdrvRequestFlags flags)
 {
-- 
2.14.3




[Qemu-devel] [PATCH v2 4/8] qcow: Switch qcow_co_writev to byte-based calls

2018-05-31 Thread Eric Blake
We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the internals of the qcow
driver write function, by iterating over offset/bytes instead of
sector_num/nb_sectors, and with a rename of index_in_cluster and
repurposing of n to track bytes instead of sectors.

A later patch will then switch the qcow driver as a whole over
to byte-based operation.

Signed-off-by: Eric Blake 

---
v2: prefer 64-bit * over 23-bit <<, rename variable for legibility [Kevin]
---
 block/qcow.c | 36 +---
 1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index b90915218ff..44adeba276c 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -723,13 +723,15 @@ static coroutine_fn int qcow_co_writev(BlockDriverState 
*bs, int64_t sector_num,
int flags)
 {
 BDRVQcowState *s = bs->opaque;
-int index_in_cluster;
+int offset_in_cluster;
 uint64_t cluster_offset;
 int ret = 0, n;
 struct iovec hd_iov;
 QEMUIOVector hd_qiov;
 uint8_t *buf;
 void *orig_buf;
+int64_t offset = sector_num * BDRV_SECTOR_SIZE;
+int64_t bytes = nb_sectors * BDRV_SECTOR_SIZE;

 assert(!flags);
 s->cluster_cache_offset = -1; /* disable compressed cache */
@@ -749,16 +751,14 @@ static coroutine_fn int qcow_co_writev(BlockDriverState 
*bs, int64_t sector_num,

 qemu_co_mutex_lock(>lock);

-while (nb_sectors != 0) {
-
-index_in_cluster = sector_num & (s->cluster_sectors - 1);
-n = s->cluster_sectors - index_in_cluster;
-if (n > nb_sectors) {
-n = nb_sectors;
+while (bytes != 0) {
+offset_in_cluster = offset & (s->cluster_size - 1);
+n = s->cluster_size - offset_in_cluster;
+if (n > bytes) {
+n = bytes;
 }
-ret = get_cluster_offset(bs, sector_num << 9, 1, 0,
- index_in_cluster << 9,
- (index_in_cluster + n) << 9, _offset);
+ret = get_cluster_offset(bs, offset, 1, 0, offset_in_cluster,
+ offset_in_cluster + n, _offset);
 if (ret < 0) {
 break;
 }
@@ -768,30 +768,28 @@ static coroutine_fn int qcow_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 }
 if (bs->encrypted) {
 assert(s->crypto);
-if (qcrypto_block_encrypt(s->crypto, sector_num * BDRV_SECTOR_SIZE,
-  buf, n * BDRV_SECTOR_SIZE, NULL) < 0) {
+if (qcrypto_block_encrypt(s->crypto, offset, buf, n, NULL) < 0) {
 ret = -EIO;
 break;
 }
 }

 hd_iov.iov_base = (void *)buf;
-hd_iov.iov_len = n * 512;
+hd_iov.iov_len = n;
 qemu_iovec_init_external(_qiov, _iov, 1);
 qemu_co_mutex_unlock(>lock);
 BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
-ret = bdrv_co_writev(bs->file,
- (cluster_offset >> 9) + index_in_cluster,
- n, _qiov);
+ret = bdrv_co_pwritev(bs->file, cluster_offset + offset_in_cluster,
+  n, _qiov, 0);
 qemu_co_mutex_lock(>lock);
 if (ret < 0) {
 break;
 }
 ret = 0;

-nb_sectors -= n;
-sector_num += n;
-buf += n * 512;
+bytes -= n;
+offset += n;
+buf += n;
 }
 qemu_co_mutex_unlock(>lock);

-- 
2.14.3




[Qemu-devel] [PATCH v2 6/8] replication: Switch to byte-based calls

2018-05-31 Thread Eric Blake
We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based calls
into the block layer from the replication driver.

Ideally, the replication driver should switch to doing everything
byte-based, but that's a more invasive change that requires a
bit more auditing.

Signed-off-by: Eric Blake 
---
 block/replication.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 826db7b3049..6349d6958e4 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -246,13 +246,14 @@ static coroutine_fn int 
replication_co_readv(BlockDriverState *bs,
 backup_cow_request_begin(, child->bs->job,
  sector_num * BDRV_SECTOR_SIZE,
  remaining_bytes);
-ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors,
-qiov);
+ret = bdrv_co_preadv(bs->file, sector_num * BDRV_SECTOR_SIZE,
+ remaining_bytes, qiov, 0);
 backup_cow_request_end();
 goto out;
 }

-ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors, qiov);
+ret = bdrv_co_preadv(bs->file, sector_num * BDRV_SECTOR_SIZE,
+ remaining_sectors * BDRV_SECTOR_SIZE, qiov, 0);
 out:
 return replication_return_value(s, ret);
 }
@@ -279,8 +280,8 @@ static coroutine_fn int 
replication_co_writev(BlockDriverState *bs,
 }

 if (ret == 0) {
-ret = bdrv_co_writev(top, sector_num,
- remaining_sectors, qiov);
+ret = bdrv_co_pwritev(top, sector_num * BDRV_SECTOR_SIZE,
+  remaining_sectors * BDRV_SECTOR_SIZE, qiov, 0);
 return replication_return_value(s, ret);
 }

@@ -306,7 +307,8 @@ static coroutine_fn int 
replication_co_writev(BlockDriverState *bs,
 qemu_iovec_concat(_qiov, qiov, bytes_done, count);

 target = ret ? top : base;
-ret = bdrv_co_writev(target, sector_num, n, _qiov);
+ret = bdrv_co_pwritev(target, sector_num * BDRV_SECTOR_SIZE,
+  n * BDRV_SECTOR_SIZE, _qiov, 0);
 if (ret < 0) {
 goto out1;
 }
-- 
2.14.3




[Qemu-devel] [PATCH v2 3/8] qcow: Switch qcow_co_readv to byte-based calls

2018-05-31 Thread Eric Blake
We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the internals of the qcow
driver read function, by iterating over offset/bytes instead of
sector_num/nb_sectors, and with a rename of index_in_cluster and
repurposing of n to track bytes instead of sectors.

A later patch will then switch the qcow driver as a whole over
to byte-based operation.

Signed-off-by: Eric Blake 

---
v2: prefer 64-bit * over 32-bit <<, rename variable for legibility [Kevin]
---
 block/qcow.c | 42 --
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index 42d1bbb7643..b90915218ff 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -617,13 +617,15 @@ static coroutine_fn int qcow_co_readv(BlockDriverState 
*bs, int64_t sector_num,
  int nb_sectors, QEMUIOVector *qiov)
 {
 BDRVQcowState *s = bs->opaque;
-int index_in_cluster;
+int offset_in_cluster;
 int ret = 0, n;
 uint64_t cluster_offset;
 struct iovec hd_iov;
 QEMUIOVector hd_qiov;
 uint8_t *buf;
 void *orig_buf;
+int64_t offset = sector_num * BDRV_SECTOR_SIZE;
+int64_t bytes = nb_sectors * BDRV_SECTOR_SIZE;

 if (qiov->niov > 1) {
 buf = orig_buf = qemu_try_blockalign(bs, qiov->size);
@@ -637,36 +639,35 @@ static coroutine_fn int qcow_co_readv(BlockDriverState 
*bs, int64_t sector_num,

 qemu_co_mutex_lock(>lock);

-while (nb_sectors != 0) {
+while (bytes != 0) {
 /* prepare next request */
-ret = get_cluster_offset(bs, sector_num << 9,
- 0, 0, 0, 0, _offset);
+ret = get_cluster_offset(bs, offset, 0, 0, 0, 0, _offset);
 if (ret < 0) {
 break;
 }
-index_in_cluster = sector_num & (s->cluster_sectors - 1);
-n = s->cluster_sectors - index_in_cluster;
-if (n > nb_sectors) {
-n = nb_sectors;
+offset_in_cluster = offset & (s->cluster_size - 1);
+n = s->cluster_size - offset_in_cluster;
+if (n > bytes) {
+n = bytes;
 }

 if (!cluster_offset) {
 if (bs->backing) {
 /* read from the base image */
 hd_iov.iov_base = (void *)buf;
-hd_iov.iov_len = n * 512;
+hd_iov.iov_len = n;
 qemu_iovec_init_external(_qiov, _iov, 1);
 qemu_co_mutex_unlock(>lock);
 /* qcow2 emits this on bs->file instead of bs->backing */
 BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
-ret = bdrv_co_readv(bs->backing, sector_num, n, _qiov);
+ret = bdrv_co_preadv(bs->backing, offset, n, _qiov, 0);
 qemu_co_mutex_lock(>lock);
 if (ret < 0) {
 break;
 }
 } else {
 /* Note: in this case, no need to wait */
-memset(buf, 0, 512 * n);
+memset(buf, 0, n);
 }
 } else if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
 /* add AIO support for compressed blocks ? */
@@ -674,21 +675,19 @@ static coroutine_fn int qcow_co_readv(BlockDriverState 
*bs, int64_t sector_num,
 ret = -EIO;
 break;
 }
-memcpy(buf,
-   s->cluster_cache + index_in_cluster * 512, 512 * n);
+memcpy(buf, s->cluster_cache + offset_in_cluster, n);
 } else {
 if ((cluster_offset & 511) != 0) {
 ret = -EIO;
 break;
 }
 hd_iov.iov_base = (void *)buf;
-hd_iov.iov_len = n * 512;
+hd_iov.iov_len = n;
 qemu_iovec_init_external(_qiov, _iov, 1);
 qemu_co_mutex_unlock(>lock);
 BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-ret = bdrv_co_readv(bs->file,
-(cluster_offset >> 9) + index_in_cluster,
-n, _qiov);
+ret = bdrv_co_preadv(bs->file, cluster_offset + offset_in_cluster,
+ n, _qiov, 0);
 qemu_co_mutex_lock(>lock);
 if (ret < 0) {
 break;
@@ -696,8 +695,7 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, 
int64_t sector_num,
 if (bs->encrypted) {
 assert(s->crypto);
 if (qcrypto_block_decrypt(s->crypto,
-  sector_num * BDRV_SECTOR_SIZE, buf,
-  n * BDRV_SECTOR_SIZE, NULL) < 0) {
+  offset, buf, n, NULL) < 0) {
 ret = -EIO;
 break;
 }
@@ -705,9 +703,9 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, 
int64_t sector_num,
 }
 ret = 0;

-

[Qemu-devel] [PATCH v2 0/8] block: more byte-based cleanups: vectored I/O

2018-05-31 Thread Eric Blake
My quest continues.  I spent some time pruning sector-based usage
out of qcow as far as possible (and was dismayed at how long it
took to prove no iotests regressions); so for the other drivers, I
did the bare minimum to get rid of an interface, but will leave it
to those file owners if they want to get rid of further pointless
sector manipulations in their files.

In v2:
- throughout: add collected R-b tags
- patch 2: add assert [Kevin]
- patch 3-4: improve readability [Kevin]
- patch 8: retitle to fix typo [Kashyap]

001/8:[] [--] 'parallels: Switch to byte-based calls'
002/8:[0005] [FC] 'qcow: Switch get_cluster_offset to be byte-based'
003/8:[0017] [FC] 'qcow: Switch qcow_co_readv to byte-based calls'
004/8:[0016] [FC] 'qcow: Switch qcow_co_writev to byte-based calls'
005/8:[0008] [FC] 'qcow: Switch to a byte-based driver'
006/8:[] [--] 'replication: Switch to byte-based calls'
007/8:[] [--] 'vhdx: Switch to byte-based calls'
008/8:[down] 'block: Remove unused sector-based vectored I/O'

Eric Blake (8):
  parallels: Switch to byte-based calls
  qcow: Switch get_cluster_offset to be byte-based
  qcow: Switch qcow_co_readv to byte-based calls
  qcow: Switch qcow_co_writev to byte-based calls
  qcow: Switch to a byte-based driver
  replication: Switch to byte-based calls
  vhdx: Switch to byte-based calls
  block: Remove unused sector-based vectored I/O

 include/block/block.h |   4 --
 block/io.c|  36 --
 block/parallels.c |  16 ---
 block/qcow.c  | 130 +-
 block/replication.c   |  14 +++---
 block/vhdx.c  |  12 ++---
 6 files changed, 90 insertions(+), 122 deletions(-)

-- 
2.14.3




[Qemu-devel] [PATCH v2 2/8] qcow: Switch get_cluster_offset to be byte-based

2018-05-31 Thread Eric Blake
We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the internal helper function
get_cluster_offset(), by changing n_start and n_end to by byte
offsets rather than sector indices within the cluster being
allocated.  However, assert that these values are still
sector-aligned (at least qcrypto_block_encrypt() still wants that).
For now we get that alignment for free because we still use
sector-based driver callbacks.

A later patch will then switch the qcow driver as a whole over
to byte-based operation; but will still leave things at sector
alignments as it is not worth auditing the qcow image format
to worry about sub-sector requests.

Signed-off-by: Eric Blake 

---
v2: assert sector alignment [Max]
---
 block/qcow.c | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index 3ba2ca25ea5..42d1bbb7643 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -345,8 +345,8 @@ static int qcow_reopen_prepare(BDRVReopenState *state,
  *
  * 0 to not allocate.
  *
- * 1 to allocate a normal cluster (for sector indexes 'n_start' to
- * 'n_end')
+ * 1 to allocate a normal cluster (for sector-aligned byte offsets 'n_start'
+ * to 'n_end' within the cluster)
  *
  * 2 to allocate a compressed cluster of size
  * 'compressed_size'. 'compressed_size' must be > 0 and <
@@ -440,9 +440,10 @@ static int get_cluster_offset(BlockDriverState *bs,
 if (!allocate)
 return 0;
 BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC);
+assert(QEMU_IS_ALIGNED(n_start | n_end, BDRV_SECTOR_SIZE));
 /* allocate a new cluster */
 if ((cluster_offset & QCOW_OFLAG_COMPRESSED) &&
-(n_end - n_start) < s->cluster_sectors) {
+(n_end - n_start) < s->cluster_size) {
 /* if the cluster is already compressed, we must
decompress it in the case it is not completely
overwritten */
@@ -480,16 +481,15 @@ static int get_cluster_offset(BlockDriverState *bs,
 /* if encrypted, we must initialize the cluster
content which won't be written */
 if (bs->encrypted &&
-(n_end - n_start) < s->cluster_sectors) {
-uint64_t start_sect;
+(n_end - n_start) < s->cluster_size) {
+uint64_t start_offset;
 assert(s->crypto);
-start_sect = (offset & ~(s->cluster_size - 1)) >> 9;
-for(i = 0; i < s->cluster_sectors; i++) {
+start_offset = offset & ~(s->cluster_size - 1);
+for (i = 0; i < s->cluster_size; i += BDRV_SECTOR_SIZE) {
 if (i < n_start || i >= n_end) {
-memset(s->cluster_data, 0x00, 512);
+memset(s->cluster_data, 0x00, BDRV_SECTOR_SIZE);
 if (qcrypto_block_encrypt(s->crypto,
-  (start_sect + i) *
-  BDRV_SECTOR_SIZE,
+  start_offset + i,
   s->cluster_data,
   BDRV_SECTOR_SIZE,
   NULL) < 0) {
@@ -497,8 +497,9 @@ static int get_cluster_offset(BlockDriverState *bs,
 }
 BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
 ret = bdrv_pwrite(bs->file,
-  cluster_offset + i * 512,
-  s->cluster_data, 512);
+  cluster_offset + i,
+  s->cluster_data,
+  BDRV_SECTOR_SIZE);
 if (ret < 0) {
 return ret;
 }
@@ -758,8 +759,8 @@ static coroutine_fn int qcow_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 n = nb_sectors;
 }
 ret = get_cluster_offset(bs, sector_num << 9, 1, 0,
- index_in_cluster,
- index_in_cluster + n, _offset);
+ index_in_cluster << 9,
+ (index_in_cluster + n) << 9, _offset);
 if (ret < 0) {
 break;
 }
-- 
2.14.3




[Qemu-devel] [PATCH v2 1/8] parallels: Switch to byte-based calls

2018-05-31 Thread Eric Blake
We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based calls
into the block layer from the parallels driver.

Ideally, the parallels driver should switch to doing everything
byte-based, but that's a more invasive change that requires a
bit more auditing.

Signed-off-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Denis V. Lunev 
---
 block/parallels.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index 6e9c37f44e1..a761c896d35 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -226,14 +226,15 @@ static int64_t allocate_clusters(BlockDriverState *bs, 
int64_t sector_num,
 };
 qemu_iovec_init_external(, , 1);

-ret = bdrv_co_readv(bs->backing, idx * s->tracks, nb_cow_sectors,
-);
+ret = bdrv_co_preadv(bs->backing, idx * s->tracks * BDRV_SECTOR_SIZE,
+ nb_cow_bytes, , 0);
 if (ret < 0) {
 qemu_vfree(iov.iov_base);
 return ret;
 }

-ret = bdrv_co_writev(bs->file, s->data_end, nb_cow_sectors, );
+ret = bdrv_co_pwritev(bs->file, s->data_end * BDRV_SECTOR_SIZE,
+  nb_cow_bytes, , 0);
 qemu_vfree(iov.iov_base);
 if (ret < 0) {
 return ret;
@@ -339,7 +340,8 @@ static coroutine_fn int 
parallels_co_writev(BlockDriverState *bs,
 qemu_iovec_reset(_qiov);
 qemu_iovec_concat(_qiov, qiov, bytes_done, nbytes);

-ret = bdrv_co_writev(bs->file, position, n, _qiov);
+ret = bdrv_co_pwritev(bs->file, position * BDRV_SECTOR_SIZE, nbytes,
+  _qiov, 0);
 if (ret < 0) {
 break;
 }
@@ -378,7 +380,8 @@ static coroutine_fn int parallels_co_readv(BlockDriverState 
*bs,

 if (position < 0) {
 if (bs->backing) {
-ret = bdrv_co_readv(bs->backing, sector_num, n, _qiov);
+ret = bdrv_co_preadv(bs->backing, sector_num * 
BDRV_SECTOR_SIZE,
+ nbytes, _qiov, 0);
 if (ret < 0) {
 break;
 }
@@ -386,7 +389,8 @@ static coroutine_fn int parallels_co_readv(BlockDriverState 
*bs,
 qemu_iovec_memset(_qiov, 0, 0, nbytes);
 }
 } else {
-ret = bdrv_co_readv(bs->file, position, n, _qiov);
+ret = bdrv_co_preadv(bs->file, position * BDRV_SECTOR_SIZE, nbytes,
+ _qiov, 0);
 if (ret < 0) {
 break;
 }
-- 
2.14.3




[Qemu-devel] [PATCH v2 5/8] qcow: Switch to a byte-based driver

2018-05-31 Thread Eric Blake
We are gradually moving away from sector-based interfaces, towards
byte-based.  The qcow driver is now ready to fully utilize the
byte-based callback interface, as long as we override the default
alignment to still be 512 (needed at least for asserts present
because of encryption, but easier to do everywhere than to audit
which sub-sector requests are handled correctly, especially since
we no longer recommend qcow for new disk images).

Signed-off-by: Eric Blake 

---
v2: minor rebase to changes earlier in series
---
 block/qcow.c | 35 ---
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index 44adeba276c..55720b006ef 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -69,7 +69,6 @@ typedef struct QCowHeader {
 typedef struct BDRVQcowState {
 int cluster_bits;
 int cluster_size;
-int cluster_sectors;
 int l2_bits;
 int l2_size;
 unsigned int l1_size;
@@ -235,7 +234,6 @@ static int qcow_open(BlockDriverState *bs, QDict *options, 
int flags,
 }
 s->cluster_bits = header.cluster_bits;
 s->cluster_size = 1 << s->cluster_bits;
-s->cluster_sectors = 1 << (s->cluster_bits - 9);
 s->l2_bits = header.l2_bits;
 s->l2_size = 1 << s->l2_bits;
 bs->total_sectors = header.size / 512;
@@ -613,8 +611,18 @@ static int decompress_cluster(BlockDriverState *bs, 
uint64_t cluster_offset)
 return 0;
 }

-static coroutine_fn int qcow_co_readv(BlockDriverState *bs, int64_t sector_num,
- int nb_sectors, QEMUIOVector *qiov)
+static void qcow_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+/* At least encrypted images require 512-byte alignment. Apply the
+ * limit universally, rather than just on encrypted images, as
+ * it's easier to let the block layer handle rounding than to
+ * audit this code further. */
+bs->bl.request_alignment = BDRV_SECTOR_SIZE;
+}
+
+static coroutine_fn int qcow_co_preadv(BlockDriverState *bs, uint64_t offset,
+   uint64_t bytes, QEMUIOVector *qiov,
+   int flags)
 {
 BDRVQcowState *s = bs->opaque;
 int offset_in_cluster;
@@ -624,9 +632,8 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, 
int64_t sector_num,
 QEMUIOVector hd_qiov;
 uint8_t *buf;
 void *orig_buf;
-int64_t offset = sector_num * BDRV_SECTOR_SIZE;
-int64_t bytes = nb_sectors * BDRV_SECTOR_SIZE;

+assert(!flags);
 if (qiov->niov > 1) {
 buf = orig_buf = qemu_try_blockalign(bs, qiov->size);
 if (buf == NULL) {
@@ -718,9 +725,9 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, 
int64_t sector_num,
 return ret;
 }

-static coroutine_fn int qcow_co_writev(BlockDriverState *bs, int64_t 
sector_num,
-   int nb_sectors, QEMUIOVector *qiov,
-   int flags)
+static coroutine_fn int qcow_co_pwritev(BlockDriverState *bs, uint64_t offset,
+uint64_t bytes, QEMUIOVector *qiov,
+int flags)
 {
 BDRVQcowState *s = bs->opaque;
 int offset_in_cluster;
@@ -730,8 +737,6 @@ static coroutine_fn int qcow_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 QEMUIOVector hd_qiov;
 uint8_t *buf;
 void *orig_buf;
-int64_t offset = sector_num * BDRV_SECTOR_SIZE;
-int64_t bytes = nb_sectors * BDRV_SECTOR_SIZE;

 assert(!flags);
 s->cluster_cache_offset = -1; /* disable compressed cache */
@@ -1108,8 +1113,7 @@ qcow_co_pwritev_compressed(BlockDriverState *bs, uint64_t 
offset,

 if (ret != Z_STREAM_END || out_len >= s->cluster_size) {
 /* could not compress: write normal cluster */
-ret = qcow_co_writev(bs, offset >> BDRV_SECTOR_BITS,
- bytes >> BDRV_SECTOR_BITS, qiov, 0);
+ret = qcow_co_pwritev(bs, offset, bytes, qiov, 0);
 if (ret < 0) {
 goto fail;
 }
@@ -1194,9 +1198,10 @@ static BlockDriver bdrv_qcow = {
 .bdrv_co_create_opts= qcow_co_create_opts,
 .bdrv_has_zero_init = bdrv_has_zero_init_1,
 .supports_backing   = true,
+.bdrv_refresh_limits= qcow_refresh_limits,

-.bdrv_co_readv  = qcow_co_readv,
-.bdrv_co_writev = qcow_co_writev,
+.bdrv_co_preadv = qcow_co_preadv,
+.bdrv_co_pwritev= qcow_co_pwritev,
 .bdrv_co_block_status   = qcow_co_block_status,

 .bdrv_make_empty= qcow_make_empty,
-- 
2.14.3




  1   2   3   4   >