Re: [Qemu-devel] [Qemu-ppc] [GIT PULL for qemu-pseries REPOST] pseries: Update SLOF firmware image

2019-08-24 Thread David Gibson
On Fri, Aug 23, 2019 at 02:09:44PM +1000, Alexey Kardashevskiy wrote:
> 
> 
> On 14/08/2019 14:33, Aravinda Prasad wrote:
> > 
> > 
> > On Tuesday 13 August 2019 07:47 PM, David Gibson wrote:
> > > On Tue, Aug 13, 2019 at 01:00:24PM +0530, Aravinda Prasad wrote:
> > > > 
> > > > 
> > > > On Monday 12 August 2019 03:38 PM, David Gibson wrote:
> > > > > On Mon, Aug 05, 2019 at 02:14:39PM +0530, Aravinda Prasad wrote:
> > > > > > Alexey/David,
> > > > > > 
> > > > > > With the SLOF changes, QEMU cannot resize the RTAS blob. Resizing is
> > > > > > required for FWNMI support which extends the RTAS blob to include an
> > > > > > error log upon a machine check.
> > > > > > 
> > > > > > The check to valid RTAS buffer fails in the guest because the 
> > > > > > rtas-size
> > > > > > updated in QEMU is not reflecting in the guest.
> > > > > > 
> > > > > > Any workaround for this?
> > > > > 
> > > > > Well, we should still be able to do it, it just means fwnmi would need
> > > > > a SLOF change.  It's an inconvenience, but not really a big deal.
> > > > 
> > > > Yes. Alexey and I were discussing about the following changes to SLOf:
> > > > 
> > > > diff --git a/lib/libhvcall/hvcall.S b/lib/libhvcall/hvcall.S
> > > > index b19f6dbeff2c..880d29a29122 100644
> > > > --- a/lib/libhvcall/hvcall.S
> > > > +++ b/lib/libhvcall/hvcall.S
> > > > @@ -134,6 +134,7 @@ ENTRY(hv_rtas)
> > > >  ori r3,r3,KVMPPC_H_RTAS@l
> > > >  HVCALL
> > > >  blr
> > > > +.space 2048
> > > >  .globl hv_rtas_size
> > > >   hv_rtas_size:
> > > >  .long . - hv_rtas;
> > > > 
> > > > 
> > > > But this will statically reserve space for RTAS even when
> > > > SPAPR_CAP_FWNMI_MCE is OFF.
> > > 
> > > Sure.  We could flag that in the DT somehow, and have SLOF reserve the
> > > space conditionally.
> > > 
> > > Or we could just ignore it. 2 kiB is miniscule compared to our minimum
> > > guest size, and our current RTAS is microscopic compared to PowerVM.
> > 
> > I also think so, 2kiB is miniscule so we can allocate it statically.
> > 
> > Alexey,
> > 
> > Can you please include the above one line fix to SLOF?
> 
> 
> I am thinking of:
> ===
> @@ -132,20 +132,22 @@ ENTRY(hv_rtas)
>   mr  r4,r3
>   lis r3,KVMPPC_H_RTAS@h
>   ori r3,r3,KVMPPC_H_RTAS@l
>   HVCALL
>   blr
> + .space 2048 - (. - hv_rtas)
>   .globl hv_rtas_size
>  hv_rtas_size:
>   .long . - hv_rtas;
> 
>  ENTRY(hv_rtas_broken_sc1)
>   mr  r4,r3
>   lis r3,KVMPPC_H_RTAS@h
>   ori r3,r3,KVMPPC_H_RTAS@l
>   .long   0x7c000268
>   blr
> + .space 2048 - (. - hv_rtas_broken_sc1)
>   .globl hv_rtas_broken_sc1_size
>  hv_rtas_broken_sc1_size:
>   .long . - hv_rtas_broken_sc1;
> ===
> 
> to align the rtas blob to 2k precisely. But QEMU hardcoded
> RTAS_ERROR_LOG_OFFSET bothers me a bit, I should probably put some sort of a
> magic marker at which RTAS log can start.
> 
> David, any thoughts? The marker could be as simple as a zero, for example.

Eh, TBH I don't think an agreed upon magic marker has all that much
advantage on an agreed upon offset.  Let's keep it simple and retain
the fixed offset for now.  If that's ever a problem we'll need a
synchronized qemu & SLOF update, but that's ok - that's an
inconvenience, not a disaster.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block

2019-08-24 Thread Nir Soffer
On Fri, Aug 23, 2019 at 8:53 PM Max Reitz  wrote:

> On 23.08.19 18:48, Nir Soffer wrote:
> > On Fri, Aug 23, 2019 at 4:58 PM Max Reitz  > > wrote:
>
> [...]
>
> > If you have a format layer that truncates the image to a fixed size
> and
> > does not write anything into the first block itself (say because it
> uses
> > a footer), then (with O_DIRECT) allocate_first_block() will fail
> > (silently, because while it does return an error value, it is never
> > checked and there is no comment that explains why we don’t check it)
> >
> >
> > The motivation is that this is an optimization for the special case of
> using
> > empty image, so it does not worth failing image creation.
> > I will add a comment about that.
>
> Thanks!
>
> [...]
>
> > Thanks for the example.
> >
> > I will need time to play with blockdev and understand the flows when
> image
> > are created. Do you think is would be useful to fix now only image
> creation
> > via qemu-img, and handle blockdev later?
>
> Well, it isn’t about blockdev, it’s simply about the fact that this
> function doesn’t work for O_DIRECT files.  I showed how to reproduce the
> issue without blockdev (namely block_resize).  Sure, that is an edge
> case, but it is a completely valid case.
>
> Also, it seems to me the fix is rather simple.  Just something like:
>
> static int allocate_first_block(int fd, int64_t max_size)
> {
> int write_size = MIN(max_size, MAX_BLOCKSIZE);
> void *buf;
> ssize_t n;
>
> /* Round down to power of two */
> assert(write_size > 0);
> write_size = 1 << (31 - clz32(write_size));
>
> buf = qemu_memalign(MAX(getpagesize(), write_size), write_size);
> memset(buf, 0, write_size);
>
> do {
> n = pwrite(fd, buf, write_size, 0);
> } while (n < 0 && errno == EINTR);
>
> qemu_vfree(buf);
>
> return n < 0 ? -errno : 0;
> }
>
> Wouldn’t that work?
>

Sure, it should work.

But I think we can make this simpler, always writing MIN(max_size,
MAX_BLOCKSIZE).

vdsm is enforcing now 4k alignment, and there is no way to create images
with unaligned
size. Maybe qemu should adapt this rule?

Nir


Re: [Qemu-devel] [PATCH 8/9] exec: Delete DEVICE_HOST_ENDIAN

2019-08-24 Thread Richard Henderson
On 8/23/19 12:42 PM, Tony Nguyen wrote:
> Simplify code with MemOp short hand for host endianness, 0.
> 
>  typedef enum MemOp {
>  /* snip */
>  #ifdef HOST_WORDS_BIGENDIAN
>  MO_LE= MO_BSWAP,
>  MO_BE= 0,
>  #else
>  MO_LE= 0,
>  MO_BE= MO_BSWAP,
>  #endif
>  /* snip */
>  };
> 
> Signed-off-by: Tony Nguyen 
> ---
>  include/exec/cpu-common.h | 8 
>  memory.c  | 2 +-
>  2 files changed, 1 insertion(+), 9 deletions(-)

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] [PATCH 9/9] memory: Delete memory_region_big_endian

2019-08-24 Thread Richard Henderson
On 8/23/19 12:42 PM, Tony Nguyen wrote:
> memory_region_big_endian is no longer useful now we are consistently
> using MemOp for endianness.
> 
> Signed-off-by: Tony Nguyen 
> ---
>  memory.c | 11 +--
>  1 file changed, 1 insertion(+), 10 deletions(-)

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] [PATCH 6/9] memory: Delete devend_memop

2019-08-24 Thread Richard Henderson
On 8/23/19 12:42 PM, Tony Nguyen wrote:
> device_endian has been made redundant by MemOp.
> 
> Signed-off-by: Tony Nguyen 
> ---
>  include/exec/memory.h |  3 ---
>  memory.c  | 19 +--
>  memory_ldst.inc.c | 18 ++
>  3 files changed, 7 insertions(+), 33 deletions(-)

Reviewed-by: Richard Henderson 


r~



[Qemu-devel] [PATCH 6/6] tcg: Check for watchpoints in probe_write()

2019-08-24 Thread Richard Henderson
From: David Hildenbrand 

Let size > 0 indicate a promise to write to those bytes.
Check for write watchpoints in the probed range.

Suggested-by: Richard Henderson 
Signed-off-by: David Hildenbrand 
Message-Id: <20190823100741.9621-10-da...@redhat.com>
[rth: Recompute index after tlb_fill; check TLB_WATCHPOINT.]
Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index f7a414a131..7fc7aa9482 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1086,13 +1086,24 @@ void probe_write(CPUArchState *env, target_ulong addr, 
int size, int mmu_idx,
 {
 uintptr_t index = tlb_index(env, mmu_idx, addr);
 CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
+target_ulong tlb_addr = tlb_addr_write(entry);
 
-if (!tlb_hit(tlb_addr_write(entry), addr)) {
-/* TLB entry is for a different page */
+if (unlikely(!tlb_hit(tlb_addr, addr))) {
 if (!VICTIM_TLB_HIT(addr_write, addr)) {
 tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE,
  mmu_idx, retaddr);
+/* TLB resize via tlb_fill may have moved the entry. */
+index = tlb_index(env, mmu_idx, addr);
+entry = tlb_entry(env, mmu_idx, addr);
 }
+tlb_addr = tlb_addr_write(entry);
+}
+
+/* Handle watchpoints.  */
+if ((tlb_addr & TLB_WATCHPOINT) && size > 0) {
+cpu_check_watchpoint(env_cpu(env), addr, size,
+ env_tlb(env)->d[mmu_idx].iotlb[index].attrs,
+ BP_MEM_WRITE, retaddr);
 }
 }
 
-- 
2.17.1




[Qemu-devel] [PATCH 4/6] exec: Factor out cpu_watchpoint_address_matches

2019-08-24 Thread Richard Henderson
We want to move the check for watchpoints from
memory_region_section_get_iotlb to tlb_set_page_with_attrs.
Isolate the loop over watchpoints to an exported function.

Rename the existing cpu_watchpoint_address_matches to
watchpoint_address_matches, since it doesn't actually
have a cpu argument.

Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h |  7 +++
 exec.c| 45 ---
 2 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 7bd8bed5b2..c7cda65c66 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -1096,6 +1096,12 @@ static inline void cpu_check_watchpoint(CPUState *cpu, 
vaddr addr, vaddr len,
 MemTxAttrs atr, int fl, uintptr_t ra)
 {
 }
+
+static inline int cpu_watchpoint_address_matches(CPUState *cpu,
+ vaddr addr, vaddr len)
+{
+return 0;
+}
 #else
 int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
   int flags, CPUWatchpoint **watchpoint);
@@ -1105,6 +,7 @@ void cpu_watchpoint_remove_by_ref(CPUState *cpu, 
CPUWatchpoint *watchpoint);
 void cpu_watchpoint_remove_all(CPUState *cpu, int mask);
 void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
   MemTxAttrs attrs, int flags, uintptr_t ra);
+int cpu_watchpoint_address_matches(CPUState *cpu, vaddr addr, vaddr len);
 #endif
 
 /**
diff --git a/exec.c b/exec.c
index cb6f5763dc..8575ce51ad 100644
--- a/exec.c
+++ b/exec.c
@@ -1138,9 +1138,8 @@ void cpu_watchpoint_remove_all(CPUState *cpu, int mask)
  * partially or completely with the address range covered by the
  * access).
  */
-static inline bool cpu_watchpoint_address_matches(CPUWatchpoint *wp,
-  vaddr addr,
-  vaddr len)
+static inline bool watchpoint_address_matches(CPUWatchpoint *wp,
+  vaddr addr, vaddr len)
 {
 /* We know the lengths are non-zero, but a little caution is
  * required to avoid errors in the case where the range ends
@@ -1152,6 +1151,20 @@ static inline bool 
cpu_watchpoint_address_matches(CPUWatchpoint *wp,
 
 return !(addr > wpend || wp->vaddr > addrend);
 }
+
+/* Return flags for watchpoints that match addr + prot.  */
+int cpu_watchpoint_address_matches(CPUState *cpu, vaddr addr, vaddr len)
+{
+CPUWatchpoint *wp;
+int ret = 0;
+
+QTAILQ_FOREACH(wp, >watchpoints, entry) {
+if (watchpoint_address_matches(wp, addr, TARGET_PAGE_SIZE)) {
+ret |= wp->flags;
+}
+}
+return ret;
+}
 #endif /* !CONFIG_USER_ONLY */
 
 /* Add a breakpoint.  */
@@ -1459,7 +1472,7 @@ hwaddr memory_region_section_get_iotlb(CPUState *cpu,
target_ulong *address)
 {
 hwaddr iotlb;
-CPUWatchpoint *wp;
+int flags, match;
 
 if (memory_region_is_ram(section->mr)) {
 /* Normal RAM.  */
@@ -1477,17 +1490,17 @@ hwaddr memory_region_section_get_iotlb(CPUState *cpu,
 iotlb += xlat;
 }
 
-/* Make accesses to pages with watchpoints go via the
-   watchpoint trap routines.  */
-QTAILQ_FOREACH(wp, >watchpoints, entry) {
-if (cpu_watchpoint_address_matches(wp, vaddr, TARGET_PAGE_SIZE)) {
-/* Avoid trapping reads of pages with a write breakpoint. */
-if ((prot & PAGE_WRITE) || (wp->flags & BP_MEM_READ)) {
-iotlb = PHYS_SECTION_WATCH + paddr;
-*address |= TLB_MMIO;
-break;
-}
-}
+/* Avoid trapping reads of pages with a write breakpoint. */
+match = (prot & PAGE_READ ? BP_MEM_READ : 0)
+  | (prot & PAGE_WRITE ? BP_MEM_WRITE : 0);
+flags = cpu_watchpoint_address_matches(cpu, vaddr, TARGET_PAGE_SIZE);
+if (flags & match) {
+/*
+ * Make accesses to pages with watchpoints go via the
+ * watchpoint trap routines.
+ */
+iotlb = PHYS_SECTION_WATCH + paddr;
+*address |= TLB_MMIO;
 }
 
 return iotlb;
@@ -2806,7 +2819,7 @@ void cpu_check_watchpoint(CPUState *cpu, vaddr addr, 
vaddr len,
 
 addr = cc->adjust_watchpoint_address(cpu, addr, len);
 QTAILQ_FOREACH(wp, >watchpoints, entry) {
-if (cpu_watchpoint_address_matches(wp, addr, len)
+if (watchpoint_address_matches(wp, addr, len)
 && (wp->flags & flags)) {
 if (flags == BP_MEM_READ) {
 wp->flags |= BP_WATCHPOINT_HIT_READ;
-- 
2.17.1




[Qemu-devel] [PATCH 2/6] exec: Factor out core logic of check_watchpoint()

2019-08-24 Thread Richard Henderson
From: David Hildenbrand 

We want to perform the same checks in probe_write() to trigger a cpu
exit before doing any modifications. We'll have to pass a PC.

Signed-off-by: David Hildenbrand 
Reviewed-by: Richard Henderson 
Message-Id: <20190823100741.9621-9-da...@redhat.com>
[rth: Use vaddr for len, like other watchpoint functions;
Move user-only stub to static inline.]
Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h |  7 +++
 exec.c| 26 ++
 2 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 6de688059d..7bd8bed5b2 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -1091,6 +1091,11 @@ static inline void cpu_watchpoint_remove_by_ref(CPUState 
*cpu,
 static inline void cpu_watchpoint_remove_all(CPUState *cpu, int mask)
 {
 }
+
+static inline void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
+MemTxAttrs atr, int fl, uintptr_t ra)
+{
+}
 #else
 int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
   int flags, CPUWatchpoint **watchpoint);
@@ -1098,6 +1103,8 @@ int cpu_watchpoint_remove(CPUState *cpu, vaddr addr,
   vaddr len, int flags);
 void cpu_watchpoint_remove_by_ref(CPUState *cpu, CPUWatchpoint *watchpoint);
 void cpu_watchpoint_remove_all(CPUState *cpu, int mask);
+void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
+  MemTxAttrs attrs, int flags, uintptr_t ra);
 #endif
 
 /**
diff --git a/exec.c b/exec.c
index 31fb75901f..cb6f5763dc 100644
--- a/exec.c
+++ b/exec.c
@@ -2789,11 +2789,10 @@ static const MemoryRegionOps notdirty_mem_ops = {
 };
 
 /* Generate a debug exception if a watchpoint has been hit.  */
-static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
+void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
+  MemTxAttrs attrs, int flags, uintptr_t ra)
 {
-CPUState *cpu = current_cpu;
 CPUClass *cc = CPU_GET_CLASS(cpu);
-target_ulong vaddr;
 CPUWatchpoint *wp;
 
 assert(tcg_enabled());
@@ -2804,17 +2803,17 @@ static void check_watchpoint(int offset, int len, 
MemTxAttrs attrs, int flags)
 cpu_interrupt(cpu, CPU_INTERRUPT_DEBUG);
 return;
 }
-vaddr = (cpu->mem_io_vaddr & TARGET_PAGE_MASK) + offset;
-vaddr = cc->adjust_watchpoint_address(cpu, vaddr, len);
+
+addr = cc->adjust_watchpoint_address(cpu, addr, len);
 QTAILQ_FOREACH(wp, >watchpoints, entry) {
-if (cpu_watchpoint_address_matches(wp, vaddr, len)
+if (cpu_watchpoint_address_matches(wp, addr, len)
 && (wp->flags & flags)) {
 if (flags == BP_MEM_READ) {
 wp->flags |= BP_WATCHPOINT_HIT_READ;
 } else {
 wp->flags |= BP_WATCHPOINT_HIT_WRITE;
 }
-wp->hitaddr = vaddr;
+wp->hitaddr = MAX(addr, wp->vaddr);
 wp->hitattrs = attrs;
 if (!cpu->watchpoint_hit) {
 if (wp->flags & BP_CPU &&
@@ -2829,11 +2828,14 @@ static void check_watchpoint(int offset, int len, 
MemTxAttrs attrs, int flags)
 if (wp->flags & BP_STOP_BEFORE_ACCESS) {
 cpu->exception_index = EXCP_DEBUG;
 mmap_unlock();
-cpu_loop_exit(cpu);
+cpu_loop_exit_restore(cpu, ra);
 } else {
 /* Force execution of one insn next time.  */
 cpu->cflags_next_tb = 1 | curr_cflags();
 mmap_unlock();
+if (ra) {
+cpu_restore_state(cpu, ra, true);
+}
 cpu_loop_exit_noexc(cpu);
 }
 }
@@ -2843,6 +2845,14 @@ static void check_watchpoint(int offset, int len, 
MemTxAttrs attrs, int flags)
 }
 }
 
+static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
+{
+CPUState *cpu = current_cpu;
+vaddr addr = (cpu->mem_io_vaddr & TARGET_PAGE_MASK) + offset;
+
+cpu_check_watchpoint(cpu, addr, len, attrs, flags, 0);
+}
+
 /* Watchpoint access routines.  Watchpoints are inserted using TLB tricks,
so these check for a hit then pass through to the normal out-of-line
phys routines.  */
-- 
2.17.1




[Qemu-devel] [PATCH 5/6] cputlb: Handle watchpoints via TLB_WATCHPOINT

2019-08-24 Thread Richard Henderson
The raising of exceptions from check_watchpoint, buried inside
of the I/O subsystem, is fundamentally broken.  We do not have
the helper return address with which we can unwind guest state.

Replace PHYS_SECTION_WATCH and io_mem_watch with TLB_WATCHPOINT.
Move the call to cpu_check_watchpoint into the cputlb helpers
where we do have the helper return address.

This also allows us to handle watchpoints on RAM to bypass the
full i/o access path.

Signed-off-by: Richard Henderson 
---
 include/exec/cpu-all.h |   5 +-
 accel/tcg/cputlb.c |  83 +++---
 exec.c | 114 +++--
 3 files changed, 87 insertions(+), 115 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 8d07ae23a5..d2d443c4f9 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -329,11 +329,14 @@ CPUArchState *cpu_copy(CPUArchState *env);
 #define TLB_NOTDIRTY(1 << (TARGET_PAGE_BITS - 2))
 /* Set if TLB entry is an IO callback.  */
 #define TLB_MMIO(1 << (TARGET_PAGE_BITS - 3))
+/* Set if TLB entry contains a watchpoint.  */
+#define TLB_WATCHPOINT  (1 << (TARGET_PAGE_BITS - 4))
 
 /* Use this mask to check interception with an alignment mask
  * in a TCG backend.
  */
-#define TLB_FLAGS_MASK  (TLB_INVALID_MASK | TLB_NOTDIRTY | TLB_MMIO)
+#define TLB_FLAGS_MASK \
+(TLB_INVALID_MASK | TLB_NOTDIRTY | TLB_MMIO | TLB_WATCHPOINT)
 
 /**
  * tlb_hit_page: return true if page aligned @addr is a hit against the
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index c9576bebcf..f7a414a131 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -710,6 +710,7 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 hwaddr iotlb, xlat, sz, paddr_page;
 target_ulong vaddr_page;
 int asidx = cpu_asidx_from_attrs(cpu, attrs);
+int wp_flags;
 
 assert_cpu_is_self(cpu);
 
@@ -752,6 +753,8 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 code_address = address;
 iotlb = memory_region_section_get_iotlb(cpu, section, vaddr_page,
 paddr_page, xlat, prot, );
+wp_flags = cpu_watchpoint_address_matches(cpu, vaddr_page,
+  TARGET_PAGE_SIZE);
 
 index = tlb_index(env, mmu_idx, vaddr_page);
 te = tlb_entry(env, mmu_idx, vaddr_page);
@@ -805,6 +808,9 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 tn.addend = addend - vaddr_page;
 if (prot & PAGE_READ) {
 tn.addr_read = address;
+if (wp_flags & BP_MEM_READ) {
+tn.addr_read |= TLB_WATCHPOINT;
+}
 } else {
 tn.addr_read = -1;
 }
@@ -831,6 +837,9 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 if (prot & PAGE_WRITE_INV) {
 tn.addr_write |= TLB_INVALID_MASK;
 }
+if (wp_flags & BP_MEM_WRITE) {
+tn.addr_write |= TLB_WATCHPOINT;
+}
 }
 
 copy_tlb_helper_locked(te, );
@@ -1264,13 +1273,33 @@ load_helper(CPUArchState *env, target_ulong addr, 
TCGMemOpIdx oi,
 tlb_addr &= ~TLB_INVALID_MASK;
 }
 
-/* Handle an IO access.  */
+/* Handle anything that isn't just a straight memory access.  */
 if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
+CPUIOTLBEntry *iotlbentry;
+
+/* For anything that is unaligned, recurse through full_load.  */
 if ((addr & (size - 1)) != 0) {
 goto do_unaligned_access;
 }
-return io_readx(env, _tlb(env)->d[mmu_idx].iotlb[index],
-mmu_idx, addr, retaddr, access_type, op);
+
+iotlbentry = _tlb(env)->d[mmu_idx].iotlb[index];
+
+/* Handle watchpoints.  */
+if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
+/* On watchpoint hit, this will longjmp out.  */
+cpu_check_watchpoint(env_cpu(env), addr, size,
+ iotlbentry->attrs, BP_MEM_READ, retaddr);
+
+/* The backing page may or may not require I/O.  */
+tlb_addr &= ~TLB_WATCHPOINT;
+if ((tlb_addr & ~TARGET_PAGE_MASK) == 0) {
+goto do_aligned_access;
+}
+}
+
+/* Handle I/O access.  */
+return io_readx(env, iotlbentry, mmu_idx, addr,
+retaddr, access_type, op);
 }
 
 /* Handle slow unaligned access (it spans two pages or IO).  */
@@ -1297,6 +1326,7 @@ load_helper(CPUArchState *env, target_ulong addr, 
TCGMemOpIdx oi,
 return res & MAKE_64BIT_MASK(0, size * 8);
 }
 
+ do_aligned_access:
 haddr = (void *)((uintptr_t)addr + entry->addend);
 switch (op) {
 case MO_UB:
@@ -1486,13 +1516,32 @@ store_helper(CPUArchState *env, target_ulong addr, 
uint64_t val,
 tlb_addr = tlb_addr_write(entry) & ~TLB_INVALID_MASK;
 }
 
-/* Handle an IO access.  */
+/* Handle anything that isn't 

[Qemu-devel] [PATCH 1/6] exec: Move user-only watchpoint stubs inline

2019-08-24 Thread Richard Henderson
Let the user-only watchpoint stubs resolve to empty inline functions.

Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h | 23 +++
 exec.c| 26 ++
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 77fca95a40..6de688059d 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -1070,12 +1070,35 @@ static inline bool cpu_breakpoint_test(CPUState *cpu, 
vaddr pc, int mask)
 return false;
 }
 
+#ifdef CONFIG_USER_ONLY
+static inline int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
+int flags, CPUWatchpoint **watchpoint)
+{
+return -ENOSYS;
+}
+
+static inline int cpu_watchpoint_remove(CPUState *cpu, vaddr addr,
+vaddr len, int flags)
+{
+return -ENOSYS;
+}
+
+static inline void cpu_watchpoint_remove_by_ref(CPUState *cpu,
+CPUWatchpoint *wp)
+{
+}
+
+static inline void cpu_watchpoint_remove_all(CPUState *cpu, int mask)
+{
+}
+#else
 int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
   int flags, CPUWatchpoint **watchpoint);
 int cpu_watchpoint_remove(CPUState *cpu, vaddr addr,
   vaddr len, int flags);
 void cpu_watchpoint_remove_by_ref(CPUState *cpu, CPUWatchpoint *watchpoint);
 void cpu_watchpoint_remove_all(CPUState *cpu, int mask);
+#endif
 
 /**
  * cpu_get_address_space:
diff --git a/exec.c b/exec.c
index 53a15b7ad7..31fb75901f 100644
--- a/exec.c
+++ b/exec.c
@@ -1062,28 +1062,7 @@ static void breakpoint_invalidate(CPUState *cpu, 
target_ulong pc)
 }
 #endif
 
-#if defined(CONFIG_USER_ONLY)
-void cpu_watchpoint_remove_all(CPUState *cpu, int mask)
-
-{
-}
-
-int cpu_watchpoint_remove(CPUState *cpu, vaddr addr, vaddr len,
-  int flags)
-{
-return -ENOSYS;
-}
-
-void cpu_watchpoint_remove_by_ref(CPUState *cpu, CPUWatchpoint *watchpoint)
-{
-}
-
-int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
-  int flags, CPUWatchpoint **watchpoint)
-{
-return -ENOSYS;
-}
-#else
+#ifndef CONFIG_USER_ONLY
 /* Add a watchpoint.  */
 int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
   int flags, CPUWatchpoint **watchpoint)
@@ -1173,8 +1152,7 @@ static inline bool 
cpu_watchpoint_address_matches(CPUWatchpoint *wp,
 
 return !(addr > wpend || wp->vaddr > addrend);
 }
-
-#endif
+#endif /* !CONFIG_USER_ONLY */
 
 /* Add a breakpoint.  */
 int cpu_breakpoint_insert(CPUState *cpu, vaddr pc, int flags,
-- 
2.17.1




[Qemu-devel] [PATCH 0/6] exec: Cleanup watchpoints

2019-08-24 Thread Richard Henderson
As discussed with David earlier this week, the current implementation
of watchpoints cannot work, at least reliably.  We are raising an
exception out of the middle of the i/o access path which does not
even attempt to unwind the guest cpu state, nor does it have the
information required to do so.

This moves the implementation to the cputlb helpers.  This is a point
at which we can and do raise exceptions properly.

In addition, this fixes a bug in that unaligned stores were detecting
watchpoints in the middle of the byte-by-byte operation, which means
that we didn't signal the watchpoint early enough to avoid state change.


r~


David Hildenbrand (2):
  exec: Factor out core logic of check_watchpoint()
  tcg: Check for watchpoints in probe_write()

Richard Henderson (4):
  exec: Move user-only watchpoint stubs inline
  cputlb: Fold TLB_RECHECK into TLB_INVALID_MASK
  exec: Factor out cpu_watchpoint_address_matches
  cputlb: Handle watchpoints via TLB_WATCHPOINT

 include/exec/cpu-all.h |   8 +-
 include/hw/core/cpu.h  |  37 +
 accel/tcg/cputlb.c | 156 --
 exec.c | 167 +
 4 files changed, 173 insertions(+), 195 deletions(-)

-- 
2.17.1




[Qemu-devel] [PATCH 3/6] cputlb: Fold TLB_RECHECK into TLB_INVALID_MASK

2019-08-24 Thread Richard Henderson
We had two different mechanisms to force a recheck of the tlb.

Before TLB_RECHECK was introduced, we had a PAGE_WRITE_INV bit
that would immediate set TLB_INVALID_MASK, which automatically
means that a second check of the tlb entry fails.

We can use the same mechanism to handle small pages.
Conserve TLB_* bits by removing TLB_RECHECK.

Signed-off-by: Richard Henderson 
---
 include/exec/cpu-all.h |  5 +--
 accel/tcg/cputlb.c | 86 +++---
 2 files changed, 24 insertions(+), 67 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 8323094648..8d07ae23a5 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -329,14 +329,11 @@ CPUArchState *cpu_copy(CPUArchState *env);
 #define TLB_NOTDIRTY(1 << (TARGET_PAGE_BITS - 2))
 /* Set if TLB entry is an IO callback.  */
 #define TLB_MMIO(1 << (TARGET_PAGE_BITS - 3))
-/* Set if TLB entry must have MMU lookup repeated for every access */
-#define TLB_RECHECK (1 << (TARGET_PAGE_BITS - 4))
 
 /* Use this mask to check interception with an alignment mask
  * in a TCG backend.
  */
-#define TLB_FLAGS_MASK  (TLB_INVALID_MASK | TLB_NOTDIRTY | TLB_MMIO \
- | TLB_RECHECK)
+#define TLB_FLAGS_MASK  (TLB_INVALID_MASK | TLB_NOTDIRTY | TLB_MMIO)
 
 /**
  * tlb_hit_page: return true if page aligned @addr is a hit against the
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index d9787cc893..c9576bebcf 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -732,11 +732,8 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 
 address = vaddr_page;
 if (size < TARGET_PAGE_SIZE) {
-/*
- * Slow-path the TLB entries; we will repeat the MMU check and TLB
- * fill on every access.
- */
-address |= TLB_RECHECK;
+/* Repeat the MMU check and TLB fill on every access.  */
+address |= TLB_INVALID_MASK;
 }
 if (attrs.byte_swap) {
 /* Force the access through the I/O slow path.  */
@@ -1026,10 +1023,15 @@ static bool victim_tlb_hit(CPUArchState *env, size_t 
mmu_idx, size_t index,
   victim_tlb_hit(env, mmu_idx, index, offsetof(CPUTLBEntry, TY), \
  (ADDR) & TARGET_PAGE_MASK)
 
-/* NOTE: this function can trigger an exception */
-/* NOTE2: the returned address is not exactly the physical address: it
- * is actually a ram_addr_t (in system mode; the user mode emulation
- * version of this function returns a guest virtual address).
+/*
+ * Return a ram_addr_t for the virtual address for execution.
+ *
+ * Return -1 if we can't translate and execute from an entire page
+ * of RAM.  This will force us to execute by loading and translating
+ * one insn at a time, without caching.
+ *
+ * NOTE: This function will trigger an exception if the page is
+ * not executable.
  */
 tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr)
 {
@@ -1043,19 +1045,20 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, 
target_ulong addr)
 tlb_fill(env_cpu(env), addr, 0, MMU_INST_FETCH, mmu_idx, 0);
 index = tlb_index(env, mmu_idx, addr);
 entry = tlb_entry(env, mmu_idx, addr);
+
+if (unlikely(entry->addr_code & TLB_INVALID_MASK)) {
+/*
+ * The MMU protection covers a smaller range than a target
+ * page, so we must redo the MMU check for every insn.
+ */
+return -1;
+}
 }
 assert(tlb_hit(entry->addr_code, addr));
 }
 
-if (unlikely(entry->addr_code & (TLB_RECHECK | TLB_MMIO))) {
-/*
- * Return -1 if we can't translate and execute from an entire
- * page of RAM here, which will cause us to execute by loading
- * and translating one insn at a time, without caching:
- *  - TLB_RECHECK: means the MMU protection covers a smaller range
- *than a target page, so we must redo the MMU check every insn
- *  - TLB_MMIO: region is not backed by RAM
- */
+if (unlikely(entry->addr_code & TLB_MMIO)) {
+/* The region is not backed by RAM.  */
 return -1;
 }
 
@@ -1180,7 +1183,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, 
target_ulong addr,
 }
 
 /* Notice an IO access or a needs-MMU-lookup access */
-if (unlikely(tlb_addr & (TLB_MMIO | TLB_RECHECK))) {
+if (unlikely(tlb_addr & TLB_MMIO)) {
 /* There's really nothing that can be done to
support this apart from stop-the-world.  */
 goto stop_the_world;
@@ -1258,6 +1261,7 @@ load_helper(CPUArchState *env, target_ulong addr, 
TCGMemOpIdx oi,
 entry = tlb_entry(env, mmu_idx, addr);
 }
 tlb_addr = code_read ? entry->addr_code : entry->addr_read;
+tlb_addr &= ~TLB_INVALID_MASK;
 }
 
 /* Handle an IO access.  */
@@ -1265,27 +1269,6 @@ load_helper(CPUArchState *env, 

Re: [Qemu-devel] [PATCH v1 9/9] tcg: Check for watchpoints in probe_write()

2019-08-24 Thread Richard Henderson
On 8/23/19 3:07 AM, David Hildenbrand wrote:
> @@ -1071,8 +1072,23 @@ void probe_write(CPUArchState *env, target_ulong addr, 
>  if (!VICTIM_TLB_HIT(addr_write, addr)) {
>  tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE,
>   mmu_idx, retaddr);
> +/* TLB resize via tlb_fill may have moved the entry. */
> +entry = tlb_entry(env, mmu_idx, addr);
>  }
>  }
> +
> +if (!size) {
> +return;
> +}
> +tlb_addr = tlb_addr_write(entry);
> +
> +/* Watchpoints for this entry only apply if TLB_MMIO was set. */
> +if (tlb_addr & TLB_MMIO) {
> +MemTxAttrs attrs = env_tlb(env)->d[mmu_idx].iotlb[index].attrs;

We need to recompute index above as well, since we use it here.
Fixed up and applied to tcg-next.


r~



[Qemu-devel] [PATCH v29 5/8] target/avr: Add limited support for USART and 16 bit timer peripherals

2019-08-24 Thread Michael Rolnik
From: Sarah Harris 

These were designed to facilitate testing but should provide enough function to 
be useful in other contexts.
Only a subset of the functions of each peripheral is implemented, mainly due to 
the lack of a standard way to handle electrical connections (like GPIO pins).

Signed-off-by: Michael Rolnik 
---
 hw/char/Kconfig|   3 +
 hw/char/Makefile.objs  |   1 +
 hw/char/avr_usart.c| 324 ++
 hw/misc/Kconfig|   3 +
 hw/misc/Makefile.objs  |   2 +
 hw/misc/avr_mask.c | 112 ++
 hw/timer/Kconfig   |   3 +
 hw/timer/Makefile.objs |   1 +
 hw/timer/avr_timer16.c | 605 +
 include/hw/char/avr_usart.h|  97 ++
 include/hw/misc/avr_mask.h |  47 +++
 include/hw/timer/avr_timer16.h |  97 ++
 12 files changed, 1295 insertions(+)
 create mode 100644 hw/char/avr_usart.c
 create mode 100644 hw/misc/avr_mask.c
 create mode 100644 hw/timer/avr_timer16.c
 create mode 100644 include/hw/char/avr_usart.h
 create mode 100644 include/hw/misc/avr_mask.h
 create mode 100644 include/hw/timer/avr_timer16.h

diff --git a/hw/char/Kconfig b/hw/char/Kconfig
index 40e7a8b8bb..331b20983f 100644
--- a/hw/char/Kconfig
+++ b/hw/char/Kconfig
@@ -46,3 +46,6 @@ config SCLPCONSOLE
 
 config TERMINAL3270
 bool
+
+config AVR_USART
+bool
diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
index 02d8a66925..09ed50f1d0 100644
--- a/hw/char/Makefile.objs
+++ b/hw/char/Makefile.objs
@@ -21,6 +21,7 @@ obj-$(CONFIG_PSERIES) += spapr_vty.o
 obj-$(CONFIG_DIGIC) += digic-uart.o
 obj-$(CONFIG_STM32F2XX_USART) += stm32f2xx_usart.o
 obj-$(CONFIG_RASPI) += bcm2835_aux.o
+obj-$(CONFIG_AVR_USART) += avr_usart.o
 
 common-obj-$(CONFIG_CMSDK_APB_UART) += cmsdk-apb-uart.o
 common-obj-$(CONFIG_ETRAXFS) += etraxfs_ser.o
diff --git a/hw/char/avr_usart.c b/hw/char/avr_usart.c
new file mode 100644
index 00..9ca3c2a1cd
--- /dev/null
+++ b/hw/char/avr_usart.c
@@ -0,0 +1,324 @@
+/*
+ * AVR USART
+ *
+ * Copyright (c) 2018 University of Kent
+ * Author: Sarah Harris
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/char/avr_usart.h"
+#include "qemu/log.h"
+#include "hw/irq.h"
+#include "hw/qdev-properties.h"
+
+static int avr_usart_can_receive(void *opaque)
+{
+AVRUsartState *usart = opaque;
+
+if (usart->data_valid || !(usart->csrb & USART_CSRB_RXEN)) {
+return 0;
+}
+return 1;
+}
+
+static void avr_usart_receive(void *opaque, const uint8_t *buffer, int size)
+{
+AVRUsartState *usart = opaque;
+assert(size == 1);
+assert(!usart->data_valid);
+usart->data = buffer[0];
+usart->data_valid = true;
+usart->csra |= USART_CSRA_RXC;
+if (usart->csrb & USART_CSRB_RXCIE) {
+qemu_set_irq(usart->rxc_irq, 1);
+}
+}
+
+static void update_char_mask(AVRUsartState *usart)
+{
+uint8_t mode = ((usart->csrc & USART_CSRC_CSZ0) ? 1 : 0) |
+((usart->csrc & USART_CSRC_CSZ1) ? 2 : 0) |
+((usart->csrb & USART_CSRB_CSZ2) ? 4 : 0);
+switch (mode) {
+case 0:
+usart->char_mask = 0b1;
+break;
+case 1:
+usart->char_mask = 0b11;
+break;
+case 2:
+usart->char_mask = 0b111;
+break;
+case 3:
+usart->char_mask = 0b;
+break;
+case 4:
+/* Fallthrough. */
+case 5:
+/* Fallthrough. */
+case 6:
+qemu_log_mask(
+LOG_GUEST_ERROR,
+"%s: Reserved character size 0x%x\n",
+__func__,
+mode);
+break;
+case 7:
+qemu_log_mask(
+LOG_GUEST_ERROR,
+"%s: Nine bit character size not supported (forcing eight)\n",
+__func__);
+usart->char_mask = 0b;
+break;
+default:
+assert(0);
+}
+}
+
+static 

[Qemu-devel] [PATCH v29 7/8] target/avr: Register AVR support with the rest of QEMU, the build system, and the MAINTAINERS file

2019-08-24 Thread Michael Rolnik
Signed-off-by: Michael Rolnik 
---
 MAINTAINERS |  6 ++
 arch_init.c |  2 ++
 configure   |  7 +++
 default-configs/avr-softmmu.mak |  5 +
 include/disas/dis-asm.h |  6 ++
 include/sysemu/arch_init.h  |  1 +
 qapi/machine.json   |  3 ++-
 target/avr/Makefile.objs| 33 +
 tests/machine-none-test.c   |  1 +
 9 files changed, 63 insertions(+), 1 deletion(-)
 create mode 100644 default-configs/avr-softmmu.mak
 create mode 100644 target/avr/Makefile.objs

diff --git a/MAINTAINERS b/MAINTAINERS
index ef6c01084b..763370e6ff 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -163,6 +163,12 @@ S: Maintained
 F: hw/arm/smmu*
 F: include/hw/arm/smmu*
 
+AVR TCG CPUs
+M: Michael Rolnik 
+S: Maintained
+F: target/avr/
+F: hw/avr/
+
 CRIS TCG CPUs
 M: Edgar E. Iglesias 
 S: Maintained
diff --git a/arch_init.c b/arch_init.c
index 0a1531124c..fb308aa802 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -85,6 +85,8 @@ int graphic_depth = 32;
 #define QEMU_ARCH QEMU_ARCH_UNICORE32
 #elif defined(TARGET_XTENSA)
 #define QEMU_ARCH QEMU_ARCH_XTENSA
+#elif defined(TARGET_AVR)
+#define QEMU_ARCH QEMU_ARCH_AVR
 #endif
 
 const uint32_t arch_type = QEMU_ARCH;
diff --git a/configure b/configure
index e44e454c43..1dbecba329 100755
--- a/configure
+++ b/configure
@@ -7510,6 +7510,10 @@ case "$target_name" in
 target_compiler=$cross_cc_aarch64
 eval "target_compiler_cflags=\$cross_cc_cflags_${target_name}"
   ;;
+  avr)
+   gdb_xml_files="avr-cpu.xml"
+target_compiler=$cross_cc_avr
+  ;;
   cris)
 target_compiler=$cross_cc_cris
   ;;
@@ -7790,6 +7794,9 @@ for i in $ARCH $TARGET_BASE_ARCH ; do
   disas_config "ARM_A64"
 fi
   ;;
+  avr)
+disas_config "AVR"
+  ;;
   cris)
 disas_config "CRIS"
   ;;
diff --git a/default-configs/avr-softmmu.mak b/default-configs/avr-softmmu.mak
new file mode 100644
index 00..d1e1c28118
--- /dev/null
+++ b/default-configs/avr-softmmu.mak
@@ -0,0 +1,5 @@
+# Default configuration for avr-softmmu
+
+# Boards:
+#
+CONFIG_AVR_SAMPLE=y
diff --git a/include/disas/dis-asm.h b/include/disas/dis-asm.h
index e9c7dd8eb4..8bedce17ac 100644
--- a/include/disas/dis-asm.h
+++ b/include/disas/dis-asm.h
@@ -211,6 +211,12 @@ enum bfd_architecture
 #define bfd_mach_m32r  0  /* backwards compatibility */
   bfd_arch_mn10200,/* Matsushita MN10200 */
   bfd_arch_mn10300,/* Matsushita MN10300 */
+  bfd_arch_avr,   /* Atmel AVR microcontrollers.  */
+#define bfd_mach_avr1  1
+#define bfd_mach_avr2  2
+#define bfd_mach_avr3  3
+#define bfd_mach_avr4  4
+#define bfd_mach_avr5  5
   bfd_arch_cris,   /* Axis CRIS */
 #define bfd_mach_cris_v0_v10   255
 #define bfd_mach_cris_v32  32
diff --git a/include/sysemu/arch_init.h b/include/sysemu/arch_init.h
index 62c6fe4cf1..893df26ce2 100644
--- a/include/sysemu/arch_init.h
+++ b/include/sysemu/arch_init.h
@@ -24,6 +24,7 @@ enum {
 QEMU_ARCH_NIOS2 = (1 << 17),
 QEMU_ARCH_HPPA = (1 << 18),
 QEMU_ARCH_RISCV = (1 << 19),
+QEMU_ARCH_AVR = (1 << 20),
 };
 
 extern const uint32_t arch_type;
diff --git a/qapi/machine.json b/qapi/machine.json
index de5c742d72..a82ba088f4 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -21,11 +21,12 @@
 #is true even for "qemu-system-x86_64".
 #
 # ppcemb: dropped in 3.1
+# avr: addin in 4.2
 #
 # Since: 3.0
 ##
 { 'enum' : 'SysEmuTarget',
-  'data' : [ 'aarch64', 'alpha', 'arm', 'cris', 'hppa', 'i386', 'lm32',
+  'data' : [ 'aarch64', 'alpha', 'arm', 'avr', 'cris', 'hppa', 'i386', 'lm32',
  'm68k', 'microblaze', 'microblazeel', 'mips', 'mips64',
  'mips64el', 'mipsel', 'moxie', 'nios2', 'or1k', 'ppc',
  'ppc64', 'riscv32', 'riscv64', 's390x', 'sh4',
diff --git a/target/avr/Makefile.objs b/target/avr/Makefile.objs
new file mode 100644
index 00..2976affd95
--- /dev/null
+++ b/target/avr/Makefile.objs
@@ -0,0 +1,33 @@
+#
+#  QEMU AVR CPU
+#
+#  Copyright (c) 2019 Michael Rolnik
+#
+#  This library is free software; you can redistribute it and/or
+#  modify it under the terms of the GNU Lesser General Public
+#  License as published by the Free Software Foundation; either
+#  version 2.1 of the License, or (at your option) any later version.
+#
+#  This library is distributed in the hope that it will be useful,
+#  but WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+#  Lesser General Public License for more details.
+#
+#  You should have received a copy of the GNU Lesser General Public
+#  License along with this library; if not, see
+#  
+#
+
+DECODETREE = $(SRC_PATH)/scripts/decodetree.py
+decode-y = $(SRC_PATH)/target/avr/insn.decode
+
+target/avr/decode_insn.inc.c: $(decode-y) $(DECODETREE)
+   $(call quiet-command, \
+ 

[Qemu-devel] [PATCH v29 3/8] target/avr: Add instruction decoding

2019-08-24 Thread Michael Rolnik
This includes:
- encoding of all 16 bit instructions
- encoding of all 32 bit instructions

Signed-off-by: Michael Rolnik 
---
 target/avr/insn.decode | 175 +
 1 file changed, 175 insertions(+)
 create mode 100644 target/avr/insn.decode

diff --git a/target/avr/insn.decode b/target/avr/insn.decode
new file mode 100644
index 00..6b387762c6
--- /dev/null
+++ b/target/avr/insn.decode
@@ -0,0 +1,175 @@
+#
+#   A = [16 .. 31]
+#   B = [16 .. 23]
+#   C = [24, 26, 28, 30]
+#   D = [0, 2, 4, 6, 8, .. 30]
+
+%rd 4:5
+%rr 9:1 0:4
+
+_rr  rd rr
+_imm rd imm
+
+@op_rd_rr    .. . . _rr  rd=%rd rr=%rr
+ADD  11 . . @op_rd_rr
+ADC 0001 11 . . @op_rd_rr
+AND 0010 00 . . @op_rd_rr
+CP  0001 01 . . @op_rd_rr
+CPC  01 . . @op_rd_rr
+CPSE0001 00 . . @op_rd_rr
+EOR 0010 01 . . @op_rd_rr
+MOV 0010 11 . . @op_rd_rr
+MUL 1001 11 . . @op_rd_rr
+OR  0010 10 . . @op_rd_rr
+SBC  10 . . @op_rd_rr
+SUB 0001 10 . . @op_rd_rr
+
+
+%rd_c   4:2 !function=to_C
+%imm6   6:2 0:4
+
+@op_rd_imm6   .. .. _imm rd=%rd_c imm=%imm6
+ADIW1001 0110 .. .. @op_rd_imm6
+SBIW1001 0111 .. .. @op_rd_imm6
+
+
+%rd_a   4:4 !function=to_A
+%rr_a   0:4 !function=to_A
+%rd_d   4:4 !function=to_D
+%rr_d   0:4 !function=to_D
+%imm8   8:4 0:4
+
+@op_rd_imm8     _imm rd=%rd_a imm=%imm8
+ANDI0111    @op_rd_imm8
+CPI 0011    @op_rd_imm8
+LDI 1110    @op_rd_imm8
+ORI 0110    @op_rd_imm8
+SBCI0100    @op_rd_imm8
+SUBI0101    @op_rd_imm8
+
+
+@op_rd   ... rd:5 
+ASR 1001 010 . 0101 @op_rd
+COM 1001 010 .  @op_rd
+DEC 1001 010 . 1010 @op_rd
+ELPM2   1001 000 . 0110 @op_rd
+ELPMX   1001 000 . 0111 @op_rd
+INC 1001 010 . 0011 @op_rd
+LDX11001 000 . 1100 @op_rd
+LDX21001 000 . 1101 @op_rd
+LDX31001 000 . 1110 @op_rd
+LDY21001 000 . 1001 @op_rd
+LDY31001 000 . 1010 @op_rd
+LDZ21001 000 . 0001 @op_rd
+LDZ31001 000 . 0010 @op_rd
+LPM21001 000 . 0100 @op_rd
+LPMX1001 000 . 0101 @op_rd
+LSR 1001 010 . 0110 @op_rd
+NEG 1001 010 . 0001 @op_rd
+POP 1001 000 .  @op_rd
+PUSH1001 001 .  @op_rd
+ROR 1001 010 . 0111 @op_rd
+STY21001 001 . 1001 @op_rd
+STY31001 001 . 1010 @op_rd
+STZ21001 001 . 0001 @op_rd
+STZ31001 001 . 0010 @op_rd
+SWAP1001 010 . 0010 @op_rd
+
+
+@op_bit   . bit:3 
+BCLR1001 0100 1 ... 1000@op_bit
+BSET1001 0100 0 ... 1000@op_bit
+
+
+@op_rd_bit   ... rd:5 . bit:3
+BLD  100 . 0 ...@op_rd_bit
+BST  101 . 0 ...@op_rd_bit
+
+
+@op_bit_imm  .. imm:s7 bit:3
+BRBC 01 ... ... @op_bit_imm
+BRBS 00 ... ... @op_bit_imm
+
+
+BREAK   1001 0101 1001 1000
+EICALL  1001 0101 0001 1001
+EIJMP   1001 0100 0001 1001
+ELPM1   1001 0101 1101 1000
+ICALL   1001 0101  1001
+IJMP1001 0100  1001
+LPM11001 0101 1100 1000
+NOP    
+RET 1001 0101  1000
+RETI1001 0101 0001 1000
+SLEEP   1001 0101 1000 1000
+SPM 1001 0101 1110 1000
+SPMX1001 0101  1000
+WDR 1001 0101 1010 1000
+
+
+@op_reg_bit   reg:5 bit:3
+CBI 1001 1000 . ... @op_reg_bit
+SBI 1001 1010 . ... @op_reg_bit
+SBIC1001 1001 . ... @op_reg_bit
+SBIS1001 1011 . ... @op_reg_bit
+
+
+DES 1001 0100 imm:4 1011
+
+
+%rd_b   4:3   

[Qemu-devel] [PATCH v29 8/8] target/avr: Add tests

2019-08-24 Thread Michael Rolnik
1. Avocado test
The test is based on
https://github.com/seharris/qemu-avr-tests/tree/master/free-rtos/Demo
demo which. If working correctly, prints 'ABCDEFGHIJKLMNOPQRSTUVWX' out.
it also demostrates that timer and IRQ are working

2. Boot serial test
Prinit out 'T' through serial port

Signed-off-by: Michael Rolnik 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---
 tests/Makefile.include   |  2 ++
 tests/acceptance/machine_avr6.py | 36 
 tests/boot-serial-test.c | 10 +
 3 files changed, 48 insertions(+)
 create mode 100644 tests/acceptance/machine_avr6.py

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 49684fd4f4..dbaeb87572 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -202,6 +202,8 @@ check-qtest-i386-y += tests/test-x86-cpuid-compat$(EXESUF)
 check-qtest-i386-y += tests/numa-test$(EXESUF)
 check-qtest-x86_64-y += $(check-qtest-i386-y)
 
+check-qtest-avr-y += tests/boot-serial-test$(EXESUF)
+
 check-qtest-alpha-y += tests/boot-serial-test$(EXESUF)
 check-qtest-alpha-$(CONFIG_VGA) += tests/display-vga-test$(EXESUF)
 
diff --git a/tests/acceptance/machine_avr6.py b/tests/acceptance/machine_avr6.py
new file mode 100644
index 00..0601080b01
--- /dev/null
+++ b/tests/acceptance/machine_avr6.py
@@ -0,0 +1,36 @@
+import logging
+import time
+import distutils.spawn
+
+from avocado import skipUnless
+from avocado_qemu import Test
+from avocado.utils import process
+
+class AVR6Machine(Test):
+timeout = 5
+
+def test_freertos(self):
+"""
+:avocado: tags=arch:avr
+:avocado: tags=machine:sample
+"""
+"""
+
https://github.com/seharris/qemu-avr-tests/raw/master/free-rtos/Demo/AVR_ATMega2560_GCC/demo.elf
+constantly prints out 
'ABCDEFGHIJKLMNOPQRSTUVWXABCDEFGHIJKLMNOPQRSTUVWX'
+"""
+rom_url = 'https://github.com/seharris/qemu-avr-tests'
+rom_url += '/raw/master/free-rtos/Demo/AVR_ATMega2560_GCC/demo.elf'
+rom_hash = '7eb521f511ca8f2622e0a3c5e8dd686efbb911d4'
+rom_path = self.fetch_asset(rom_url, asset_hash=rom_hash)
+
+self.vm.set_machine('sample')
+self.vm.add_args('-bios', rom_path)
+self.vm.add_args('-nographic')
+self.vm.launch()
+
+time.sleep(2)
+self.vm.shutdown()
+
+match = 'ABCDEFGHIJKLMNOPQRSTUVWXABCDEFGHIJKLMNOPQRSTUVWX'
+
+self.assertIn(match, self.vm.get_log())
diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
index 24852d4c7d..22cbaccc1b 100644
--- a/tests/boot-serial-test.c
+++ b/tests/boot-serial-test.c
@@ -16,6 +16,15 @@
 #include "qemu/osdep.h"
 #include "libqtest.h"
 
+static const uint8_t bios_avr[] = {
+0x88, 0xe0, /* ldi r24, 0x08   */
+0x80, 0x93, 0xc1, 0x00, /* sts 0x00C1, r24 ; Enable tx */
+0x86, 0xe0, /* ldi r24, 0x06   */
+0x80, 0x93, 0xc2, 0x00, /* sts 0x00C2, r24 ; Set the data bits to 8 */
+0x84, 0xe5, /* ldi r24, 0x54   */
+0x80, 0x93, 0xc6, 0x00, /* sts 0x00C6, r24 ; Output 'T' */
+};
+
 static const uint8_t kernel_mcf5208[] = {
 0x41, 0xf9, 0xfc, 0x06, 0x00, 0x00, /* lea 0xfc06,%a0 */
 0x10, 0x3c, 0x00, 0x54, /* move.b #'T',%d0 */
@@ -92,6 +101,7 @@ typedef struct testdef {
 
 static testdef_t tests[] = {
 { "alpha", "clipper", "", "PCI:" },
+{ "avr", "sample", "", "T", sizeof(bios_avr), NULL, bios_avr },
 { "ppc", "ppce500", "", "U-Boot" },
 { "ppc", "40p", "-vga none -boot d", "Trying cd:," },
 { "ppc", "g3beige", "", "PowerPC,750" },
-- 
2.17.2 (Apple Git-113)




[Qemu-devel] [PATCH v29 4/8] target/avr: Add instruction translation

2019-08-24 Thread Michael Rolnik
This includes:
- TCG translations for each instruction

Signed-off-by: Michael Rolnik 
---
 target/avr/translate.c | 2888 
 1 file changed, 2888 insertions(+)
 create mode 100644 target/avr/translate.c

diff --git a/target/avr/translate.c b/target/avr/translate.c
new file mode 100644
index 00..42cb4a690c
--- /dev/null
+++ b/target/avr/translate.c
@@ -0,0 +1,2888 @@
+/*
+ * QEMU AVR CPU
+ *
+ * Copyright (c) 2019 Michael Rolnik
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see
+ * 
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/qemu-print.h"
+#include "tcg/tcg.h"
+#include "cpu.h"
+#include "exec/exec-all.h"
+#include "disas/disas.h"
+#include "tcg-op.h"
+#include "exec/cpu_ldst.h"
+#include "exec/helper-proto.h"
+#include "exec/helper-gen.h"
+#include "exec/log.h"
+#include "exec/gdbstub.h"
+#include "exec/translator.h"
+
+/*
+ *  Define if you want a BREAK instruction translated to a breakpoint
+ *  Active debugging connection is assumed
+ *  This is for
+ *  https://github.com/seharris/qemu-avr-tests/tree/master/instruction-tests
+ *  tests
+ */
+#undef BREAKPOINT_ON_BREAK
+
+static TCGv cpu_pc;
+
+static TCGv cpu_Cf;
+static TCGv cpu_Zf;
+static TCGv cpu_Nf;
+static TCGv cpu_Vf;
+static TCGv cpu_Sf;
+static TCGv cpu_Hf;
+static TCGv cpu_Tf;
+static TCGv cpu_If;
+
+static TCGv cpu_rampD;
+static TCGv cpu_rampX;
+static TCGv cpu_rampY;
+static TCGv cpu_rampZ;
+
+static TCGv cpu_r[NO_CPU_REGISTERS];
+static TCGv cpu_eind;
+static TCGv cpu_sp;
+
+static TCGv cpu_skip;
+
+static const char reg_names[NO_CPU_REGISTERS][8] = {
+"r0",  "r1",  "r2",  "r3",  "r4",  "r5",  "r6",  "r7",
+"r8",  "r9",  "r10", "r11", "r12", "r13", "r14", "r15",
+"r16", "r17", "r18", "r19", "r20", "r21", "r22", "r23",
+"r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31",
+};
+#define REG(x) (cpu_r[x])
+
+enum {
+DISAS_EXIT   = DISAS_TARGET_0,  /* We want return to the cpu main loop.  */
+DISAS_LOOKUP = DISAS_TARGET_1,  /* We have a variable condition exit.  */
+DISAS_CHAIN  = DISAS_TARGET_2,  /* We have a single condition exit.  */
+};
+
+typedef struct DisasContext DisasContext;
+
+/* This is the state at translation time. */
+struct DisasContext {
+TranslationBlock *tb;
+
+CPUAVRState *env;
+CPUState *cs;
+
+target_long npc;
+uint32_t opcode;
+
+/* Routine used to access memory */
+int memidx;
+int bstate;
+int singlestep;
+
+TCGv skip_var0;
+TCGv skip_var1;
+TCGCond skip_cond;
+bool free_skip_var0;
+};
+
+static int to_A(DisasContext *ctx, int indx) { return 16 + (indx % 16); }
+static int to_B(DisasContext *ctx, int indx) { return 16 + (indx % 8); }
+static int to_C(DisasContext *ctx, int indx) { return 24 + (indx % 4) * 2; }
+static int to_D(DisasContext *ctx, int indx) { return (indx % 16) * 2; }
+
+static uint16_t next_word(DisasContext *ctx)
+{
+return cpu_lduw_code(ctx->env, ctx->npc++ * 2);
+}
+
+static int append_16(DisasContext *ctx, int x)
+{
+return x << 16 | next_word(ctx);
+}
+
+static bool decode_insn(DisasContext *ctx, uint16_t insn);
+#include "decode_insn.inc.c"
+
+static bool avr_have_feature(DisasContext *ctx, int feature)
+{
+if (!avr_feature(ctx->env, feature)) {
+gen_helper_unsupported(cpu_env);
+ctx->bstate = DISAS_NORETURN;
+return false;
+}
+return true;
+}
+
+static void gen_goto_tb(DisasContext *ctx, int n, target_ulong dest)
+{
+TranslationBlock *tb = ctx->tb;
+
+if (ctx->singlestep == 0) {
+tcg_gen_goto_tb(n);
+tcg_gen_movi_i32(cpu_pc, dest);
+tcg_gen_exit_tb(tb, n);
+} else {
+tcg_gen_movi_i32(cpu_pc, dest);
+gen_helper_debug(cpu_env);
+tcg_gen_exit_tb(NULL, 0);
+}
+ctx->bstate = DISAS_NORETURN;
+}
+
+#include "exec/gen-icount.h"
+
+static void gen_add_CHf(TCGv R, TCGv Rd, TCGv Rr)
+{
+TCGv t1 = tcg_temp_new_i32();
+TCGv t2 = tcg_temp_new_i32();
+TCGv t3 = tcg_temp_new_i32();
+
+tcg_gen_and_tl(t1, Rd, Rr); /* t1 = Rd & Rr */
+tcg_gen_andc_tl(t2, Rd, R); /* t2 = Rd & ~R */
+tcg_gen_andc_tl(t3, Rr, R); /* t3 = Rr & ~R */
+tcg_gen_or_tl(t1, t1, t2); /* t1 = t1 | t2 | t3 */
+tcg_gen_or_tl(t1, t1, t3);
+
+tcg_gen_shri_tl(cpu_Cf, t1, 7); /* Cf = t1(7) */
+tcg_gen_shri_tl(cpu_Hf, t1, 3); /* Hf = 

[Qemu-devel] [PATCH v29 1/8] target/avr: Add outward facing interfaces and core CPU logic

2019-08-24 Thread Michael Rolnik
From: Sarah Harris 

This includes:
- CPU data structures
- object model classes and functions
- migration functions
- GDB hooks

Signed-off-by: Michael Rolnik 
Acked-by: Igor Mammedov 
---
 gdb-xml/avr-cpu.xml|  49 
 target/avr/cpu-param.h |  37 +++
 target/avr/cpu-qom.h   |  54 
 target/avr/cpu.c   | 576 +
 target/avr/cpu.h   | 254 ++
 target/avr/gdbstub.c   |  85 ++
 target/avr/machine.c   | 121 +
 7 files changed, 1176 insertions(+)
 create mode 100644 gdb-xml/avr-cpu.xml
 create mode 100644 target/avr/cpu-param.h
 create mode 100644 target/avr/cpu-qom.h
 create mode 100644 target/avr/cpu.c
 create mode 100644 target/avr/cpu.h
 create mode 100644 target/avr/gdbstub.c
 create mode 100644 target/avr/machine.c

diff --git a/gdb-xml/avr-cpu.xml b/gdb-xml/avr-cpu.xml
new file mode 100644
index 00..c4747f5b40
--- /dev/null
+++ b/gdb-xml/avr-cpu.xml
@@ -0,0 +1,49 @@
+
+
+
+
+
+
+
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+
diff --git a/target/avr/cpu-param.h b/target/avr/cpu-param.h
new file mode 100644
index 00..ccd1ea3429
--- /dev/null
+++ b/target/avr/cpu-param.h
@@ -0,0 +1,37 @@
+/*
+ * QEMU AVR CPU
+ *
+ * Copyright (c) 2019 Michael Rolnik
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see
+ * 
+ */
+
+#ifndef AVR_CPU_PARAM_H
+#define AVR_CPU_PARAM_H 1
+
+#define TARGET_LONG_BITS 32
+/*
+ * TARGET_PAGE_BITS cannot be more than 8 bits because
+ * 1.  all IO registers occupy [0x .. 0x00ff] address range, and they
+ * should be implemented as a device and not memory
+ * 2.  SRAM starts at the address 0x0100
+ */
+#define TARGET_PAGE_BITS 8
+#define TARGET_PHYS_ADDR_SPACE_BITS 24
+#define TARGET_VIRT_ADDR_SPACE_BITS 24
+#define NB_MMU_MODES 2
+
+
+#endif
diff --git a/target/avr/cpu-qom.h b/target/avr/cpu-qom.h
new file mode 100644
index 00..e28b58c897
--- /dev/null
+++ b/target/avr/cpu-qom.h
@@ -0,0 +1,54 @@
+/*
+ * QEMU AVR CPU
+ *
+ * Copyright (c) 2019 Michael Rolnik
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see
+ * 
+ */
+
+#ifndef QEMU_AVR_QOM_H
+#define QEMU_AVR_QOM_H
+
+#include "hw/core/cpu.h"
+
+#define TYPE_AVR_CPU "avr-cpu"
+
+#define AVR_CPU_CLASS(klass) \
+OBJECT_CLASS_CHECK(AVRCPUClass, (klass), TYPE_AVR_CPU)
+#define AVR_CPU(obj) \
+OBJECT_CHECK(AVRCPU, (obj), TYPE_AVR_CPU)
+#define AVR_CPU_GET_CLASS(obj) \
+OBJECT_GET_CLASS(AVRCPUClass, (obj), TYPE_AVR_CPU)
+
+/**
+ *  AVRCPUClass:
+ *  @parent_realize: The parent class' realize handler.
+ *  @parent_reset: The parent class' reset handler.
+ *  @vr: Version Register value.
+ *
+ *  A AVR CPU model.
+ */
+typedef struct AVRCPUClass {
+/*< private >*/
+CPUClass parent_class;
+/*< public >*/
+DeviceRealize parent_realize;
+void (*parent_reset)(CPUState *cpu);
+} AVRCPUClass;
+
+typedef struct AVRCPU AVRCPU;
+
+
+#endif /* !defined (QEMU_AVR_CPU_QOM_H) */
diff --git a/target/avr/cpu.c b/target/avr/cpu.c
new file mode 100644
index 00..dae56d7845
--- /dev/null
+++ b/target/avr/cpu.c
@@ -0,0 +1,576 @@
+/*
+ * QEMU AVR CPU
+ *
+ * Copyright (c) 2019 Michael Rolnik
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the 

[Qemu-devel] [PATCH v29 2/8] target/avr: Add instruction helpers

2019-08-24 Thread Michael Rolnik
From: Sarah Harris 

Stubs for unimplemented instructions and helpers for instructions that need to 
interact with QEMU.
SPM and WDR are unimplemented because they require emulation of complex 
peripherals.
The implementation of SLEEP is very limited due to the lack of peripherals to 
generate wake interrupts.
Memory access instructions are implemented here because some address ranges 
actually refer to CPU registers.

Signed-off-by: Michael Rolnik 
---
 target/avr/helper.c | 354 
 target/avr/helper.h |  29 
 2 files changed, 383 insertions(+)
 create mode 100644 target/avr/helper.c
 create mode 100644 target/avr/helper.h

diff --git a/target/avr/helper.c b/target/avr/helper.c
new file mode 100644
index 00..f0f0d4f15a
--- /dev/null
+++ b/target/avr/helper.c
@@ -0,0 +1,354 @@
+/*
+ * QEMU AVR CPU
+ *
+ * Copyright (c) 2019 Michael Rolnik
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see
+ * 
+ */
+
+#include "qemu/osdep.h"
+
+#include "cpu.h"
+#include "hw/irq.h"
+#include "hw/sysbus.h"
+#include "sysemu/sysemu.h"
+#include "exec/exec-all.h"
+#include "exec/cpu_ldst.h"
+#include "exec/helper-proto.h"
+#include "exec/ioport.h"
+#include "qemu/host-utils.h"
+#include "qemu/error-report.h"
+
+bool avr_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
+{
+bool ret = false;
+CPUClass *cc = CPU_GET_CLASS(cs);
+AVRCPU *cpu = AVR_CPU(cs);
+CPUAVRState *env = >env;
+
+if (interrupt_request & CPU_INTERRUPT_RESET) {
+if (cpu_interrupts_enabled(env)) {
+cs->exception_index = EXCP_RESET;
+cc->do_interrupt(cs);
+
+cs->interrupt_request &= ~CPU_INTERRUPT_RESET;
+
+ret = true;
+}
+}
+if (interrupt_request & CPU_INTERRUPT_HARD) {
+if (cpu_interrupts_enabled(env) && env->intsrc != 0) {
+int index = ctz32(env->intsrc);
+cs->exception_index = EXCP_INT(index);
+cc->do_interrupt(cs);
+
+env->intsrc &= env->intsrc - 1; /* clear the interrupt */
+cs->interrupt_request &= ~CPU_INTERRUPT_HARD;
+
+ret = true;
+}
+}
+return ret;
+}
+
+void avr_cpu_do_interrupt(CPUState *cs)
+{
+AVRCPU *cpu = AVR_CPU(cs);
+CPUAVRState *env = >env;
+
+uint32_t ret = env->pc_w;
+int vector = 0;
+int size = avr_feature(env, AVR_FEATURE_JMP_CALL) ? 2 : 1;
+int base = 0;
+
+if (cs->exception_index == EXCP_RESET) {
+vector = 0;
+} else if (env->intsrc != 0) {
+vector = ctz32(env->intsrc) + 1;
+}
+
+if (avr_feature(env, AVR_FEATURE_3_BYTE_PC)) {
+cpu_stb_data(env, env->sp--, (ret & 0xff));
+cpu_stb_data(env, env->sp--, (ret & 0x00ff00) >> 8);
+cpu_stb_data(env, env->sp--, (ret & 0xff) >> 16);
+} else if (avr_feature(env, AVR_FEATURE_2_BYTE_PC)) {
+cpu_stb_data(env, env->sp--, (ret & 0xff));
+cpu_stb_data(env, env->sp--, (ret & 0x00ff00) >> 8);
+} else {
+cpu_stb_data(env, env->sp--, (ret & 0xff));
+}
+
+env->pc_w = base + vector * size;
+env->sregI = 0; /* clear Global Interrupt Flag */
+
+cs->exception_index = -1;
+}
+
+int avr_cpu_memory_rw_debug(CPUState *cs, vaddr addr, uint8_t *buf,
+int len, bool is_write)
+{
+return cpu_memory_rw_debug(cs, addr, buf, len, is_write);
+}
+
+hwaddr avr_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
+{
+return addr; /* I assume 1:1 address correspondance */
+}
+
+int avr_cpu_handle_mmu_fault(
+CPUState *cs, vaddr address, int size, int rw, int mmu_idx)
+{
+/* currently it's assumed that this will never happen */
+cs->exception_index = EXCP_DEBUG;
+cpu_dump_state(cs, stderr, 0);
+return 1;
+}
+
+bool avr_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
+MMUAccessType access_type, int mmu_idx,
+bool probe, uintptr_t retaddr)
+{
+int prot = 0;
+MemTxAttrs attrs = {};
+uint32_t paddr;
+
+address &= TARGET_PAGE_MASK;
+
+if (mmu_idx == MMU_CODE_IDX) {
+/* access to code in flash */
+paddr = OFFSET_CODE + address;
+prot = PAGE_READ | PAGE_EXEC;
+if (paddr + TARGET_PAGE_SIZE > OFFSET_DATA) {
+error_report("execution left flash memory");

[Qemu-devel] [PATCH v29 6/8] target/avr: Add example board configuration

2019-08-24 Thread Michael Rolnik
From: Sarah Harris 

A simple board setup that configures an AVR CPU to run a given firmware image.
This is all that's useful to implement without peripheral emulation as AVR CPUs 
include a lot of on-board peripherals.

NOTE: this is not a real board 
NOTE: it's used for CPU testing

Signed-off-by: Michael Rolnik 
---
 hw/Kconfig   |   1 +
 hw/avr/Kconfig   |   6 +
 hw/avr/Makefile.objs |   1 +
 hw/avr/sample.c  | 282 +++
 4 files changed, 290 insertions(+)
 create mode 100644 hw/avr/Kconfig
 create mode 100644 hw/avr/Makefile.objs
 create mode 100644 hw/avr/sample.c

diff --git a/hw/Kconfig b/hw/Kconfig
index b45db3c813..f185fcb19e 100644
--- a/hw/Kconfig
+++ b/hw/Kconfig
@@ -42,6 +42,7 @@ source watchdog/Kconfig
 # arch Kconfig
 source arm/Kconfig
 source alpha/Kconfig
+source avr/Kconfig
 source cris/Kconfig
 source hppa/Kconfig
 source i386/Kconfig
diff --git a/hw/avr/Kconfig b/hw/avr/Kconfig
new file mode 100644
index 00..92aa1e6afb
--- /dev/null
+++ b/hw/avr/Kconfig
@@ -0,0 +1,6 @@
+config AVR_SAMPLE
+bool
+select AVR_TIMER16
+select AVR_USART
+select AVR_MASK
+select UNIMP
diff --git a/hw/avr/Makefile.objs b/hw/avr/Makefile.objs
new file mode 100644
index 00..626b7064b3
--- /dev/null
+++ b/hw/avr/Makefile.objs
@@ -0,0 +1 @@
+obj-y += sample.o
diff --git a/hw/avr/sample.c b/hw/avr/sample.c
new file mode 100644
index 00..2295ec1b79
--- /dev/null
+++ b/hw/avr/sample.c
@@ -0,0 +1,282 @@
+/*
+ * QEMU AVR CPU
+ *
+ * Copyright (c) 2019 Michael Rolnik
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see
+ * 
+ */
+
+/*
+ *  NOTE:
+ *  This is not a real AVR board, this is an example!
+ *  The CPU is an approximation of an ATmega2560, but is missing various
+ *  built-in peripherals.
+ *
+ *  This example board loads provided binary file into flash memory and
+ *  executes it from 0x address in the code memory space.
+ *
+ *  Currently used for AVR CPU validation
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu-common.h"
+#include "cpu.h"
+#include "hw/hw.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/qtest.h"
+#include "ui/console.h"
+#include "hw/boards.h"
+#include "hw/loader.h"
+#include "qemu/error-report.h"
+#include "exec/address-spaces.h"
+#include "include/hw/sysbus.h"
+#include "include/hw/char/avr_usart.h"
+#include "include/hw/timer/avr_timer16.h"
+#include "include/hw/misc/avr_mask.h"
+#include "elf.h"
+#include "hw/misc/unimp.h"
+
+#define SIZE_FLASH 0x0004
+#define SIZE_SRAM 0x2000
+/*
+ * Size of additional "external" memory, as if the AVR were configured to use
+ * an external RAM chip.
+ * Note that the configuration registers that normally enable this feature are
+ * unimplemented.
+ */
+#define SIZE_EXMEM 0x
+
+/* Offsets of peripherals in emulated memory space (i.e. not host addresses)  
*/
+#define PRR0_BASE 0x64
+#define PRR1_BASE 0x65
+#define USART_BASE 0xc0
+#define TIMER1_BASE 0x80
+#define TIMER1_IMSK_BASE 0x6f
+#define TIMER1_IFR_BASE 0x36
+
+/* Interrupt numbers used by peripherals */
+#define USART_RXC_IRQ 24
+#define USART_DRE_IRQ 25
+#define USART_TXC_IRQ 26
+
+#define TIMER1_CAPT_IRQ 15
+#define TIMER1_COMPA_IRQ 16
+#define TIMER1_COMPB_IRQ 17
+#define TIMER1_COMPC_IRQ 18
+#define TIMER1_OVF_IRQ 19
+
+/*  Power reduction */
+#define PRR1_BIT_PRTIM5 0x05/*  Timer/Counter5  */
+#define PRR1_BIT_PRTIM4 0x04/*  Timer/Counter4  */
+#define PRR1_BIT_PRTIM3 0x03/*  Timer/Counter3  */
+#define PRR1_BIT_PRUSART3   0x02/*  USART3  */
+#define PRR1_BIT_PRUSART2   0x01/*  USART2  */
+#define PRR1_BIT_PRUSART1   0x00/*  USART1  */
+
+#define PRR0_BIT_PRTWI  0x06/*  TWI */
+#define PRR0_BIT_PRTIM2 0x05/*  Timer/Counter2  */
+#define PRR0_BIT_PRTIM0 0x04/*  Timer/Counter0  */
+#define PRR0_BIT_PRTIM1 0x03/*  Timer/Counter1  */
+#define PRR0_BIT_PRSPI  0x02/*  Serial Peripheral Interface */
+#define PRR0_BIT_PRUSART0   0x01/*  USART0  */
+#define PRR0_BIT_PRADC  0x00/*  ADC */
+
+typedef struct {
+MachineClass parent;
+} SampleMachineClass;
+
+typedef struct {
+MachineState parent;
+MemoryRegion *ram;
+MemoryRegion *flash;
+AVRUsartState *usart0;
+

[Qemu-devel] [PATCH v29 0/8] QEMU AVR 8 bit cores

2019-08-24 Thread Michael Rolnik
This series of patches adds 8bit AVR cores to QEMU.
All instruction, except BREAK/DES/SPM/SPMX, are implemented. Not fully tested 
yet.
However I was able to execute simple code with functions. e.g fibonacci 
calculation.
This series of patches include a non real, sample board.
No fuses support yet. PC is set to 0 at reset.

the patches include the following
1. just a basic 8bit AVR CPU, without instruction decoding or translation
2. CPU features which allow define the following 8bit AVR cores
 avr1
 avr2 avr25
 avr3 avr31 avr35
 avr4
 avr5 avr51
 avr6
 xmega2 xmega4 xmega5 xmega6 xmega7
3. a definition of sample machine with SRAM, FLASH and CPU which allows to 
execute simple code
4. encoding for all AVR instructions
5. interrupt handling
6. helpers for IN, OUT, SLEEP, WBR & unsupported instructions
7. a decoder which given an opcode decides what istruction it is
8. translation of AVR instruction into TCG
9. all features together

changes since v3
1. rampD/X/Y/Z registers are encoded as 0x00ff (instead of 0x00ff) for 
faster address manipulaton
2. ffs changed to ctz32
3. duplicate code removed at avr_cpu_do_interrupt
4. using andc instead of not + and
5. fixing V flag calculation in varios instructions
6. freeing local variables in PUSH
7. tcg_const_local_i32 -> tcg_const_i32
8. using sextract32 instead of my implementation
9. fixing BLD instruction
10.xor(r) instead of 0xff - r at COM
11.fixing MULS/MULSU not to modify inputs' content
12.using SUB for NEG
13.fixing tcg_gen_qemu_ld/st call in XCH

changes since v4
1. target is now defined as big endian in order to optimize push_ret/pop_ret
2. all style warnings are fixed
3. adding cpu_set/get_sreg functions
4. simplifying gen_goto_tb as there is no real paging
5. env->pc -> env->pc_w
6. making flag dump more compact
7. more spacing
8. renaming CODE/DATA_INDEX -> MMU_CODE/DATA_IDX
9. removing avr_set_feature
10. SPL/SPH set bug fix
11. switching stb_phys to cpu_stb_data
12. cleaning up avr_decode
13. saving sreg, rampD/X/Y/Z, eind in HW format (savevm)
14. saving CPU features (savevm)

changes since v5
1. BLD bug fix
2. decoder generator is added

chages since v6
1. using cpu_get_sreg/cpu_set_sreg in 
avr_cpu_gdb_read_register/avr_cpu_gdb_write_register
2. configure the target as little endian because otherwise GDB does not work
3. fixing and testing gen_push_ret/gen_pop_ret

changes since v7
1. folding back v6
2. logging at helper_outb and helper_inb are done for non supported yet 
registers only
3. MAINTAINERS updated

changes since v8
1. removing hw/avr from hw/Makefile.obj as it should not be built for all
2. making linux compilable
3. testing on
a. Mac, Apple LLVM version 7.0.0
b. Ubuntu 12.04, gcc 4.9.2
c. Fedora 23, gcc 5.3.1
4. folding back some patches
5. translation bug fixes for ORI, CPI, XOR instructions
6. propper handling of cpu register writes though memory

changes since v9
1. removing forward declarations of static functions
2. disabling debug prints
3. switching to case range instead of if else if ...
4. LD/ST IN/OUT accessing CPU maintainder registers are not routed to any device
5. commenst about sample board and sample IO device added
6. sample board description is more descriptive now
7. memory_region_allocate_system_memory is used to create RAM
8. now there are helper_fullrd & helper_fullwr when LD/ST try to access 
registers

changes since v10
1. movig back fullwr & fullrd into the commit where outb and inb were introduced
2. changing tlb_fill function signature
3. adding empty line between functions
4. adding newline on the last line of the file
5. using tb->flags to generae full access ST/LD instructions
6. fixing SBRC bug
7. folding back 10th commit
8. whenever a new file is introduced it's added to Makefile.objs

changes since v11
1. updating to v2.7.0-rc
2. removing assignment to env->fullacc from gen_intermediate_code

changes since v12
1. fixing spacing
2. fixing get/put_segment functions
3. removing target-avr/machine.h file
4. VMSTATE_SINGLE_TEST -> VMSTATE_SINGLE
5. comment spelling
6. removing hw/avr/sample_io.c
7. char const* -> const char*
8. proper ram allocation
9. fixing breakpoint functionality.
10.env1 -> env
11.fixing avr_cpu_gdb_write_register & avr_cpu_gdb_read_register functions
12.any cpu is removed
12.feature bits are not saved into vm state

changes since v13
1. rebasing to v2.7.0-rc1

changes since v14
1. I made self review with git gui tool. (I did not know such a thing exists)
2. removing all double/tripple spaces
3. removing comment reference to SampleIO
4. folding back some changes, so there is not deleted lines in my code
5. moving avr configuration, within configure file, before chris

changes since v15
1. removing IO registers cache from CPU
2. implementing CBI/SBI as read(helper_inb), modify, write(helper_outb)
3. implementing CBIC/SBIC as read(helper_inb), check, branch
4. adding missing tcg_temp_free_i32 for tcg_const_i32

changes since v16
1. 

[Qemu-devel] [PATCH v2 2/2] nbd: Tolerate more errors to structured reply request

2019-08-24 Thread Eric Blake
A server may have a reason to reject a request for structured replies,
beyond just not recognizing them as a valid request; similarly, it may
have a reason for rejecting a request for a meta context.  It doesn't
hurt us to continue talking to such a server; otherwise 'qemu-nbd
--list' of such a server fails to display all available details about
the export.

Encountered when temporarily tweaking nbdkit to reply with
NBD_REP_ERR_POLICY.  Present since structured reply support was first
added (commit d795299b reused starttls handling, but starttls is
different in that we can't fall back to other behavior on any error).

Signed-off-by: Eric Blake 
---
 nbd/client.c | 63 +---
 nbd/trace-events |  2 +-
 2 files changed, 34 insertions(+), 31 deletions(-)

diff --git a/nbd/client.c b/nbd/client.c
index a9d8d32feff7..b9dc829175f9 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright (C) 2016-2018 Red Hat, Inc.
+ *  Copyright (C) 2016-2019 Red Hat, Inc.
  *  Copyright (C) 2005  Anthony Liguori 
  *
  *  Network Block Device Client Side
@@ -142,17 +142,18 @@ static int nbd_receive_option_reply(QIOChannel *ioc, 
uint32_t opt,
 return 0;
 }

-/* If reply represents success, return 1 without further action.
- * If reply represents an error, consume the optional payload of
- * the packet on ioc.  Then return 0 for unsupported (so the client
- * can fall back to other approaches), or -1 with errp set for other
- * errors.
+/*
+ * If reply represents success, return 1 without further action.  If
+ * reply represents an error, consume the optional payload of the
+ * packet on ioc.  Then return 0 for unsupported (so the client can
+ * fall back to other approaches), where @strict determines if only
+ * ERR_UNSUP or all errors fit that category, or -1 with errp set for
+ * other errors.
  */
 static int nbd_handle_reply_err(QIOChannel *ioc, NBDOptionReply *reply,
-Error **errp)
+bool strict, Error **errp)
 {
-char *msg = NULL;
-int result = -1;
+g_autofree char *msg = NULL;

 if (!(reply->type & (1 << 31))) {
 return 1;
@@ -163,26 +164,28 @@ static int nbd_handle_reply_err(QIOChannel *ioc, 
NBDOptionReply *reply,
 error_setg(errp, "server error %" PRIu32
" (%s) message is too long",
reply->type, nbd_rep_lookup(reply->type));
-goto cleanup;
+goto err;
 }
 msg = g_malloc(reply->length + 1);
 if (nbd_read(ioc, msg, reply->length, NULL, errp) < 0) {
 error_prepend(errp, "Failed to read option error %" PRIu32
   " (%s) message: ",
   reply->type, nbd_rep_lookup(reply->type));
-goto cleanup;
+goto err;
 }
 msg[reply->length] = '\0';
 trace_nbd_server_error_msg(reply->type,
nbd_reply_type_lookup(reply->type), msg);
 }

+if (reply->type == NBD_REP_ERR_UNSUP || !strict) {
+trace_nbd_reply_err_ignored(reply->option,
+nbd_opt_lookup(reply->option),
+reply->type, nbd_rep_lookup(reply->type));
+return 0;
+}
+
 switch (reply->type) {
-case NBD_REP_ERR_UNSUP:
-trace_nbd_reply_err_unsup(reply->option, 
nbd_opt_lookup(reply->option));
-result = 0;
-goto cleanup;
-
 case NBD_REP_ERR_POLICY:
 error_setg(errp, "Denied by server for option %" PRIu32 " (%s)",
reply->option, nbd_opt_lookup(reply->option));
@@ -227,12 +230,9 @@ static int nbd_handle_reply_err(QIOChannel *ioc, 
NBDOptionReply *reply,
 error_append_hint(errp, "server reported: %s\n", msg);
 }

- cleanup:
-g_free(msg);
-if (result < 0) {
-nbd_send_opt_abort(ioc);
-}
-return result;
+ err:
+nbd_send_opt_abort(ioc);
+return -1;
 }

 /* nbd_receive_list:
@@ -257,7 +257,7 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
char **description,
 if (nbd_receive_option_reply(ioc, NBD_OPT_LIST, , errp) < 0) {
 return -1;
 }
-error = nbd_handle_reply_err(ioc, , errp);
+error = nbd_handle_reply_err(ioc, , true, errp);
 if (error <= 0) {
 return error;
 }
@@ -363,7 +363,7 @@ static int nbd_opt_info_or_go(QIOChannel *ioc, uint32_t opt,
 if (nbd_receive_option_reply(ioc, opt, , errp) < 0) {
 return -1;
 }
-error = nbd_handle_reply_err(ioc, , errp);
+error = nbd_handle_reply_err(ioc, , true, errp);
 if (error <= 0) {
 return error;
 }
@@ -538,12 +538,15 @@ static int nbd_receive_query_exports(QIOChannel *ioc,
 }
 }

-/* nbd_request_simple_option: Send an option request, and parse the reply
+/*
+ * nbd_request_simple_option: Send an option request, and parse 

[Qemu-devel] [PATCH v2 0/2] nbd: tolerate more errors to extension requests

2019-08-24 Thread Eric Blake
Since v1;
- new patch 1
- adjust patch 2 to not set errp when not in strict mode

Eric Blake (2):
  nbd: Use g_autofree in a few places
  nbd: Tolerate more errors to structured reply request

 block/nbd.c  | 11 +++
 nbd/client.c | 85 +++-
 nbd/server.c | 12 +++
 nbd/trace-events |  2 +-
 4 files changed, 49 insertions(+), 61 deletions(-)

-- 
2.21.0




[Qemu-devel] [PATCH v2 1/2] nbd: Use g_autofree in a few places

2019-08-24 Thread Eric Blake
Thanks to our recent move to use glib's g_autofree, I can join the
bandwagon.  Getting rid of gotos is fun ;)

There are probably more places where we could register cleanup
functions and get rid of more gotos; this patch just focuses on the
labels that existed merely to call g_free.

Signed-off-by: Eric Blake 
---
 block/nbd.c  | 11 ---
 nbd/client.c | 22 +++---
 nbd/server.c | 12 
 3 files changed, 15 insertions(+), 30 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index beed46fb3414..c4c91a158602 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1374,7 +1374,7 @@ static bool nbd_has_filename_options_conflict(QDict 
*options, Error **errp)
 static void nbd_parse_filename(const char *filename, QDict *options,
Error **errp)
 {
-char *file;
+g_autofree char *file = NULL;
 char *export_name;
 const char *host_spec;
 const char *unixpath;
@@ -1396,7 +1396,7 @@ static void nbd_parse_filename(const char *filename, 
QDict *options,
 export_name = strstr(file, EN_OPTSTR);
 if (export_name) {
 if (export_name[strlen(EN_OPTSTR)] == 0) {
-goto out;
+return;
 }
 export_name[0] = 0; /* truncate 'file' */
 export_name += strlen(EN_OPTSTR);
@@ -1407,11 +1407,11 @@ static void nbd_parse_filename(const char *filename, 
QDict *options,
 /* extract the host_spec - fail if it's not nbd:... */
 if (!strstart(file, "nbd:", _spec)) {
 error_setg(errp, "File name string for NBD must start with 'nbd:'");
-goto out;
+return;
 }

 if (!*host_spec) {
-goto out;
+return;
 }

 /* are we a UNIX or TCP socket? */
@@ -1431,9 +1431,6 @@ static void nbd_parse_filename(const char *filename, 
QDict *options,
 out_inet:
 qapi_free_InetSocketAddress(addr);
 }
-
-out:
-g_free(file);
 }

 static bool nbd_process_legacy_socket_options(QDict *output_options,
diff --git a/nbd/client.c b/nbd/client.c
index 49bf9906f94b..a9d8d32feff7 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -247,12 +247,11 @@ static int nbd_handle_reply_err(QIOChannel *ioc, 
NBDOptionReply *reply,
 static int nbd_receive_list(QIOChannel *ioc, char **name, char **description,
 Error **errp)
 {
-int ret = -1;
 NBDOptionReply reply;
 uint32_t len;
 uint32_t namelen;
-char *local_name = NULL;
-char *local_desc = NULL;
+g_autofree char *local_name = NULL;
+g_autofree char *local_desc = NULL;
 int error;

 if (nbd_receive_option_reply(ioc, NBD_OPT_LIST, , errp) < 0) {
@@ -298,7 +297,7 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
char **description,
 local_name = g_malloc(namelen + 1);
 if (nbd_read(ioc, local_name, namelen, "export name", errp) < 0) {
 nbd_send_opt_abort(ioc);
-goto out;
+return -1;
 }
 local_name[namelen] = '\0';
 len -= namelen;
@@ -306,24 +305,17 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
char **description,
 local_desc = g_malloc(len + 1);
 if (nbd_read(ioc, local_desc, len, "export description", errp) < 0) {
 nbd_send_opt_abort(ioc);
-goto out;
+return -1;
 }
 local_desc[len] = '\0';
 }

 trace_nbd_receive_list(local_name, local_desc ?: "");
-*name = local_name;
-local_name = NULL;
+*name = g_steal_pointer(_name);
 if (description) {
-*description = local_desc;
-local_desc = NULL;
+*description = g_steal_pointer(_desc);
 }
-ret = 1;
-
- out:
-g_free(local_name);
-g_free(local_desc);
-return ret;
+return 1;
 }


diff --git a/nbd/server.c b/nbd/server.c
index 0fb41c6c50ea..74d205812fee 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -206,7 +206,7 @@ static int GCC_FMT_ATTR(4, 0)
 nbd_negotiate_send_rep_verr(NBDClient *client, uint32_t type,
 Error **errp, const char *fmt, va_list va)
 {
-char *msg;
+g_autofree char *msg = NULL;
 int ret;
 size_t len;

@@ -216,18 +216,14 @@ nbd_negotiate_send_rep_verr(NBDClient *client, uint32_t 
type,
 trace_nbd_negotiate_send_rep_err(msg);
 ret = nbd_negotiate_send_rep_len(client, type, len, errp);
 if (ret < 0) {
-goto out;
+return ret;
 }
 if (nbd_write(client->ioc, msg, len, errp) < 0) {
 error_prepend(errp, "write failed (error message): ");
-ret = -EIO;
-} else {
-ret = 0;
+return -EIO;
 }

-out:
-g_free(msg);
-return ret;
+return 0;
 }

 /* Send an error reply.
-- 
2.21.0




Re: [Qemu-devel] [PATCH 2/2] migration/qemu-file: fix potential buf waste for extra buf_index adjustment

2019-08-24 Thread Wei Yang
On Fri, Aug 23, 2019 at 02:05:02PM +0100, Dr. David Alan Gilbert wrote:
>* Daniel P. Berrang? (berra...@redhat.com) wrote:
>> On Fri, Aug 23, 2019 at 12:06:09PM +0100, Dr. David Alan Gilbert wrote:
>> > (Copying Dan in)
>> > 
>> > * Wei Yang (richardw.y...@linux.intel.com) wrote:
>> > > In add_to_iovec(), qemu_fflush() will be called if iovec is full. If
>> > > this happens, buf_index is reset. Currently, this is not checked and
>> > > buf_index would always been adjust with buf size.
>> > > 
>> > > This is not harmful, but will waste some space in file buffer.
>> > 
>> > That's a nice find.
>> > 
>> > > This patch make add_to_iovec() return 1 when it has flushed the file.
>> > > Then the caller could check the return value to see whether it is
>> > > necessary to adjust the buf_index any more.
>> > > 
>> > > Signed-off-by: Wei Yang 
>> > 
>> > Reviewed-by: Dr. David Alan Gilbert 
>> > 
>> > (I wonder if there's a way to wrap that little add_to_iovec, check, add
>> > to index, flush in a little wrapper).
>> 
>> Given the name "add_to_iovec" I think it is pretty surprising
>> that it calls "qemu_flush" at all.
>> 
>> It is also pretty wierd that we're checking two different
>> conditions in two different places.
>> 
>> Right now the code is essentially doing this:
>> 
>>  if (f->iovcnt >= MAX_IOV_SIZE) {
>> qemu_fflush(f);
>>  }
>>  if (f->buf_index == IO_BUF_SIZE) {
>> qemu_fflush(f);
>>  }
>> 
>> Except that in the qemu_put_buffer_async() case, we're
>> only doing the first of these two checks. This feels
>> very odd indeed - I would have thought either it should
>> do both, or do neither.
>
>No, there's two separate types of buffers.
>
>There's f->buf which is a single allocated buffer in the QEMUFile
>with an offset buf_index, and there are arbitrary RAM pages
>added typically via qemu_put_buffer_async.
>

qemu_put_buffer_async is the only one which put a range not in f->buf into the
iovec.

And one thing confused me is even its name is async, add_to_iovec still would
call qemu_fflush when iovec is full. So it is not a always async function.
>From the function name, it is a little difficult to differentiate
qemu_put_buffer and qemu_put_buffer_async.

>The check for >= IO_BUF_SIZE is only done when adding to the f->buf,
>where as the check on f->iovcnt is done when you add an element to
>the iovec and that can happen potentially in either case.
>
>Dave
>
>> Assuming doing both flushs is ok for qemu_put_buffer_async
>> then I'd suggest renaming 'add_to_iovec' to 'queue_buffer'
>> and have that method do both of these qemu_fflush() calls.
>> 
>> > > ---
>> > >  migration/qemu-file.c | 42 --
>> > >  1 file changed, 28 insertions(+), 14 deletions(-)
>> > > 
>> > > diff --git a/migration/qemu-file.c b/migration/qemu-file.c
>> > > index 35c22605dd..05d9f42ddb 100644
>> > > --- a/migration/qemu-file.c
>> > > +++ b/migration/qemu-file.c
>> > > @@ -343,8 +343,16 @@ int qemu_fclose(QEMUFile *f)
>> > >  return ret;
>> > >  }
>> > >  
>> > > -static void add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
>> > > - bool may_free)
>> > > +/*
>> > > + * Add buf to iovec. Do flush if iovec is full.
>> > > + *
>> > > + * Return values:
>> > > + * 1 iovec is full and flushed
>> > > + * 0 iovec is not flushed
>> > > + *
>> > > + */
>> > > +static int add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
>> > > +bool may_free)
>> > >  {
>> > >  /* check for adjacent buffer and coalesce them */
>> > >  if (f->iovcnt > 0 && buf == f->iov[f->iovcnt - 1].iov_base +
>> > > @@ -362,7 +370,10 @@ static void add_to_iovec(QEMUFile *f, const uint8_t 
>> > > *buf, size_t size,
>> > >  
>> > >  if (f->iovcnt >= MAX_IOV_SIZE) {
>> > >  qemu_fflush(f);
>> > > +return 1;
>> > >  }
>> > > +
>> > > +return 0;
>> > >  }
>> > >  
>> > >  void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, size_t size,
>> > > @@ -391,10 +402,11 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t 
>> > > *buf, size_t size)
>> > >  }
>> > >  memcpy(f->buf + f->buf_index, buf, l);
>> > >  f->bytes_xfer += l;
>> > > -add_to_iovec(f, f->buf + f->buf_index, l, false);
>> > > -f->buf_index += l;
>> > > -if (f->buf_index == IO_BUF_SIZE) {
>> > > -qemu_fflush(f);
>> > > +if (!add_to_iovec(f, f->buf + f->buf_index, l, false)) {
>> > > +f->buf_index += l;
>> > > +if (f->buf_index == IO_BUF_SIZE) {
>> > > +qemu_fflush(f);
>> > > +}
>> > >  }
>> > >  if (qemu_file_get_error(f)) {
>> > >  break;
>> > > @@ -412,10 +424,11 @@ void qemu_put_byte(QEMUFile *f, int v)
>> > >  
>> > >  f->buf[f->buf_index] = v;
>> > >  f->bytes_xfer++;
>> > > -add_to_iovec(f, f->buf + f->buf_index, 1, false);
>> > > -f->buf_index++;
>> > > -if 

Re: [Qemu-devel] [PATCH 2/2] migration/qemu-file: fix potential buf waste for extra buf_index adjustment

2019-08-24 Thread Wei Yang
On Fri, Aug 23, 2019 at 12:06:09PM +0100, Dr. David Alan Gilbert wrote:
>(Copying Dan in)
>
>* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> In add_to_iovec(), qemu_fflush() will be called if iovec is full. If
>> this happens, buf_index is reset. Currently, this is not checked and
>> buf_index would always been adjust with buf size.
>> 
>> This is not harmful, but will waste some space in file buffer.
>
>That's a nice find.
>
>> This patch make add_to_iovec() return 1 when it has flushed the file.
>> Then the caller could check the return value to see whether it is
>> necessary to adjust the buf_index any more.
>> 
>> Signed-off-by: Wei Yang 
>
>Reviewed-by: Dr. David Alan Gilbert 
>
>(I wonder if there's a way to wrap that little add_to_iovec, check, add
>to index, flush in a little wrapper).
>
>Dave
>
>> ---
>>  migration/qemu-file.c | 42 --
>>  1 file changed, 28 insertions(+), 14 deletions(-)
>> 
>> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
>> index 35c22605dd..05d9f42ddb 100644
>> --- a/migration/qemu-file.c
>> +++ b/migration/qemu-file.c
>> @@ -343,8 +343,16 @@ int qemu_fclose(QEMUFile *f)
>>  return ret;
>>  }
>>  
>> -static void add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
>> - bool may_free)
>> +/*
>> + * Add buf to iovec. Do flush if iovec is full.
>> + *
>> + * Return values:
>> + * 1 iovec is full and flushed
>> + * 0 iovec is not flushed
>> + *
>> + */
>> +static int add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
>> +bool may_free)
>>  {
>>  /* check for adjacent buffer and coalesce them */
>>  if (f->iovcnt > 0 && buf == f->iov[f->iovcnt - 1].iov_base +
>> @@ -362,7 +370,10 @@ static void add_to_iovec(QEMUFile *f, const uint8_t 
>> *buf, size_t size,
>>  
>>  if (f->iovcnt >= MAX_IOV_SIZE) {
>>  qemu_fflush(f);
>> +return 1;
>>  }
>> +
>> +return 0;
>>  }
>>  
>>  void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, size_t size,
>> @@ -391,10 +402,11 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, 
>> size_t size)
>>  }
>>  memcpy(f->buf + f->buf_index, buf, l);
>>  f->bytes_xfer += l;
>> -add_to_iovec(f, f->buf + f->buf_index, l, false);
>> -f->buf_index += l;
>> -if (f->buf_index == IO_BUF_SIZE) {
>> -qemu_fflush(f);
>> +if (!add_to_iovec(f, f->buf + f->buf_index, l, false)) {
>> +f->buf_index += l;
>> +if (f->buf_index == IO_BUF_SIZE) {
>> +qemu_fflush(f);
>> +}

You mean put these four lines into a wrapper?

Name it as add_buf_to_iovec?

>>  }
>>  if (qemu_file_get_error(f)) {
>>  break;
>> @@ -412,10 +424,11 @@ void qemu_put_byte(QEMUFile *f, int v)
>>  
>>  f->buf[f->buf_index] = v;
>>  f->bytes_xfer++;
>> -add_to_iovec(f, f->buf + f->buf_index, 1, false);
>> -f->buf_index++;
>> -if (f->buf_index == IO_BUF_SIZE) {
>> -qemu_fflush(f);
>> +if (!add_to_iovec(f, f->buf + f->buf_index, 1, false)) {
>> +f->buf_index++;
>> +if (f->buf_index == IO_BUF_SIZE) {
>> +qemu_fflush(f);
>> +}
>>  }
>>  }
>>  
>> @@ -717,10 +730,11 @@ ssize_t qemu_put_compression_data(QEMUFile *f, 
>> z_stream *stream,
>>  }
>>  
>>  qemu_put_be32(f, blen);
>> -add_to_iovec(f, f->buf + f->buf_index, blen, false);
>> -f->buf_index += blen;
>> -if (f->buf_index == IO_BUF_SIZE) {
>> -qemu_fflush(f);
>> +if (!add_to_iovec(f, f->buf + f->buf_index, blen, false)) {
>> +f->buf_index += blen;
>> +if (f->buf_index == IO_BUF_SIZE) {
>> +qemu_fflush(f);
>> +}
>>  }
>>  return blen + sizeof(int32_t);
>>  }
>> -- 
>> 2.17.1
>> 
>--
>Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH v1 8/9] exec.c: Factor out core logic of check_watchpoint()

2019-08-24 Thread Richard Henderson
On 8/23/19 4:27 AM, David Hildenbrand wrote:
> On 23.08.19 12:07, David Hildenbrand wrote:
>> We want to perform the same checks in probe_write() to trigger a cpu
>> exit before doing any modifications. We'll have to pass a PC.
>>
>> Signed-off-by: David Hildenbrand 
>> ---
>>  exec.c| 23 +--
>>  include/hw/core/cpu.h |  2 ++
>>  2 files changed, 19 insertions(+), 6 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index 1df966d17a..d233a4250b 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -2810,12 +2810,10 @@ static const MemoryRegionOps notdirty_mem_ops = {
>>  },
>>  };
>>  
>> -/* Generate a debug exception if a watchpoint has been hit.  */
>> -static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int 
>> flags)
>> +void cpu_check_watchpoint(CPUState *cpu, vaddr vaddr, int len,
>> +  MemTxAttrs attrs, int flags, uintptr_t ra)
>>  {
>> -CPUState *cpu = current_cpu;
>>  CPUClass *cc = CPU_GET_CLASS(cpu);
>> -target_ulong vaddr;
>>  CPUWatchpoint *wp;
>>  
>>  assert(tcg_enabled());
>> @@ -2826,7 +2824,7 @@ static void check_watchpoint(int offset, int len, 
>> MemTxAttrs attrs, int flags)
>>  cpu_interrupt(cpu, CPU_INTERRUPT_DEBUG);
>>  return;
>>  }
>> -vaddr = (cpu->mem_io_vaddr & TARGET_PAGE_MASK) + offset;
>> +
>>  vaddr = cc->adjust_watchpoint_address(cpu, vaddr, len);
>>  QTAILQ_FOREACH(wp, >watchpoints, entry) {
>>  if (cpu_watchpoint_address_matches(wp, vaddr, len)
>> @@ -2851,11 +2849,14 @@ static void check_watchpoint(int offset, int len, 
>> MemTxAttrs attrs, int flags)
>>  if (wp->flags & BP_STOP_BEFORE_ACCESS) {
>>  cpu->exception_index = EXCP_DEBUG;
>>  mmap_unlock();
>> -cpu_loop_exit(cpu);
>> +cpu_loop_exit_restore(cpu, ra);
>>  } else {
>>  /* Force execution of one insn next time.  */
>>  cpu->cflags_next_tb = 1 | curr_cflags();
>>  mmap_unlock();
>> +if (ra) {
>> +cpu_restore_state(cpu, ra, true);
>> +}
>>  cpu_loop_exit_noexc(cpu);
>>  }
>>  }
>> @@ -2865,6 +2866,16 @@ static void check_watchpoint(int offset, int len, 
>> MemTxAttrs attrs, int flags)
>>  }
>>  }
>>  
>> +/* Generate a debug exception if a watchpoint has been hit.  */
>> +static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int 
>> flags)
>> +{
>> +CPUState *cpu = current_cpu;
>> +vaddr vaddr;
>> +
>> +vaddr = (cpu->mem_io_vaddr & TARGET_PAGE_MASK) + offset;
>> +cpu_check_watchpoint(cpu, vaddr, len, attrs, flags, 0);
>> +}
>> +
>>  /* Watchpoint access routines.  Watchpoints are inserted using TLB tricks,
>> so these check for a hit then pass through to the normal out-of-line
>> phys routines.  */
>> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
>> index 77fca95a40..3a2d76b32c 100644
>> --- a/include/hw/core/cpu.h
>> +++ b/include/hw/core/cpu.h
>> @@ -1070,6 +1070,8 @@ static inline bool cpu_breakpoint_test(CPUState *cpu, 
>> vaddr pc, int mask)
>>  return false;
>>  }
>>  
>> +void cpu_check_watchpoint(CPUState *cpu, vaddr vaddr, int len,
>> +  MemTxAttrs attrs, int flags, uintptr_t ra);
>>  int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
>>int flags, CPUWatchpoint **watchpoint);
>>  int cpu_watchpoint_remove(CPUState *cpu, vaddr addr,
>>
> 
> As we will have bigger accesses with probe_write(), we should do
> 
> diff --git a/exec.c b/exec.c
> index d233a4250b..4f8cc62a5f 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2834,7 +2834,7 @@ void cpu_check_watchpoint(CPUState *cpu, vaddr
> vaddr, int len,
>  } else {
>  wp->flags |= BP_WATCHPOINT_HIT_WRITE;
>  }
> -wp->hitaddr = vaddr;
> +wp->hitaddr = MAX(vaddr, wp->vaddr);
>  wp->hitattrs = attrs;
>  if (!cpu->watchpoint_hit) {
>  if (wp->flags & BP_CPU &&
> 
> I guess, to make sure we actually indicate the watchpoint.
> 

I have applied this patch with this fix to tcg-next.
I'm working on fixing the other problems we discovered re watchpoints.


r~



Re: [Qemu-devel] [PATCH v2 00/12] block: qiov_offset parameter for io

2019-08-24 Thread Vladimir Sementsov-Ogievskiy
23.08.2019 19:21, Stefan Hajnoczi wrote:
> On Thu, Aug 22, 2019 at 05:59:43PM +, Vladimir Sementsov-Ogievskiy wrote:
>> 22.08.2019 20:39, Vladimir Sementsov-Ogievskiy wrote:
>>> 22.08.2019 20:24, Vladimir Sementsov-Ogievskiy wrote:
 22.08.2019 18:50, Stefan Hajnoczi wrote:
> On Tue, Jun 04, 2019 at 07:15:02PM +0300, Vladimir Sementsov-Ogievskiy 
> wrote:
>> Hi all!
>>
>> Here is new parameter qiov_offset for io path, to avoid
>> a lot of places with same pattern of creating local_qiov or hd_qiov
>> variables.
>>
>> These series also includes my
>> "[Qemu-devel] [PATCH 0/2] block/io: refactor padding"
>> with some changes [described in 01 and 03 emails]
>>
>> Vladimir Sementsov-Ogievskiy (12):
>>     util/iov: introduce qemu_iovec_init_extended
>>     util/iov: improve qemu_iovec_is_zero
>>     block/io: refactor padding
>>     block: define .*_part io handlers in BlockDriver
>>     block/io: bdrv_co_do_copy_on_readv: use and support qiov_offset
>>     block/io: bdrv_co_do_copy_on_readv: lazy allocation
>>     block/io: bdrv_aligned_preadv: use and support qiov_offset
>>     block/io: bdrv_aligned_pwritev: use and support qiov_offset
>>     block/io: introduce bdrv_co_p{read,write}v_part
>>     block/qcow2: refactor qcow2_co_preadv to use buffer-based io
>>     block/qcow2: implement .bdrv_co_preadv_part
>>     block/qcow2: implement .bdrv_co_pwritev(_compressed)_part
>>
>>    block/qcow2.h |   1 +
>>    include/block/block_int.h |  21 ++
>>    include/qemu/iov.h    |  10 +-
>>    block/backup.c    |   2 +-
>>    block/io.c    | 532 ++
>>    block/qcow2-cluster.c |  14 +-
>>    block/qcow2.c | 131 +-
>>    qemu-img.c    |   4 +-
>>    util/iov.c    | 153 +--
>>    9 files changed, 559 insertions(+), 309 deletions(-)
>
> qemu-iotests 077 fails after I apply this series (including your
> uninitialized variable fix).  I'm afraid I can't include it in the block
> pull request, sorry!
>
> Stefan
>

 Hmm, 77 don't work on master for me:
 077  fail   [20:23:57] [20:23:59]    output 
 mismatch (see 077.out.bad)
 --- /work/src/qemu/up-block-drop-hd-qiov/tests/qemu-iotests/077.out 
 2019-04-22 15:06:56.162045432 +0300
 +++ /work/src/qemu/up-block-drop-hd-qiov/tests/qemu-iotests/077.out.bad 
 2019-08-22 20:23:59.124122307 +0300
 @@ -1,7 +1,15 @@
    QA output created by 077
 +==117186==WARNING: ASan doesn't fully support makecontext/swapcontext 
 functions and may produce false positives in some cases!
 +==117186==WARNING: ASan is ignoring requested __asan_handle_no_return: 
 stack top: 0x7ffefc071000; bottom 0x7fad7277b000; size: 0x0051898f6000 
 (350200225792)
 +False positive error reports may follow
 +For details see 
 http://code.google.com/p/address-sanitizer/issues/detail?id=189
    Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728

    == Some concurrent requests involving RMW ==
 +==117197==WARNING: ASan doesn't fully support makecontext/swapcontext 
 functions and may produce false positives in some cases!
 +==117197==WARNING: ASan is ignoring requested __asan_handle_no_return: 
 stack top: 0x7fffa4fcc000; bottom 0x7fa93a2c1000; size: 0x00566ad0b000 
 (371159248896)
 +False positive error reports may follow
 +For details see 
 http://code.google.com/p/address-sanitizer/issues/detail?id=189
    wrote XXX/XXX bytes at offset XXX
    XXX bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    wrote XXX/XXX bytes at offset XXX
 @@ -66,6 +74,10 @@
    XXX bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

    == Verify image content ==
 +==117219==WARNING: ASan doesn't fully support makecontext/swapcontext 
 functions and may produce false positives in some cases!
 +==117219==WARNING: ASan is ignoring requested __asan_handle_no_return: 
 stack top: 0x7ffc722de000; bottom 0x7f0848232000; size: 0x00f42a0ac000 
 (1048677367808)
 +False positive error reports may follow
 +For details see 
 http://code.google.com/p/address-sanitizer/issues/detail?id=189
    read 512/512 bytes at offset 0
    512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    read 512/512 bytes at offset 512
 @@ -156,5 +168,9 @@
    1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    read 2048/2048 bytes at offset 71680
    2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 +==117229==WARNING: ASan doesn't fully support makecontext/swapcontext 
 functions and may produce false positives in some cases!
 +==117229==WARNING: ASan is ignoring requested 

[Qemu-devel] [PATCH] block: fix permission update in bdrv_replace_node

2019-08-24 Thread Vladimir Sementsov-Ogievskiy
It's wrong to OR shared permissions. It may lead to crash on further
permission updates.
Also, no needs to consider previously calculated permissions, as at
this point we already bind all new parents and bdrv_get_cumulative_perm
result is enough. So fix the bug by just set permissions by
bdrv_get_cumulative_perm result.

Bug was introduced in long ago 234ac1a9025, in 2.9.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all!

I found this bug during my work around backup-top filter. It happens that
on filter removing, bdrv_replace_node() breaks permissions in graph which
lead to bdrv_set_backing_hd(new backing: NULL) on
assert(tighten_restrictions == false).

 block.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index 874a29a983..5944124845 100644
--- a/block.c
+++ b/block.c
@@ -4165,7 +4165,6 @@ void bdrv_replace_node(BlockDriverState *from, 
BlockDriverState *to,
 {
 BdrvChild *c, *next;
 GSList *list = NULL, *p;
-uint64_t old_perm, old_shared;
 uint64_t perm = 0, shared = BLK_PERM_ALL;
 int ret;
 
@@ -4211,8 +4210,8 @@ void bdrv_replace_node(BlockDriverState *from, 
BlockDriverState *to,
 bdrv_unref(from);
 }
 
-bdrv_get_cumulative_perm(to, _perm, _shared);
-bdrv_set_perm(to, old_perm | perm, old_shared | shared);
+bdrv_get_cumulative_perm(to, , );
+bdrv_set_perm(to, perm, shared);
 
 out:
 g_slist_free(list);
-- 
2.18.0




Re: [Qemu-devel] [PATCH v4 01/10] add device_legacy_reset function to prepare for reset api change

2019-08-24 Thread David Gibson
On Wed, Aug 21, 2019 at 06:33:32PM +0200, Damien Hedde wrote:
> Provide a temporary device_legacy_reset function doing what
> device_reset does to prepare for the transition with Resettable
> API.
> 
> All occurrence of device_reset in the code tree are also replaced
> by device_legacy_reset.
> 
> The new resettable API has different prototype and semantics
> (resetting child buses as well as the specified device). Subsequent
> commits will make the changeover for each call site individually; once
> that is complete device_legacy_reset() will be removed.
> 
> Signed-off-by: Damien Hedde 
> Reviewed-by: Peter Maydell 

ppc parts
Acked-by: David Gibson 

> ---
> Cc: Gerd Hoffmann 
> Cc: Paolo Bonzini 
> Cc: "Daniel P. Berrangé" 
> Cc: Eduardo Habkost 
> Cc: Richard Henderson 
> Cc: "Michael S. Tsirkin" 
> Cc: Marcel Apfelbaum 
> Cc: John Snow 
> Cc: "Cédric Le Goater" 
> Cc: Collin Walling 
> Cc: Cornelia Huck 
> Cc: David Hildenbrand 
> Cc: Halil Pasic 
> Cc: Christian Borntraeger 
> Cc: Dmitry Fleytman 
> Cc: Fam Zheng 
> Cc: qemu-bl...@nongnu.org
> Cc: qemu-...@nongnu.org
> Cc: qemu-s3...@nongnu.org
> Cc: qemu-...@nongnu.org
> ---
>  hw/audio/intel-hda.c | 2 +-
>  hw/core/qdev.c   | 6 +++---
>  hw/hyperv/hyperv.c   | 2 +-
>  hw/i386/pc.c | 2 +-
>  hw/ide/microdrive.c  | 8 
>  hw/intc/spapr_xive.c | 2 +-
>  hw/ppc/pnv_psi.c | 2 +-
>  hw/ppc/spapr_pci.c   | 2 +-
>  hw/ppc/spapr_vio.c   | 2 +-
>  hw/s390x/s390-pci-inst.c | 2 +-
>  hw/scsi/vmw_pvscsi.c | 2 +-
>  hw/sd/omap_mmc.c | 2 +-
>  hw/sd/pl181.c| 2 +-
>  include/hw/qdev-core.h   | 4 ++--
>  14 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c
> index 6ecd383540..27b71c57cf 100644
> --- a/hw/audio/intel-hda.c
> +++ b/hw/audio/intel-hda.c
> @@ -1087,7 +1087,7 @@ static void intel_hda_reset(DeviceState *dev)
>  QTAILQ_FOREACH(kid, >codecs.qbus.children, sibling) {
>  DeviceState *qdev = kid->child;
>  cdev = HDA_CODEC_DEVICE(qdev);
> -device_reset(DEVICE(cdev));
> +device_legacy_reset(DEVICE(cdev));
>  d->state_sts |= (1 << cdev->cad);
>  }
>  intel_hda_update_irq(d);
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 60d66c2f39..a95d4fa87d 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -257,7 +257,7 @@ HotplugHandler *qdev_get_hotplug_handler(DeviceState *dev)
>  
>  static int qdev_reset_one(DeviceState *dev, void *opaque)
>  {
> -device_reset(dev);
> +device_legacy_reset(dev);
>  
>  return 0;
>  }
> @@ -865,7 +865,7 @@ static void device_set_realized(Object *obj, bool value, 
> Error **errp)
>  }
>  }
>  if (dev->hotplugged) {
> -device_reset(dev);
> +device_legacy_reset(dev);
>  }
>  dev->pending_deleted_event = false;
>  
> @@ -1087,7 +1087,7 @@ void device_class_set_parent_unrealize(DeviceClass *dc,
>  dc->unrealize = dev_unrealize;
>  }
>  
> -void device_reset(DeviceState *dev)
> +void device_legacy_reset(DeviceState *dev)
>  {
>  DeviceClass *klass = DEVICE_GET_CLASS(dev);
>  
> diff --git a/hw/hyperv/hyperv.c b/hw/hyperv/hyperv.c
> index 6ebf31c310..cd9db3cb5c 100644
> --- a/hw/hyperv/hyperv.c
> +++ b/hw/hyperv/hyperv.c
> @@ -140,7 +140,7 @@ void hyperv_synic_reset(CPUState *cs)
>  SynICState *synic = get_synic(cs);
>  
>  if (synic) {
> -device_reset(DEVICE(synic));
> +device_legacy_reset(DEVICE(synic));
>  }
>  }
>  
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 3ab4bcb3ca..f33a8de42f 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2826,7 +2826,7 @@ static void pc_machine_reset(MachineState *machine)
>  cpu = X86_CPU(cs);
>  
>  if (cpu->apic_state) {
> -device_reset(cpu->apic_state);
> +device_legacy_reset(cpu->apic_state);
>  }
>  }
>  }
> diff --git a/hw/ide/microdrive.c b/hw/ide/microdrive.c
> index b0272ea14b..6b30e36ed8 100644
> --- a/hw/ide/microdrive.c
> +++ b/hw/ide/microdrive.c
> @@ -173,7 +173,7 @@ static void md_attr_write(PCMCIACardState *card, uint32_t 
> at, uint8_t value)
>  case 0x00:   /* Configuration Option Register */
>  s->opt = value & 0xcf;
>  if (value & OPT_SRESET) {
> -device_reset(DEVICE(s));
> +device_legacy_reset(DEVICE(s));
>  }
>  md_interrupt_update(s);
>  break;
> @@ -316,7 +316,7 @@ static void md_common_write(PCMCIACardState *card, 
> uint32_t at, uint16_t value)
>  case 0xe:/* Device Control */
>  s->ctrl = value;
>  if (value & CTRL_SRST) {
> -device_reset(DEVICE(s));
> +device_legacy_reset(DEVICE(s));
>  }
>  md_interrupt_update(s);
>  break;
> @@ -541,7 +541,7 @@ static int dscm1_attach(PCMCIACardState *card)
>  md->attr_base = pcc->cis[0x74] | (pcc->cis[0x76] << 

Re: [Qemu-devel] [RFC PATCH 07/17] hw/i2c: Declare device little or big endian

2019-08-24 Thread David Gibson
On Sat, Aug 24, 2019 at 04:56:28AM +1000, Tony Nguyen wrote:
> For each device declared with DEVICE_NATIVE_ENDIAN, find the set of
> targets from the set of target/hw/*/device.o.
> 
> If the set of targets are all little or all big endian, re-declare
> the device endianness as DEVICE_LITTLE_ENDIAN or DEVICE_BIG_ENDIAN
> respectively.
> 
> This *naive* deduction may result in genuinely native endian devices
> being incorrectly declared as little or big endian, but should not
> introduce regressions for current targets.
> 
> These devices should be re-declared as DEVICE_NATIVE_ENDIAN if 1) it
> has a new target with an opposite endian or 2) someone informed knows
> better =)
> 
> Signed-off-by: Tony Nguyen 
> ---
>  hw/i2c/imx_i2c.c   | 2 +-
>  hw/i2c/mpc_i2c.c   | 2 +-
>  hw/i2c/versatile_i2c.c | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)

ppc part (mpc_i2c.c)
Acked-by: David Gibson 

> 
> diff --git a/hw/i2c/imx_i2c.c b/hw/i2c/imx_i2c.c
> index 30b9aea247..cc2689d967 100644
> --- a/hw/i2c/imx_i2c.c
> +++ b/hw/i2c/imx_i2c.c
> @@ -278,7 +278,7 @@ static const MemoryRegionOps imx_i2c_ops = {
>  .write = imx_i2c_write,
>  .valid.min_access_size = 1,
>  .valid.max_access_size = 2,
> -.endianness = DEVICE_NATIVE_ENDIAN,
> +.endianness = DEVICE_LITTLE_ENDIAN,
>  };
>  
>  static const VMStateDescription imx_i2c_vmstate = {
> diff --git a/hw/i2c/mpc_i2c.c b/hw/i2c/mpc_i2c.c
> index 0aa1be3ce7..b71b5ff7d5 100644
> --- a/hw/i2c/mpc_i2c.c
> +++ b/hw/i2c/mpc_i2c.c
> @@ -306,7 +306,7 @@ static const MemoryRegionOps i2c_ops = {
>  .read =  mpc_i2c_read,
>  .write =  mpc_i2c_write,
>  .valid.max_access_size = 1,
> -.endianness = DEVICE_NATIVE_ENDIAN,
> +.endianness = DEVICE_BIG_ENDIAN,
>  };
>  
>  static const VMStateDescription mpc_i2c_vmstate = {
> diff --git a/hw/i2c/versatile_i2c.c b/hw/i2c/versatile_i2c.c
> index 1ac2a6f59a..c92d3b115c 100644
> --- a/hw/i2c/versatile_i2c.c
> +++ b/hw/i2c/versatile_i2c.c
> @@ -77,7 +77,7 @@ static void versatile_i2c_write(void *opaque, hwaddr offset,
>  static const MemoryRegionOps versatile_i2c_ops = {
>  .read = versatile_i2c_read,
>  .write = versatile_i2c_write,
> -.endianness = DEVICE_NATIVE_ENDIAN,
> +.endianness = DEVICE_LITTLE_ENDIAN,
>  };
>  
>  static void versatile_i2c_init(Object *obj)

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH V2 1/2] tests.acceptance.avocado_qemu: Add support for powerpc

2019-08-24 Thread David Gibson
On Mon, Aug 19, 2019 at 01:58:20PM +0530, sathn...@linux.vnet.ibm.com wrote:
> From: Satheesh Rajendran 
> 
> Current acceptance test will not run properly in powerpc
> environment due qemu target is different from arch, this
> usually matches, except with bi-endian architectures like ppc64.
> uname would return `ppc64` or `ppc64le` based `big` or `little`
> endian but qemu `target` is always `ppc64`. Let's handle it.
> 
> Reviewed-by: Cédric Le Goater 
> Signed-off-by: Satheesh Rajendran 

Reviewed-by: David Gibson 

I sent a similar patch a little while back, but it seems it got lost.


> ---
>  tests/acceptance/avocado_qemu/__init__.py | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tests/acceptance/avocado_qemu/__init__.py 
> b/tests/acceptance/avocado_qemu/__init__.py
> index aee5d820ed..bd41e0443c 100644
> --- a/tests/acceptance/avocado_qemu/__init__.py
> +++ b/tests/acceptance/avocado_qemu/__init__.py
> @@ -39,6 +39,9 @@ def pick_default_qemu_bin(arch=None):
>  """
>  if arch is None:
>  arch = os.uname()[4]
> +# qemu binary path does not match arch for powerpc, handle it
> +if 'ppc64le' in arch:
> +arch = 'ppc64'
>  qemu_bin_relative_path = os.path.join("%s-softmmu" % arch,
>"qemu-system-%s" % arch)
>  if is_readable_executable_file(qemu_bin_relative_path):

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v5 0/8] target/ppc: Optimize emulation of some Altivec instructions

2019-08-24 Thread David Gibson
On Mon, Jul 15, 2019 at 04:22:46PM +0200, Stefan Brankovic wrote:
> Optimize emulation of ten Altivec instructions: lvsl, lvsr, vsl, vsr, vpkpx,
> vgbbd, vclzb, vclzh, vclzw and vclzd.
> 
> This series buils up on and complements recent work of Thomas Murta, Mark
> Cave-Ayland and Richard Henderson in the same area. It is based on devising 
> TCG
> translation implementation for selected instructions rather than using 
> helpers.
> The selected instructions are most of the time idiosyncratic to ppc platform,
> so relatively complex TCG translation (without direct mapping to host
> instruction that is not possible in these cases) seems to be the best option,
> and that approach is presented in this series. The performance improvements
> are significant in all cases.

I was waiting for acks from rth on the remainder of this series, but
it seems to have been forgotten.  Can you rebase and resend the
remaining patches for inclusion in ppc-for-4.2.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] more automated/public CI for QEMU pullreqs

2019-08-24 Thread Alex Bennée


Philippe Mathieu-Daudé  writes:

> On 8/22/19 7:05 PM, Paolo Bonzini wrote:
>> On 22/08/19 18:50, Dr. David Alan Gilbert wrote:
>>> * Paolo Bonzini (pbonz...@redhat.com) wrote:
 On 22/08/19 18:31, Dr. David Alan Gilbert wrote:
>> With both these points in mind, I think it is  pretty hard sell to
>> say we should write & maintain a custom CI system just for QEMU
>> unless it is offering major compelling functionality we can't do
>> without.
>>>
>>> (That was Dan's comment)
>>>
 In theory I agree.

 In practice, the major compelling functionality is portability.  If it
 is true that setting up runners is problematic even on aarch64, frankly
 GitLab CI is dead on arrival.  If it is not true, then I'd be very happy
 to use GitLab CI too.
>>>
>>> IMHO if for some weird reason Gitlab has problems on aarch64 then we
>>> just need to get that fixed.
>>
>> I'm sure it's just some packaging or deployment issue.  But
>> https://gitlab.com/gitlab-org/gitlab-runner/merge_requests/725 has been
>> open for more than one year; the last two messages are:
>>
>> * 1 month ago: "I hope we will be able to merge it soon"
>>
>> * 3 weeks ago: "Today I tried use gitlab-runner on my arm64 box, however
>> it kept mysteriously failing"
>>
>> So the question is simply who does the work.
>
> IIRC Samuel Ortiz told he was using GitLab with Aarch64 runners around
> Nov 2018, but "compiling from source". Alex Bennée tried building it on
> our Packet server during early 2019.
> Later an (unattended?) Ubuntu upgrade installed a package that does not
> work anymore with current GitLab server. I noticed this few months ago,
> built it again and tested it, then looked at what was wrong with the
> upstream MR. The Aarch64 packaging succeed when cross-building on x86_64
> host, but fails when building natively... Since part of it is "built or
> tested in the cloud" and involving Go, I simply let a comment:
> https://gitlab.com/gitlab-org/gitlab-runner/merge_requests/725#note_183470145

I need to have another look at this but from memory it is because the
"cross-build" approach they use is to try and build all arches with a
qemu-user controlled docker build. However if the host architecture is
the target you are aiming for that should run normally - however it
failed on qemu-test because you can't build the armhf binaries there as
not all AArch64 machines support AArch32. There is a bug in the Debian
binfmt_misc scripts that I raised but needs proper fixing.

I think the easiest thing to do would be to document how to build
exactly 1 architecture at a time so you don't have to succeed in
building everything at once and can build something natively without
jumping through hoops.

>
> So to confirm what Paolo said, GitLab runners work on Aarch64
> (and we have it well tested), however there is a packaging issue,
> so it does not work "out of the box".
>
>
> Related to:
>
>   Runner compiled with Go 1.8.7 seems to not work properly with
>   multiarch support. Executing the binary built with Go 1.8.7
>   results with an error [...]
>
> There has been 1 recent fix for the go runner:
> https://bugs.launchpad.net/qemu/+bug/1838946/comments/1
>
> And there is an ongoing discussion about "patch to swap SIGRTMIN + 1
> and SIGRTMAX - 1".

Much as I champion qemu-user for solving build problems I think being
able to build natively should be the quicker and easier fix (not that we
shouldn't fix qemu-user bugs as well).

--
Alex Bennée



Re: [Qemu-devel] [PATCH] configure: more resilient Python version capture

2019-08-24 Thread tony.nguyen
> -python_version=$($python -V 2>&1 | sed -e 's/Python\ //')
> +python_version=$(python2 -c 'import sys; print("%d.%d.%d" % 
> (sys.version_info[0], sys.version_info[1], sys.version_info[2]))' 2>/dev/null)

On a Python 3 only system, configure will no longer print the version.

e.g. 
/* snip */
install   install
pythonpython3 -B ()
slirp support git 
/* snip */

> @@ -6511,6 +6511,7 @@ if ! $python -c 'import sys; sys.exit(sys.version_info 
> < (3,0))'; then
>echo
>echo "warning: Python 2 support is deprecated" >&2 
>echo "warning: Python 3 will be required for building future versions of 
> QEMU" >&2 
> +  python2="y"
>  fi  
...
> -echo "PYTHON_VERSION=$python_version" >> $config_host_mak
> +echo "PYTHON2=$python2" >> $config_host_mak
...
> -ifneq ($(findstring v2,"v$(PYTHON_VERSION)"),v2)
> +ifneq ($(PYTHON2),y)

Succinctly, if Python 3.

We can further ween the world off Python 2 by replacing python2="y" for 
python3="y" and PYTHON2 for PYTHON3.


Re: [Qemu-devel] [PATCH 0/3] mailmap: Clean up

2019-08-24 Thread Alex Bennée


Aleksandar Markovic  writes:

> 23.08.2019. 08.13, "Markus Armbruster"  је написао/ла:
>>
>> Philippe Mathieu-Daudé  writes:
>>
>> > Trivial cleanup of .mailmap to have a nice 'git shortlog' output.
>> >
>> > Philippe Mathieu-Daudé (3):
>> >   mailmap: Reorder by sections
>> >   mailmap: Update philmd email address
>> >   mailmap: Add many entries to improve 'git shortlog' statistics
>> >
>> >  .mailmap | 123 +++
>> >  1 file changed, 115 insertions(+), 8 deletions(-)
>>
>> Series
>> Reviewed-by: Markus Armbruster 
>>
>> However, it increases the difference to contrib/gitdm/aliases.
>
> Alex' initial gitdm effort, as I understood it, was not meant to cover all
> history from 2007 or so, but just to give reasonable statistics for 2018
> (amd future years).
>
> In that light, .mailmap and gitdm aliases do not need to be equivalent.
>
> But perhaps Alex would now want gitdm to be used for all QEMU history? Is
> this desirable?

It would be of interest historically but not something I'd want to spend
a lot of time adding code churn for.

>
> Aleksandar
>
>> I'm just
>> as guilty; my recent "[PATCH 2/2] contrib/gitdm: Add arm...@pond.sub.org
>> to group-map-redhat" updates only that. and not .mailmap.
>>
>> Perhaps we want to keep the two in sync manually.  We should then add
>> suitable comments to each file.
>>
>> Could we instead teach gitdm to use .mailmap, and ditch
>> contrib/gitdm/aliases?
>>
>> aliases' format is documented in gitdm's README.  Each line maps a
>> non-canonical e-mail address to a canonical one.
>>
>> .mailmap's format is documented in git-shortlog(1).  It can do a bit
>> more.  Even the common part differs: it has two addresses in different
>> order *boggle*.
>>


--
Alex Bennée



Re: [Qemu-devel] [PATCH 1/1] protocol: Add NBD_CMD_FLAG_FAST_ZERO

2019-08-24 Thread Wouter Verhelst
On Fri, Aug 23, 2019 at 01:58:44PM -0500, Eric Blake wrote:
> On 8/23/19 1:48 PM, Wouter Verhelst wrote:
> > On Fri, Aug 23, 2019 at 09:34:26AM -0500, Eric Blake wrote:
> >> +- bit 4, `NBD_CMD_FLAG_FAST_ZERO`; valid during
> >> +  `NBD_CMD_WRITE_ZEROES`. If set, but the server cannot perform the
> >> +  write zeroes any faster than it would for an equivalent
> >> +  `NBD_CMD_WRITE`,
> > 
> > One way of fulfilling the letter of this requirement but not its spirit
> > could be to have background writes; that is, the server makes a note
> > that the zeroed region should contain zeroes, makes an error-free reply
> > to the client, and then starts updating things in the background (with
> > proper layering so that an NBD_CMD_READ would see zeroes).
> 
> For writes, this should still be viable IF the server can also cancel
> that background write of zeroes in favor of a foreground request for
> actual data to be written to the same offset.  In other words, as long
> as the behavior to the client is "as if" there is no duplicated I/O
> cost, the zero appears fast, even if it kicked off a long-running async
> process to actually accomplish it.

That's kind of what I was thinking of, yeah.

A background write would cause disk I/O, which *will* cause any write
that happen concurrently with it to slow down. If we need to write
several orders of magnitude of zeroes, then the "fast zero" will
actually cause the following writes to slow down, which could impact
performance.

The cancelling should indeed happen (otherwise ordering of writes will
be wrong, as per the spec), but that doesn't negate the performance
impact.

> > This could negatively impact performance after that command to the
> > effect that syncing the device would be slower rather than faster, if
> > not done right.
> 
> Oh. I see - for flush requests, you're worried about the cost of the
> flush forcing the I/O for the background zero to complete before flush
> can return.
> 
> Perhaps that merely means that a client using fast zero requests as a
> means of probing whether it can do a bulk pre-zero pass even though it
> will be rewriting part of that image with data later SHOULD NOT attempt
> to flush the disk until all other interesting write requests are also
> ready to queue.  In the 'qemu-img convert' case which spawned this
> discussion, that's certainly the case (qemu-img does not call flush
> after the pre-zeroing, but only after all data is copied - and then it
> really DOES want to wait for any remaining backgrounded zeroing to land
> on the disk along with any normal writes when it does its final flush).

Not what I meant, but also a good point, thanks :)

> > Do we want to keep that in consideration?
> 
> Ideas on how best to add what I mentioned above into the specification?

Perhaps clarify that the "fast zero" flag is meant to *improve*
performance, and that it therefore should either be implemented in a way
that does in fact improve performance, or not at all?

-- 
 Home is where you have to wash the dishes.
  -- #debian-devel, Freenode, 2004-09-22