Re: [Qemu-devel] [PATCH for-2.5] tcg: Increase the highwater reservation

2015-12-04 Thread Aurelien Jarno
On 2015-12-03 13:19, Aurelien Jarno wrote:
> On 2015-12-02 10:36, Richard Henderson wrote:
> > On 12/01/2015 08:32 AM, Aurelien Jarno wrote:
> > >On 2015-12-01 08:19, Richard Henderson wrote:
> > >>If there are a lot of guest memory ops in the TB, the amount of
> > >>code generated by tcg_out_tb_finalize could be well more than 1k.
> > >>In the short term, increase the reservation larger than any TB
> > >>seen in practice.
> > >>
> > >>Reported-by: Aurelien Jarno 
> > >>Signed-off-by: Richard Henderson 
> > >
> > >Thanks! I was testing the same patch locally after our discussion, and I
> > >confirm it fixes the issue.
> > 
> > Here's the proper patch for 2.6.  If you'd like to run it through your same
> > tests, please?
> 
> The patch looks fine to me, the only tiny nitpick I have is that you can
> return a bool instead of a hint.
> 
> I am currently testing it (which is basically running a glibc build and
> testsuite multiple time in a VM), I'll let you know the result by
> tomorrow. In the meantime, you can add:
> 
> Reviewed-by: Aurelien Jarno 

I haven't seen any crash so far, which is more time than it usually
needs to trigger the issue. So I guess we can consider that the patch is
correct. Therefore:

Tested-by: Aurelien Jarno 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH for-2.5] tcg: Increase the highwater reservation

2015-12-03 Thread Aurelien Jarno
On 2015-12-02 10:36, Richard Henderson wrote:
> On 12/01/2015 08:32 AM, Aurelien Jarno wrote:
> >On 2015-12-01 08:19, Richard Henderson wrote:
> >>If there are a lot of guest memory ops in the TB, the amount of
> >>code generated by tcg_out_tb_finalize could be well more than 1k.
> >>In the short term, increase the reservation larger than any TB
> >>seen in practice.
> >>
> >>Reported-by: Aurelien Jarno 
> >>Signed-off-by: Richard Henderson 
> >
> >Thanks! I was testing the same patch locally after our discussion, and I
> >confirm it fixes the issue.
> 
> Here's the proper patch for 2.6.  If you'd like to run it through your same
> tests, please?

The patch looks fine to me, the only tiny nitpick I have is that you can
return a bool instead of a hint.

I am currently testing it (which is basically running a glibc build and
testsuite multiple time in a VM), I'll let you know the result by
tomorrow. In the meantime, you can add:

Reviewed-by: Aurelien Jarno 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH for-2.5] tcg: Increase the highwater reservation

2015-12-02 Thread Richard Henderson

On 12/01/2015 08:32 AM, Aurelien Jarno wrote:

On 2015-12-01 08:19, Richard Henderson wrote:

If there are a lot of guest memory ops in the TB, the amount of
code generated by tcg_out_tb_finalize could be well more than 1k.
In the short term, increase the reservation larger than any TB
seen in practice.

Reported-by: Aurelien Jarno 
Signed-off-by: Richard Henderson 


Thanks! I was testing the same patch locally after our discussion, and I
confirm it fixes the issue.


Here's the proper patch for 2.6.  If you'd like to run it through your same 
tests, please?



r~
diff --git a/tcg/ia64/tcg-target.c b/tcg/ia64/tcg-target.c
index 647e9a6..b30d643 100644
--- a/tcg/ia64/tcg-target.c
+++ b/tcg/ia64/tcg-target.c
@@ -1572,7 +1572,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, TCGMemOp opc,
 be->labels = l;
 }
 
-static void tcg_out_tb_finalize(TCGContext *s)
+static int tcg_out_tb_finalize(TCGContext *s)
 {
 static const void * const helpers[8] = {
 helper_ret_stb_mmu,
@@ -1620,7 +1620,16 @@ static void tcg_out_tb_finalize(TCGContext *s)
 }
 
 reloc_pcrel21b_slot2(l->label_ptr, dest);
+
+/* Test for (pending) buffer overflow.  The assumption is that any
+   one operation beginning below the high water mark cannot overrun
+   the buffer completely.  Thus we can test for overflow after
+   generating code without having to check during generation.  */
+if (unlikely((void *)s->code_ptr > s->code_gen_highwater)) {
+return -1;
+}
 }
+return 0;
 }
 
 static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args)
diff --git a/tcg/tcg-be-ldst.h b/tcg/tcg-be-ldst.h
index 40a2369..08222a3 100644
--- a/tcg/tcg-be-ldst.h
+++ b/tcg/tcg-be-ldst.h
@@ -56,7 +56,7 @@ static inline void tcg_out_tb_init(TCGContext *s)
 static void tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l);
 static void tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l);
 
-static void tcg_out_tb_finalize(TCGContext *s)
+static int tcg_out_tb_finalize(TCGContext *s)
 {
 TCGLabelQemuLdst *lb;
 
@@ -67,7 +67,16 @@ static void tcg_out_tb_finalize(TCGContext *s)
 } else {
 tcg_out_qemu_st_slow_path(s, lb);
 }
+
+/* Test for (pending) buffer overflow.  The assumption is that any
+   one operation beginning below the high water mark cannot overrun
+   the buffer completely.  Thus we can test for overflow after
+   generating code without having to check during generation.  */
+if (unlikely((void *)s->code_ptr > s->code_gen_highwater)) {
+return -1;
+}
 }
+return 0;
 }
 
 /*
diff --git a/tcg/tcg-be-null.h b/tcg/tcg-be-null.h
index 74c57d5..051f609 100644
--- a/tcg/tcg-be-null.h
+++ b/tcg/tcg-be-null.h
@@ -38,6 +38,7 @@ static inline void tcg_out_tb_init(TCGContext *s)
  * Generate TB finalization at the end of block
  */
 
-static inline void tcg_out_tb_finalize(TCGContext *s)
+static inline int tcg_out_tb_finalize(TCGContext *s)
 {
+return 0;
 }
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a163541..841ed53 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -110,7 +110,7 @@ static void tcg_out_call(TCGContext *s, tcg_insn_unit 
*target);
 static int tcg_target_const_match(tcg_target_long val, TCGType type,
   const TCGArgConstraint *arg_ct);
 static void tcg_out_tb_init(TCGContext *s);
-static void tcg_out_tb_finalize(TCGContext *s);
+static int tcg_out_tb_finalize(TCGContext *s);
 
 
 
@@ -388,11 +388,7 @@ void tcg_prologue_init(TCGContext *s)
 /* Compute a high-water mark, at which we voluntarily flush the buffer
and start over.  The size here is arbitrary, significantly larger
than we expect the code generation for any one opcode to require.  */
-/* ??? We currently have no good estimate for, or checks in,
-   tcg_out_tb_finalize.  If there are quite a lot of guest memory ops,
-   the number of out-of-line fragments could be quite high.  In the
-   short-term, increase the highwater buffer.  */
-s->code_gen_highwater = s->code_gen_buffer + (total_size - 64*1024);
+s->code_gen_highwater = s->code_gen_buffer + (total_size - 1024);
 
 tcg_register_jit(s->code_gen_buffer, total_size);
 
@@ -2455,7 +2451,9 @@ int tcg_gen_code(TCGContext *s, tcg_insn_unit 
*gen_code_buf)
 s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
 
 /* Generate TB finalization at the end of block */
-tcg_out_tb_finalize(s);
+if (tcg_out_tb_finalize(s) < 0) {
+return -1;
+}
 
 /* flush instruction cache */
 flush_icache_range((uintptr_t)s->code_buf, (uintptr_t)s->code_ptr);


Re: [Qemu-devel] [PATCH for-2.5] tcg: Increase the highwater reservation

2015-12-01 Thread Richard Henderson

On 12/01/2015 08:40 AM, Aurelien Jarno wrote:

On 2015-12-01 17:34, Aurelien Jarno wrote:

On 2015-12-01 16:28, Peter Maydell wrote:

On 1 December 2015 at 16:19, Richard Henderson  wrote:

If there are a lot of guest memory ops in the TB, the amount of
code generated by tcg_out_tb_finalize could be well more than 1k.
In the short term, increase the reservation larger than any TB
seen in practice.

Reported-by: Aurelien Jarno 
Signed-off-by: Richard Henderson 
---

Reported and discussed with Aurelien on IRC yesterday.  This seems
to be the easiest fix for the upcoming release.  I will fix this
properly (by modifying every backend's finalize routines) for 2.6.


What would be the result of our hitting this bug? I ask because
there's a report on qemu-discuss about a qemu-i386-on-ARM-host
bug: http://lists.nongnu.org/archive/html/qemu-discuss/2015-11/msg00042.html
and the debug log (http://www.mediafire.com/download/ge611be9vbebbw7/qemu.log)
suggests we're segfaulting in translation on the TB shortly
after we (successfully) translate a TB whose final 'out' size
is 1100 and which has 64 guest writes in it. So I'm wondering
if that's actually the same bug this is fixing...


I don't think this is the same bug. The problem happens because the slow
path of the softmmu load/store access is written at the end of the TB.
In user mode, there is no slow path, so nothing is written at the end.


Oh quite right.  Duh.


That said the problem reported is likely fixed by this commit that went
just after it has been reported:


It does seem likely, but I don't see how we can know that the out size is 1100 
in that situation.  The disassembler dump doesn't happen until after we've done 
all of the writes that would have resulted in a highwater overflow segv.


I don't see how this can be the same bug.


r~



Re: [Qemu-devel] [PATCH for-2.5] tcg: Increase the highwater reservation

2015-12-01 Thread Peter Maydell
On 1 December 2015 at 23:06, Richard Henderson  wrote:
> On 12/01/2015 08:40 AM, Aurelien Jarno wrote:
>> That said the problem reported is likely fixed by this commit that went
>> just after it has been reported:
>
>
> It does seem likely, but I don't see how we can know that the out size is
> 1100 in that situation.  The disassembler dump doesn't happen until after
> we've done all of the writes that would have resulted in a highwater
> overflow segv.

Yeah, if we always cleanly segv immediately on highwater overflow
(as opposed to corrupting something so a later translation crashes)
then this can't be the bug that's reported for qemu-i386. The
actual TB that we never finish translating is quite small:

IN:
0x419552e0:  push   %ebp
0x419552e1:  mov%esp,%ebp
0x419552e3:  sub$0x18,%esp
0x419552e6:  fldl   0x8(%ebp)
0x419552e9:  fstpl  -0x8(%ebp)
0x419552ec:  movl   $0x1400,0x4(%esp)
0x419552f4:  movl   $0x2,(%esp)
0x419552fb:  call   0x41954b96

thanks
-- PMM



Re: [Qemu-devel] [PATCH for-2.5] tcg: Increase the highwater reservation

2015-12-01 Thread Richard Henderson

On 12/01/2015 08:28 AM, Peter Maydell wrote:

On 1 December 2015 at 16:19, Richard Henderson  wrote:

If there are a lot of guest memory ops in the TB, the amount of
code generated by tcg_out_tb_finalize could be well more than 1k.
In the short term, increase the reservation larger than any TB
seen in practice.

Reported-by: Aurelien Jarno 
Signed-off-by: Richard Henderson 
---

Reported and discussed with Aurelien on IRC yesterday.  This seems
to be the easiest fix for the upcoming release.  I will fix this
properly (by modifying every backend's finalize routines) for 2.6.


What would be the result of our hitting this bug?


A segfault, writing to the guard page for the code_gen buffer.


I ask because
there's a report on qemu-discuss about a qemu-i386-on-ARM-host
bug: http://lists.nongnu.org/archive/html/qemu-discuss/2015-11/msg00042.html
and the debug log (http://www.mediafire.com/download/ge611be9vbebbw7/qemu.log)
suggests we're segfaulting in translation on the TB shortly
after we (successfully) translate a TB whose final 'out' size
is 1100 and which has 64 guest writes in it. So I'm wondering
if that's actually the same bug this is fixing...


It's plausible.

The maximum 32-bit memory op for arm requires 9 insns in the slow path.  Times 
64 that's 2304 bytes, which exceeds the current highwater buffer space.


The new 64k buffer allows for 1820 (arm backend) writes before exceeding the 
highwater buffer.  Which is significantly more than TCG_MAX_INSNS (512), though 
not even close to OPC_MAX_SIZE (170240), which would require > 6MB in highwater 
space.



r~



Re: [Qemu-devel] [PATCH for-2.5] tcg: Increase the highwater reservation

2015-12-01 Thread Peter Maydell
On 1 December 2015 at 16:19, Richard Henderson  wrote:
> If there are a lot of guest memory ops in the TB, the amount of
> code generated by tcg_out_tb_finalize could be well more than 1k.
> In the short term, increase the reservation larger than any TB
> seen in practice.
>
> Reported-by: Aurelien Jarno 
> Signed-off-by: Richard Henderson 
> ---
>
> Reported and discussed with Aurelien on IRC yesterday.  This seems
> to be the easiest fix for the upcoming release.  I will fix this
> properly (by modifying every backend's finalize routines) for 2.6.

What would be the result of our hitting this bug? I ask because
there's a report on qemu-discuss about a qemu-i386-on-ARM-host
bug: http://lists.nongnu.org/archive/html/qemu-discuss/2015-11/msg00042.html
and the debug log (http://www.mediafire.com/download/ge611be9vbebbw7/qemu.log)
suggests we're segfaulting in translation on the TB shortly
after we (successfully) translate a TB whose final 'out' size
is 1100 and which has 64 guest writes in it. So I'm wondering
if that's actually the same bug this is fixing...

thanks
-- PMM



Re: [Qemu-devel] [PATCH for-2.5] tcg: Increase the highwater reservation

2015-12-01 Thread Aurelien Jarno
On 2015-12-01 08:19, Richard Henderson wrote:
> If there are a lot of guest memory ops in the TB, the amount of
> code generated by tcg_out_tb_finalize could be well more than 1k.
> In the short term, increase the reservation larger than any TB
> seen in practice.
> 
> Reported-by: Aurelien Jarno 
> Signed-off-by: Richard Henderson 

Thanks! I was testing the same patch locally after our discussion, and I
confirm it fixes the issue.

Reviewed-by: Aurelien Jarno 
Tested-by: Aurelien Jarno 


-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH for-2.5] tcg: Increase the highwater reservation

2015-12-01 Thread Aurelien Jarno
On 2015-12-01 16:28, Peter Maydell wrote:
> On 1 December 2015 at 16:19, Richard Henderson  wrote:
> > If there are a lot of guest memory ops in the TB, the amount of
> > code generated by tcg_out_tb_finalize could be well more than 1k.
> > In the short term, increase the reservation larger than any TB
> > seen in practice.
> >
> > Reported-by: Aurelien Jarno 
> > Signed-off-by: Richard Henderson 
> > ---
> >
> > Reported and discussed with Aurelien on IRC yesterday.  This seems
> > to be the easiest fix for the upcoming release.  I will fix this
> > properly (by modifying every backend's finalize routines) for 2.6.
> 
> What would be the result of our hitting this bug? I ask because
> there's a report on qemu-discuss about a qemu-i386-on-ARM-host
> bug: http://lists.nongnu.org/archive/html/qemu-discuss/2015-11/msg00042.html
> and the debug log (http://www.mediafire.com/download/ge611be9vbebbw7/qemu.log)
> suggests we're segfaulting in translation on the TB shortly
> after we (successfully) translate a TB whose final 'out' size
> is 1100 and which has 64 guest writes in it. So I'm wondering
> if that's actually the same bug this is fixing...

I don't think this is the same bug. The problem happens because the slow
path of the softmmu load/store access is written at the end of the TB.
In user mode, there is no slow path, so nothing is written at the end.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH for-2.5] tcg: Increase the highwater reservation

2015-12-01 Thread Aurelien Jarno
On 2015-12-01 17:34, Aurelien Jarno wrote:
> On 2015-12-01 16:28, Peter Maydell wrote:
> > On 1 December 2015 at 16:19, Richard Henderson  wrote:
> > > If there are a lot of guest memory ops in the TB, the amount of
> > > code generated by tcg_out_tb_finalize could be well more than 1k.
> > > In the short term, increase the reservation larger than any TB
> > > seen in practice.
> > >
> > > Reported-by: Aurelien Jarno 
> > > Signed-off-by: Richard Henderson 
> > > ---
> > >
> > > Reported and discussed with Aurelien on IRC yesterday.  This seems
> > > to be the easiest fix for the upcoming release.  I will fix this
> > > properly (by modifying every backend's finalize routines) for 2.6.
> > 
> > What would be the result of our hitting this bug? I ask because
> > there's a report on qemu-discuss about a qemu-i386-on-ARM-host
> > bug: http://lists.nongnu.org/archive/html/qemu-discuss/2015-11/msg00042.html
> > and the debug log 
> > (http://www.mediafire.com/download/ge611be9vbebbw7/qemu.log)
> > suggests we're segfaulting in translation on the TB shortly
> > after we (successfully) translate a TB whose final 'out' size
> > is 1100 and which has 64 guest writes in it. So I'm wondering
> > if that's actually the same bug this is fixing...
> 
> I don't think this is the same bug. The problem happens because the slow
> path of the softmmu load/store access is written at the end of the TB.
> In user mode, there is no slow path, so nothing is written at the end.

That said the problem reported is likely fixed by this commit that went
just after it has been reported:

commit 644da9b39e477caa80bab69d2847dfcb468f0d33
Author: John Clarke 
Date:   Thu Nov 19 10:30:50 2015 +0100

tcg: Fix highwater check

A simple typo in the variable to use when comparing vs the highwater mark.
Reports are that qemu can in fact segfault occasionally due to this mistake.

Signed-off-by: John Clarke 
Signed-off-by: Richard Henderson 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net