Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2018-01-10 Thread Pavel Dovgalyuk
> From: Peter Maydell [mailto:peter.mayd...@linaro.org]
> On 10 January 2018 at 07:04, Pavel Dovgalyuk  wrote:
> > The failure cause is in incorrect interrupt processing.
> > When ARM processes hardware interrupt in arm_cpu_exec_interrupt(),
> > it executes cs->exception_index = excp_idx;
> >
> > This assumes, that the exception will be processed later.
> > But it is processed immediately by calling cc->do_interrupt(cs);
> > instead of leaving this job to cpu_exec.
> 
> That seems fine to me. The code knows it needs to take
> an interrupt, it has all the information it needs to do
> it, it can just go ahead and call the code that takes
> the interrupt. The comment in accel/tcg/cpu-exec.c says:
> 
> /* The target hook has 3 exit conditions:
>False when the interrupt isn't processed,
>True when it is, and we should restart on a new TB,
>and via longjmp via cpu_loop_exit.  */
> 
> and here we have processed the interrupt and returned true.
> We set exception_index because that's how you tell the
> do_interrupt hook which interrupt to deal with.
> 
> That is, the pattern that the arm target code assumes for
> cs->exception_index is "you don't set this unless you're
> about to call do_interrupt; if you do set it then definitely
> call do_interrupt and don't do anything much in between".
> In that view of the world there's no need to reset it or
> check it because nothing is permitted to happen between
> "set value" and "call do_interrupt". This is in line with
> the way we handle other arm-specific bits of information
> associated with the exception, like env->exception.target_el.
> Having a long gap between "set value" and "do_interrupt"
> is worrying because it means that maybe we might end
> up doing something else in that gap that corrupts the
> various bits of information associated with the exception,
> or something that's not architecturally permitted to
> happen at that point.

I see.
I found the same pattern in other targets.
But only MIPS resets exception_index after processing
the interrupt. Others do not bother.
Then the following change of cpu_exec should be correct?

if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
replay_interrupt();
+   cpu->exception_index = -1;
*last_tb = NULL;
}

Pavel Dovgalyuk




Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2018-01-10 Thread Peter Maydell
On 10 January 2018 at 07:04, Pavel Dovgalyuk  wrote:
> The failure cause is in incorrect interrupt processing.
> When ARM processes hardware interrupt in arm_cpu_exec_interrupt(),
> it executes cs->exception_index = excp_idx;
>
> This assumes, that the exception will be processed later.
> But it is processed immediately by calling cc->do_interrupt(cs);
> instead of leaving this job to cpu_exec.

That seems fine to me. The code knows it needs to take
an interrupt, it has all the information it needs to do
it, it can just go ahead and call the code that takes
the interrupt. The comment in accel/tcg/cpu-exec.c says:

/* The target hook has 3 exit conditions:
   False when the interrupt isn't processed,
   True when it is, and we should restart on a new TB,
   and via longjmp via cpu_loop_exit.  */

and here we have processed the interrupt and returned true.
We set exception_index because that's how you tell the
do_interrupt hook which interrupt to deal with.

That is, the pattern that the arm target code assumes for
cs->exception_index is "you don't set this unless you're
about to call do_interrupt; if you do set it then definitely
call do_interrupt and don't do anything much in between".
In that view of the world there's no need to reset it or
check it because nothing is permitted to happen between
"set value" and "call do_interrupt". This is in line with
the way we handle other arm-specific bits of information
associated with the exception, like env->exception.target_el.
Having a long gap between "set value" and "do_interrupt"
is worrying because it means that maybe we might end
up doing something else in that gap that corrupts the
various bits of information associated with the exception,
or something that's not architecturally permitted to
happen at that point.

thanks
-- PMM



Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2018-01-09 Thread Pavel Dovgalyuk
> From: Peter Maydell [mailto:peter.mayd...@linaro.org]
> On 9 January 2018 at 13:21, Pavel Dovgalyuk  wrote:
> > I tried to get some logs with the following code.
> > It prints that there was an exception 5 and it was overwritten by the 
> > standard code.
> > Fixed code prevents this overwrite.
> >
> > I guess that one of the following is true:
> >  - unfixed version misses some exceptions
> >  - fixed version processes some exceptions twice (e.g., when there is no 
> > clear exception)
> >
> > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> > index 280200f..fa810f7 100644
> > --- a/accel/tcg/cpu-exec.c
> > +++ b/accel/tcg/cpu-exec.c
> > @@ -605,6 +605,8 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> >  /* Finally, check if we need to exit to the main loop.  */
> >  if (unlikely(atomic_read(>exit_request)
> >  || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 
> > 0)))
> > +if (cpu->exception_index != -1 && cpu->exception_index != 
> > EXCP_INTERRUP
> > +qemu_log("overwriting excp_index %x\n", cpu->exception_index);
> >  atomic_set(>exit_request, 0);
> >  cpu->exception_index = EXCP_INTERRUPT;
> >  return true;
> 
> This looks like it's just working around whatever is going on
> (why should EXCP_INTERRUPT be special?). What we need to do is
> find out what's actually happening here...

The failure cause is in incorrect interrupt processing.
When ARM processes hardware interrupt in arm_cpu_exec_interrupt(),
it executes cs->exception_index = excp_idx;

This assumes, that the exception will be processed later.
But it is processed immediately by calling cc->do_interrupt(cs);
instead of leaving this job to cpu_exec.

I guess these calls should be removed to match the cpu_exec execution pattern.

Pavel Dovgalyuk




Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2018-01-09 Thread Peter Maydell
On 9 January 2018 at 13:21, Pavel Dovgalyuk  wrote:
> I tried to get some logs with the following code.
> It prints that there was an exception 5 and it was overwritten by the 
> standard code.
> Fixed code prevents this overwrite.
>
> I guess that one of the following is true:
>  - unfixed version misses some exceptions
>  - fixed version processes some exceptions twice (e.g., when there is no 
> clear exception)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 280200f..fa810f7 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -605,6 +605,8 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>  /* Finally, check if we need to exit to the main loop.  */
>  if (unlikely(atomic_read(>exit_request)
>  || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 
> 0)))
> +if (cpu->exception_index != -1 && cpu->exception_index != 
> EXCP_INTERRUP
> +qemu_log("overwriting excp_index %x\n", cpu->exception_index);
>  atomic_set(>exit_request, 0);
>  cpu->exception_index = EXCP_INTERRUPT;
>  return true;

This looks like it's just working around whatever is going on
(why should EXCP_INTERRUPT be special?). What we need to do is
find out what's actually happening here...

thanks
-- PMM



Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2018-01-09 Thread Pavel Dovgalyuk
> From: Peter Maydell [mailto:peter.mayd...@linaro.org]
> On 20 November 2017 at 11:06, Peter Maydell  wrote:
> > On 20 November 2017 at 10:25, Pavel Dovgalyuk  wrote:
> >>> From: Peter Maydell [mailto:peter.mayd...@linaro.org]
> >>> On 17 November 2017 at 20:26, Paolo Bonzini  wrote:
> >>> > On 17/11/2017 21:07, Peter Maydell wrote:
> >>> >> Hi. This commit breaks booting of Debian on aarch64 virt board.
> >>> >> (repro instructions for creating the image available at:
> >>> >> https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-
> virt-
> >>> board/)
> >>> >> The guest kernel never prints anything to the serial port.
> >>> >>
> >>> >> Reverting this commit fixes master for me, so I plan to do
> >>> >> that on Monday.
> >>> >
> >>> > Maybe you can also test moving the atomic_set inside the "if".  It does
> >>> > seem to be a genuine bugfix.
> >>>
> >>> No, that doesn't help: guest still sits there like a lemon.
> >>
> >> Maybe this is a more complex problem?
> >> I tried removing this if and aarch64 still does not work.
> >
> > Reverting the commit fixes it for me; I have that going through
> > build tests and will push the revert later today.
> 
> Revert pushed to git master.
> 
> More generally, this commit seems to assume that QEMU always
> does:
>  * set exception_index to something
>  * handle that
>  * clear exception_index to -1
> 
> but it's not clear to me that it's actually always the case
> that it gets cleared back to -1.

I tried to get some logs with the following code.
It prints that there was an exception 5 and it was overwritten by the standard 
code.
Fixed code prevents this overwrite.

I guess that one of the following is true:
 - unfixed version misses some exceptions
 - fixed version processes some exceptions twice (e.g., when there is no clear 
exception)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 280200f..fa810f7 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -605,6 +605,8 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
 /* Finally, check if we need to exit to the main loop.  */
 if (unlikely(atomic_read(>exit_request)
 || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 0))) 
+if (cpu->exception_index != -1 && cpu->exception_index != EXCP_INTERRUP
+qemu_log("overwriting excp_index %x\n", cpu->exception_index);
 atomic_set(>exit_request, 0);
 cpu->exception_index = EXCP_INTERRUPT;
 return true;

Pavel Dovgalyuk




Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-20 Thread Paolo Bonzini
On 20/11/2017 13:50, Peter Maydell wrote:
> More generally, this commit seems to assume that QEMU always
> does:
>  * set exception_index to something
>  * handle that
>  * clear exception_index to -1
> 
> but it's not clear to me that it's actually always the case
> that it gets cleared back to -1.

After returning from cpu_handle_interrupt, cpu_exec goes to
cpu_handle_exception which does

if (cpu->exception_index >= EXCP_INTERRUPT) {
*ret = cpu->exception_index;
if (*ret == EXCP_DEBUG) {
cpu_handle_debug_exception(cpu);
}
cpu->exception_index = -1;
return true;
} else {
CPUClass *cc = CPU_GET_CLASS(cpu);
qemu_mutex_lock_iothread();
cc->do_interrupt(cpu);
qemu_mutex_unlock_iothread();
cpu->exception_index = -1;
}

return false;

Does ARM have a case where cc->do_interrupt can longjmp back to the
beginning of cpu_handle_exception?  But I still do not understand why
you don't eventually clear exception_index to -1.  Maybe there should be
an assertion for that before and after cpu_handle_interrupt.

Thanks,

Paolo



Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-20 Thread Peter Maydell
On 20 November 2017 at 11:06, Peter Maydell  wrote:
> On 20 November 2017 at 10:25, Pavel Dovgalyuk  wrote:
>>> From: Peter Maydell [mailto:peter.mayd...@linaro.org]
>>> On 17 November 2017 at 20:26, Paolo Bonzini  wrote:
>>> > On 17/11/2017 21:07, Peter Maydell wrote:
>>> >> Hi. This commit breaks booting of Debian on aarch64 virt board.
>>> >> (repro instructions for creating the image available at:
>>> >> https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-virt-
>>> board/)
>>> >> The guest kernel never prints anything to the serial port.
>>> >>
>>> >> Reverting this commit fixes master for me, so I plan to do
>>> >> that on Monday.
>>> >
>>> > Maybe you can also test moving the atomic_set inside the "if".  It does
>>> > seem to be a genuine bugfix.
>>>
>>> No, that doesn't help: guest still sits there like a lemon.
>>
>> Maybe this is a more complex problem?
>> I tried removing this if and aarch64 still does not work.
>
> Reverting the commit fixes it for me; I have that going through
> build tests and will push the revert later today.

Revert pushed to git master.

More generally, this commit seems to assume that QEMU always
does:
 * set exception_index to something
 * handle that
 * clear exception_index to -1

but it's not clear to me that it's actually always the case
that it gets cleared back to -1.

thanks
-- PMM



Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-20 Thread Peter Maydell
On 20 November 2017 at 10:25, Pavel Dovgalyuk  wrote:
>> From: Peter Maydell [mailto:peter.mayd...@linaro.org]
>> On 17 November 2017 at 20:26, Paolo Bonzini  wrote:
>> > On 17/11/2017 21:07, Peter Maydell wrote:
>> >> Hi. This commit breaks booting of Debian on aarch64 virt board.
>> >> (repro instructions for creating the image available at:
>> >> https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-virt-
>> board/)
>> >> The guest kernel never prints anything to the serial port.
>> >>
>> >> Reverting this commit fixes master for me, so I plan to do
>> >> that on Monday.
>> >
>> > Maybe you can also test moving the atomic_set inside the "if".  It does
>> > seem to be a genuine bugfix.
>>
>> No, that doesn't help: guest still sits there like a lemon.
>
> Maybe this is a more complex problem?
> I tried removing this if and aarch64 still does not work.

Reverting the commit fixes it for me; I have that going through
build tests and will push the revert later today.

thanks
-- PMM



Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-20 Thread Pavel Dovgalyuk
> From: Peter Maydell [mailto:peter.mayd...@linaro.org]
> On 17 November 2017 at 20:26, Paolo Bonzini  wrote:
> > On 17/11/2017 21:07, Peter Maydell wrote:
> >> Hi. This commit breaks booting of Debian on aarch64 virt board.
> >> (repro instructions for creating the image available at:
> >> https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-virt-
> board/)
> >> The guest kernel never prints anything to the serial port.
> >>
> >> Reverting this commit fixes master for me, so I plan to do
> >> that on Monday.
> >
> > Maybe you can also test moving the atomic_set inside the "if".  It does
> > seem to be a genuine bugfix.
> 
> No, that doesn't help: guest still sits there like a lemon.

Maybe this is a more complex problem?
I tried removing this if and aarch64 still does not work.

Pavel Dovgalyuk




Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-17 Thread Peter Maydell
On 17 November 2017 at 20:26, Paolo Bonzini  wrote:
> On 17/11/2017 21:07, Peter Maydell wrote:
>> Hi. This commit breaks booting of Debian on aarch64 virt board.
>> (repro instructions for creating the image available at:
>> https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-virt-board/)
>> The guest kernel never prints anything to the serial port.
>>
>> Reverting this commit fixes master for me, so I plan to do
>> that on Monday.
>
> Maybe you can also test moving the atomic_set inside the "if".  It does
> seem to be a genuine bugfix.

No, that doesn't help: guest still sits there like a lemon.

thanks
-- PMM



Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-17 Thread Paolo Bonzini
On 17/11/2017 21:07, Peter Maydell wrote:
> On 16 November 2017 at 11:59, Paolo Bonzini  wrote:
>> From: Pavel Dovgalyuk 
>>
>> This patch adds a condition before overwriting exception_index fiels.
>> It is needed when exception_index is already set to some meaningful value.
>>
>> Signed-off-by: Pavel Dovgalyuk 
>>
>> Message-Id: <20171114081812.27640.26372.stgit@pasha-VirtualBox>
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  accel/tcg/cpu-exec.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index 61297f8f4a..0473055a08 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -594,7 +594,9 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>>  if (unlikely(atomic_read(>exit_request)
>>  || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 
>> 0))) {
>>  atomic_set(>exit_request, 0);
>> -cpu->exception_index = EXCP_INTERRUPT;
>> +if (cpu->exception_index == -1) {
>> +cpu->exception_index = EXCP_INTERRUPT;
>> +}
>>  return true;
>>  }
> 
> Hi. This commit breaks booting of Debian on aarch64 virt board.
> (repro instructions for creating the image available at:
> https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-virt-board/)
> The guest kernel never prints anything to the serial port.
> 
> Reverting this commit fixes master for me, so I plan to do
> that on Monday.

Maybe you can also test moving the atomic_set inside the "if".  It does
seem to be a genuine bugfix.

Paolo



Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-17 Thread Peter Maydell
On 16 November 2017 at 11:59, Paolo Bonzini  wrote:
> From: Pavel Dovgalyuk 
>
> This patch adds a condition before overwriting exception_index fiels.
> It is needed when exception_index is already set to some meaningful value.
>
> Signed-off-by: Pavel Dovgalyuk 
>
> Message-Id: <20171114081812.27640.26372.stgit@pasha-VirtualBox>
> Signed-off-by: Paolo Bonzini 
> ---
>  accel/tcg/cpu-exec.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 61297f8f4a..0473055a08 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -594,7 +594,9 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>  if (unlikely(atomic_read(>exit_request)
>  || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 
> 0))) {
>  atomic_set(>exit_request, 0);
> -cpu->exception_index = EXCP_INTERRUPT;
> +if (cpu->exception_index == -1) {
> +cpu->exception_index = EXCP_INTERRUPT;
> +}
>  return true;
>  }

Hi. This commit breaks booting of Debian on aarch64 virt board.
(repro instructions for creating the image available at:
https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-virt-board/)
The guest kernel never prints anything to the serial port.

Reverting this commit fixes master for me, so I plan to do
that on Monday.

thanks
-- PMM



[Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-16 Thread Paolo Bonzini
From: Pavel Dovgalyuk 

This patch adds a condition before overwriting exception_index fiels.
It is needed when exception_index is already set to some meaningful value.

Signed-off-by: Pavel Dovgalyuk 

Message-Id: <20171114081812.27640.26372.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini 
---
 accel/tcg/cpu-exec.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 61297f8f4a..0473055a08 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -594,7 +594,9 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
 if (unlikely(atomic_read(>exit_request)
 || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 0))) 
{
 atomic_set(>exit_request, 0);
-cpu->exception_index = EXCP_INTERRUPT;
+if (cpu->exception_index == -1) {
+cpu->exception_index = EXCP_INTERRUPT;
+}
 return true;
 }
 
-- 
2.14.3