date:20110122

[Qemu-devel] Re: MIPS, io-thread, icount and wfi

2011-01-22 Thread Edgar E. Iglesias

On Wed, Jan 19, 2011 at 08:02:28PM +0100, Edgar E. Iglesias wrote:
> On Wed, Jan 19, 2011 at 03:02:26PM -0200, Marcelo Tosatti wrote:
> > On Tue, Jan 18, 2011 at 11:00:57AM +0100, Jan Kiszka wrote:
> > > On 2011-01-18 01:19, Edgar E. Iglesias wrote:
> > > > On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
> > > >> Hi,
> > > >>
> > > >> I'm running an io-thread enabled qemu-system-mipsel with icount.
> > > >> When the guest (linux) goes to sleep through the wait insn (waiting
> > > >> to be woken up by future timer interrupts), the thing deadlocks.
> > > >>
> > > >> IIUC, this is because vm timers are driven by icount, but the CPU is
> > > >> halted so icount makes no progress and time stands still.
> > > >>
> > > >> I've locally disabled vcpu halting when icount is enabled, that
> > > >> works around my problem but of course makes qemu consume 100% host cpu.
> > > >>
> > > >> I don't know why I only see this problem with io-thread builds?
> > > >> Could be related timing and luck.
> > > >>
> > > >> Would be interesting to know if someone has any info on how this was
> > > >> intended to work (if it was)? And if there are ideas for better
> > > >> workarounds or fixes that don't disable vcpu halting entirely.
> > > > 
> > > > Hi,
> > > > 
> > > > I've found the problem. For some reason io-thread builds use a
> > > > static timeout for wait loops. The entire chunk of code that
> > > > makes sure qemu_icount makes forward progress when the CPU's
> > > > are idle has been ifdef'ed away...
> > > > 
> > > > This fixes the problem for me, hopefully without affecting
> > > > io-thread runs without icount.
> > > > 
> > > > commit 0f4f3a919952500b487b438c5520f07a1c6be35b
> > > > Author: Edgar E. Iglesias 
> > > > Date:   Tue Jan 18 01:01:57 2011 +0100
> > > > 
> > > > qemu-timer: Fix timeout calc for io-thread with icount
> > > > 
> > > > Make sure we always make forward progress with qemu_icount to
> > > > avoid deadlocks. For io-thread, use the static 1000 timeout
> > > > only if icount is disabled.
> > > > 
> > > > Signed-off-by: Edgar E. Iglesias 
> > > > 
> > > > diff --git a/qemu-timer.c b/qemu-timer.c
> > > > index 95814af..db1ec49 100644
> > > > --- a/qemu-timer.c
> > > > +++ b/qemu-timer.c
> > > > @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void)
> > > >  }
> > > >  }
> > > >  
> > > > -#ifndef CONFIG_IOTHREAD
> > > >  static int64_t qemu_icount_delta(void)
> > > >  {
> > > >  if (!use_icount) {
> > > > @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void)
> > > >  return cpu_get_icount() - cpu_get_clock();
> > > >  }
> > > >  }
> > > > -#endif
> > > >  
> > > >  /* enable cpu_get_ticks() */
> > > >  void cpu_enable_ticks(void)
> > > > @@ -1077,9 +1075,17 @@ void quit_timers(void)
> > > >  
> > > >  int qemu_calculate_timeout(void)
> > > >  {
> > > > -#ifndef CONFIG_IOTHREAD
> > > >  int timeout;
> > > >  
> > > > +#ifdef CONFIG_IOTHREAD
> > > > +/* When using icount, making forward progress with qemu_icount 
> > > > when the
> > > > +   guest CPU is idle is critical. We only use the static io-thread 
> > > > timeout
> > > > +   for non icount runs.  */
> > > > +if (!use_icount) {
> > > > +return 1000;
> > > > +}
> > > > +#endif
> > > > +
> > > >  if (!vm_running)
> > > >  timeout = 5000;
> > > >  else {
> > > > @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void)
> > > >  }
> > > >  
> > > >  return timeout;
> > > > -#else /* CONFIG_IOTHREAD */
> > > > -return 1000;
> > > > -#endif
> > > >  }
> > > >  
> > > > 
> > > > 
> > > 
> > > This logic and timeout values were imported on iothread merge. And I bet
> > > at least the timeout value of 1s (vs. 5s) can still be found in
> > > qemu-kvm. Maybe someone over there can remember the rationales behind
> > > choosing this value.
> > > 
> > > Jan
> > 
> > This timeout is for the main select() call. So there is not a lot
> > of reasoning, how long to wait when there's no activity on the file
> > descriptors.
> 
> OK, I suspected something like that. Thanks both of you for the info.
> I'll give people a couple of days to complain at the patch, if noone
> does I'll apply it.

Silence - so I've applied this one, thanks.

Cheers

Re: [Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-22 Thread Edgar E. Iglesias

On Sat, Jan 22, 2011 at 03:40:34PM +0100, Aurelien Jarno wrote:
> On Sat, Jan 22, 2011 at 03:32:36PM +0100, Alexander Graf wrote:
> > 
> > On 22.01.2011, at 15:14, Edgar E. Iglesias wrote:
> > 
> > > On Sat, Jan 22, 2011 at 02:13:03PM +0100, Aurelien Jarno wrote:
> > >> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
> > >>> On 2011-01-18 13:05, Alexander Graf wrote:
> >  
> >  On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> >  
> > > Hi,
> > > 
> > >>> Worse might also be that unknown issue that force you to inject an 
> > >>> IRQ
> > >>> here. We don't know. That's probably worst.
> > >> 
> > >> Well, IIRC the issue was that usually a level high interrupt line 
> > >> would
> > >> simply retrigger an interrupt after enabling the interrupt line in 
> > >> the
> > >> APIC again.
> > > 
> > > edge triggered interrupts wouldn't though.
> >  
> >  The code change doesn't change anything for edge triggered interrupts 
> >  - they work fine. Only !msi (== level) ones are broken.
> >  
> > > 
> > >> Unless my memory completely fails on me, that didn't happen,
> > >> so I added the manual retrigger on a partial command ACK in ahci 
> > >> code.
> > > 
> > > That re-trigger smells like you are not clearing the interrupt line 
> > > where you should.  For starters try calling ahci_check_irq() after 
> > > guest writes to PORT_IRQ_STAT.
> >  
> >  The problem happened when I had the following:
> >  
> >  ahci irq bits = 0
> >  
> >  ahci irq bits = 1 | 2
> >  irq line trigger
> >  guest clears 1
> >  ahci bits = 2
> >  
> >  
> >  On normal hardware, the high state of the irq line would simply 
> >  trigger another interrupt in the guest when it gets ACKed on the 
> >  LAPIC. Somehow it doesn't get triggered here. I don't see why clearing 
> >  the interrupt line would help? It's not what real hardware would do, 
> >  no?
> >  
> > >>> 
> > >>> No, real devices would simply leave the line asserted as long as there
> > >>> is a reason to.
> > >>> 
> > >>> So again my question: With which irqchips do you see this effect, kvm's
> > >>> in-kernel model and/or qemu's user space model? If there is an issue
> > >>> with retriggering a CPU interrupt while the source is still asserted,
> > >>> that probably needs to be fixed.
> > >>> 
> > >> 
> > >> I guess it might be related to a problem identified long time ago on
> > >> some targets, and that leads to the following #ifdef in i8259.c:
> > >> 
> > >> | /* all targets should do this rather than acking the IRQ in the cpu */
> > >> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || 
> > >> defined(TARGET_ALPHA)
> > >> 
> > >> For your information it has been fixed on MIPS in commit 
> > >> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
> > >> 
> > >> Basically when an interrupt line is high, the interrupt is getting
> > >> delivered to the CPU. This part works correctly on x86. The CPU will
> > >> take a corresponding action, basically either disabling the interrupt
> > >> at the CPU or controller level or doing something on the device so that
> > >> it lower its IRQ line. This new IRQ line level should then propagate
> > >> through the various controllers, which should also lower their IRQ line
> > >> if no other interrupt line are active. This ACK process should then
> > >> continue up to the CPU.
> > > 
> > > I totally agree.
> > > 
> > > 
> > >> For x86 the interrupt state is cleared as soon as the interrupt is
> > >> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
> > >> is still pending, it won't be seen by the CPU. It's probably what you
> > >> observed with AHCI. 
> > > 
> > > Yes, essentially, the CPU_INTERRUPT_HARD signal is an input to the CPU.
> > > The CPU cannot drive it directly. To lower it, it must take some kind
> > > of indirect action (IO or whatever) to clear the condition that is
> > > forcing it high. Any assignments to clear or set the CPU_INTERRUPT_HARD
> > > signal from within the CPU core are likely wrong..
> > > 
> > > FWIW, PPC code in cpu-exec.c:443 looks suspicious aswell...
> > 
> > How's that suspicious? As long as pending_interrupts is only reset on 
> > actual interrupt delivery, everything's fine. Interrupts like the 
> > decrementor are not level based on some PPCs, so the actual semantics are 
> > implementation dependent.
> > 
> 
> It's suspicious because it's not the right place to do that. It should
> be done in the interrupt controller. This seems to be already done in
> hw/ppc.c when updating pending_interrupts, so this code seems to be
> useless.

Exactly.

The CPU model is trying to drive an input signal. We should find a way to
somehow forbid that in C.

Cheers

[Qemu-devel] Re: microblaze: unused variable

2011-01-22 Thread Edgar E. Iglesias

On Sat, Jan 22, 2011 at 02:43:36PM +, Blue Swirl wrote:
> helper_addkc() in op_helper.c looks buggy, GCC 4.6.0 complains:
>   CCmicroblaze-softmmu/op_helper.o
> /src/qemu/target-microblaze/op_helper.c: In function 'helper_addkc':
> /src/qemu/target-microblaze/op_helper.c:133:14: error: variable 'd'
> set but not used [-Werror=unused-but-set-variable]

Hi,

I've commited something that hopefully fixes the problem now.

Sorry for the previously crazy drunken andriod keyboarded msg :)

Cheers

Re: [Qemu-devel] [sparc] Floating point exception issue

2011-01-22 Thread Artyom Tarasenko

On Sat, Jan 22, 2011 at 5:45 PM, Robert Reif  wrote:
> Blue Swirl wrote:
>>
>> On Sat, Jan 22, 2011 at 3:30 PM, Mateusz Loskot
>>  wrote:
>>
>>>
>>> On 18/01/11 21:51, Blue Swirl wrote:
>>>

 On Tue, Jan 18, 2011 at 6:00 PM, Mateusz Loskot
  wrote:

>
> On 18/01/11 17:36, Blue Swirl wrote:
>
>>
>> On Tue, Jan 18, 2011 at 3:27 PM, Mateusz Loskot
>>  wrote:
>>
>>>
>>> Hi,
>>>
>>> Recently, I have reported mysterious issues on NetBSD 5.1
>>> emulated on SPARC. The whole first thread is here:
>>>
>>> http://lists.gnu.org/archive/html/qemu-devel/2011-01/msg01509.html
>>>
>>> I decided to investigate the problem deeper and with great help
>>> from NetBSD folks, I managed to find reproducible test case.
>>> Initially, it was AWK command:
>>>
>>> # echo NaN | awk '{print "test"}'
>>> awk: floating point exception 8
>>>  source line number 1
>>>
>>> and next it boiled down to simple C program (see below).
>>> Details of the investigation are archived in the NetBSD Problem
>>> Report #44389 here:
>>>
>>> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389
>>>
>>>
>>> Here is final version of the test program which reproduces the
>>> problem:
>>>
>>> #include
>>> #include
>>> #include
>>> #include
>>>
>>> int is_number(const char *s)
>>> {
>>>        double r;
>>>        char *ep;
>>>        errno = 0;
>>>        r = strtod(s,&ep);
>>>        if (r == HUGE_VAL)
>>>                printf("X:%g\n", r);
>>>
>>>        if (ep == s || r == HUGE_VAL || errno == ERANGE)
>>>                return 0;
>>>        while (*ep == ' ' || *ep == '\t' || *ep == '\n')
>>>                ep++;
>>>        if (*ep == '\0')
>>>                return 1;
>>>        else
>>>                return 0;
>>> }
>>>
>>> int main(int argc, char **argv)
>>> {
>>>        double v;
>>>
>>>        if (is_number("NaN")) {
>>>                printf("is a number\n");
>>>                v = atof("NaN");
>>>        } else {
>>>                printf("not a number\n");
>>>                v = 0.0;
>>>        }
>>>        printf("%.4f\n", v);
>>>
>>>        return 0;
>>> }
>>>
>>>
>>> On NetBSD/SPARC, the program receives SIGFPE:
>>>
>>> $ gcc ./nan_test_2.c
>>>  $ ./a.out
>>>  [1]   Floating point exception (core dumped) ./a.out
>>>
>>> Specifically, it's caused by r == HUGE_VAL condition in
>>>  if (ep == s || r == HUGE_VAL || errno == ERANGE)
>>> where r is NaN.
>>>
>>> All the signs indicate there is a bug in QEMU.
>>>
>>
>> I'll install 5.1, but on 4.0 which I had installed, the program works
>> fine:
>> $ ./sigfpe
>> is a number
>> nan
>>
>
> I just tested on NetBSD 5.0/SPARC under QEMU 0.13 (same version I use
> with
> NetBSD 5.1/SPARC) and it works well indeed:
>
> mloskot@qemu-netbsd-50-sparc:~/tmp# ./a.out
> is a number
> nan
> mloskot@qemu-netbsd-50-sparc:~/tmp#
>
> Hmm, this is becoming interesting.
>
> I run QEMU 0.13 on Windows Vista (64-bit).
> Perhaps host system and QEMU binaries are relevant here.
> I will try on Linux host system later tonight.
>
> BTW, here are my images:
>
> http://mateusz.loskot.net/tmp/qemu/
>

 The problem was with NaN handling in fcmped instruction. I've
 committed a patch that fixes the problem, please test. Thanks for
 reporting and the test case.

>>>
>>> FYI, this problem seems to be occurring in qemu on Windows only.
>>> I tested git version dated before you applied the fix
>>> and built on Linux and the problem does not happening there.
>>>
>>
>> No, it was generic problem. I could reproduce it with your test case
>> and with a small assembler program which only used fcmped using user
>> emulator.
>>
>>
>>
>
> There is still a floating point exception problem that may or may not be
> related.

AFAICS they are not related. Strictly speaking they are not FP
problems but deferred trap processing problems.

> Booting an ss5-170 with a real sun ROM image still fails FP tests:
>
> 4.1.1  fpu    reg          regfile                  Pass
> 4.1.2  fpu    reg          misalign                 Pass
> 4.1.3  fpu    reg          single precision         Pass
> 4.1.4  fpu    reg          double precision         Pass
> 4.2.1  fpu    exceptions   single precision         Failed
> Exception Didn't Block Store : 0 : exp= , obs= 
> 4.2.2  fpu    exceptions   double precision         Failed
> Exception Didn't Block Store : 0 : exp= , obs= 
>
>
>



-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/

[Qemu-devel] Re: microblaze: unused variable

2011-01-22 Thread Edgar E. Iglesias

Thanks ble,
The addk. Helper is now there only there to compute carry. I
Feel free to remove the unused statement and unusrd vars. Otherwise ill do
when im back again. Cheers

On Jan 22, 2011 3:43 PM, "Blue Swirl"  wrote:
>
> helper_addkc() in op_helper.c looks buggy, GCC 4.6.0 complains:
>  CCmicroblaze-softmmu/op_helper.o
> /src/qemu/target-microblaze/op_helper.c: In function 'helper_addkc':
> /src/qemu/target-microblaze/op_helper.c:133:14: error: variable 'd'
> set but not used [-Werror=unused-but-set-variable]

Re: [Qemu-devel] Changing the content of target cpu registers

2011-01-22 Thread Blue Swirl

2011/1/21 Raphaël Lefèvre :
> On Wed, Jan 19, 2011 at 2:13 AM, Stefano Bonifazi
>  wrote:
>> On 01/18/2011 06:17 PM, Blue Swirl wrote:
>>>
>>> On Tue, Jan 18, 2011 at 9:29 AM, Stefano Bonifazi
>>>   wrote:

 Hi all!
  I am working on qemu-user (qemu-ppc).
 I'd like to edit the values of target registers during the execution. Can
 I
 do that by simply changing the content of env->gpr[] or do these only
 contain a copy of the values of the registers?
 In this last case, where are the real values of the target registers
 stored
 so that by modifying them I can alter the behavior of the target code
 execution?
>>>
>>> env->gpr is the canonical location, but the translator assigns TCG
>>> variables to them (cpu_gpr[] in translate.c), so GPR contents may be
>>> cached to these. But when helpers are called or the TB finishes,
>>> env->gpr should be valid again.
>>
>> Hi!
>>  Thank you for your answer!
>> So if I understand well if I set env->gpr in a code section where there is
>> no TCG translation on progress, I can edit directly the target CPU register
>> right?
>> Best Regards!
>> Stefano B.
>>
>>
>
> In fact, I need to apologize for my poor comprehension to your
> questions even after digesting the explinations from Blue Swirl. By
> tracing code of qemu, "env->gpr" should be able to be modified any
> place directly(or indirectly) whether the TCG involved or not.

Not exactly: you can't mix TCGv cpu_gpr[x] variable use with
env->gpr[x] access using tcg_gen_{ld,st}_tl(foo, cpu_env,
offsetof(CPUState, gpr[x])). The loads/stores would use env->gpr[x]
directly, bypassing possibly cached values in registers assigned by
TCG.

This may be visible by tracing the generated code, not code in
cpu-exec.c or op_helper.c.

Re: [Qemu-devel] [sparc] Floating point exception issue

2011-01-22 Thread Robert Reif


Blue Swirl wrote:

On Sat, Jan 22, 2011 at 3:30 PM, Mateusz Loskot  wrote:
   

On 18/01/11 21:51, Blue Swirl wrote:
 

On Tue, Jan 18, 2011 at 6:00 PM, Mateusz Loskot  wrote:
   

On 18/01/11 17:36, Blue Swirl wrote:
 

On Tue, Jan 18, 2011 at 3:27 PM, Mateusz Loskot
  wrote:
   

Hi,

Recently, I have reported mysterious issues on NetBSD 5.1
emulated on SPARC. The whole first thread is here:

http://lists.gnu.org/archive/html/qemu-devel/2011-01/msg01509.html

I decided to investigate the problem deeper and with great help
from NetBSD folks, I managed to find reproducible test case.
Initially, it was AWK command:

# echo NaN | awk '{print "test"}'
awk: floating point exception 8
  source line number 1

and next it boiled down to simple C program (see below).
Details of the investigation are archived in the NetBSD Problem
Report #44389 here:

http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389


Here is final version of the test program which reproduces the problem:

#include
#include
#include
#include

int is_number(const char *s)
{
double r;
char *ep;
errno = 0;
r = strtod(s,&ep);
if (r == HUGE_VAL)
printf("X:%g\n", r);

if (ep == s || r == HUGE_VAL || errno == ERANGE)
return 0;
while (*ep == ' ' || *ep == '\t' || *ep == '\n')
ep++;
if (*ep == '\0')
return 1;
else
return 0;
}

int main(int argc, char **argv)
{
double v;

if (is_number("NaN")) {
printf("is a number\n");
v = atof("NaN");
} else {
printf("not a number\n");
v = 0.0;
}
printf("%.4f\n", v);

return 0;
}


On NetBSD/SPARC, the program receives SIGFPE:

$ gcc ./nan_test_2.c
  $ ./a.out
  [1]   Floating point exception (core dumped) ./a.out

Specifically, it's caused by r == HUGE_VAL condition in
  if (ep == s || r == HUGE_VAL || errno == ERANGE)
where r is NaN.

All the signs indicate there is a bug in QEMU.
 

I'll install 5.1, but on 4.0 which I had installed, the program works
fine:
$ ./sigfpe
is a number
nan
   

I just tested on NetBSD 5.0/SPARC under QEMU 0.13 (same version I use with
NetBSD 5.1/SPARC) and it works well indeed:

mloskot@qemu-netbsd-50-sparc:~/tmp# ./a.out
is a number
nan
mloskot@qemu-netbsd-50-sparc:~/tmp#

Hmm, this is becoming interesting.

I run QEMU 0.13 on Windows Vista (64-bit).
Perhaps host system and QEMU binaries are relevant here.
I will try on Linux host system later tonight.

BTW, here are my images:

http://mateusz.loskot.net/tmp/qemu/
 

The problem was with NaN handling in fcmped instruction. I've
committed a patch that fixes the problem, please test. Thanks for
reporting and the test case.
   

FYI, this problem seems to be occurring in qemu on Windows only.
I tested git version dated before you applied the fix
and built on Linux and the problem does not happening there.
 

No, it was generic problem. I could reproduce it with your test case
and with a small assembler program which only used fcmped using user
emulator.


   
There is still a floating point exception problem that may or may not be 
related.


Booting an ss5-170 with a real sun ROM image still fails FP tests:

4.1.1  fpureg  regfile  Pass
4.1.2  fpureg  misalign Pass
4.1.3  fpureg  single precision Pass
4.1.4  fpureg  double precision Pass
4.2.1  fpuexceptions   single precision Failed
Exception Didn't Block Store : 0 : exp= , obs= 
4.2.2  fpuexceptions   double precision Failed
Exception Didn't Block Store : 0 : exp= , obs=

Re: [Qemu-devel] [sparc] Floating point exception issue

2011-01-22 Thread Blue Swirl

On Sat, Jan 22, 2011 at 3:30 PM, Mateusz Loskot  wrote:
> On 18/01/11 21:51, Blue Swirl wrote:
>> On Tue, Jan 18, 2011 at 6:00 PM, Mateusz Loskot  wrote:
>>> On 18/01/11 17:36, Blue Swirl wrote:

 On Tue, Jan 18, 2011 at 3:27 PM, Mateusz Loskot
  wrote:
>
> Hi,
>
> Recently, I have reported mysterious issues on NetBSD 5.1
> emulated on SPARC. The whole first thread is here:
>
> http://lists.gnu.org/archive/html/qemu-devel/2011-01/msg01509.html
>
> I decided to investigate the problem deeper and with great help
> from NetBSD folks, I managed to find reproducible test case.
> Initially, it was AWK command:
>
> # echo NaN | awk '{print "test"}'
> awk: floating point exception 8
>  source line number 1
>
> and next it boiled down to simple C program (see below).
> Details of the investigation are archived in the NetBSD Problem
> Report #44389 here:
>
> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389
>
>
> Here is final version of the test program which reproduces the problem:
>
> #include
> #include
> #include
> #include
>
> int is_number(const char *s)
> {
>        double r;
>        char *ep;
>        errno = 0;
>        r = strtod(s,&ep);
>        if (r == HUGE_VAL)
>                printf("X:%g\n", r);
>
>        if (ep == s || r == HUGE_VAL || errno == ERANGE)
>                return 0;
>        while (*ep == ' ' || *ep == '\t' || *ep == '\n')
>                ep++;
>        if (*ep == '\0')
>                return 1;
>        else
>                return 0;
> }
>
> int main(int argc, char **argv)
> {
>        double v;
>
>        if (is_number("NaN")) {
>                printf("is a number\n");
>                v = atof("NaN");
>        } else {
>                printf("not a number\n");
>                v = 0.0;
>        }
>        printf("%.4f\n", v);
>
>        return 0;
> }
>
>
> On NetBSD/SPARC, the program receives SIGFPE:
>
> $ gcc ./nan_test_2.c
>  $ ./a.out
>  [1]   Floating point exception (core dumped) ./a.out
>
> Specifically, it's caused by r == HUGE_VAL condition in
>  if (ep == s || r == HUGE_VAL || errno == ERANGE)
> where r is NaN.
>
> All the signs indicate there is a bug in QEMU.

 I'll install 5.1, but on 4.0 which I had installed, the program works
 fine:
 $ ./sigfpe
 is a number
 nan
>>>
>>> I just tested on NetBSD 5.0/SPARC under QEMU 0.13 (same version I use with
>>> NetBSD 5.1/SPARC) and it works well indeed:
>>>
>>> mloskot@qemu-netbsd-50-sparc:~/tmp# ./a.out
>>> is a number
>>> nan
>>> mloskot@qemu-netbsd-50-sparc:~/tmp#
>>>
>>> Hmm, this is becoming interesting.
>>>
>>> I run QEMU 0.13 on Windows Vista (64-bit).
>>> Perhaps host system and QEMU binaries are relevant here.
>>> I will try on Linux host system later tonight.
>>>
>>> BTW, here are my images:
>>>
>>> http://mateusz.loskot.net/tmp/qemu/
>>
>> The problem was with NaN handling in fcmped instruction. I've
>> committed a patch that fixes the problem, please test. Thanks for
>> reporting and the test case.
>
> FYI, this problem seems to be occurring in qemu on Windows only.
> I tested git version dated before you applied the fix
> and built on Linux and the problem does not happening there.

No, it was generic problem. I could reproduce it with your test case
and with a small assembler program which only used fcmped using user
emulator.

Re: [Qemu-devel] [sparc] Floating point exception issue

2011-01-22 Thread Mateusz Loskot

On 18/01/11 21:51, Blue Swirl wrote:
> On Tue, Jan 18, 2011 at 6:00 PM, Mateusz Loskot  wrote:
>> On 18/01/11 17:36, Blue Swirl wrote:
>>>
>>> On Tue, Jan 18, 2011 at 3:27 PM, Mateusz Loskot
>>>  wrote:

 Hi,

 Recently, I have reported mysterious issues on NetBSD 5.1
 emulated on SPARC. The whole first thread is here:

 http://lists.gnu.org/archive/html/qemu-devel/2011-01/msg01509.html

 I decided to investigate the problem deeper and with great help
 from NetBSD folks, I managed to find reproducible test case.
 Initially, it was AWK command:

 # echo NaN | awk '{print "test"}'
 awk: floating point exception 8
  source line number 1

 and next it boiled down to simple C program (see below).
 Details of the investigation are archived in the NetBSD Problem
 Report #44389 here:

 http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389


 Here is final version of the test program which reproduces the problem:

 #include
 #include
 #include
 #include

 int is_number(const char *s)
 {
double r;
char *ep;
errno = 0;
r = strtod(s,&ep);
if (r == HUGE_VAL)
printf("X:%g\n", r);

if (ep == s || r == HUGE_VAL || errno == ERANGE)
return 0;
while (*ep == ' ' || *ep == '\t' || *ep == '\n')
ep++;
if (*ep == '\0')
return 1;
else
return 0;
 }

 int main(int argc, char **argv)
 {
double v;

if (is_number("NaN")) {
printf("is a number\n");
v = atof("NaN");
} else {
printf("not a number\n");
v = 0.0;
}
printf("%.4f\n", v);

return 0;
 }


 On NetBSD/SPARC, the program receives SIGFPE:

 $ gcc ./nan_test_2.c
  $ ./a.out
  [1]   Floating point exception (core dumped) ./a.out

 Specifically, it's caused by r == HUGE_VAL condition in
  if (ep == s || r == HUGE_VAL || errno == ERANGE)
 where r is NaN.

 All the signs indicate there is a bug in QEMU.
>>>
>>> I'll install 5.1, but on 4.0 which I had installed, the program works
>>> fine:
>>> $ ./sigfpe
>>> is a number
>>> nan
>>
>> I just tested on NetBSD 5.0/SPARC under QEMU 0.13 (same version I use with
>> NetBSD 5.1/SPARC) and it works well indeed:
>>
>> mloskot@qemu-netbsd-50-sparc:~/tmp# ./a.out
>> is a number
>> nan
>> mloskot@qemu-netbsd-50-sparc:~/tmp#
>>
>> Hmm, this is becoming interesting.
>>
>> I run QEMU 0.13 on Windows Vista (64-bit).
>> Perhaps host system and QEMU binaries are relevant here.
>> I will try on Linux host system later tonight.
>>
>> BTW, here are my images:
>>
>> http://mateusz.loskot.net/tmp/qemu/
> 
> The problem was with NaN handling in fcmped instruction. I've
> committed a patch that fixes the problem, please test. Thanks for
> reporting and the test case.

FYI, this problem seems to be occurring in qemu on Windows only.
I tested git version dated before you applied the fix
and built on Linux and the problem does not happening there.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
Member of ACCU, http://accu.org

[Qemu-devel] microblaze: unused variable

2011-01-22 Thread Blue Swirl

helper_addkc() in op_helper.c looks buggy, GCC 4.6.0 complains:
  CCmicroblaze-softmmu/op_helper.o
/src/qemu/target-microblaze/op_helper.c: In function 'helper_addkc':
/src/qemu/target-microblaze/op_helper.c:133:14: error: variable 'd'
set but not used [-Werror=unused-but-set-variable]

Re: [Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-22 Thread Aurelien Jarno

On Sat, Jan 22, 2011 at 03:32:36PM +0100, Alexander Graf wrote:
> 
> On 22.01.2011, at 15:14, Edgar E. Iglesias wrote:
> 
> > On Sat, Jan 22, 2011 at 02:13:03PM +0100, Aurelien Jarno wrote:
> >> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
> >>> On 2011-01-18 13:05, Alexander Graf wrote:
>  
>  On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
>  
> > Hi,
> > 
> >>> Worse might also be that unknown issue that force you to inject an IRQ
> >>> here. We don't know. That's probably worst.
> >> 
> >> Well, IIRC the issue was that usually a level high interrupt line would
> >> simply retrigger an interrupt after enabling the interrupt line in the
> >> APIC again.
> > 
> > edge triggered interrupts wouldn't though.
>  
>  The code change doesn't change anything for edge triggered interrupts - 
>  they work fine. Only !msi (== level) ones are broken.
>  
> > 
> >> Unless my memory completely fails on me, that didn't happen,
> >> so I added the manual retrigger on a partial command ACK in ahci code.
> > 
> > That re-trigger smells like you are not clearing the interrupt line 
> > where you should.  For starters try calling ahci_check_irq() after 
> > guest writes to PORT_IRQ_STAT.
>  
>  The problem happened when I had the following:
>  
>  ahci irq bits = 0
>  
>  ahci irq bits = 1 | 2
>  irq line trigger
>  guest clears 1
>  ahci bits = 2
>  
>  
>  On normal hardware, the high state of the irq line would simply trigger 
>  another interrupt in the guest when it gets ACKed on the LAPIC. Somehow 
>  it doesn't get triggered here. I don't see why clearing the interrupt 
>  line would help? It's not what real hardware would do, no?
>  
> >>> 
> >>> No, real devices would simply leave the line asserted as long as there
> >>> is a reason to.
> >>> 
> >>> So again my question: With which irqchips do you see this effect, kvm's
> >>> in-kernel model and/or qemu's user space model? If there is an issue
> >>> with retriggering a CPU interrupt while the source is still asserted,
> >>> that probably needs to be fixed.
> >>> 
> >> 
> >> I guess it might be related to a problem identified long time ago on
> >> some targets, and that leads to the following #ifdef in i8259.c:
> >> 
> >> | /* all targets should do this rather than acking the IRQ in the cpu */
> >> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
> >> 
> >> For your information it has been fixed on MIPS in commit 
> >> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
> >> 
> >> Basically when an interrupt line is high, the interrupt is getting
> >> delivered to the CPU. This part works correctly on x86. The CPU will
> >> take a corresponding action, basically either disabling the interrupt
> >> at the CPU or controller level or doing something on the device so that
> >> it lower its IRQ line. This new IRQ line level should then propagate
> >> through the various controllers, which should also lower their IRQ line
> >> if no other interrupt line are active. This ACK process should then
> >> continue up to the CPU.
> > 
> > I totally agree.
> > 
> > 
> >> For x86 the interrupt state is cleared as soon as the interrupt is
> >> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
> >> is still pending, it won't be seen by the CPU. It's probably what you
> >> observed with AHCI. 
> > 
> > Yes, essentially, the CPU_INTERRUPT_HARD signal is an input to the CPU.
> > The CPU cannot drive it directly. To lower it, it must take some kind
> > of indirect action (IO or whatever) to clear the condition that is
> > forcing it high. Any assignments to clear or set the CPU_INTERRUPT_HARD
> > signal from within the CPU core are likely wrong..
> > 
> > FWIW, PPC code in cpu-exec.c:443 looks suspicious aswell...
> 
> How's that suspicious? As long as pending_interrupts is only reset on actual 
> interrupt delivery, everything's fine. Interrupts like the decrementor are 
> not level based on some PPCs, so the actual semantics are implementation 
> dependent.
> 

It's suspicious because it's not the right place to do that. It should
be done in the interrupt controller. This seems to be already done in
hw/ppc.c when updating pending_interrupts, so this code seems to be
useless.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

RE: [Qemu-devel] [PATCH] target-arm: Fix loading of scalar valueforNeon multiply-by-scalar

2011-01-22 Thread Schildbach, Wolfgang

> From: qemu-devel-bounces+wschi=dolby@nongnu.org 
> [mailto:qemu-devel-bounces+wschi=dolby@nongnu.org] On 
> Behalf Of Schildbach, Wolfgang
> 
> > -Original Message-
> > From: qemu-devel-bounces+wschi=dolby@nongnu.org
> > [mailto:qemu-devel-bounces+wschi=dolby@nongnu.org] On Behalf Of 
> > Peter Maydell
> > 
> > Fix the register and part of register we get the scalar from in the 
> > various "multiply vector by scalar" ops (VMUL by scalar and 
> friends).
> 
> FWIW, on the two test cases that I have, this patch (together with
> Christophe's) does not improve behaviour (see 
> https://bugs.launchpad.net/bugs/702885).

Duh. I had missed the greater part of Christophe's patch (I am still
having trouble with my mail client; applying patches off the list is
manual for me).

With both patches applied, indeed the bug filed on launchpad seems
fixed. On my second test case, behaviour is also much improved. Thanks
much!

- Wolfgang

Re: [Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-22 Thread Aurelien Jarno

On Sat, Jan 22, 2011 at 03:28:06PM +0100, Alexander Graf wrote:
> 
> On 22.01.2011, at 14:13, Aurelien Jarno wrote:
> 
> > On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
> >> On 2011-01-18 13:05, Alexander Graf wrote:
> >>> 
> >>> On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> >>> 
>  Hi,
>  
> >> Worse might also be that unknown issue that force you to inject an IRQ
> >> here. We don't know. That's probably worst.
> > 
> > Well, IIRC the issue was that usually a level high interrupt line would
> > simply retrigger an interrupt after enabling the interrupt line in the
> > APIC again.
>  
>  edge triggered interrupts wouldn't though.
> >>> 
> >>> The code change doesn't change anything for edge triggered interrupts - 
> >>> they work fine. Only !msi (== level) ones are broken.
> >>> 
>  
> > Unless my memory completely fails on me, that didn't happen,
> > so I added the manual retrigger on a partial command ACK in ahci code.
>  
>  That re-trigger smells like you are not clearing the interrupt line 
>  where you should.  For starters try calling ahci_check_irq() after guest 
>  writes to PORT_IRQ_STAT.
> >>> 
> >>> The problem happened when I had the following:
> >>> 
> >>> ahci irq bits = 0
> >>> 
> >>> ahci irq bits = 1 | 2
> >>> irq line trigger
> >>> guest clears 1
> >>> ahci bits = 2
> >>> 
> >>> 
> >>> On normal hardware, the high state of the irq line would simply trigger 
> >>> another interrupt in the guest when it gets ACKed on the LAPIC. Somehow 
> >>> it doesn't get triggered here. I don't see why clearing the interrupt 
> >>> line would help? It's not what real hardware would do, no?
> >>> 
> >> 
> >> No, real devices would simply leave the line asserted as long as there
> >> is a reason to.
> >> 
> >> So again my question: With which irqchips do you see this effect, kvm's
> >> in-kernel model and/or qemu's user space model? If there is an issue
> >> with retriggering a CPU interrupt while the source is still asserted,
> >> that probably needs to be fixed.
> >> 
> > 
> > I guess it might be related to a problem identified long time ago on
> > some targets, and that leads to the following #ifdef in i8259.c:
> > 
> > | /* all targets should do this rather than acking the IRQ in the cpu */
> > | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
> > 
> > For your information it has been fixed on MIPS in commit 
> > 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
> > 
> > Basically when an interrupt line is high, the interrupt is getting
> > delivered to the CPU. This part works correctly on x86. The CPU will
> > take a corresponding action, basically either disabling the interrupt
> > at the CPU or controller level or doing something on the device so that
> > it lower its IRQ line. This new IRQ line level should then propagate
> > through the various controllers, which should also lower their IRQ line
> > if no other interrupt line are active. This ACK process should then
> > continue up to the CPU.
> > 
> > For x86 the interrupt state is cleared as soon as the interrupt is
> > signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
> > is still pending, it won't be seen by the CPU. It's probably what you
> > observed with AHCI. 
> 
> I'm not sure what the best way to solve it would be, but maybe a callback on 
> iret would work? Then the interrupt can be checked on again.

This looks really like a hack, and the one you put in AHCI is probably
better than this one

> Alternatively, env->interrupt_request really should get cleared by the LAPIC 
> when the interrupt line is pulled down there.

I think this is the way to go, as it matches what happens on real
hardware. The changes might be a bit more invasive though.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-22 Thread Alexander Graf


On 22.01.2011, at 15:14, Edgar E. Iglesias wrote:

> On Sat, Jan 22, 2011 at 02:13:03PM +0100, Aurelien Jarno wrote:
>> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
>>> On 2011-01-18 13:05, Alexander Graf wrote:
 
 On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
 
> Hi,
> 
>>> Worse might also be that unknown issue that force you to inject an IRQ
>>> here. We don't know. That's probably worst.
>> 
>> Well, IIRC the issue was that usually a level high interrupt line would
>> simply retrigger an interrupt after enabling the interrupt line in the
>> APIC again.
> 
> edge triggered interrupts wouldn't though.
 
 The code change doesn't change anything for edge triggered interrupts - 
 they work fine. Only !msi (== level) ones are broken.
 
> 
>> Unless my memory completely fails on me, that didn't happen,
>> so I added the manual retrigger on a partial command ACK in ahci code.
> 
> That re-trigger smells like you are not clearing the interrupt line where 
> you should.  For starters try calling ahci_check_irq() after guest writes 
> to PORT_IRQ_STAT.
 
 The problem happened when I had the following:
 
 ahci irq bits = 0
 
 ahci irq bits = 1 | 2
 irq line trigger
 guest clears 1
 ahci bits = 2
 
 
 On normal hardware, the high state of the irq line would simply trigger 
 another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it 
 doesn't get triggered here. I don't see why clearing the interrupt line 
 would help? It's not what real hardware would do, no?
 
>>> 
>>> No, real devices would simply leave the line asserted as long as there
>>> is a reason to.
>>> 
>>> So again my question: With which irqchips do you see this effect, kvm's
>>> in-kernel model and/or qemu's user space model? If there is an issue
>>> with retriggering a CPU interrupt while the source is still asserted,
>>> that probably needs to be fixed.
>>> 
>> 
>> I guess it might be related to a problem identified long time ago on
>> some targets, and that leads to the following #ifdef in i8259.c:
>> 
>> | /* all targets should do this rather than acking the IRQ in the cpu */
>> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
>> 
>> For your information it has been fixed on MIPS in commit 
>> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
>> 
>> Basically when an interrupt line is high, the interrupt is getting
>> delivered to the CPU. This part works correctly on x86. The CPU will
>> take a corresponding action, basically either disabling the interrupt
>> at the CPU or controller level or doing something on the device so that
>> it lower its IRQ line. This new IRQ line level should then propagate
>> through the various controllers, which should also lower their IRQ line
>> if no other interrupt line are active. This ACK process should then
>> continue up to the CPU.
> 
> I totally agree.
> 
> 
>> For x86 the interrupt state is cleared as soon as the interrupt is
>> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
>> is still pending, it won't be seen by the CPU. It's probably what you
>> observed with AHCI. 
> 
> Yes, essentially, the CPU_INTERRUPT_HARD signal is an input to the CPU.
> The CPU cannot drive it directly. To lower it, it must take some kind
> of indirect action (IO or whatever) to clear the condition that is
> forcing it high. Any assignments to clear or set the CPU_INTERRUPT_HARD
> signal from within the CPU core are likely wrong..
> 
> FWIW, PPC code in cpu-exec.c:443 looks suspicious aswell...

How's that suspicious? As long as pending_interrupts is only reset on actual 
interrupt delivery, everything's fine. Interrupts like the decrementor are not 
level based on some PPCs, so the actual semantics are implementation dependent.

Either way, I agree that getting interrupts right is very hard :).


Alex

Re: [Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-22 Thread Alexander Graf


On 22.01.2011, at 14:13, Aurelien Jarno wrote:

> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
>> On 2011-01-18 13:05, Alexander Graf wrote:
>>> 
>>> On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
>>> 
 Hi,
 
>> Worse might also be that unknown issue that force you to inject an IRQ
>> here. We don't know. That's probably worst.
> 
> Well, IIRC the issue was that usually a level high interrupt line would
> simply retrigger an interrupt after enabling the interrupt line in the
> APIC again.
 
 edge triggered interrupts wouldn't though.
>>> 
>>> The code change doesn't change anything for edge triggered interrupts - 
>>> they work fine. Only !msi (== level) ones are broken.
>>> 
 
> Unless my memory completely fails on me, that didn't happen,
> so I added the manual retrigger on a partial command ACK in ahci code.
 
 That re-trigger smells like you are not clearing the interrupt line where 
 you should.  For starters try calling ahci_check_irq() after guest writes 
 to PORT_IRQ_STAT.
>>> 
>>> The problem happened when I had the following:
>>> 
>>> ahci irq bits = 0
>>> 
>>> ahci irq bits = 1 | 2
>>> irq line trigger
>>> guest clears 1
>>> ahci bits = 2
>>> 
>>> 
>>> On normal hardware, the high state of the irq line would simply trigger 
>>> another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it 
>>> doesn't get triggered here. I don't see why clearing the interrupt line 
>>> would help? It's not what real hardware would do, no?
>>> 
>> 
>> No, real devices would simply leave the line asserted as long as there
>> is a reason to.
>> 
>> So again my question: With which irqchips do you see this effect, kvm's
>> in-kernel model and/or qemu's user space model? If there is an issue
>> with retriggering a CPU interrupt while the source is still asserted,
>> that probably needs to be fixed.
>> 
> 
> I guess it might be related to a problem identified long time ago on
> some targets, and that leads to the following #ifdef in i8259.c:
> 
> | /* all targets should do this rather than acking the IRQ in the cpu */
> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
> 
> For your information it has been fixed on MIPS in commit 
> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
> 
> Basically when an interrupt line is high, the interrupt is getting
> delivered to the CPU. This part works correctly on x86. The CPU will
> take a corresponding action, basically either disabling the interrupt
> at the CPU or controller level or doing something on the device so that
> it lower its IRQ line. This new IRQ line level should then propagate
> through the various controllers, which should also lower their IRQ line
> if no other interrupt line are active. This ACK process should then
> continue up to the CPU.
> 
> For x86 the interrupt state is cleared as soon as the interrupt is
> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
> is still pending, it won't be seen by the CPU. It's probably what you
> observed with AHCI. 

I'm not sure what the best way to solve it would be, but maybe a callback on 
iret would work? Then the interrupt can be checked on again.

Alternatively, env->interrupt_request really should get cleared by the LAPIC 
when the interrupt line is pulled down there.


Alex

Re: [Qemu-devel] [ARM] Contributing tests for Neon

2011-01-22 Thread Edgar E. Iglesias

On Fri, Jan 21, 2011 at 11:25:05PM +, Peter Maydell wrote:
> On 21 January 2011 10:07, Christophe Lyon  wrote:
> > I have developed some tests for ARM-Neon in the form of C sources files
> > calling ARM Neon intrinsics, and comparing the results of the resulting
> > program with a known reference (eg execution on actual CPU) shows
> > if the execution engine is follows the spec.
> >
> > These tests currently represent 750KB of sources (150 files, 12000 lines),
> 
> That's a larger and more comprehensive test suite than I was
> expecting :-)  (which is fantastic)
> 
> > I have a few questions on how to proceed:
> > - we wish to release the files under the MIT license, is it OK to deliver
> > them as-is in the 'tests' qemu subdir?
> > - given the size of the whole stuff, I don't think it's suitable that I
> > send it on the list for review, what should I do?
> 
> How about you make it available somewhere (tarball via http, git
> repository on gitorious, or other method of your choice) for the
> moment? Then we can take a look at it and proceed from there.

I agree, this is a good first step. If possible, its good if you
can publish source and binaries so people with the appropriate
tools can rebuild the tests. Also, when possible, binaries with
debug info can be helpful when analyzing test failures.

Cheers

Re: [Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-22 Thread Edgar E. Iglesias

On Sat, Jan 22, 2011 at 02:13:03PM +0100, Aurelien Jarno wrote:
> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
> > On 2011-01-18 13:05, Alexander Graf wrote:
> > > 
> > > On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> > > 
> > >>  Hi,
> > >>
> >  Worse might also be that unknown issue that force you to inject an IRQ
> >  here. We don't know. That's probably worst.
> > >>>
> > >>> Well, IIRC the issue was that usually a level high interrupt line would
> > >>> simply retrigger an interrupt after enabling the interrupt line in the
> > >>> APIC again.
> > >>
> > >> edge triggered interrupts wouldn't though.
> > > 
> > > The code change doesn't change anything for edge triggered interrupts - 
> > > they work fine. Only !msi (== level) ones are broken.
> > > 
> > >>
> > >>> Unless my memory completely fails on me, that didn't happen,
> > >>> so I added the manual retrigger on a partial command ACK in ahci code.
> > >>
> > >> That re-trigger smells like you are not clearing the interrupt line 
> > >> where you should.  For starters try calling ahci_check_irq() after guest 
> > >> writes to PORT_IRQ_STAT.
> > > 
> > > The problem happened when I had the following:
> > > 
> > > ahci irq bits = 0
> > > 
> > > ahci irq bits = 1 | 2
> > > irq line trigger
> > > guest clears 1
> > > ahci bits = 2
> > > 
> > > 
> > > On normal hardware, the high state of the irq line would simply trigger 
> > > another interrupt in the guest when it gets ACKed on the LAPIC. Somehow 
> > > it doesn't get triggered here. I don't see why clearing the interrupt 
> > > line would help? It's not what real hardware would do, no?
> > > 
> > 
> > No, real devices would simply leave the line asserted as long as there
> > is a reason to.
> > 
> > So again my question: With which irqchips do you see this effect, kvm's
> > in-kernel model and/or qemu's user space model? If there is an issue
> > with retriggering a CPU interrupt while the source is still asserted,
> > that probably needs to be fixed.
> > 
> 
> I guess it might be related to a problem identified long time ago on
> some targets, and that leads to the following #ifdef in i8259.c:
> 
> | /* all targets should do this rather than acking the IRQ in the cpu */
> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)
> 
> For your information it has been fixed on MIPS in commit 
> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.
> 
> Basically when an interrupt line is high, the interrupt is getting
> delivered to the CPU. This part works correctly on x86. The CPU will
> take a corresponding action, basically either disabling the interrupt
> at the CPU or controller level or doing something on the device so that
> it lower its IRQ line. This new IRQ line level should then propagate
> through the various controllers, which should also lower their IRQ line
> if no other interrupt line are active. This ACK process should then
> continue up to the CPU.

I totally agree.

 
> For x86 the interrupt state is cleared as soon as the interrupt is
> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
> is still pending, it won't be seen by the CPU. It's probably what you
> observed with AHCI. 

Yes, essentially, the CPU_INTERRUPT_HARD signal is an input to the CPU.
The CPU cannot drive it directly. To lower it, it must take some kind
of indirect action (IO or whatever) to clear the condition that is
forcing it high. Any assignments to clear or set the CPU_INTERRUPT_HARD
signal from within the CPU core are likely wrong..

FWIW, PPC code in cpu-exec.c:443 looks suspicious aswell...

Cheers

Re: [Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-22 Thread Aurelien Jarno

On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote:
> On 2011-01-18 13:05, Alexander Graf wrote:
> > 
> > On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> > 
> >>  Hi,
> >>
>  Worse might also be that unknown issue that force you to inject an IRQ
>  here. We don't know. That's probably worst.
> >>>
> >>> Well, IIRC the issue was that usually a level high interrupt line would
> >>> simply retrigger an interrupt after enabling the interrupt line in the
> >>> APIC again.
> >>
> >> edge triggered interrupts wouldn't though.
> > 
> > The code change doesn't change anything for edge triggered interrupts - 
> > they work fine. Only !msi (== level) ones are broken.
> > 
> >>
> >>> Unless my memory completely fails on me, that didn't happen,
> >>> so I added the manual retrigger on a partial command ACK in ahci code.
> >>
> >> That re-trigger smells like you are not clearing the interrupt line where 
> >> you should.  For starters try calling ahci_check_irq() after guest writes 
> >> to PORT_IRQ_STAT.
> > 
> > The problem happened when I had the following:
> > 
> > ahci irq bits = 0
> > 
> > ahci irq bits = 1 | 2
> > irq line trigger
> > guest clears 1
> > ahci bits = 2
> > 
> > 
> > On normal hardware, the high state of the irq line would simply trigger 
> > another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it 
> > doesn't get triggered here. I don't see why clearing the interrupt line 
> > would help? It's not what real hardware would do, no?
> > 
> 
> No, real devices would simply leave the line asserted as long as there
> is a reason to.
> 
> So again my question: With which irqchips do you see this effect, kvm's
> in-kernel model and/or qemu's user space model? If there is an issue
> with retriggering a CPU interrupt while the source is still asserted,
> that probably needs to be fixed.
> 

I guess it might be related to a problem identified long time ago on
some targets, and that leads to the following #ifdef in i8259.c:

| /* all targets should do this rather than acking the IRQ in the cpu */
| #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA)

For your information it has been fixed on MIPS in commit 
4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa.

Basically when an interrupt line is high, the interrupt is getting
delivered to the CPU. This part works correctly on x86. The CPU will
take a corresponding action, basically either disabling the interrupt
at the CPU or controller level or doing something on the device so that
it lower its IRQ line. This new IRQ line level should then propagate
through the various controllers, which should also lower their IRQ line
if no other interrupt line are active. This ACK process should then
continue up to the CPU.

For x86 the interrupt state is cleared as soon as the interrupt is
signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt
is still pending, it won't be seen by the CPU. It's probably what you
observed with AHCI. 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

[Qemu-devel] [PATCH] qemu-char: Check for missing backend name

2011-01-22 Thread Stefan Hajnoczi

Check if the backend option is missing before searching the backend
table.  This fixes a NULL pointer dereference when QEMU is invoked with
the following invalid command-line:

  $ qemu -chardev id=foo,path=/tmp/socket

Previously QEMU would segfault, now it produces this error message:

  chardev: "foo" missing backend

Signed-off-by: Stefan Hajnoczi 
---
 qemu-char.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index edc9ad6..8a424d6 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -2508,6 +2508,11 @@ CharDriverState *qemu_chr_open_opts(QemuOpts *opts,
 return NULL;
 }
 
+if (qemu_opt_get(opts, "backend") == NULL) {
+fprintf(stderr, "chardev: \"%s\" missing backend\n",
+qemu_opts_id(opts));
+return NULL;
+}
 for (i = 0; i < ARRAY_SIZE(backend_table); i++) {
 if (strcmp(backend_table[i].name, qemu_opt_get(opts, "backend")) == 0)
 break;
-- 
1.7.2.3

[Qemu-devel] [RFC][PATCH 02/12] continuation: Fix container_of() redefinition

2011-01-22 Thread Stefan Hajnoczi

QEMU already has a visible container_of() macro definition.  Avoid the
compiler error.

Signed-off-by: Stefan Hajnoczi 
---
 continuation.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/continuation.h b/continuation.h
index 585788e..112df65 100644
--- a/continuation.h
+++ b/continuation.h
@@ -45,8 +45,10 @@ int cc_release(struct continuation *cc);
 int cc_swap(struct continuation *from, struct continuation *to);
 
 #define offset_of(type, member) ((unsigned long)(&((type *)0)->member))
+#ifndef container_of
 #define container_of(obj, type, member) \
 (type *)(((char *)obj) - offset_of(type, member))
+#endif
 
 #endif
 /*
-- 
1.7.2.3

[Qemu-devel] [RFC][PATCH 03/12] Make sure to release allocated stack when coroutine is released.

2011-01-22 Thread Stefan Hajnoczi

Copied from a fix to the original code by Anthony Liguori
.

Signed-off-by: Stefan Hajnoczi 
---
 coroutine_ucontext.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/coroutine_ucontext.c b/coroutine_ucontext.c
index 25e8687..f76da94 100644
--- a/coroutine_ucontext.c
+++ b/coroutine_ucontext.c
@@ -38,6 +38,7 @@ static int _coroutine_release(struct continuation *cc)
if (ret < 0)
return ret;
}
+munmap(co->cc.stack, co->cc.stack_size);
 
co->caller = NULL;
 
-- 
1.7.2.3

[Qemu-devel] [PATCH 2/3] ppc405: Fix memory leak

2011-01-22 Thread Stefan Weil

Signed-off-by: Stefan Weil 
---
 hw/ppc405_boards.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/ppc405_boards.c b/hw/ppc405_boards.c
index b1d2b94..912757b 100644
--- a/hw/ppc405_boards.c
+++ b/hw/ppc405_boards.c
@@ -560,6 +560,7 @@ static void taihu_405ep_init(ram_addr_t ram_size,
 filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
 if (filename) {
 bios_size = load_image(filename, qemu_get_ram_ptr(bios_offset));
+qemu_free(filename);
 } else {
 bios_size = -1;
 }
-- 
1.7.2.3

[Qemu-devel] [PATCH 1/3] pci: Fix memory leak

2011-01-22 Thread Stefan Weil

Signed-off-by: Stefan Weil 
---
 hw/pci.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index 63f476a..38a60ae 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -1877,6 +1877,7 @@ static int pci_add_option_rom(PCIDevice *pdev, bool 
is_default_rom)
 if (size < 0) {
 error_report("%s: failed to find romfile \"%s\"",
  __FUNCTION__, pdev->romfile);
+qemu_free(path);
 return -1;
 }
 if (size & (size - 1)) {
-- 
1.7.2.3

[Qemu-devel] [PATCH 3/3] s390: Fix memory leak

2011-01-22 Thread Stefan Weil

Signed-off-by: Stefan Weil 
---
 hw/s390-virtio.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/s390-virtio.c b/hw/s390-virtio.c
index f29b624..850422f 100644
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -204,6 +204,7 @@ static void s390_init(ram_addr_t ram_size,
 
 bios_filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
 bios_size = load_image(bios_filename, 
qemu_get_ram_ptr(ZIPL_LOAD_ADDR));
+qemu_free(bios_filename);
 
 if ((long)bios_size < 0) {
 hw_error("could not load bootloader '%s'\n", bios_name);
-- 
1.7.2.3

[Qemu-devel] [PATCH KQEMU] Make kqemu compile with kernel 2.6.37

2011-01-22 Thread xtof pernod


Hello,

For those who still use kqemu (not all cpu's support kvm;), here is a
small patch that re-enable the init_MUTEX() macro, whose removal
was planed in 2009: see  https://lkml.org/lkml/2009/7/26/10

Signed-off-by: christophe pernod 
---
diff -up kqemu/kqemu-linux.c.ORG kqemu/kqemu-linux.c
--- kqemu/kqemu-linux.c.ORG 2009-05-31 00:34:10.0 +0200
+++ kqemu/kqemu-linux.c 2011-01-20 19:44:20.431688004 +0100
@@ -26,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -60,6 +62,10 @@ int lock_count;
 int page_alloc_count;
 #endif

+#ifndef init_MUTEX
+#define init_MUTEX(sem) sema_init(sem, 1)
+#endif



--
XtoF.

[Qemu-devel] [RFC][PATCH 01/12] coroutine: Add gtk-vnc coroutines library

2011-01-22 Thread Stefan Hajnoczi

Asynchronous image format code is becoming very complex.  Let's try
using coroutines to write sequential code without callbacks but use
coroutines to switch stacks under the hood.

Signed-off-by: Stefan Hajnoczi 
---
 Makefile.objs|2 +-
 continuation.c   |   87 +
 continuation.h   |   58 ++
 coroutine.h  |   72 +++
 coroutine_ucontext.c |  131 ++
 5 files changed, 349 insertions(+), 1 deletions(-)
 create mode 100644 continuation.c
 create mode 100644 continuation.h
 create mode 100644 coroutine.h
 create mode 100644 coroutine_ucontext.c

diff --git a/Makefile.objs b/Makefile.objs
index c3e52c5..ae26396 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -15,7 +15,7 @@ oslib-obj-$(CONFIG_POSIX) += oslib-posix.o
 
 block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
 block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
-block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
+block-obj-$(CONFIG_POSIX) += posix-aio-compat.o continuation.o 
coroutine_ucontext.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
 block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o 
vvfat.o
diff --git a/continuation.c b/continuation.c
new file mode 100644
index 000..f61aeb5
--- /dev/null
+++ b/continuation.c
@@ -0,0 +1,87 @@
+/*
+ * GTK VNC Widget
+ *
+ * Copyright (C) 2006  Anthony Liguori 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.0 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 
USA
+ */
+
+#include "continuation.h"
+
+/*
+ * va_args to makecontext() must be type 'int', so passing
+ * the pointer we need may require several int args. This
+ * union is a quick hack to let us do that
+ */
+union cc_arg {
+   void *p;
+   int i[2];
+};
+
+static void continuation_trampoline(int i0, int i1)
+{
+   union cc_arg arg;
+   struct continuation *cc;
+   arg.i[0] = i0;
+   arg.i[1] = i1;
+   cc = arg.p;
+
+   cc->entry(cc);
+}
+
+int cc_init(struct continuation *cc)
+{
+   volatile union cc_arg arg;
+   arg.p = cc;
+   if (getcontext(&cc->uc) == -1)
+   return -1;
+
+   cc->uc.uc_link = &cc->last;
+   cc->uc.uc_stack.ss_sp = cc->stack;
+   cc->uc.uc_stack.ss_size = cc->stack_size;
+   cc->uc.uc_stack.ss_flags = 0;
+
+   makecontext(&cc->uc, (void *)continuation_trampoline, 2, arg.i[0], 
arg.i[1]);
+
+   return 0;
+}
+
+int cc_release(struct continuation *cc)
+{
+   if (cc->release)
+   return cc->release(cc);
+
+   return 0;
+}
+
+int cc_swap(struct continuation *from, struct continuation *to)
+{
+   to->exited = 0;
+   if (getcontext(&to->last) == -1)
+   return -1;
+   else if (to->exited == 0)
+   to->exited = 1;
+   else if (to->exited == 1)
+   return 1;
+
+   return swapcontext(&from->uc, &to->uc);
+}
+/*
+ * Local variables:
+ *  c-indent-level: 8
+ *  c-basic-offset: 8
+ *  tab-width: 8
+ * End:
+ */
diff --git a/continuation.h b/continuation.h
new file mode 100644
index 000..585788e
--- /dev/null
+++ b/continuation.h
@@ -0,0 +1,58 @@
+/*
+ * GTK VNC Widget
+ *
+ * Copyright (C) 2006  Anthony Liguori 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.0 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 
USA
+ */
+
+#ifndef _CONTINUATION_H_
+#define _CONTINUATION_H_
+
+#include 
+
+struct continuation
+{
+   char *stack;
+   size_t stack_size;
+   void (*entry)(struct continuation *cc);
+   int (*release)(struct continuation *cc);
+
+   /* private */
+   ucontext_t uc;
+

[Qemu-devel] [RFC][PATCH 12/12] qcow2: Serialize all requests

2011-01-22 Thread Stefan Hajnoczi

QCOW2 with coroutines is not safe because synchronous code paths are no
longer guaranteed to execute without interference from pending requests.
A blocking call like bdrv_pread() causes the coroutine to yield and
another request can be processed during that time, causing to race
conditions or interference between pending requests.

The simple solution is to serialize all requests.  This is bad for
performance and a fine-grained solution needs to be implemented in
future patches.

Using this patch, QCOW2 with coroutines can reliably install a RHEL6 VM
with a virtio-blk disk.

Signed-off-by: Stefan Hajnoczi 
---
 block/qcow2.c |   21 -
 block/qcow2.h |5 +++--
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 4b33ef3..0cea0e8 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -231,6 +231,7 @@ static int qcow2_open(BlockDriverState *bs, int flags)
 }
 
 QLIST_INIT(&s->cluster_allocs);
+QTAILQ_INIT(&s->request_list);
 
 /* read qcow2 extensions */
 if (header.backing_file_offset) {
@@ -365,6 +366,7 @@ typedef struct QCowAIOCB {
 QEMUBH *bh;
 QCowL2Meta l2meta;
 QLIST_ENTRY(QCowAIOCB) next_depend;
+QTAILQ_ENTRY(QCowAIOCB) next_request;
 Coroutine *coroutine;
 int ret; /* return code for user callback */
 } QCowAIOCB;
@@ -385,11 +387,20 @@ static AIOPool qcow2_aio_pool = {
 static void qcow2_aio_bh(void *opaque)
 {
 QCowAIOCB *acb = opaque;
+BDRVQcowState *s = acb->common.bs->opaque;
+
 qemu_bh_delete(acb->bh);
 acb->bh = NULL;
 acb->common.cb(acb->common.opaque, acb->ret);
 qemu_iovec_destroy(&acb->hd_qiov);
+QTAILQ_REMOVE(&s->request_list, acb, next_request);
 qemu_aio_release(acb);
+
+/* Start next request */
+if (!QTAILQ_EMPTY(&s->request_list)) {
+acb = QTAILQ_FIRST(&s->request_list);
+qemu_coroutine_enter(acb->coroutine, acb);
+}
 }
 
 static int qcow2_schedule_bh(QEMUBHFunc *cb, QCowAIOCB *acb)
@@ -548,8 +559,10 @@ static BlockDriverAIOCB *qcow2_aio_setup(BlockDriverState 
*bs,
 int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
 BlockDriverCompletionFunc *cb, void *opaque, int is_write)
 {
+BDRVQcowState *s = bs->opaque;
 QCowAIOCB *acb;
 Coroutine *coroutine;
+int start_request;
 
 acb = qemu_aio_get(&qcow2_aio_pool, bs, cb, opaque);
 if (!acb)
@@ -569,7 +582,13 @@ static BlockDriverAIOCB *qcow2_aio_setup(BlockDriverState 
*bs,
 coroutine = qemu_coroutine_create(is_write ? qcow2_co_write
: qcow2_co_read);
 acb->coroutine = coroutine;
-qemu_coroutine_enter(coroutine, acb);
+
+/* Kick off the request if no others are currently executing */
+start_request = QTAILQ_EMPTY(&s->request_list);
+QTAILQ_INSERT_TAIL(&s->request_list, acb, next_request);
+if (start_request) {
+qemu_coroutine_enter(coroutine, acb);
+}
 return &acb->common;
 }
 
diff --git a/block/qcow2.h b/block/qcow2.h
index 5217bea..159f86b 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -78,6 +78,8 @@ typedef struct QCowSnapshot {
 uint64_t vm_clock_nsec;
 } QCowSnapshot;
 
+struct QCowAIOCB;
+
 typedef struct BDRVQcowState {
 int cluster_bits;
 int cluster_size;
@@ -98,6 +100,7 @@ typedef struct BDRVQcowState {
 uint8_t *cluster_data;
 uint64_t cluster_cache_offset;
 QLIST_HEAD(QCowClusterAlloc, QCowL2Meta) cluster_allocs;
+QTAILQ_HEAD(, QCowAIOCB) request_list;
 
 uint64_t *refcount_table;
 uint64_t refcount_table_offset;
@@ -128,8 +131,6 @@ typedef struct QCowCreateState {
 int64_t refcount_block_offset;
 } QCowCreateState;
 
-struct QCowAIOCB;
-
 /* XXX This could be private for qcow2-cluster.c */
 typedef struct QCowL2Meta
 {
-- 
1.7.2.3

[Qemu-devel] [RFC][PATCH 06/12] coroutine: Add qemu_coroutine_self()

2011-01-22 Thread Stefan Hajnoczi

Add a function to get the current coroutine.  There is always a current
coroutine, either the "leader" (default main coroutine) or a specific
coroutine created with qemu_coroutine_create().

Signed-off-by: Stefan Hajnoczi 
---
 qemu-coroutine.c |5 +
 qemu-coroutine.h |5 +
 2 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/qemu-coroutine.c b/qemu-coroutine.c
index dd2cd8e..e55b7c6 100644
--- a/qemu-coroutine.c
+++ b/qemu-coroutine.c
@@ -38,3 +38,8 @@ void * coroutine_fn qemu_coroutine_yield(void *opaque)
 {
 return coroutine_yield(opaque);
 }
+
+Coroutine * coroutine_fn qemu_coroutine_self(void)
+{
+return (Coroutine*)coroutine_self();
+}
diff --git a/qemu-coroutine.h b/qemu-coroutine.h
index 22fe4ea..41f90bb 100644
--- a/qemu-coroutine.h
+++ b/qemu-coroutine.h
@@ -73,4 +73,9 @@ void *qemu_coroutine_enter(Coroutine *coroutine, void 
*opaque);
  */
 void * coroutine_fn qemu_coroutine_yield(void *opaque);
 
+/**
+ * Get the currently executing coroutine
+ */
+Coroutine * coroutine_fn qemu_coroutine_self(void);
+
 #endif /* QEMU_COROUTINE_H */
-- 
1.7.2.3

[Qemu-devel] [RFC][PATCH 10/12] block: Add coroutine support to synchronous I/O functions

2011-01-22 Thread Stefan Hajnoczi

Implement transparent support for the following synchronous I/O
functions to be called from coroutines:

bdrv_read()
bdrv_write()
bdrv_pread()
bdrv_pwrite()
bdrv_pwrite_sync()
bdrv_write_sync()
bdrv_flush()

These functions will use asynchronous I/O behind the scenes but
otherwise behave the same as the truly synchronous implementations.

This change allows synchronous I/O code in QEMU to just work in a
coroutine.

Signed-off-by: Stefan Hajnoczi 
---
 block.c |   42 ++
 block.h |1 +
 2 files changed, 43 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index ecb6538..f955ded 100644
--- a/block.c
+++ b/block.c
@@ -924,6 +924,17 @@ int bdrv_read(BlockDriverState *bs, int64_t sector_num,
 {
 BlockDriver *drv = bs->drv;
 
+if (qemu_in_coroutine()) {
+QEMUIOVector qiov;
+struct iovec iov = {
+.iov_base = buf,
+.iov_len = nb_sectors * BDRV_SECTOR_SIZE,
+};
+
+qemu_iovec_init_external(&qiov, &iov, 1);
+return bdrv_co_readv(bs, sector_num, &qiov, nb_sectors);
+}
+
 if (!drv)
 return -ENOMEDIUM;
 if (bdrv_check_request(bs, sector_num, nb_sectors))
@@ -970,6 +981,18 @@ int bdrv_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
 {
 BlockDriver *drv = bs->drv;
+
+if (qemu_in_coroutine()) {
+QEMUIOVector qiov;
+struct iovec iov = {
+.iov_base = (void *)buf,
+.iov_len = nb_sectors * BDRV_SECTOR_SIZE,
+};
+
+qemu_iovec_init_external(&qiov, &iov, 1);
+return bdrv_co_writev(bs, sector_num, &qiov, nb_sectors);
+}
+
 if (!bs->drv)
 return -ENOMEDIUM;
 if (bs->read_only)
@@ -1471,6 +1494,10 @@ const char *bdrv_get_device_name(BlockDriverState *bs)
 
 int bdrv_flush(BlockDriverState *bs)
 {
+if (qemu_in_coroutine()) {
+return bdrv_co_flush(bs);
+}
+
 if (bs->open_flags & BDRV_O_NO_FLUSH) {
 return 0;
 }
@@ -2664,6 +2691,21 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, 
int64_t sector_num,
 return bdrv_co_io(bs, sector_num, iov, nb_sectors, 1);
 }
 
+int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
+{
+CoroutineIOCompletion co = {
+.coroutine = qemu_coroutine_self(),
+};
+BlockDriverAIOCB *acb;
+
+acb = bdrv_aio_flush(bs, bdrv_co_complete, &co);
+if (!acb) {
+return -EIO;
+}
+qemu_coroutine_yield(NULL);
+return co.ret;
+}
+
 /**/
 /* removable device support */
 
diff --git a/block.h b/block.h
index 472e3d4..bced0c5 100644
--- a/block.h
+++ b/block.h
@@ -126,6 +126,7 @@ int coroutine_fn bdrv_co_readv(BlockDriverState *bs, 
int64_t sector_num,
QEMUIOVector *iov, int nb_sectors);
 int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
 QEMUIOVector *iov, int nb_sectors);
+int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
 
 typedef struct BlockRequest {
 /* Fields to be filled by multiwrite caller */
-- 
1.7.2.3

[Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutines for async I/O

2011-01-22 Thread Stefan Hajnoczi

Converting qcow2 to use coroutines is fairly simple since most of qcow2
is synchronous.  The synchronous I/O functions likes bdrv_pread() now
transparently work when called from a coroutine, so all the synchronous
code just works.

The explicitly asynchronous code is adjusted to repeatedly call
qcow2_aio_read_cb() or qcow2_aio_write_cb() until the request completes.
At that point the coroutine will return from its entry function and its
resources are freed.

The bdrv_aio_readv() and bdrv_aio_writev() user callback is now invoked
from a BH.  This is necessary since the user callback code does not
expect to be executed from a coroutine.

This conversion is not completely correct because the safety the
synchronous code does not carry over to the coroutine version.
Previously, a synchronous code path could assume that it will never be
interleaved with another request executing.  This is no longer true
because bdrv_pread() and bdrv_pwrite() cause the coroutine to yield and
other requests can be processed during that time.

The solution is to carefully introduce checks so that pending requests
do not step on each other's toes.  That is left for a future patch...

Signed-off-by: Stefan Hajnoczi 
---
 block/qcow2.c |  160 ++---
 1 files changed, 85 insertions(+), 75 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index b6b094c..4b33ef3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -361,19 +361,20 @@ typedef struct QCowAIOCB {
 uint64_t bytes_done;
 uint64_t cluster_offset;
 uint8_t *cluster_data;
-BlockDriverAIOCB *hd_aiocb;
 QEMUIOVector hd_qiov;
 QEMUBH *bh;
 QCowL2Meta l2meta;
 QLIST_ENTRY(QCowAIOCB) next_depend;
+Coroutine *coroutine;
+int ret; /* return code for user callback */
 } QCowAIOCB;
 
 static void qcow2_aio_cancel(BlockDriverAIOCB *blockacb)
 {
 QCowAIOCB *acb = container_of(blockacb, QCowAIOCB, common);
-if (acb->hd_aiocb)
-bdrv_aio_cancel(acb->hd_aiocb);
 qemu_aio_release(acb);
+/* XXX This function looks broken, we could be in the middle of a request
+ * and releasing the acb is not a good idea */
 }
 
 static AIOPool qcow2_aio_pool = {
@@ -381,13 +382,14 @@ static AIOPool qcow2_aio_pool = {
 .cancel = qcow2_aio_cancel,
 };
 
-static void qcow2_aio_read_cb(void *opaque, int ret);
-static void qcow2_aio_read_bh(void *opaque)
+static void qcow2_aio_bh(void *opaque)
 {
 QCowAIOCB *acb = opaque;
 qemu_bh_delete(acb->bh);
 acb->bh = NULL;
-qcow2_aio_read_cb(opaque, 0);
+acb->common.cb(acb->common.opaque, acb->ret);
+qemu_iovec_destroy(&acb->hd_qiov);
+qemu_aio_release(acb);
 }
 
 static int qcow2_schedule_bh(QEMUBHFunc *cb, QCowAIOCB *acb)
@@ -404,14 +406,13 @@ static int qcow2_schedule_bh(QEMUBHFunc *cb, QCowAIOCB 
*acb)
 return 0;
 }
 
-static void qcow2_aio_read_cb(void *opaque, int ret)
+static int coroutine_fn qcow2_aio_read_cb(void *opaque, int ret)
 {
 QCowAIOCB *acb = opaque;
 BlockDriverState *bs = acb->common.bs;
 BDRVQcowState *s = bs->opaque;
 int index_in_cluster, n1;
 
-acb->hd_aiocb = NULL;
 if (ret < 0)
 goto done;
 
@@ -469,22 +470,13 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
 acb->sector_num, acb->cur_nr_sectors);
 if (n1 > 0) {
 BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
-acb->hd_aiocb = bdrv_aio_readv(bs->backing_hd, acb->sector_num,
-&acb->hd_qiov, acb->cur_nr_sectors,
-   qcow2_aio_read_cb, acb);
-if (acb->hd_aiocb == NULL)
-goto done;
-} else {
-ret = qcow2_schedule_bh(qcow2_aio_read_bh, acb);
-if (ret < 0)
+ret = bdrv_co_readv(bs->backing_hd, acb->sector_num, 
&acb->hd_qiov, acb->cur_nr_sectors);
+if (ret < 0) {
 goto done;
+}
 }
 } else {
-/* Note: in this case, no need to wait */
 qemu_iovec_memset(&acb->hd_qiov, 0, 512 * acb->cur_nr_sectors);
-ret = qcow2_schedule_bh(qcow2_aio_read_bh, acb);
-if (ret < 0)
-goto done;
 }
 } else if (acb->cluster_offset & QCOW_OFLAG_COMPRESSED) {
 /* add AIO support for compressed blocks ? */
@@ -494,10 +486,6 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
 qemu_iovec_from_buffer(&acb->hd_qiov,
 s->cluster_cache + index_in_cluster * 512,
 512 * acb->cur_nr_sectors);
-
-ret = qcow2_schedule_bh(qcow2_aio_read_bh, acb);
-if (ret < 0)
-goto done;
 } else {
 if ((acb->cluster_offset & 511) != 0) {
 ret = -EIO;
@@ -522,34 +510,50 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
 }
 
 BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-a

[Qemu-devel] [RFC][PATCH 08/12] coroutine: Add qemu_in_coroutine()

2011-01-22 Thread Stefan Hajnoczi

The qemu_in_coroutine() function checks whether or not we are currently
in a user coroutine.  This can be used to distinguish between executing
normal vcpu or iothread code from executing a coroutine.

Signed-off-by: Stefan Hajnoczi 
---
 qemu-coroutine.c |5 +
 qemu-coroutine.h |7 +++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/qemu-coroutine.c b/qemu-coroutine.c
index e55b7c6..9318600 100644
--- a/qemu-coroutine.c
+++ b/qemu-coroutine.c
@@ -43,3 +43,8 @@ Coroutine * coroutine_fn qemu_coroutine_self(void)
 {
 return (Coroutine*)coroutine_self();
 }
+
+bool qemu_in_coroutine(void)
+{
+return !coroutine_is_leader(coroutine_self());
+}
diff --git a/qemu-coroutine.h b/qemu-coroutine.h
index 41f90bb..1fad3bf 100644
--- a/qemu-coroutine.h
+++ b/qemu-coroutine.h
@@ -14,6 +14,8 @@
 #ifndef QEMU_COROUTINE_H
 #define QEMU_COROUTINE_H
 
+#include 
+
 /**
  * Mark a function that executes in coroutine context
  *
@@ -78,4 +80,9 @@ void * coroutine_fn qemu_coroutine_yield(void *opaque);
  */
 Coroutine * coroutine_fn qemu_coroutine_self(void);
 
+/**
+ * Return whether or not currently inside a coroutine
+ */
+bool qemu_in_coroutine(void);
+
 #endif /* QEMU_COROUTINE_H */
-- 
1.7.2.3

[Qemu-devel] [RFC][PATCH 09/12] block: Add bdrv_co_readv() and bdrv_co_writev()

2011-01-22 Thread Stefan Hajnoczi

Add coroutine versions of the bdrv_aio_readv() and bdrv_aio_writev()
functions.  The coroutine version doesn't take a callback and simply
returns the error value.  Behind the scenes they are implemented using
bdrv_aio_readv() and bdrv_aio_writev().

Signed-off-by: Stefan Hajnoczi 
---
 block.c |   51 +++
 block.h |7 +++
 2 files changed, 58 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index ff2795b..ecb6538 100644
--- a/block.c
+++ b/block.c
@@ -2614,6 +2614,57 @@ void qemu_aio_release(void *p)
 }
 
 /**/
+/* I/O for coroutines */
+
+typedef struct CoroutineIOCompletion {
+Coroutine *coroutine;
+int ret;
+} CoroutineIOCompletion;
+
+static void bdrv_co_complete(void *opaque, int ret)
+{
+CoroutineIOCompletion *co = opaque;
+
+co->ret = ret;
+qemu_coroutine_enter(co->coroutine, NULL);
+}
+
+static int coroutine_fn bdrv_co_io(BlockDriverState *bs, int64_t sector_num,
+   QEMUIOVector *iov, int nb_sectors,
+   int is_write)
+{
+CoroutineIOCompletion co = {
+.coroutine = qemu_coroutine_self(),
+};
+BlockDriverAIOCB *acb;
+
+if (is_write) {
+acb = bdrv_aio_writev(bs, sector_num, iov, nb_sectors,
+  bdrv_co_complete, &co);
+} else {
+acb = bdrv_aio_readv(bs, sector_num, iov, nb_sectors,
+ bdrv_co_complete, &co);
+}
+if (!acb) {
+return -EIO;
+}
+qemu_coroutine_yield(NULL);
+return co.ret;
+}
+
+int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
+   QEMUIOVector *iov, int nb_sectors)
+{
+return bdrv_co_io(bs, sector_num, iov, nb_sectors, 0);
+}
+
+int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
+QEMUIOVector *iov, int nb_sectors)
+{
+return bdrv_co_io(bs, sector_num, iov, nb_sectors, 1);
+}
+
+/**/
 /* removable device support */
 
 /**
diff --git a/block.h b/block.h
index f923add..472e3d4 100644
--- a/block.h
+++ b/block.h
@@ -4,6 +4,7 @@
 #include "qemu-aio.h"
 #include "qemu-common.h"
 #include "qemu-option.h"
+#include "qemu-coroutine.h"
 #include "qobject.h"
 
 /* block.c */
@@ -120,6 +121,12 @@ BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
 BlockDriverCompletionFunc *cb, void *opaque);
 void bdrv_aio_cancel(BlockDriverAIOCB *acb);
 
+/* block I/O for coroutines */
+int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
+   QEMUIOVector *iov, int nb_sectors);
+int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
+QEMUIOVector *iov, int nb_sectors);
+
 typedef struct BlockRequest {
 /* Fields to be filled by multiwrite caller */
 int64_t sector;
-- 
1.7.2.3

[Qemu-devel] [RFC][PATCH 05/12] coroutine: Add coroutines

2011-01-22 Thread Stefan Hajnoczi

Add functions to create coroutines and transfer control into a coroutine
and back out again.

Signed-off-by: Stefan Hajnoczi 
---
 Makefile.objs|2 +-
 qemu-coroutine.c |   40 
 qemu-coroutine.h |   76 ++
 3 files changed, 117 insertions(+), 1 deletions(-)
 create mode 100644 qemu-coroutine.c
 create mode 100644 qemu-coroutine.h

diff --git a/Makefile.objs b/Makefile.objs
index ae26396..7933fc9 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -15,7 +15,7 @@ oslib-obj-$(CONFIG_POSIX) += oslib-posix.o
 
 block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
 block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
-block-obj-$(CONFIG_POSIX) += posix-aio-compat.o continuation.o 
coroutine_ucontext.o
+block-obj-$(CONFIG_POSIX) += posix-aio-compat.o continuation.o 
coroutine_ucontext.o qemu-coroutine.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
 block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o 
vvfat.o
diff --git a/qemu-coroutine.c b/qemu-coroutine.c
new file mode 100644
index 000..dd2cd8e
--- /dev/null
+++ b/qemu-coroutine.c
@@ -0,0 +1,40 @@
+/*
+ * QEMU coroutines
+ *
+ * Copyright IBM, Corp. 2011
+ *
+ * Authors:
+ *  Stefan Hajnoczi
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "coroutine.h"
+
+#include "qemu-common.h"
+#include "qemu-coroutine.h"
+
+struct Coroutine {
+struct coroutine co;
+};
+
+Coroutine *qemu_coroutine_create(CoroutineEntry *entry)
+{
+Coroutine *coroutine = qemu_mallocz(sizeof(*coroutine));
+
+coroutine->co.entry = entry;
+coroutine_init(&coroutine->co);
+return coroutine;
+}
+
+void *qemu_coroutine_enter(Coroutine *coroutine, void *opaque)
+{
+return coroutine_yieldto(&coroutine->co, opaque);
+}
+
+void * coroutine_fn qemu_coroutine_yield(void *opaque)
+{
+return coroutine_yield(opaque);
+}
diff --git a/qemu-coroutine.h b/qemu-coroutine.h
new file mode 100644
index 000..22fe4ea
--- /dev/null
+++ b/qemu-coroutine.h
@@ -0,0 +1,76 @@
+/*
+ * QEMU coroutine implementation
+ *
+ * Copyright IBM, Corp. 2011
+ *
+ * Authors:
+ *  Stefan Hajnoczi
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#ifndef QEMU_COROUTINE_H
+#define QEMU_COROUTINE_H
+
+/**
+ * Mark a function that executes in coroutine context
+ *
+ * Functions that execute in coroutine context cannot be called directly from
+ * normal functions.  In the future it would be nice to enable compiler or
+ * static checker support for catching such errors.  This annotation might make
+ * it possible and in the meantime it serves as documentation.
+ *
+ * For example:
+ *
+ *   static void coroutine_fn foo(void) {
+ *   
+ *   }
+ */
+#define coroutine_fn
+
+typedef struct Coroutine Coroutine;
+
+/**
+ * Coroutine entry point
+ *
+ * When the coroutine is entered for the first time, opaque is passed in as an
+ * argument.
+ *
+ * When this function returns, the coroutine is destroyed automatically and the
+ * return value is passed back to the caller who last entered the coroutine.
+ */
+typedef void * coroutine_fn CoroutineEntry(void *opaque);
+
+/**
+ * Create a new coroutine
+ *
+ * Use qemu_coroutine_enter() to actually transfer control to the coroutine.
+ */
+Coroutine *qemu_coroutine_create(CoroutineEntry *entry);
+
+/**
+ * Transfer control to a coroutine
+ *
+ * The opaque argument is made available to the coroutine either as the entry
+ * function argument if this is the first time a new coroutine is entered, or
+ * as the return value from qemu_coroutine_yield().
+ *
+ * The return value from this function is either an opaque value yielded by the
+ * coroutine or the coroutine entry function return value when the coroutine
+ * terminates.
+ */
+void *qemu_coroutine_enter(Coroutine *coroutine, void *opaque);
+
+/**
+ * Transfer control back to a coroutine's caller
+ *
+ * The opaque argument is returned from the calling qemu_coroutine_enter().
+ *
+ * The return value is the argument passed back in from the next
+ * qemu_coroutine_enter().
+ */
+void * coroutine_fn qemu_coroutine_yield(void *opaque);
+
+#endif /* QEMU_COROUTINE_H */
-- 
1.7.2.3

[Qemu-devel] [RFC][PATCH 00/12] qcow2: Convert qcow2 to use coroutines for async I/O

2011-01-22 Thread Stefan Hajnoczi

This patch series prototypes making QCOW2 fully asynchronous to eliminate the
timing jitter and poor performance that has been observed.  QCOW2 has
asynchronous I/O code paths for some of the read/write common cases but
metadata access is always synchronous.

One solution is to rewrite QCOW2 to be fully asynchronous by splitting all
functions that perform blocking I/O into a series of callbacks.  Due to the
complexity of QCOW2, this conversion and the maintenance prospects are
unattractive.

This patch series prototypes an alternative solution to make QCOW2
asynchronous.  It introduces coroutines, cooperative userspace threads of
control, so that each QCOW2 request has its own call stack.  To perform I/O,
the coroutine submits an asynchronous I/O request and then yields back to QEMU.
The coroutine stays suspended while the I/O operation is being processed by
lower layers of the stack.  When the asynchronous I/O completes, the coroutine
is resumed.

The upshot of this is that QCOW2 can be implemented in a sequential fashion
without explicit callbacks but all I/O actually happens asynchronously under
the covers.

This prototype implements reads, writes, and flushes.  Should install or boot
VMs successfully.  However, it has the following limitations:

1. QCOW2 requests are serialized because the code is not yet safe for
   concurrent requests.  See the last patch for details.

2. Coroutines are unoptimized.  We should pool coroutines (and their mmapped
   stacks) to avoid the cost of coroutine creation.

3. The qcow2_aio_read_cb() and qcow2_aoi_write_cb() functions should be
   refactored into sequential code now that callbacks are no longer needed.

I think this approach can solve the performance and functional problems of the
current QCOW2 implementation.  It does not require invasive changes, much of
QCOW2 works unmodified.

Kevin: Do you like this approach and do you want to develop it further?

Stefan

[Qemu-devel] [RFC][PATCH 04/12] coroutine: Use thread-local leader and current variables

2011-01-22 Thread Stefan Hajnoczi

Each pthread should have its own current coroutine.  This ensures
coroutines are thread-safe.

Signed-off-by: Stefan Hajnoczi 
---
 coroutine_ucontext.c |5 -
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/coroutine_ucontext.c b/coroutine_ucontext.c
index f76da94..289e5bd 100644
--- a/coroutine_ucontext.c
+++ b/coroutine_ucontext.c
@@ -70,13 +70,8 @@ int coroutine_init(struct coroutine *co)
return cc_init(&co->cc);
 }
 
-#if 0
 static __thread struct coroutine leader;
 static __thread struct coroutine *current;
-#else
-static struct coroutine leader;
-static struct coroutine *current;
-#endif
 
 struct coroutine *coroutine_self(void)
 {
-- 
1.7.2.3

[Qemu-devel] [RFC][PATCH 07/12] coroutine: Add coroutine_is_leader()

2011-01-22 Thread Stefan Hajnoczi

Make it possible to check whether a coroutine is the default main
coroutine (the "leader") or not.

Signed-off-by: Stefan Hajnoczi 
---
 coroutine.h  |2 ++
 coroutine_ucontext.c |5 +
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/coroutine.h b/coroutine.h
index 5316535..3aca19a 100644
--- a/coroutine.h
+++ b/coroutine.h
@@ -58,6 +58,8 @@ void *coroutine_swap(struct coroutine *from, struct coroutine 
*to, void *arg);
 
 struct coroutine *coroutine_self(void);
 
+int coroutine_is_leader(struct coroutine *co);
+
 void *coroutine_yieldto(struct coroutine *to, void *arg);
 
 void *coroutine_yield(void *arg);
diff --git a/coroutine_ucontext.c b/coroutine_ucontext.c
index 289e5bd..b90a2f6 100644
--- a/coroutine_ucontext.c
+++ b/coroutine_ucontext.c
@@ -80,6 +80,11 @@ struct coroutine *coroutine_self(void)
return current;
 }
 
+int coroutine_is_leader(struct coroutine *co)
+{
+return co == &leader;
+}
+
 void *coroutine_swap(struct coroutine *from, struct coroutine *to, void *arg)
 {
int ret;
-- 
1.7.2.3

Re: [Qemu-devel] [PATCH 1/5] Fast Virtual Disk (FVD) Proposal Part 1

2011-01-22 Thread Peter Maydell

On 20 January 2011 17:08, Stefan Weil  wrote:
> Yes, that's a problem with some parts of the old code.
> For files which you want to modify, you could remove
> the spaces with your script before applying your other
> modifications and create a separate patch which only
> removes the superfluous spaces.

(This kind of came up in the other thread about fixing
non-C89 comments. I don't have any particular interest in
this area of the qemu source so this is a general remark.)

I definitely dislike patches which change whitespace or
indentation for an entire file, even if they are standalone
"only fixing whitespace" patches; they make it much harder
to deal with forks and branches of qemu. I would prefer
it if we only fix whitespace, indent and bracing for lines
we're touching anyway.

-- PMM

37 matches

Mail list logo