Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Marcelo Tosatti
On Tue, Apr 29, 2008 at 07:47:54PM -0500, Anthony Liguori wrote:
> Marcelo Tosatti wrote:
> >>>Moving the signal handling + pipe write to a separate thread should get
> >>>rid of it.
> >>> 
> >>>  
> >>Yeah, but then you just introduce buffering problems since if you're 
> >>getting that many signals, the pipe will get full.
> >>
> >
> >It is OK to lose signals if you have at least one queued in the pipe.
> >  
> 
> If you're getting so many signals that you can't make forward progress 
> on any system call, you're application is not going to function 
> anymore.  A use of signals in this manner is broken by design.
> 
> >>No point in designing for something that isn't likely to happen in 
> >>practice.
> >>
> >
> >You should not design something making the assumption that this scenario
> >won't happen.
> >
> >For example this could happen in high throughput guests using POSIX AIO, 
> >actually pretty likely to happen if data is cached in hosts pagecache.
> >  
> 
> We really just need to move away from signals as best as we can.  I've 
> got a patch started that implements a thread-pool based AIO mechanism 
> for QEMU.  Notifications are done over a pipe so we don't have to deal 
> with the unreliability of signals.
> 
> I can't imagine a guest trying to do so much IO though that this would 
> really ever happen.  POSIX AIO can only have one outstanding request 
> per-fd.  To complete the IO request, you would have to eventually go 
> back to the guest and during that time, the IO thread is going to be 
> able to make forward progress. 

No. POSIX AIO can have one _in flight_ AIO per-fd, but many pending AIO
requests per-fd. You don't have to go back to guest mode to get more
AIO completions.

And then you have drivers arming timers.

I just don't like the idea.

> You won't get a signal again until a new  IO request is submitted.


> 
> Regards,
> 
> Anthony Liguori
> 
> >Its somewhat similar to what happens with NAPI and interrupt mitigation.
> >
> >  

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Anthony Liguori
Marcelo Tosatti wrote:
>>> Moving the signal handling + pipe write to a separate thread should get
>>> rid of it.
>>>  
>>>   
>> Yeah, but then you just introduce buffering problems since if you're 
>> getting that many signals, the pipe will get full.
>> 
>
> It is OK to lose signals if you have at least one queued in the pipe.
>   

If you're getting so many signals that you can't make forward progress 
on any system call, you're application is not going to function 
anymore.  A use of signals in this manner is broken by design.

>> No point in designing for something that isn't likely to happen in practice.
>> 
>
> You should not design something making the assumption that this scenario
> won't happen.
>
> For example this could happen in high throughput guests using POSIX AIO, 
> actually pretty likely to happen if data is cached in hosts pagecache.
>   

We really just need to move away from signals as best as we can.  I've 
got a patch started that implements a thread-pool based AIO mechanism 
for QEMU.  Notifications are done over a pipe so we don't have to deal 
with the unreliability of signals.

I can't imagine a guest trying to do so much IO though that this would 
really ever happen.  POSIX AIO can only have one outstanding request 
per-fd.  To complete the IO request, you would have to eventually go 
back to the guest and during that time, the IO thread is going to be 
able to make forward progress.  You won't get a signal again until a new 
IO request is submitted.

Regards,

Anthony Liguori

> Its somewhat similar to what happens with NAPI and interrupt mitigation.
>
>   


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Marcelo Tosatti
On Tue, Apr 29, 2008 at 07:22:53PM -0500, Anthony Liguori wrote:
> Marcelo Tosatti wrote:
> >True.
> >
> >Either way, this starvation due to signals can't happen with the current
> >scheme because signals are blocked. It seems a significant drawback.
> >  
> 
> Practically speaking, I don't see signal starving being a problem.  Part 
> of the benefit of this approach is that we're reducing the overall 
> number of signals received.  With the in-kernel PIT, the number of 
> userspace signals is even further reduced.
> 
> >Moving the signal handling + pipe write to a separate thread should get
> >rid of it.
> >  
> 
> Yeah, but then you just introduce buffering problems since if you're 
> getting that many signals, the pipe will get full.

It is OK to lose signals if you have at least one queued in the pipe.

> No point in designing for something that isn't likely to happen in practice.

You should not design something making the assumption that this scenario
won't happen.

For example this could happen in high throughput guests using POSIX AIO, 
actually pretty likely to happen if data is cached in hosts pagecache.

Its somewhat similar to what happens with NAPI and interrupt mitigation.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Anthony Liguori
Marcelo Tosatti wrote:
> True.
>
> Either way, this starvation due to signals can't happen with the current
> scheme because signals are blocked. It seems a significant drawback.
>   

Practically speaking, I don't see signal starving being a problem.  Part 
of the benefit of this approach is that we're reducing the overall 
number of signals received.  With the in-kernel PIT, the number of 
userspace signals is even further reduced.

> Moving the signal handling + pipe write to a separate thread should get
> rid of it.
>   

Yeah, but then you just introduce buffering problems since if you're 
getting that many signals, the pipe will get full.

No point in designing for something that isn't likely to happen in practice.

Regards,

Anthony Liguori

>> No, they're independent of the patch.  The symptom is that the guest 
>> tends to hang during boot for prolonged periods of time.  It tends to be 
>> random whether it will hang at all, how long it will hang, and at what 
>> point it will hang.  It tends to hang more often in particular places.  
>> In my ubuntu server guest, for instance, it tends to hang right after 
>> partition probing, right after "Loading hardware drivers", and right 
>> before spawning a getty on /dev/tty0.
>>
>> Normally, it will only hang for a few seconds.  However, with -smp N 
>> where N > number of cpus in the host, it tends to happen more 
>> frequently.  Using clock=jiffies in the guest makes the problem go away.
>> 
>
> I'll try to reproduce that, thanks.
>
>   
>> -no-kvm-pit doesn't make a difference (the guest is normally using the 
>> pm-timer).
>>
>> Regards,
>>
>> Anthony Liguori
>> 


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Marcelo Tosatti
On Tue, Apr 29, 2008 at 06:44:29PM -0500, Anthony Liguori wrote:
> Marcelo Tosatti wrote:
> >Why? If you leave data in the pipe the next select() won't block.
> >
> >Isnt there the possibility that this loop can be stuck for significant
> >amounts of time? If you're receiving lots of notifications through
> >signals.
> >  
> 
> If  you're getting that many signals, then select() is never going to 
> run anyway.

True.

Either way, this starvation due to signals can't happen with the current
scheme because signals are blocked. It seems a significant drawback.

Moving the signal handling + pipe write to a separate thread should get
rid of it.

> No, they're independent of the patch.  The symptom is that the guest 
> tends to hang during boot for prolonged periods of time.  It tends to be 
> random whether it will hang at all, how long it will hang, and at what 
> point it will hang.  It tends to hang more often in particular places.  
> In my ubuntu server guest, for instance, it tends to hang right after 
> partition probing, right after "Loading hardware drivers", and right 
> before spawning a getty on /dev/tty0.
> 
> Normally, it will only hang for a few seconds.  However, with -smp N 
> where N > number of cpus in the host, it tends to happen more 
> frequently.  Using clock=jiffies in the guest makes the problem go away.

I'll try to reproduce that, thanks.

> -no-kvm-pit doesn't make a difference (the guest is normally using the 
> pm-timer).
> 
> Regards,
> 
> Anthony Liguori

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Anthony Liguori
Marcelo Tosatti wrote:
> Why? If you leave data in the pipe the next select() won't block.
>
> Isnt there the possibility that this loop can be stuck for significant
> amounts of time? If you're receiving lots of notifications through
> signals.
>   

If  you're getting that many signals, then select() is never going to 
run anyway.

>> -kvm_eat_signal(&io_signal_table, NULL, 1000);
>>pthread_mutex_lock(&qemu_mutex);
>> -cpu_single_env = NULL;
>> -main_loop_wait(0);
>> +main_loop_wait(10);
>>   
>>
>> 
> Increase that 1000 or something. Will make it easier to spot bugs.
>
>  
>   
 I have actually and it does introduce some bugs.  I'm not entirely clear 
 what is causing them though.

 
>>> Should indicate that some event previously delivered through signals and
>>> received by sigtimedwait is not waking up the IO thread.
>>>  
>>>   
>> I'll take a look and see.  I'm having time keeping issues in the guest 
>> so it's hard to tell what problems are caused by the IO thread verses time.
>> 
>
> Time issues only with the patch? If not, please share details.
>   

No, they're independent of the patch.  The symptom is that the guest 
tends to hang during boot for prolonged periods of time.  It tends to be 
random whether it will hang at all, how long it will hang, and at what 
point it will hang.  It tends to hang more often in particular places.  
In my ubuntu server guest, for instance, it tends to hang right after 
partition probing, right after "Loading hardware drivers", and right 
before spawning a getty on /dev/tty0.

Normally, it will only hang for a few seconds.  However, with -smp N 
where N > number of cpus in the host, it tends to happen more 
frequently.  Using clock=jiffies in the guest makes the problem go away.

-no-kvm-pit doesn't make a difference (the guest is normally using the 
pm-timer).

Regards,

Anthony Liguori


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Marcelo Tosatti
On Tue, Apr 29, 2008 at 06:15:58PM -0500, Anthony Liguori wrote:
> Marcelo Tosatti wrote:
> >Problem is if the IO thread _receives_ SIGIPI instead of some vcpu
> >thread. 
> >
> >So there is potential harm in not blocking it.
> >  
> 
> Hrm, aren't SIG_IPIs delivered to a particular thread-id though?  When 
> would the IO thread receive a SIG_IPI?

Right, they are only delivered to specific threads. Still not harmful 
to make it explicit.

> >>>What is the reason for this loop instead of a straight read? 
> >>>
> >>>Its alright to be interrupted by a signal.
> >>> 
> >>>  
> >>Just general habit with QEMU.
> >>
> >
> >Please don't :-)
> >  
> 
> I don't see the harm.  In fact, I think it's more correct.  Otherwise, 
> we have to wait for another invocation of the fd callback.

Why? If you leave data in the pipe the next select() won't block.

Isnt there the possibility that this loop can be stuck for significant
amounts of time? If you're receiving lots of notifications through
signals.

> -kvm_eat_signal(&io_signal_table, NULL, 1000);
> pthread_mutex_lock(&qemu_mutex);
> -cpu_single_env = NULL;
> -main_loop_wait(0);
> + main_loop_wait(10);
>    
> 
> >>>Increase that 1000 or something. Will make it easier to spot bugs.
> >>> 
> >>>  
> >>I have actually and it does introduce some bugs.  I'm not entirely clear 
> >>what is causing them though.
> >>
> >
> >Should indicate that some event previously delivered through signals and
> >received by sigtimedwait is not waking up the IO thread.
> >  
> 
> I'll take a look and see.  I'm having time keeping issues in the guest 
> so it's hard to tell what problems are caused by the IO thread verses time.

Time issues only with the patch? If not, please share details.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Anthony Liguori
Marcelo Tosatti wrote:
> Problem is if the IO thread _receives_ SIGIPI instead of some vcpu
> thread. 
>
> So there is potential harm in not blocking it.
>   

Hrm, aren't SIG_IPIs delivered to a particular thread-id though?  When 
would the IO thread receive a SIG_IPI?

>>> What is the reason for this loop instead of a straight read? 
>>>
>>> Its alright to be interrupted by a signal.
>>>  
>>>   
>> Just general habit with QEMU.
>> 
>
> Please don't :-)
>   

I don't see the harm.  In fact, I think it's more correct.  Otherwise, 
we have to wait for another invocation of the fd callback.

 -kvm_eat_signal(&io_signal_table, NULL, 1000);
 pthread_mutex_lock(&qemu_mutex);
 -cpu_single_env = NULL;
 -main_loop_wait(0);
 +  main_loop_wait(10);

 
>>> Increase that 1000 or something. Will make it easier to spot bugs.
>>>  
>>>   
>> I have actually and it does introduce some bugs.  I'm not entirely clear 
>> what is causing them though.
>> 
>
> Should indicate that some event previously delivered through signals and
> received by sigtimedwait is not waking up the IO thread.
>   

I'll take a look and see.  I'm having time keeping issues in the guest 
so it's hard to tell what problems are caused by the IO thread verses time.

Regards,

Anthony Liguori


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Marcelo Tosatti
On Tue, Apr 29, 2008 at 05:42:51PM -0500, Anthony Liguori wrote:
> Marcelo Tosatti wrote:
> >Hi Anthony,
> >
> >How is -no-kvm-irqchip working with the patch?
> >  
> 
> Seems to work fine.  What is your expectation?

Just wondering if vcpu's are being properly awake.

> >Make sure the IO thread has SIG_IPI blocked (those are for APIC vcpu
> >initialization only).
> >  
> 
> Just so I'm clear, there's really no harm in not blocking SIG_IPI 
> because it would just be ignored by the IO thread (since the SIG_IPI 
> handler is a nop).  But yeah, we should explicitly block it.

Problem is if the IO thread _receives_ SIGIPI instead of some vcpu
thread. 

So there is potential harm in not blocking it.

> >What is the reason for this loop instead of a straight read? 
> >
> >Its alright to be interrupted by a signal.
> >  
> 
> Just general habit with QEMU.

Please don't :-)

> >>-kvm_eat_signal(&io_signal_table, NULL, 1000);
> >> pthread_mutex_lock(&qemu_mutex);
> >>-cpu_single_env = NULL;
> >>-main_loop_wait(0);
> >>+   main_loop_wait(10);
> >>
> >
> >Increase that 1000 or something. Will make it easier to spot bugs.
> >  
> 
> I have actually and it does introduce some bugs.  I'm not entirely clear 
> what is causing them though.

Should indicate that some event previously delivered through signals and
received by sigtimedwait is not waking up the IO thread.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Anthony Liguori
Marcelo Tosatti wrote:
> Hi Anthony,
>
> How is -no-kvm-irqchip working with the patch?
>   

Seems to work fine.  What is your expectation?

> On Tue, Apr 29, 2008 at 09:28:14AM -0500, Anthony Liguori wrote:
>   
>> This patch eliminates the use of sigtimedwait() in the IO thread.  To avoid 
>> the
>> signal/select race condition, we use a pipe that we write to in the signal
>> handlers.  This was suggested by Rusty and seems to work well.
>>
>> +static int kvm_eat_signal(CPUState *env, int timeout)
>>  {
>>  struct timespec ts;
>>  int r, e, ret = 0;
>>  siginfo_t siginfo;
>> +sigset_t waitset;
>>  
>> +sigemptyset(&waitset);
>> +sigaddset(&waitset, SIG_IPI);
>>  ts.tv_sec = timeout / 1000;
>>  ts.tv_nsec = (timeout % 1000) * 100;
>> -r = sigtimedwait(&waitset->sigset, &siginfo, &ts);
>> +qemu_kvm_unlock();
>> +r = sigtimedwait(&waitset, &siginfo, &ts);
>> +qemu_kvm_lock(env);
>> +cpu_single_env = env;
>> 
>
> This assignment seems redundant now.
>   

Yeah, I have a bigger patch which eliminates all of the explicit 
assignments to cpu_single_env.

>>  
>> @@ -263,12 +238,8 @@ static void pause_all_threads(void)
>>  vcpu_info[i].stop = 1;
>>  pthread_kill(vcpu_info[i].thread, SIG_IPI);
>> 
>
> Make sure the IO thread has SIG_IPI blocked (those are for APIC vcpu
> initialization only).
>   

Just so I'm clear, there's really no harm in not blocking SIG_IPI 
because it would just be ignored by the IO thread (since the SIG_IPI 
handler is a nop).  But yeah, we should explicitly block it.

>> +static void sig_aio_fd_read(void *opaque)
>> +{
>> +int signum;
>> +ssize_t len;
>> +
>> +do { 
>> +len = read(kvm_sigfd[0], &signum, sizeof(signum));
>> +} while (len == -1 && errno == EINTR);
>> 
>
> What is the reason for this loop instead of a straight read? 
>
> Its alright to be interrupted by a signal.
>   

Just general habit with QEMU.

>> +signal(SIGUSR1, sig_aio_handler);
>> +signal(SIGUSR2, sig_aio_handler);
>> +signal(SIGALRM, sig_aio_handler);
>> +signal(SIGIO, sig_aio_handler);
>> +
>> +if (pipe(kvm_sigfd) == -1)
>> +abort();
>> 
>
> perror() would be nice.
>   

Yeah, everything needs proper error handling.

>> -kvm_eat_signal(&io_signal_table, NULL, 1000);
>>  pthread_mutex_lock(&qemu_mutex);
>> -cpu_single_env = NULL;
>> -main_loop_wait(0);
>> +main_loop_wait(10);
>> 
>
> Increase that 1000 or something. Will make it easier to spot bugs.
>   

I have actually and it does introduce some bugs.  I'm not entirely clear 
what is causing them though.

Regards,

Anthony Liguori

> Similarly in qemu_kvm_aio_wait().
>
>   


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Marcelo Tosatti
Hi Anthony,

How is -no-kvm-irqchip working with the patch?

On Tue, Apr 29, 2008 at 09:28:14AM -0500, Anthony Liguori wrote:
> This patch eliminates the use of sigtimedwait() in the IO thread.  To avoid 
> the
> signal/select race condition, we use a pipe that we write to in the signal
> handlers.  This was suggested by Rusty and seems to work well.
> 
> +static int kvm_eat_signal(CPUState *env, int timeout)
>  {
>  struct timespec ts;
>  int r, e, ret = 0;
>  siginfo_t siginfo;
> +sigset_t waitset;
>  
> +sigemptyset(&waitset);
> +sigaddset(&waitset, SIG_IPI);
>  ts.tv_sec = timeout / 1000;
>  ts.tv_nsec = (timeout % 1000) * 100;
> -r = sigtimedwait(&waitset->sigset, &siginfo, &ts);
> +qemu_kvm_unlock();
> +r = sigtimedwait(&waitset, &siginfo, &ts);
> +qemu_kvm_lock(env);
> +cpu_single_env = env;

This assignment seems redundant now.

>  if (r == -1 && (errno == EAGAIN || errno == EINTR) && !timeout)
>   return 0;
>  e = errno;
> -pthread_mutex_lock(&qemu_mutex);
>  if (env && vcpu)
>  cpu_single_env = vcpu->env;

And this one too.

>  
> @@ -263,12 +238,8 @@ static void pause_all_threads(void)
>   vcpu_info[i].stop = 1;
>   pthread_kill(vcpu_info[i].thread, SIG_IPI);

Make sure the IO thread has SIG_IPI blocked (those are for APIC vcpu
initialization only).

> +static void sig_aio_fd_read(void *opaque)
> +{
> +int signum;
> +ssize_t len;
> +
> +do { 
> + len = read(kvm_sigfd[0], &signum, sizeof(signum));
> +} while (len == -1 && errno == EINTR);

What is the reason for this loop instead of a straight read? 

Its alright to be interrupted by a signal.

> +signal(SIGUSR1, sig_aio_handler);
> +signal(SIGUSR2, sig_aio_handler);
> +signal(SIGALRM, sig_aio_handler);
> +signal(SIGIO, sig_aio_handler);
> +
> +if (pipe(kvm_sigfd) == -1)
> + abort();

perror() would be nice.

> -kvm_eat_signal(&io_signal_table, NULL, 1000);
>  pthread_mutex_lock(&qemu_mutex);
> -cpu_single_env = NULL;
> -main_loop_wait(0);
> + main_loop_wait(10);

Increase that 1000 or something. Will make it easier to spot bugs.

Similarly in qemu_kvm_aio_wait().


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH][RFC] Use pipe() to simulate signalfd()

2008-04-29 Thread Anthony Liguori
This patch eliminates the use of sigtimedwait() in the IO thread.  To avoid the
signal/select race condition, we use a pipe that we write to in the signal
handlers.  This was suggested by Rusty and seems to work well.

There are a lot of cleanup/simplification opportunities with this but I've
limited it just to the signal masking/eating routines.  We've got at least one
live lock left in the code that I haven't yet identified.  My goal is to get
this all a lot simplier though so that it's easier to fix the remaining
lock-ups.

I'm looking for some feedback that this is a sane direction.  I haven't tested
this enough yet so please don't apply it.

Signed-off-by: Anthony Liguori <[EMAIL PROTECTED]>

diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
index 9a9bf59..46d7425 100644
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -7,6 +7,9 @@
  */
 #include "config.h"
 #include "config-host.h"
+#include "qemu-common.h"
+#include "block.h"
+#include "console.h"
 
 int kvm_allowed = 1;
 int kvm_irqchip = 1;
@@ -38,14 +41,6 @@ __thread struct vcpu_info *vcpu;
 
 static int qemu_system_ready;
 
-struct qemu_kvm_signal_table {
-sigset_t sigset;
-sigset_t negsigset;
-};
-
-static struct qemu_kvm_signal_table io_signal_table;
-static struct qemu_kvm_signal_table vcpu_signal_table;
-
 #define SIG_IPI (SIGRTMIN+4)
 
 struct vcpu_info {
@@ -169,53 +164,37 @@ static int has_work(CPUState *env)
 return kvm_arch_has_work(env);
 }
 
-static int kvm_process_signal(int si_signo)
-{
-struct sigaction sa;
-
-switch (si_signo) {
-case SIGUSR2:
-pthread_cond_signal(&qemu_aio_cond);
-break;
-case SIGALRM:
-case SIGIO:
-sigaction(si_signo, NULL, &sa);
-sa.sa_handler(si_signo);
-break;
-}
-
-return 1;
-}
-
-static int kvm_eat_signal(struct qemu_kvm_signal_table *waitset, CPUState *env,
-  int timeout)
+static int kvm_eat_signal(CPUState *env, int timeout)
 {
 struct timespec ts;
 int r, e, ret = 0;
 siginfo_t siginfo;
+sigset_t waitset;
 
+sigemptyset(&waitset);
+sigaddset(&waitset, SIG_IPI);
 ts.tv_sec = timeout / 1000;
 ts.tv_nsec = (timeout % 1000) * 100;
-r = sigtimedwait(&waitset->sigset, &siginfo, &ts);
+qemu_kvm_unlock();
+r = sigtimedwait(&waitset, &siginfo, &ts);
+qemu_kvm_lock(env);
+cpu_single_env = env;
 if (r == -1 && (errno == EAGAIN || errno == EINTR) && !timeout)
return 0;
 e = errno;
-pthread_mutex_lock(&qemu_mutex);
 if (env && vcpu)
 cpu_single_env = vcpu->env;
 if (r == -1 && !(errno == EAGAIN || errno == EINTR)) {
printf("sigtimedwait: %s\n", strerror(e));
exit(1);
 }
-if (r != -1)
-ret = kvm_process_signal(siginfo.si_signo);
+ret = 1;
 
 if (env && vcpu_info[env->cpu_index].stop) {
vcpu_info[env->cpu_index].stop = 0;
vcpu_info[env->cpu_index].stopped = 1;
pthread_kill(io_thread, SIGUSR1);
 }
-pthread_mutex_unlock(&qemu_mutex);
 
 return ret;
 }
@@ -224,24 +203,20 @@ static int kvm_eat_signal(struct qemu_kvm_signal_table 
*waitset, CPUState *env,
 static void kvm_eat_signals(CPUState *env, int timeout)
 {
 int r = 0;
-struct qemu_kvm_signal_table *waitset = &vcpu_signal_table;
 
-while (kvm_eat_signal(waitset, env, 0))
+while (kvm_eat_signal(env, 0))
r = 1;
 if (!r && timeout) {
-   r = kvm_eat_signal(waitset, env, timeout);
+   r = kvm_eat_signal(env, timeout);
if (r)
-   while (kvm_eat_signal(waitset, env, 0))
+   while (kvm_eat_signal(env, 0))
;
 }
 }
 
 static void kvm_main_loop_wait(CPUState *env, int timeout)
 {
-pthread_mutex_unlock(&qemu_mutex);
 kvm_eat_signals(env, timeout);
-pthread_mutex_lock(&qemu_mutex);
-cpu_single_env = env;
 vcpu_info[env->cpu_index].signalled = 0;
 }
 
@@ -263,12 +238,8 @@ static void pause_all_threads(void)
vcpu_info[i].stop = 1;
pthread_kill(vcpu_info[i].thread, SIG_IPI);
 }
-while (!all_threads_paused()) {
-   pthread_mutex_unlock(&qemu_mutex);
-   kvm_eat_signal(&io_signal_table, NULL, 1000);
-   pthread_mutex_lock(&qemu_mutex);
-   cpu_single_env = NULL;
-}
+while (!all_threads_paused())
+   main_loop_wait(10);
 }
 
 static void resume_all_threads(void)
@@ -391,18 +362,6 @@ static void *ap_main_loop(void *_env)
 return NULL;
 }
 
-static void qemu_kvm_init_signal_table(struct qemu_kvm_signal_table *sigtab)
-{
-sigemptyset(&sigtab->sigset);
-sigfillset(&sigtab->negsigset);
-}
-
-static void kvm_add_signal(struct qemu_kvm_signal_table *sigtab, int signum)
-{
-sigaddset(&sigtab->sigset, signum);
-sigdelset(&sigtab->negsigset, signum);
-}
-
 void kvm_init_new_ap(int cpu, CPUState *env)
 {
 pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env);
@@ -411,28 +370,12 @@ void kvm_init_new_ap(int cpu, CPUState *env)
pthread_cond_wait