Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-20 Thread Avi Kivity
Davide Libenzi wrote:
 On Thu, 17 May 2007, Avi Kivity wrote:

   
 Davide Libenzi wrote:
 
 On Wed, 16 May 2007, Avi Kivity wrote:

   
   
 IMO doing eventfd_fget() asap is best.  I much prefer refcounted pointers
 to
 handles in the kernel: it's easier to see what things point to, and there
 is
 to context needed for dereferencing.
 
 
 There are concerns (from Al and Christoph) about file lifetime tracking of
 an fd passed down to the kernel.
   
   
 What concerns are these?

 I have my own concerns about bare fds, which can change their backing file,
 and which are dependent on the context.  The kernel appears to be moving away
 from handles to pointers, as seen by the pid - task_struck/struct pid
 conversion.
 

 The concern is that by keeping a reference to the file* you may end up 
 with a file that has no more userspace links. Although the file_count()==1 
 will tell you that you've the only reference left, this relies on you 
 doing the check. I really don't know how this will be used in KVM, so I 
 can't really say how well these concerns applies.

   

That's perfectly fine for kvm (and as far as I can see, any other 
application: the user losing their eventfd reference is equivalent to 
the user ignoring any events, which should be survivable by the kernel).



   
 Avi, how about you go the other way around? I expose you something like:

 long eventfd_create(unsigned int count, void (*release)(void *),
 void *priv);

 This returns you an eventfd (or error), and lets you install a callback to
 be used when the file_operations-release is called (basically, userspace
 closed the last file instance).
 Can KVM pass back an fd to userspace instead of the other way around?
   
   
 That was my original thought, but when I saw your proposal I preferred that 
 as
 more flexible.

 If we go this way, I prefer to have a struct file * and do away with the
 callback, but that brings us back to the file lifetime concerns.
 

 Doing the above requires some care too. Access to your copy of the file* 
 must be protected by a lock (you can sleep from inside the callback, so a 
 mutex is fine too). I'll sketch you some code:


 /* Defined in linux/eventfd.h */
 struct eventfd_relcb {
   struct list_head lnk;
   void (*proc)(struct eventfd_relcb *);
 };

 /* Your context data */
 struct your_ctx {
   whatever_lock your_lock;
   ...
   struct eventfd_relcb rcb;
   struct file *evfile;
 };

 /* Your eventfd release callback */
 void rcb_callback(struct eventfd_relcb *rcb) {
   struct your_ctx *c = container_of(rcb, struct your_ctx, rcb);

   whatever_lock_lock(c-your_lock);
   ...
   c-evfile = NULL;
   whatever_lock_unlock(c-your_lock);
 }

 /* Your notify userspace function */
 void notify_userspace(struct your_ctx *c) {

   whatever_lock_lock(c-your_lock);
   if (c-evfile != NULL)
   eventfd_signal(c-evfile, 1);
   whatever_lock_unlock(c-your_lock);
 }

 /* Your eventfd create/setup function (modulo error checks) */
 void setup_eventfd(struct your_ctx *c) {
   int fd;

   c-rcb.proc = rcb_callback;
   fd = eventfd_create(0, c-rcb);
   c-evfile = eventfd_fget(fd);
   /* Then return fd to userspace in some way */
   ...
 }



 If, by any chance, you need to detach yourself from the eventfd:

 void detach_eventfd(struct your_ctx *c) {

   whatever_lock_lock(c-your_lock);
   if (c-evfile != NULL) {
   eventfd_del_release_cb(c-evfile, c-rcb);
   /* And, optionally ... */
   eventfd_set_shutdown(c-evfile);
   }
   whatever_lock_unlock(c-your_lock);
 }


 The eventfd_set_shutdown() function will make a POLLIN signaled to 
 userspace, and a read(2) on the eventfd will return 0 (like peer 
 disconnected sockets). Then userspace will know that no more eventfs will 
 show up in there.
 Tentative patch below (builds, not tested yet).


   

Looks like a lot of code for what was supposed to be a code reduction... 
I think I'll look at Gregory's notification-free suggestion instead.


-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-17 Thread Avi Kivity
Davide Libenzi wrote:
 On Wed, 16 May 2007, Avi Kivity wrote:

   
 IMO doing eventfd_fget() asap is best.  I much prefer refcounted pointers to
 handles in the kernel: it's easier to see what things point to, and there is
 to context needed for dereferencing.
 

 There are concerns (from Al and Christoph) about file lifetime tracking of 
 an fd passed down to the kernel.
   

What concerns are these?

I have my own concerns about bare fds, which can change their backing 
file, and which are dependent on the context.  The kernel appears to be 
moving away from handles to pointers, as seen by the pid - 
task_struck/struct pid conversion.

 Avi, how about you go the other way around? I expose you something like:

 long eventfd_create(unsigned int count, void (*release)(void *),
 void *priv);

 This returns you an eventfd (or error), and lets you install a callback to 
 be used when the file_operations-release is called (basically, userspace 
 closed the last file instance).
 Can KVM pass back an fd to userspace instead of the other way around?
   

That was my original thought, but when I saw your proposal I preferred 
that as more flexible.

If we go this way, I prefer to have a struct file * and do away with the 
callback, but that brings us back to the file lifetime concerns.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-17 Thread Davide Libenzi
On Thu, 17 May 2007, Avi Kivity wrote:

 Davide Libenzi wrote:
  On Wed, 16 May 2007, Avi Kivity wrote:
  

   IMO doing eventfd_fget() asap is best.  I much prefer refcounted pointers
   to
   handles in the kernel: it's easier to see what things point to, and there
   is
   to context needed for dereferencing.
   
  
  There are concerns (from Al and Christoph) about file lifetime tracking of
  an fd passed down to the kernel.

 
 What concerns are these?
 
 I have my own concerns about bare fds, which can change their backing file,
 and which are dependent on the context.  The kernel appears to be moving away
 from handles to pointers, as seen by the pid - task_struck/struct pid
 conversion.

The concern is that by keeping a reference to the file* you may end up 
with a file that has no more userspace links. Although the file_count()==1 
will tell you that you've the only reference left, this relies on you 
doing the check. I really don't know how this will be used in KVM, so I 
can't really say how well these concerns applies.




  Avi, how about you go the other way around? I expose you something like:
  
  long eventfd_create(unsigned int count, void (*release)(void *),
  void *priv);
  
  This returns you an eventfd (or error), and lets you install a callback to
  be used when the file_operations-release is called (basically, userspace
  closed the last file instance).
  Can KVM pass back an fd to userspace instead of the other way around?

 
 That was my original thought, but when I saw your proposal I preferred that as
 more flexible.
 
 If we go this way, I prefer to have a struct file * and do away with the
 callback, but that brings us back to the file lifetime concerns.

Doing the above requires some care too. Access to your copy of the file* 
must be protected by a lock (you can sleep from inside the callback, so a 
mutex is fine too). I'll sketch you some code:


/* Defined in linux/eventfd.h */
struct eventfd_relcb {
struct list_head lnk;
void (*proc)(struct eventfd_relcb *);
};

/* Your context data */
struct your_ctx {
whatever_lock your_lock;
...
struct eventfd_relcb rcb;
struct file *evfile;
};

/* Your eventfd release callback */
void rcb_callback(struct eventfd_relcb *rcb) {
struct your_ctx *c = container_of(rcb, struct your_ctx, rcb);

whatever_lock_lock(c-your_lock);
...
c-evfile = NULL;
whatever_lock_unlock(c-your_lock);
}

/* Your notify userspace function */
void notify_userspace(struct your_ctx *c) {

whatever_lock_lock(c-your_lock);
if (c-evfile != NULL)
eventfd_signal(c-evfile, 1);
whatever_lock_unlock(c-your_lock);
}

/* Your eventfd create/setup function (modulo error checks) */
void setup_eventfd(struct your_ctx *c) {
int fd;

c-rcb.proc = rcb_callback;
fd = eventfd_create(0, c-rcb);
c-evfile = eventfd_fget(fd);
/* Then return fd to userspace in some way */
...
}



If, by any chance, you need to detach yourself from the eventfd:

void detach_eventfd(struct your_ctx *c) {

whatever_lock_lock(c-your_lock);
if (c-evfile != NULL) {
eventfd_del_release_cb(c-evfile, c-rcb);
/* And, optionally ... */
eventfd_set_shutdown(c-evfile);
}
whatever_lock_unlock(c-your_lock);
}


The eventfd_set_shutdown() function will make a POLLIN signaled to 
userspace, and a read(2) on the eventfd will return 0 (like peer 
disconnected sockets). Then userspace will know that no more eventfs will 
show up in there.
Tentative patch below (builds, not tested yet).




- Davide



Index: linux-2.6.22-rc1-git1.mod/fs/eventfd.c
===
--- linux-2.6.22-rc1-git1.mod.orig/fs/eventfd.c 2007-05-16 15:10:26.0 
-0700
+++ linux-2.6.22-rc1-git1.mod/fs/eventfd.c  2007-05-17 10:53:38.0 
-0700
@@ -16,9 +16,15 @@
 #include linux/anon_inodes.h
 #include linux/eventfd.h
 
+#define EVENTFD_FLG_SHUTDOWN   (1  0)
+
+#define EVENTFD_FILE(f)((f)-f_op == eventfd_fops)
+
 struct eventfd_ctx {
spinlock_t lock;
wait_queue_head_t wqh;
+   unsigned long flags;
+
/*
 * Every time that a write(2) is performed on an eventfd, the
 * value of the __u64 being written is added to count and a
@@ -28,37 +34,72 @@
 * issue a wakeup.
 */
__u64 count;
+
+   /*
+* List of callbacks to be invoked when the eventfd file
+* in going away.
+*/
+   struct list_head rcb_list;
 };
 
-/*
- * Adds n to the eventfd counter count. Returns n in case of
- * success, or a value lower then n in case of coutner overflow.
- * This function is supposed to be called by the kernel in paths
- * that do not allow sleeping. In this function we allow the counter
- * to reach the 

Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-17 Thread Davide Libenzi
On Thu, 17 May 2007, Davide Libenzi wrote:

 /* Your eventfd create/setup function (modulo error checks) */
 void setup_eventfd(struct your_ctx *c) {
   int fd;
 
   c-rcb.proc = rcb_callback;
   fd = eventfd_create(0, c-rcb);
   c-evfile = eventfd_fget(fd);

+   fput(c-evfile);

   /* Then return fd to userspace in some way */
   ...
 }

You need an fput() there, to bring f_count back to 1, of course ;)


- Davide



-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-16 Thread Avi Kivity
Gregory Haskins wrote:
 On Tue, May 15, 2007 at  3:45 AM, in message [EMAIL PROTECTED],
 
 Avi Kivity [EMAIL PROTECTED] wrote: 
   
 Gregory Haskins wrote:
 
 Signed- off- by: Gregory Haskins [EMAIL PROTECTED]
 ---

  drivers/kvm/kvm.h  |1 +
  drivers/kvm/kvm_main.c |   52 
 ++--
  include/linux/kvm.h|1 +
  3 files changed, 48 insertions(+), 6 deletions(- )

 diff -- git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
 index 7b5d5e6..f5731c4 100644
 ---  a/drivers/kvm/kvm.h
 +++ b/drivers/kvm/kvm.h
 @@ - 333,6 +333,7 @@ struct kvm_vcpu_irq {
 int  deferred;
 struct task_struct  *task;
 int  guest_mode;
 +   int  eventfd;
   
   
 Best to convert the fd to a filp when you install it.  This avoids the
 conversion during runtime and allows you to do error checking earlier.
 


 That was my initial impression also, but then I realized there was a problem 
 with that:  Eventfd doesnt appear to have any way to notify other entities 
 when the fd is closed.  Therefore the filp could be left dangling in this 
 case.  By using the fd instead, I can validate the pointer each time I need 
 it.  Perhaps Davide will have a suggestion here.
   

eventfd_fget() will bump the reference count.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-16 Thread Avi Kivity
Davide Libenzi wrote:
 On Tue, 15 May 2007, Gregory Haskins wrote:

   
 On Tue, May 15, 2007 at  3:45 AM, in message [EMAIL PROTECTED],
   
 Avi Kivity [EMAIL PROTECTED] wrote: 
 
 Gregory Haskins wrote:
   
 Signed- off- by: Gregory Haskins [EMAIL PROTECTED]
 ---

  drivers/kvm/kvm.h  |1 +
  drivers/kvm/kvm_main.c |   52 
 ++--
  include/linux/kvm.h|1 +
  3 files changed, 48 insertions(+), 6 deletions(- )

 diff -- git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
 index 7b5d5e6..f5731c4 100644
 ---  a/drivers/kvm/kvm.h
 +++ b/drivers/kvm/kvm.h
 @@ - 333,6 +333,7 @@ struct kvm_vcpu_irq {
int  deferred;
struct task_struct  *task;
int  guest_mode;
 +  int  eventfd;
   
 
 Best to convert the fd to a filp when you install it.  This avoids the
 conversion during runtime and allows you to do error checking earlier.
   
 That was my initial impression also, but then I realized there was a 
 problem with that:  Eventfd doesnt appear to have any way to notify 
 other entities when the fd is closed.  Therefore the filp could be left 
 dangling in this case.  By using the fd instead, I can validate the 
 pointer each time I need it.  Perhaps Davide will have a suggestion 
 here.
 

 I don't know how critical is the path where you will be doing check. The 
 eventfd_fget() is pretty fast, so if you're not looking at a performance 
 critical path, I'd suggest that. Otherwise you can do an early 
 eventfd_get, and keep the file*. If you have no the ways to know if the 
 userspace disconnected, an atomic_read(file-f_count)==1 will tell you 
 that you're the only owner of the file* (that is, userspace closed the 
 eventfd descriptor). I'd give preference to the former option though.
   

We aren't really interested whether userspace is looking or not.  After 
all, it installed the eventfd, so it must be interested in the events.

 Avi, as you may have read from lkml, Andrew prefers the eventfd symbols 
 export to come through the kvm tree, since they do not want to export 
 symbols that so far has no module users.

   

Yes, I'll pick up the patch from lkml.


-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-16 Thread Avi Kivity
Anthony Liguori wrote:
 Gregory Haskins wrote:
   
 While KVM will inevitably start requiring newer kernel versions, do we 
 really need to do it right now?  Perhaps we could add dummy eventfd_* 
 functions to the module compat header?  Then at least older kernels will 
 continue to work with in- kernel APIC disabled.

 
   
 The plan as Avi and I agreed to (IIUC) is to support eventfd via the 
 extern-compat header.  The same could be said for HRT.  I don't thing either 
 one will be particularly pleasant to support in this fashion, but this is 
 how he wants to do it.  (Recall that I had abstracted the HRT in earlier 
 versions which he requested to be removed, so I didnt bother with the 
 eventfd interface).  I agree that it will keep the kvm core code cleaner 
 instead of having all these custom abstractions, so I see his point there.  
 I'm just not looking forward to the compat work ;)

 That being said: Perhaps you just came up with a good compromise.  The code 
 is already structured to support both the old way and the new way.  We could 
 have the in-kernel modes disabled at compile time on older kernels.  Thats 
 probably a much better solution then trying to get HRT and eventfd emulated 
 on, say, 2.6.9 ;)

 The implication here is that I don't plan on supporting new features via the 
 old-way.  For instance, SMP features would only work with in-kernel 
 emulation.  If we go down this route, it automatically means that older 
 kernels will not have any other related advanced features, not just the 
 performance/features currently offered.  Perhaps this is acceptable also?  
 I'm not sure.
   
 

 I don't know.  I'm less concerned about long term implications than I am 
 with short term ones.  My primary concern was having new versions of KVM 
 stop working on older kernels.  If SMP requires in-kernel APIC, then 
 provided that UP kernels still work, if someone is sufficiently 
 motivated to get SMP working on older kernels, they can implement the 
 eventfd emulation.

   

I'm not worried about eventfd -- we can just ship a 
conditionally-compiled copy of the code with the external module (we 
also need a non-syscall interface, as modules can't provide syscalls, 
but that's easily workaroundable).  hrtimers is more difficult, but 
there's really no choice -- we can't #ifdef the core code for that.

I really would like to see smp working without kernel apic, as pure qemu 
supports it.  It would of course require newer modules.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-16 Thread Avi Kivity
Davide Libenzi wrote:
 On Tue, 15 May 2007, Christoph Hellwig wrote:

   
 On Tue, May 15, 2007 at 12:18:17PM -0400, Gregory Haskins wrote:
 
 On Tue, May 15, 2007 at 11:40 AM, in message
 
 [EMAIL PROTECTED], Davide Libenzi
 [EMAIL PROTECTED] wrote: 
   
 I don't know how critical is the path where you will be doing check. The 
 eventfd_fget() is pretty fast, so if you're not looking at a performance 
 critical path, I'd suggest that. Otherwise you can do an early 
 eventfd_get, and keep the file*. If you have no the ways to know if the 
 userspace disconnected, an atomic_read(file- f_count)==1 will tell you 
 that you're the only owner of the file* (that is, userspace closed the 
 eventfd descriptor). I'd give preference to the former option though.
 
 Thanks for the insight, Davide.  It sounds like we can probably stay with 
 the way I have it for now, and keep the file-f_count idea in our back 
 pocket should a performance problem arise.
   
 accessing file-f_count is not allowed for drivers.  It took us quite a
 bit of effort to clean up all users a while ago, and it turned out most
 of them were rather buggy.
 

 Right, you could use the file_count() macro :)
 Seriuosly, if this (doing an eventfd_fget+fput) becomes a problem, I can
 have the check done in eventfd_signal() and return a proper error code.

   

IMO doing eventfd_fget() asap is best.  I much prefer refcounted 
pointers to handles in the kernel: it's easier to see what things point 
to, and there is to context needed for dereferencing.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-16 Thread Davide Libenzi
On Wed, 16 May 2007, Avi Kivity wrote:

 IMO doing eventfd_fget() asap is best.  I much prefer refcounted pointers to
 handles in the kernel: it's easier to see what things point to, and there is
 to context needed for dereferencing.

There are concerns (from Al and Christoph) about file lifetime tracking of 
an fd passed down to the kernel.
Avi, how about you go the other way around? I expose you something like:

long eventfd_create(unsigned int count, void (*release)(void *),
void *priv);

This returns you an eventfd (or error), and lets you install a callback to 
be used when the file_operations-release is called (basically, userspace 
closed the last file instance).
Can KVM pass back an fd to userspace instead of the other way around?



- Davide



-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-15 Thread Gregory Haskins
Signed-off-by: Gregory Haskins [EMAIL PROTECTED]
---

 drivers/kvm/kvm.h  |1 +
 drivers/kvm/kvm_main.c |   55 +++-
 include/linux/kvm.h|1 +
 3 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index d5783dc..e1ba066 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -332,6 +332,7 @@ struct kvm_vcpu_irq {
int  pending;
int  deferred;
int  guest_cpu;
+   int  eventfd;
 };
 
 struct kvm_vcpu {
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index 0c6a62a..7109b66 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -40,6 +40,7 @@
 #include linux/file.h
 #include linux/fs.h
 #include linux/mount.h
+#include linux/eventfd.h
 
 #include x86_emulate.h
 #include segment_descriptor.h
@@ -326,6 +327,7 @@ static struct kvm *kvm_create_vm(void)
memset(vcpu-irq, 0, sizeof(vcpu-irq));
spin_lock_init(vcpu-irq.lock);
vcpu-irq.deferred = -1;
+   vcpu-irq.eventfd   = -1;
 
vcpu-cpu = -1;
vcpu-kvm = kvm;
@@ -2358,6 +2360,7 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
 {
struct kvm_vcpu *vcpu = (struct kvm_vcpu*)this-private;
int direct_ipi = -1;
+   int eventfd = -1;
 
spin_lock_irq(vcpu-irq.lock);
 
@@ -2379,7 +2382,14 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
 */
direct_ipi = vcpu-irq.guest_cpu;
BUG_ON(direct_ipi == smp_processor_id());
-   }
+   } else
+   /*
+* otherwise, we must assume that we could be
+* blocked anywhere, including userspace. Send
+* a signal to give everyone a chance to get
+* notification
+*/
+   eventfd = vcpu-irq.eventfd;
}
}
 
@@ -2401,6 +2411,12 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
smp_call_function_single(direct_ipi,
 kvm_vcpu_guest_intr,
 vcpu, 0, 0);
+
+   if (eventfd != -1) {
+   struct file *filp = eventfd_fget(eventfd);
+   if (!IS_ERR(filp))
+   eventfd_signal(filp, 1);
+   }
 }
 
 static void kvm_vcpu_irqsink_init(struct kvm_vcpu *vcpu)
@@ -2584,6 +2600,17 @@ static int kvm_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, 
struct kvm_fpu *fpu)
return 0;
 }
 
+static int kvm_vcpu_ioctl_set_eventfd(struct kvm_vcpu *vcpu, int fd)
+{
+   if (IS_ERR(eventfd_fget(fd)))
+   return -EINVAL;
+
+   vcpu-irq.eventfd = fd;
+   smp_wmb();
+
+   return 0;
+}
+
 static long kvm_vcpu_ioctl(struct file *filp,
   unsigned int ioctl, unsigned long arg)
 {
@@ -2753,6 +2780,15 @@ static long kvm_vcpu_ioctl(struct file *filp,
r = 0;
break;
}
+   case KVM_SET_EVENTFD: {
+   int eventfd = (long)argp;
+
+   r = kvm_vcpu_ioctl_set_eventfd(vcpu, eventfd);
+   if (r)
+   goto out;
+   r = 0;
+   break;
+   }
default:
;
}
@@ -2937,12 +2973,19 @@ static long kvm_dev_ioctl(struct file *filp,
r = 0;
break;
}
-   case KVM_CHECK_EXTENSION:
-   /*
-* No extensions defined at present.
-*/
-   r = 0;
+   case KVM_CHECK_EXTENSION: {
+   int ext = (long)argp;
+
+   switch (ext) {
+   case KVM_SET_EVENTFD:
+   r = 1;
+   break;
+   default:
+   r = 0;
+   break;
+   }
break;
+   }
case KVM_GET_VCPU_MMAP_SIZE:
r = -EINVAL;
if (arg)
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index e6edca8..f13ec8c 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -300,5 +300,6 @@ struct kvm_signal_mask {
 #define KVM_SET_SIGNAL_MASK   _IOW(KVMIO,  0x8b, struct kvm_signal_mask)
 #define KVM_GET_FPU   _IOR(KVMIO,  0x8c, struct kvm_fpu)
 #define KVM_SET_FPU   _IOW(KVMIO,  0x8d, struct kvm_fpu)
+#define KVM_SET_EVENTFD   _IO(KVMIO,   0x8e)
 
 #endif


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. 

Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-15 Thread Davide Libenzi
On Tue, 15 May 2007, Gregory Haskins wrote:

  On Tue, May 15, 2007 at  3:45 AM, in message [EMAIL PROTECTED],
 Avi Kivity [EMAIL PROTECTED] wrote: 
  Gregory Haskins wrote:
  Signed- off- by: Gregory Haskins [EMAIL PROTECTED]
  ---
 
   drivers/kvm/kvm.h  |1 +
   drivers/kvm/kvm_main.c |   52 
  ++--
   include/linux/kvm.h|1 +
   3 files changed, 48 insertions(+), 6 deletions(- )
 
  diff -- git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
  index 7b5d5e6..f5731c4 100644
  ---  a/drivers/kvm/kvm.h
  +++ b/drivers/kvm/kvm.h
  @@ - 333,6 +333,7 @@ struct kvm_vcpu_irq {
 int  deferred;
 struct task_struct  *task;
 int  guest_mode;
  +  int  eventfd;

  
  Best to convert the fd to a filp when you install it.  This avoids the
  conversion during runtime and allows you to do error checking earlier.
 
 That was my initial impression also, but then I realized there was a 
 problem with that:  Eventfd doesnt appear to have any way to notify 
 other entities when the fd is closed.  Therefore the filp could be left 
 dangling in this case.  By using the fd instead, I can validate the 
 pointer each time I need it.  Perhaps Davide will have a suggestion 
 here.

I don't know how critical is the path where you will be doing check. The 
eventfd_fget() is pretty fast, so if you're not looking at a performance 
critical path, I'd suggest that. Otherwise you can do an early 
eventfd_get, and keep the file*. If you have no the ways to know if the 
userspace disconnected, an atomic_read(file-f_count)==1 will tell you 
that you're the only owner of the file* (that is, userspace closed the 
eventfd descriptor). I'd give preference to the former option though.
Avi, as you may have read from lkml, Andrew prefers the eventfd symbols 
export to come through the kvm tree, since they do not want to export 
symbols that so far has no module users.


- Davide



-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-15 Thread Gregory Haskins
 On Tue, May 15, 2007 at 11:40 AM, in message
[EMAIL PROTECTED], Davide Libenzi
[EMAIL PROTECTED] wrote: 

 I don't know how critical is the path where you will be doing check. The 
 eventfd_fget() is pretty fast, so if you're not looking at a performance 
 critical path, I'd suggest that. Otherwise you can do an early 
 eventfd_get, and keep the file*. If you have no the ways to know if the 
 userspace disconnected, an atomic_read(file- f_count)==1 will tell you 
 that you're the only owner of the file* (that is, userspace closed the 
 eventfd descriptor). I'd give preference to the former option though.

Thanks for the insight, Davide.  It sounds like we can probably stay with the 
way I have it for now, and keep the file-f_count idea in our back pocket 
should a performance problem arise.

 Avi, as you may have read from lkml, Andrew prefers the eventfd symbols 
 export to come through the kvm tree, since they do not want to export 
 symbols that so far has no module users.

I can create a patch and add it to my series for this work.

Thanks again,
-Greg


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-15 Thread Christoph Hellwig
On Tue, May 15, 2007 at 12:18:17PM -0400, Gregory Haskins wrote:
  On Tue, May 15, 2007 at 11:40 AM, in message
 [EMAIL PROTECTED], Davide Libenzi
 [EMAIL PROTECTED] wrote: 
 
  I don't know how critical is the path where you will be doing check. The 
  eventfd_fget() is pretty fast, so if you're not looking at a performance 
  critical path, I'd suggest that. Otherwise you can do an early 
  eventfd_get, and keep the file*. If you have no the ways to know if the 
  userspace disconnected, an atomic_read(file- f_count)==1 will tell you 
  that you're the only owner of the file* (that is, userspace closed the 
  eventfd descriptor). I'd give preference to the former option though.
 
 Thanks for the insight, Davide.  It sounds like we can probably stay with the 
 way I have it for now, and keep the file-f_count idea in our back pocket 
 should a performance problem arise.

accessing file-f_count is not allowed for drivers.  It took us quite a
bit of effort to clean up all users a while ago, and it turned out most
of them were rather buggy.


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-15 Thread Gregory Haskins
 On Tue, May 15, 2007 at 12:22 PM, in message
[EMAIL PROTECTED], Christoph Hellwig [EMAIL PROTECTED]
wrote: 
 On Tue, May 15, 2007 at 12:18:17PM - 0400, Gregory Haskins wrote:
  On Tue, May 15, 2007 at 11:40 AM, in message
 [EMAIL PROTECTED], Davide Libenzi
 [EMAIL PROTECTED] wrote: 
 
  I don't know how critical is the path where you will be doing check. The 
  eventfd_fget() is pretty fast, so if you're not looking at a performance 
  critical path, I'd suggest that. Otherwise you can do an early 
  eventfd_get, and keep the file*. If you have no the ways to know if the 
  userspace disconnected, an atomic_read(file-  f_count)==1 will tell you 
  that you're the only owner of the file* (that is, userspace closed the 
  eventfd descriptor). I'd give preference to the former option though.
 
 Thanks for the insight, Davide.  It sounds like we can probably stay with 
 the way I have it for now, and keep the file- f_count idea in our back 
 pocket 
 should a performance problem arise.
 
 accessing file- f_count is not allowed for drivers.  It took us quite a
 bit of effort to clean up all users a while ago, and it turned out most
 of them were rather buggy.

Ack.

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-15 Thread Anthony Liguori
This patch series depends on eventfd which means that KVM now requires 
2.6.22-rc1.

While KVM will inevitably start requiring newer kernel versions, do we 
really need to do it right now?  Perhaps we could add dummy eventfd_* 
functions to the module compat header?  Then at least older kernels will 
continue to work with in-kernel APIC disabled.

Regards,

Anthony Liguori

Gregory Haskins wrote:
 Signed-off-by: Gregory Haskins [EMAIL PROTECTED]
 ---

  drivers/kvm/kvm.h  |1 +
  drivers/kvm/kvm_main.c |   55 
 +++-
  include/linux/kvm.h|1 +
  3 files changed, 51 insertions(+), 6 deletions(-)

 diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
 index d5783dc..e1ba066 100644
 --- a/drivers/kvm/kvm.h
 +++ b/drivers/kvm/kvm.h
 @@ -332,6 +332,7 @@ struct kvm_vcpu_irq {
   int  pending;
   int  deferred;
   int  guest_cpu;
 + int  eventfd;
  };
  
  struct kvm_vcpu {
 diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
 index 0c6a62a..7109b66 100644
 --- a/drivers/kvm/kvm_main.c
 +++ b/drivers/kvm/kvm_main.c
 @@ -40,6 +40,7 @@
  #include linux/file.h
  #include linux/fs.h
  #include linux/mount.h
 +#include linux/eventfd.h
  
  #include x86_emulate.h
  #include segment_descriptor.h
 @@ -326,6 +327,7 @@ static struct kvm *kvm_create_vm(void)
   memset(vcpu-irq, 0, sizeof(vcpu-irq));
   spin_lock_init(vcpu-irq.lock);
   vcpu-irq.deferred = -1;
 + vcpu-irq.eventfd   = -1;
  
   vcpu-cpu = -1;
   vcpu-kvm = kvm;
 @@ -2358,6 +2360,7 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
  {
   struct kvm_vcpu *vcpu = (struct kvm_vcpu*)this-private;
   int direct_ipi = -1;
 + int eventfd = -1;
  
   spin_lock_irq(vcpu-irq.lock);
  
 @@ -2379,7 +2382,14 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
*/
   direct_ipi = vcpu-irq.guest_cpu;
   BUG_ON(direct_ipi == smp_processor_id());
 - }
 + } else
 + /*
 +  * otherwise, we must assume that we could be
 +  * blocked anywhere, including userspace. Send
 +  * a signal to give everyone a chance to get
 +  * notification
 +  */
 + eventfd = vcpu-irq.eventfd;
   }
   }
  
 @@ -2401,6 +2411,12 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
   smp_call_function_single(direct_ipi,
kvm_vcpu_guest_intr,
vcpu, 0, 0);
 +
 + if (eventfd != -1) {
 + struct file *filp = eventfd_fget(eventfd);
 + if (!IS_ERR(filp))
 + eventfd_signal(filp, 1);
 + }
  }
  
  static void kvm_vcpu_irqsink_init(struct kvm_vcpu *vcpu)
 @@ -2584,6 +2600,17 @@ static int kvm_vcpu_ioctl_set_fpu(struct kvm_vcpu 
 *vcpu, struct kvm_fpu *fpu)
   return 0;
  }
  
 +static int kvm_vcpu_ioctl_set_eventfd(struct kvm_vcpu *vcpu, int fd)
 +{
 + if (IS_ERR(eventfd_fget(fd)))
 + return -EINVAL;
 +
 + vcpu-irq.eventfd = fd;
 + smp_wmb();
 +
 + return 0;
 +}
 +
  static long kvm_vcpu_ioctl(struct file *filp,
  unsigned int ioctl, unsigned long arg)
  {
 @@ -2753,6 +2780,15 @@ static long kvm_vcpu_ioctl(struct file *filp,
   r = 0;
   break;
   }
 + case KVM_SET_EVENTFD: {
 + int eventfd = (long)argp;
 +
 + r = kvm_vcpu_ioctl_set_eventfd(vcpu, eventfd);
 + if (r)
 + goto out;
 + r = 0;
 + break;
 + }
   default:
   ;
   }
 @@ -2937,12 +2973,19 @@ static long kvm_dev_ioctl(struct file *filp,
   r = 0;
   break;
   }
 - case KVM_CHECK_EXTENSION:
 - /*
 -  * No extensions defined at present.
 -  */
 - r = 0;
 + case KVM_CHECK_EXTENSION: {
 + int ext = (long)argp;
 +
 + switch (ext) {
 + case KVM_SET_EVENTFD:
 + r = 1;
 + break;
 + default:
 + r = 0;
 + break;
 + }
   break;
 + }
   case KVM_GET_VCPU_MMAP_SIZE:
   r = -EINVAL;
   if (arg)
 diff --git a/include/linux/kvm.h b/include/linux/kvm.h
 index e6edca8..f13ec8c 100644
 --- a/include/linux/kvm.h
 +++ b/include/linux/kvm.h
 @@ -300,5 +300,6 @@ struct kvm_signal_mask {
  #define KVM_SET_SIGNAL_MASK   _IOW(KVMIO,  0x8b, struct kvm_signal_mask)
  #define KVM_GET_FPU   _IOR(KVMIO,  0x8c, 

Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-15 Thread Gregory Haskins
 On Tue, May 15, 2007 at 12:39 PM, in message [EMAIL PROTECTED],
Anthony Liguori [EMAIL PROTECTED] wrote: 
 This patch series depends on eventfd which means that KVM now requires 
 2.6.22- rc1.

This is understood.

 
 While KVM will inevitably start requiring newer kernel versions, do we 
 really need to do it right now?  Perhaps we could add dummy eventfd_* 
 functions to the module compat header?  Then at least older kernels will 
 continue to work with in- kernel APIC disabled.


The plan as Avi and I agreed to (IIUC) is to support eventfd via the 
extern-compat header.  The same could be said for HRT.  I don't thing either 
one will be particularly pleasant to support in this fashion, but this is how 
he wants to do it.  (Recall that I had abstracted the HRT in earlier versions 
which he requested to be removed, so I didnt bother with the eventfd 
interface).  I agree that it will keep the kvm core code cleaner instead of 
having all these custom abstractions, so I see his point there.  I'm just not 
looking forward to the compat work ;)

That being said: Perhaps you just came up with a good compromise.  The code is 
already structured to support both the old way and the new way.  We could have 
the in-kernel modes disabled at compile time on older kernels.  Thats probably 
a much better solution then trying to get HRT and eventfd emulated on, say, 
2.6.9 ;)

The implication here is that I don't plan on supporting new features via the 
old-way.  For instance, SMP features would only work with in-kernel emulation.  
If we go down this route, it automatically means that older kernels will not 
have any other related advanced features, not just the performance/features 
currently offered.  Perhaps this is acceptable also?  I'm not sure.

Regards,
-Greg 



-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-15 Thread Davide Libenzi
On Tue, 15 May 2007, Christoph Hellwig wrote:

 On Tue, May 15, 2007 at 12:18:17PM -0400, Gregory Haskins wrote:
   On Tue, May 15, 2007 at 11:40 AM, in message
  [EMAIL PROTECTED], Davide Libenzi
  [EMAIL PROTECTED] wrote: 
  
   I don't know how critical is the path where you will be doing check. The 
   eventfd_fget() is pretty fast, so if you're not looking at a performance 
   critical path, I'd suggest that. Otherwise you can do an early 
   eventfd_get, and keep the file*. If you have no the ways to know if the 
   userspace disconnected, an atomic_read(file- f_count)==1 will tell you 
   that you're the only owner of the file* (that is, userspace closed the 
   eventfd descriptor). I'd give preference to the former option though.
  
  Thanks for the insight, Davide.  It sounds like we can probably stay with 
  the way I have it for now, and keep the file-f_count idea in our back 
  pocket should a performance problem arise.
 
 accessing file-f_count is not allowed for drivers.  It took us quite a
 bit of effort to clean up all users a while ago, and it turned out most
 of them were rather buggy.

Right, you could use the file_count() macro :)
Seriuosly, if this (doing an eventfd_fget+fput) becomes a problem, I can
have the check done in eventfd_signal() and return a proper error code.



- Davide



-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-15 Thread Anthony Liguori
Gregory Haskins wrote:
 While KVM will inevitably start requiring newer kernel versions, do we 
 really need to do it right now?  Perhaps we could add dummy eventfd_* 
 functions to the module compat header?  Then at least older kernels will 
 continue to work with in- kernel APIC disabled.

 

 The plan as Avi and I agreed to (IIUC) is to support eventfd via the 
 extern-compat header.  The same could be said for HRT.  I don't thing either 
 one will be particularly pleasant to support in this fashion, but this is how 
 he wants to do it.  (Recall that I had abstracted the HRT in earlier versions 
 which he requested to be removed, so I didnt bother with the eventfd 
 interface).  I agree that it will keep the kvm core code cleaner instead of 
 having all these custom abstractions, so I see his point there.  I'm just not 
 looking forward to the compat work ;)

 That being said: Perhaps you just came up with a good compromise.  The code 
 is already structured to support both the old way and the new way.  We could 
 have the in-kernel modes disabled at compile time on older kernels.  Thats 
 probably a much better solution then trying to get HRT and eventfd emulated 
 on, say, 2.6.9 ;)

 The implication here is that I don't plan on supporting new features via the 
 old-way.  For instance, SMP features would only work with in-kernel 
 emulation.  If we go down this route, it automatically means that older 
 kernels will not have any other related advanced features, not just the 
 performance/features currently offered.  Perhaps this is acceptable also?  
 I'm not sure.
   

I don't know.  I'm less concerned about long term implications than I am 
with short term ones.  My primary concern was having new versions of KVM 
stop working on older kernels.  If SMP requires in-kernel APIC, then 
provided that UP kernels still work, if someone is sufficiently 
motivated to get SMP working on older kernels, they can implement the 
eventfd emulation.

Regards,

Anthony Liguori

 Regards,
 -Greg 



 -
 This SF.net email is sponsored by DB2 Express
 Download DB2 Express C - the FREE version of DB2 express and take
 control of your XML. No limits. Just data. Click to get it now.
 http://sourceforge.net/powerbar/db2/
 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel

   


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-14 Thread Gregory Haskins
Signed-off-by: Gregory Haskins [EMAIL PROTECTED]
---

 drivers/kvm/kvm.h  |1 +
 drivers/kvm/kvm_main.c |   52 ++--
 include/linux/kvm.h|1 +
 3 files changed, 48 insertions(+), 6 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 7b5d5e6..f5731c4 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -333,6 +333,7 @@ struct kvm_vcpu_irq {
int  deferred;
struct task_struct  *task;
int  guest_mode;
+   int  eventfd;
 };
 
 struct kvm_vcpu {
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index cb73763..86e4262 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -40,6 +40,7 @@
 #include linux/file.h
 #include linux/fs.h
 #include linux/mount.h
+#include linux/eventfd.h
 
 #include x86_emulate.h
 #include segment_descriptor.h
@@ -326,6 +327,7 @@ static struct kvm *kvm_create_vm(void)
memset(vcpu-irq, 0, sizeof(vcpu-irq));
spin_lock_init(vcpu-irq.lock);
vcpu-irq.deferred = -1;
+   vcpu-irq.eventfd   = -1;
 
vcpu-cpu = -1;
vcpu-kvm = kvm;
@@ -2358,6 +2360,7 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
 {
struct kvm_vcpu *vcpu = (struct kvm_vcpu*)this-private;
int direct_ipi = -1;
+   int eventfd = -1;
 
spin_lock_irq(vcpu-irq.lock);
 
@@ -2379,7 +2382,14 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
 */
direct_ipi = task_cpu(vcpu-irq.task);
BUG_ON(direct_ipi == smp_processor_id());
-   }
+   } else
+   /*
+* otherwise, we must assume that we could be
+* blocked anywhere, including userspace. Send
+* a signal to give everyone a chance to get
+* notification
+*/
+   eventfd = vcpu-irq.eventfd;
}
}
 
@@ -2401,6 +2411,12 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
smp_call_function_single(direct_ipi,
 kvm_vcpu_guest_intr,
 vcpu, 0, 0);
+
+   if (eventfd != -1) {
+   struct file *filp = eventfd_fget(eventfd);
+   if (!IS_ERR(filp))
+   eventfd_signal(filp, 1);
+   }
 }
 
 static void kvm_vcpu_irqsink_init(struct kvm_vcpu *vcpu)
@@ -2584,6 +2600,14 @@ static int kvm_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, 
struct kvm_fpu *fpu)
return 0;
 }
 
+static int kvm_vcpu_ioctl_set_eventfd(struct kvm_vcpu *vcpu, int fd)
+{
+   vcpu-irq.eventfd = fd;
+   smp_wmb();
+
+   return 0;
+}
+
 static long kvm_vcpu_ioctl(struct file *filp,
   unsigned int ioctl, unsigned long arg)
 {
@@ -2753,6 +2777,15 @@ static long kvm_vcpu_ioctl(struct file *filp,
r = 0;
break;
}
+   case KVM_SET_EVENTFD: {
+   int eventfd = (long)argp;
+
+   r = kvm_vcpu_ioctl_set_eventfd(vcpu, eventfd);
+   if (r)
+   goto out;
+   r = 0;
+   break;
+   }
default:
;
}
@@ -2937,12 +2970,19 @@ static long kvm_dev_ioctl(struct file *filp,
r = 0;
break;
}
-   case KVM_CHECK_EXTENSION:
-   /*
-* No extensions defined at present.
-*/
-   r = 0;
+   case KVM_CHECK_EXTENSION: {
+   int ext = (long)argp;
+
+   switch (ext) {
+   case KVM_SET_EVENTFD:
+   r = 1;
+   break;
+   default:
+   r = 0;
+   break;
+   }
break;
+   }
case KVM_GET_VCPU_MMAP_SIZE:
r = -EINVAL;
if (arg)
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index e6edca8..f13ec8c 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -300,5 +300,6 @@ struct kvm_signal_mask {
 #define KVM_SET_SIGNAL_MASK   _IOW(KVMIO,  0x8b, struct kvm_signal_mask)
 #define KVM_GET_FPU   _IOR(KVMIO,  0x8c, struct kvm_fpu)
 #define KVM_SET_FPU   _IOW(KVMIO,  0x8d, struct kvm_fpu)
+#define KVM_SET_EVENTFD   _IO(KVMIO,   0x8e)
 
 #endif


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

[kvm-devel] [PATCH 5/9] KVM: Adds ability to signal userspace using a file-descriptor

2007-05-09 Thread Gregory Haskins
Signed-off-by: Gregory Haskins [EMAIL PROTECTED]
---

 drivers/kvm/kvm.h  |2 +
 drivers/kvm/kvm_main.c |   82 
 2 files changed, 84 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 0f6cc32..b5bfc91 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -331,6 +331,8 @@ struct kvm_vcpu_irq {
int  deferred;
struct task_struct  *task;
int  guest_mode;
+   wait_queue_head_twq;
+   int  usignal;
 };
 
 struct kvm_vcpu {
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index a160638..6b40c18 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -40,6 +40,7 @@
 #include linux/file.h
 #include linux/fs.h
 #include linux/mount.h
+#include linux/poll.h
 
 #include x86_emulate.h
 #include segment_descriptor.h
@@ -304,6 +305,7 @@ static struct kvm *kvm_create_vm(void)
memset(vcpu-irq, 0, sizeof(vcpu-irq));
spin_lock_init(vcpu-irq.lock);
vcpu-irq.deferred = -1;
+   init_waitqueue_head(vcpu-irq.wq);
 
vcpu-cpu = -1;
vcpu-kvm = kvm;
@@ -2265,11 +2267,78 @@ static int kvm_vcpu_release(struct inode *inode, struct 
file *filp)
return 0;
 }
 
+static unsigned int kvm_vcpu_poll(struct file *filp, poll_table *wait)
+{
+   struct kvm_vcpu *vcpu = filp-private_data;
+   unsigned int events = 0;
+   unsigned long flags;
+
+   poll_wait(filp, vcpu-irq.wq, wait);
+
+   spin_lock_irqsave(vcpu-irq.lock, flags);
+   if (vcpu-irq.usignal)
+   events |= POLLIN;
+   spin_unlock_irqrestore(vcpu-irq.lock, flags);
+
+   return events;
+}
+
+static ssize_t kvm_vcpu_read(struct file *filp, char __user *buf, size_t count,
+loff_t *ppos)
+{
+   struct kvm_vcpu *vcpu = filp-private_data;
+   ssize_t res = -EAGAIN;
+   DECLARE_WAITQUEUE(wait, current);
+   unsigned long flags;
+   int val;
+
+   if (count  sizeof(vcpu-irq.usignal))
+   return -EINVAL;
+
+   spin_lock_irqsave(vcpu-irq.lock, flags);
+
+   val = vcpu-irq.usignal;
+
+   if (val  0)
+   res = sizeof(val);
+   else if (!(filp-f_flags  O_NONBLOCK)) {
+   __add_wait_queue(vcpu-irq.wq, wait);
+   for (res = 0;;) {
+   set_current_state(TASK_INTERRUPTIBLE);
+   if (val  0) {
+   res = sizeof(val);
+   break;
+   }
+   if (signal_pending(current)) {
+   res = -ERESTARTSYS;
+   break;
+   }
+   spin_unlock_irqrestore(vcpu-irq.lock, flags);
+   schedule();
+   spin_lock_irqsave(vcpu-irq.lock, flags);
+   }
+   __remove_wait_queue(vcpu-irq.wq, wait);
+   __set_current_state(TASK_RUNNING);
+   }
+
+   if (res  0)
+   vcpu-irq.usignal = 0;
+
+   spin_unlock_irqrestore(vcpu-irq.lock, flags);
+
+   if (res  0  put_user(val, (int __user *) buf))
+   return -EFAULT;
+
+   return res;
+}
+
 static struct file_operations kvm_vcpu_fops = {
.release= kvm_vcpu_release,
.unlocked_ioctl = kvm_vcpu_ioctl,
.compat_ioctl   = kvm_vcpu_ioctl,
.mmap   = kvm_vcpu_mmap,
+   .poll   = kvm_vcpu_poll,
+   .read   = kvm_vcpu_read,
 };
 
 /*
@@ -2336,6 +2405,7 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
struct kvm_vcpu *vcpu = (struct kvm_vcpu*)this-private;
unsigned long flags;
int direct_ipi = -1;
+   int indirect_sig = 0;
 
spin_lock_irqsave(vcpu-irq.lock, flags);
 
@@ -2357,6 +2427,15 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
 */
direct_ipi = task_cpu(vcpu-irq.task);
BUG_ON(direct_ipi == smp_processor_id());
+   } else {
+   /*
+* otherwise, we must assume that we could be
+* blocked anywhere, including userspace. Send
+* a signal to give everyone a chance to get
+* notification
+*/
+   vcpu-irq.usignal++;
+   indirect_sig = 1;
}
}
}
@@ -2379,6 +2458,9 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
smp_call_function_single(direct_ipi,
 kvm_vcpu_guest_intr,
 vcpu, 0, 0);
+
+