Re: [kvm-devel] [PATCH 1/4] ACE documentation

2007-05-06 Thread Wink Saville
> >
> > Trusted code should only be allowed access to the feature, at the moment
> > it is enforced by requiring the applications to have root permissions to
> > open the character device driver.
>
> This is a serious problem. There is a reason why we normally do things
> with system calls. Unless you can come up with a safe and reasonably clean
> way for unprivileged applications to use your code, I don't see how you
> expect it to get merged in the kernel.

Yeh, may be this won't be possible, but ya never know till you try. In any
case, I've posted to this list because I thought being able to efficiently
and symmetrically communicate between an guest thread and the host
kernel might be of interest. What I was thinking was using kshmem and
ACE it should be possible to implement a more efficient para-virtualize
devices.

So even though there is some loss separation between kernel it might
be worth it in this case and possibly other cases as well.

> > > Can't you put this into the vdso? Calling into the right place sounds
> > > like a problem that is already solved.
> >
> > Possibly, but it isn't universally available, I hope to use this technique
> > on other architectures.
>
> It should be possible to implement vdso on any architecture that is still
> missing it. Not easy, but it's an established way of doing things and a lot
> cleaner than making up your own linkage model.
>

I will take a look at vdso.

Wink

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1/4] ACE documentation

2007-05-06 Thread Arnd Bergmann
On Sunday 06 May 2007, Wink Saville wrote:
> >
> > > Thus code
> > > +executing within the ACE area can also be executed from user space or
> > > +kernel space. This is accomplished by using spin locks when executing
> > > +within the ACE area and changes to arch/x86_64/kernel/entry.S such that
> > > +when an interrupt occurs while executing code in the ACE area that code
> > > +will be completed before the interrupt is dispatched.
> >
> > I don't understand how you can write to the spinlock when coming from
> > user space. If the page is writable, how do you make sure the user can't
> > write malicious code or data into it?
> 
> Trusted code should only be allowed access to the feature, at the moment
> it is enforced by requiring the applications to have root permissions to
> open the character device driver.

This is a serious problem. There is a reason why we normally do things
with system calls. Unless you can come up with a safe and reasonably clean
way for unprivileged applications to use your code, I don't see how you
expect it to get merged in the kernel.

> > Can't you put this into the vdso? Calling into the right place sounds
> > like a problem that is already solved.
> 
> Possibly, but it isn't universally available, I hope to use this technique
> on other architectures.

It should be possible to implement vdso on any architecture that is still
missing it. Not easy, but it's an established way of doing things and a lot
cleaner than making up your own linkage model.
 
Arnd <<<

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1/4] ACE documentation

2007-05-06 Thread Wink Saville
On 5/6/07, Arnd Bergmann <[EMAIL PROTECTED]> wrote:
> On Sunday 06 May 2007, Wink Saville wrote:
> > +Atomic Code Execution (ACE) allows code to execute as if it was
> > +surrounded by spin_lock_irqsave and spin_unlock_irqrestore without
> > +requiring actual access to the processor flags register.
>
> I guess you mean spin_{un,}lock_irq here, right? The save/restore
> part had me confused for quite a while when trying to understand
> what you are doing.

Sorry, I should have said spin_unlock_irqrestore.

>
> > Thus code
> > +executing within the ACE area can also be executed from user space or
> > +kernel space. This is accomplished by using spin locks when executing
> > +within the ACE area and changes to arch/x86_64/kernel/entry.S such that
> > +when an interrupt occurs while executing code in the ACE area that code
> > +will be completed before the interrupt is dispatched.
>
> I don't understand how you can write to the spinlock when coming from
> user space. If the page is writable, how do you make sure the user can't
> write malicious code or data into it?

Trusted code should only be allowed access to the feature, at the moment
it is enforced by requiring the applications to have root permissions to
open the character device driver.


>
> > +The modifications to entry.S starts at the label ace_common plus the
> > +macro HANDLE_ACE immediately following. HANDLE_ACE invokes ace_common
> > +at the beginning of each interrupt, if it finds that code in the ACE
> > +area was interrupted it causes the ACE code to resume execution prior
> > +to handling the interrupt.
>
> You add eight instruction to every interrupt entry, have you measured
> whether that makes a difference in timing?

I haven't measured this, but assume the cost is negligible relative to
the shortest
interrupt path. I welcome any suggestions on improvements, which no doubt
are possible.

> > +The code for the ACE area is in the file arch/x86_64/kernel/ace.S. This
> > +file contains all of the code that maybe executed by the kernel or
> > +user-space. To be able to share this code with user space the code is
> > +also compiled by user space, but this is only to generate the jump
> > +table. Other techniques are possible, but this is the current
> > +implementation and can be seen in the directory test/ace.
>
> Can't you put this into the vdso? Calling into the right place sounds
> like a problem that is already solved.

Possibly, but it isn't universally available, I hope to use this technique
on other architectures.

> > +The ACE area is initialized by calling
> > +drivers/ace/ace_device.c/ace_init() from init/main.c/start_kernel().
>
> I don't understand what you even want the character device for,
> shouldn't the memory just always be there?

The initial reason was just as a place to test the ideas and the implementation.
But, as noted above, I also use it to enforce which applications can
access the feature. Some technique will probably be required to do this
and the driver can certainly be removed if it is unnecessary.

>
> > +4. ACE area API Reference
> > +
> > +ACE supports the following API:
> > +
> > +uint64_t ace_inc_counter(void)
> > +
> > +  Increment the counter
> > +
> > +
> > +uint64_t ace_inc_timer_counter(void)
> > +
> > +  Increment the timer counter
> > +
> > +
> > +void ace_get_counters(uint64_t *counter, uint64_t *timer_counter)
> > +
> > +  Return the counter and timer counter
>
> None of these functions sound like you need such a complex infrastructure
> to do them, setting single values can be done with atomic operations from
> user space. Is there anything more complex that you plan to add as the
> next step?
>

These were only used for testing, and in fact there are different now and I
should have updated the documentation.

Thank you for the feed back,

Wink

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1/4] ACE documentation

2007-05-06 Thread Arnd Bergmann
On Sunday 06 May 2007, Wink Saville wrote:
> +Atomic Code Execution (ACE) allows code to execute as if it was
> +surrounded by spin_lock_irqsave and spin_unlock_irqrestore without
> +requiring actual access to the processor flags register.

I guess you mean spin_{un,}lock_irq here, right? The save/restore
part had me confused for quite a while when trying to understand
what you are doing.

> Thus code 
> +executing within the ACE area can also be executed from user space or
> +kernel space. This is accomplished by using spin locks when executing
> +within the ACE area and changes to arch/x86_64/kernel/entry.S such that
> +when an interrupt occurs while executing code in the ACE area that code
> +will be completed before the interrupt is dispatched.

I don't understand how you can write to the spinlock when coming from
user space. If the page is writable, how do you make sure the user can't
write malicious code or data into it?

> +The modifications to entry.S starts at the label ace_common plus the
> +macro HANDLE_ACE immediately following. HANDLE_ACE invokes ace_common
> +at the beginning of each interrupt, if it finds that code in the ACE
> +area was interrupted it causes the ACE code to resume execution prior
> +to handling the interrupt.

You add eight instruction to every interrupt entry, have you measured
whether that makes a difference in timing?

> +The code for the ACE area is in the file arch/x86_64/kernel/ace.S. This
> +file contains all of the code that maybe executed by the kernel or
> +user-space. To be able to share this code with user space the code is
> +also compiled by user space, but this is only to generate the jump
> +table. Other techniques are possible, but this is the current
> +implementation and can be seen in the directory test/ace.

Can't you put this into the vdso? Calling into the right place sounds
like a problem that is already solved.

> +The ACE area is initialized by calling
> +drivers/ace/ace_device.c/ace_init() from init/main.c/start_kernel().

I don't understand what you even want the character device for,
shouldn't the memory just always be there?

> +4. ACE area API Reference
> +
> +ACE supports the following API:
> +
> +uint64_t ace_inc_counter(void)
> +
> +  Increment the counter
> +
> +
> +uint64_t ace_inc_timer_counter(void)
> +
> +  Increment the timer counter
> +
> +
> +void ace_get_counters(uint64_t *counter, uint64_t *timer_counter)
> +
> +  Return the counter and timer counter

None of these functions sound like you need such a complex infrastructure
to do them, setting single values can be done with atomic operations from
user space. Is there anything more complex that you plan to add as the
next step?

Arnd <><

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/4] ACE documentation

2007-05-05 Thread Wink Saville
ACE, is an acronym for Atomic Code Execution and provides
spin_lock_irqsave and spin_lock_irqrestore semantics in
user space.

Signed-off-by: Wink Saville <[EMAIL PROTECTED]>
---
  Documentation/ace.txt |  100 +
  1 files changed, 100 insertions(+), 0 deletions(-)
  create mode 100644 Documentation/ace.txt

Index: linux-2.6/Documentation/ace.txt
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6/Documentation/ace.txt 2007-04-29 21:52:04.0 -0700
@@ -0,0 +1,100 @@
+Title  : Atomic Code Execution (ACE)
+Authors: Wink Saville <[EMAIL PROTECTED]>
+
+CONTENTS
+
+1. Concepts
+2. Architectures Supported
+3. Configuring
+4. API Reference
+5. TODO
+
+
+1. Concepts
+
+Atomic Code Execution (ACE) allows code to execute as if it was
+surrounded by spin_lock_irqsave and spin_unlock_irqrestore without
+requiring actual access to the processor flags register. Thus code
+executing within the ACE area can also be executed from user space or
+kernel space. This is accomplished by using spin locks when executing
+within the ACE area and changes to arch/x86_64/kernel/entry.S such that
+when an interrupt occurs while executing code in the ACE area that code
+will be completed before the interrupt is dispatched.
+
+The modifications to entry.S starts at the label ace_common plus the
+macro HANDLE_ACE immediately following. HANDLE_ACE invokes ace_common
+at the beginning of each interrupt, if it finds that code in the ACE
+area was interrupted it causes the ACE code to resume execution prior
+to handling the interrupt.
+
+The code for the ACE area is in the file arch/x86_64/kernel/ace.S. This
+file contains all of the code that maybe executed by the kernel or
+user-space. To be able to share this code with user space the code is
+also compiled by user space, but this is only to generate the jump
+table. Other techniques are possible, but this is the current
+implementation and can be seen in the directory test/ace.
+
+The implementation of the code in the ACE area must be coordinated with
+how the code in HANDLE_ACE is implemented and will be specific to each
+architecture. The primary requirement is that it must be easy for the
+ISR code to continue execution and it must be easy for both user and
+kernel code to execute it. To make it simple to continue execution the
+ISR code alters the ACE code return address.
+
+This is done by having the ACE code always return to the non-ACE code
+via register r11. This allows the HANDLE_ACE code to swap the contents
+of r11 with the interrupted address and thus compete the ACE code which
+will release the spinlock and the original caller will be returned to
+when the ISR completes. Thus giving us the semantic that ACE area code
+was executed as if it was surrounded by a spin_lock_irqsave and a
+spin_unlock_irqrestore.
+
+To allow the above algorithm to work under all circumstances the ACE
+area code must appear in every user and kernel thread, this is
+accomplished by allocting the ACE area using kshmem_alloc_at.
+
+The ACE area is initialized by calling
+drivers/ace/ace_device.c/ace_init() from init/main.c/start_kernel().
+
+
+2. Architectures Supported
+
+- X86_64
+
+
+3. Configuring
+
+In "Processor type and features" select "Atomic Code Execution (ACE)".
+
+
+4. ACE area API Reference
+
+ACE supports the following API:
+
+uint64_t ace_inc_counter(void)
+
+  Increment the counter
+
+
+uint64_t ace_inc_timer_counter(void)
+
+  Increment the timer counter
+
+
+void ace_get_counters(uint64_t *counter, uint64_t *timer_counter)
+
+  Return the counter and timer counter
+
+
+5. ACE device driver API Reference
+
+The ACE driver must be opened before accessing any of the ACE are API
+routines and close upon exiting. The driver name is /dev/ace.
+Currently there is an ioctl routine but no operations are defined.
+
+
+5. TODO
+
+a. Decide how to handle page faults while executing ACE.
+b. Decide upon the actual code needed in ACE.
+


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel