Hello mr. Miller, and the others, Years ago I followed the development of the GGI project which attempted to improve the linux graphics subsystem to, say, IRIX levels. Although they never succeeded, it turned out that graphics hardware is generally designed in such a way that it is very hard, or impossible for the OS to provide direct access to graphics hardware for multiple programs simultanuously in a fast, secure and stable manner. Apparantly SGI did make such secure but still fast hardware.
It is very well possible that you already took that into account in your
design, but still I want to bring a (very short) paper on this subject
to your attention.
I have included a weppage by Linas Vepstas about designing graphics
hardware in such a way that it is easier for the OS to make graphics
access crash-proof, secure, but still fast.
* Crash-proof as in not being able to crash the graphics hardware by
feeding it bogus commands;
* Secure as in providing protected graphics contexts for different
programs, what I think would be needed for integrating SELinux with the
X window system without making it slow;
* Fast as in low overhead for context switching when different programs
access the graphics hardware, because the hardware facilitates saving
and restoring the graphics context.
I hope it is any use, and also that making secure and stable hardware
_can_ be combined with being fast.
I don't know Linas Vepstas at all, but from time to time I try to get
the right information to the right people.
thank you,
here it is:
------------------------------------------------------------------------
High Performance Graphics Hardware Design Requirements
This page attempts to spell out graphics hardware design requirements
needed to build high-performance graphics subsystems. This page is
intended for h/w graphics chip and board designers, as well as graphics
software sub-system designers and graphics device driver writers. It's
intent is to broaden the understanding of hardware design principles
needed to create high-performance graphics subsystems. These principles
are well known to high-end folks, but are sorely lacking in the Wintel
PC clone marketplace.
This page is motivated by discussions on the
comp.os.linux.development.system USENET group, and the efforts of the
Linux GGI group, where it has been discovered that most PC-class/ MS
Windows graphics hardware is sorely lacking in important graphics
features. Current work on hardware-accelerated 3D centers around the
Mesa OpenGL implementation. The Graphics Advocacy page provides the
Linux background for accelerated 3D graphics.
Basic Principles
The single most fundamental concept of high-performance graphics
hardware design is that the graphics program must have direct access to
the hardware. Depending on your experience, this may sound either
obvious, or a damned-fool bad idea. To people writing computer games,
and to people building hardware, this is obvious. To people writing
operating systems and graphics applications, who are used to device
drivers, libraries and windowing systems, this sounds stupid. In fact,
both camps are correct: fast access is direct access, and yes, with
improperly designed hardware, it is dangerous.
The high-end Unix graphics hardware community has learned that both
worlds are possible: direct access from user-level programs (usually
through libraries) for performance, coupled to protected system modes
that prevent out-of-control or malicious programs from hanging the
system and locking up the hardware. However, to create such a system,
certain principles must be adhered to in the raster chip, bus interface
chip, and graphics card design. These principles are not terribly hard,
and in fact are sometimes deceptively simple and obvious. However, many
schedules have been slipped due to a misunderstanding of the required
functions. The repercussions of these principles affect the hardware,
the graphics system, the operating system, the window system, and the
graphics application. "Minor" hardware bugs in these areas are not
easily worked around in software; indeed, it may not be even possible to
work around them.
There are two basic principles: (1) a recognition that there is a
difference between a protected mode, to which only the operating system
has access, and user-level drawing commands, which any program can bang
on. (2) The concept of context switching, whereby one graphics
application can be stopped, and another re-started, all without hanging
the graphics adapter, or loosing/scrambling the state of the hardware.
All of the other principles follow from the above.
Without further ado, the list:
Protected Mode
Certain graphics h/w registers/functions, such as cursor control
and colormap load, must be segregated into a distinct address
space from other functions, such as area clear and line drawing.
This allows the operating system to protect *privileged
functions*, such as cursor movement or colormap loading, from
*user space programs*, which want to have direct access to
hardware registers for line drawing and area clear for (obvious)
performance reasons. Such functions must be separated by at
least 4K bytes, since most CPU's do not allow fine-grained
memory protection (e.g. Intel x86, PowerPC, MIPS, Sparc only
allow protection for 1K-4K byte pages.)
Hardware Cursor
It is impossible to build a high-performance graphics subsystem
if the cursor needs to be drawn using software. This is not much
of an issue, since many DAC's today support hardware cursors,
and many/most graphics cards provide this function.
Atomic Operations
All drawing (i.e non-protected) operations must be atomic. This
allows the operating system to suspend one program that is
drawing, and start up another program that is drawing, without
hanging the graphics hardware. For example, if it requires three
registers to be written to draw a line or clear an are
(start-xy, end-xy, and "command"), it must be possible for the
software to write the start/end points, and never get around to
writing the command, without hanging the hardware. (If the
command is never written, then the line is never drawn).
In particular, this requires that command words be written last,
and not first. For commands that require multiple registers to
be written, it must be possible to break off the command at any
point without hanging the hardware (i.e. it must be possible to
write some of the registers, without writing all of them,
without indefinitely hanging the hardware). If only a partial
command is written, then no operation is performed.
Interruptible Operations
All drawing (i.e. user-level) operations must be interruptible.
That is, if a command requires that multiple registers must be
written, it must be possible to start writing data for this
command, and then break this off and perform another command
instead. Thus, for example, it must be possible to specify the
line endpoints, then specify clear-area extents, then clear the
area, then move the cursor, and then ask for the line to be
drawn (software may have reloaded the line endpoints first).
Such interrupted operations must NOT leave the hardware in an
unknown or hung state.
This, together with the atomic-operations requirement above, and
the readable registers requirement below, allows a multi-tasking
operating system to stop a drawing process at any time (on an
instruction-by-instruction basis), put it to sleep, and then
allow another drawing process to run and do its drawing.
Non-atomic, non-interruptible drawing operations require that
the drawing program to obtain a lock, do its stuff, then release
the lock when it's done. In general, locks are undesirable: they
are slow. Even if a lock was fast, just having to do one takes
CPU cycles away from what we really want to do: draw stuff.
Note that after the operating system has suspended one client,
it may do house-hold functions, such as updating the cursor or
the colormap, before allowing other processes to run. Thus, it
must be possible to execute privileged commands that interrupt
user commands.
Readable Registers
All registers must be readable. This is vital for a
multi-tasking operating system. This allows the operating system
to stop a graphics process, and save its graphics hardware
context. It then allows the OS to restore a possibly different
context from a different graphics process, allowing it to run,
then stopping it, saving, etc.
The concept introduced here is of "context switching" or
"multi-tasking". Basically, a graphics program can be suspended
at any time, and another graphics program can be started exactly
where it last left off. In order to be able to restart another
process precisely where it left off, it must be possible to set
the graphics hardware into the exact same state where the last
program left off. To be able to get back to the exact same
state, it must be possible to somehow read and save this state.
Note that high-end hardware usually provides features that not
only make it possible to read and restore state information, but
also make this operation extremely fast. Hardware that does
support save/restore usually supports this at sub-millisecond
speeds, thus allowing hundreds of context switches per second,
while still leaving the the CPU and graphics card 90% free so
that drawing can continue without hardly any slowdown.
Note that more modern high-end high-end hardware allows multiple
graphics contexts: these can be saved to, and restored from
special RAM areas on the card, without having to move all of the
context information over the bus.
Window Clipping Planes
Window clipping planes prevent a program from drawing outside of
it's window boundaries. This function isn't absolutely required,
but is almost so. A graphics program can achieve much higher
performance by not worrying about whether it is drawing outside
of it's window boundaries, or whether it is obscured by another
window. In addition, clipping planes provide an important
security function: they prevent errant or intentionally
malicious programs from drawing where they should not. Thus, an
out-of-control program will not scribble all over the screen.
The update of window clipping planes must be a reserved,
protected operation. That is, the control of window clipping
planes must be segregated into a different address space than
other user-mode drawing operations.
Note that some graphics hardware provides user-mode clipping
registers. These are NOT what we are talking about here. Yes, it
is nice to have user-mode clip registers, but these cannot be
used by the operating system to prevent out-of-control or
malicious programs from drawing where they shouldn't.
Note that hardware that supports directly-addressable frame
buffers should also support clip tests against data written to
the directly addressable areas.
Per-Window Double Buffering
This is not strictly a requirement, but frankly, for a
high-performance, animated 3D hardware, full-screen double
buffering sucks. It is painful to support in the operating
system, in the graphics subsystem, and basically looks bad once
you have two or more windows animating at the same time.
Per-Window Multiple Colormaps
Again, not strictly a requirement, but if you want things to
look nice on the screen, you have got to allow applications to
set their own private colormaps, without ruining everything for
the other windows on the screen
FIFO's
Another non-requirement, but the fact is that most high-end
graphics hardware employs FIFOs to buffer drawing commands
between the central CPU and the graphics hardware. These FIFO's
are typically anywhere from 64 Bytes to 64 KBytes long. This
allows the CPU to write commands to the graphics adapter without
having to wait for it to finish, and it allows the graphics
hardware to process drawing commands without having to wait for
the CPU to provide more commands. As long as the buffer never
accumulates more than one-tenth of a second worth of drawing
commands, any delays or lags become essentially un-noticeable to
the user.
Four common designs are seen: FIFO's in hardware (on the
graphics adapter), FIFO's in user-memory, and "ping-pong"
buffers. FIFO's on the graphics card can present a problem: when
a context switch occurs, the FIFO contents must be saved and
restored. They can be moved either to other memory on the
graphics card, or they can be sent across the bus, back to the
system. FIFO's in user memory present a problem: data and
pointers can be corrupted by the user program (accidentally or
maliciously). Of course, it must not be possible to hang the
hardware due to corrupt data in the FIFO.
Hardware Contexts
Yet another non-requirement. However, almost all high-end
hardware keeps considerable graphics context information on the
hardware itself. Just as is the case with FIFO's, this context
information must be saved and restored when a context switch
occurs. Again, this context is moved either to another memory
location on the adapter, or is sent back across the bus to the
system for temporary storage in the kernel.
Well, that all. There are in fact a large variety of more detailed
design issues, but these are too numerous to be discussed in this
overview. All of the principles discussed above are well-known and
understood in the high-end (UNIX) graphics hardware community. All of
these have been discussed and written about in public forums and
journals. However, many of these are rare, have low circulation, or are
out-of-print. This is the ultimate reason for the existence of this
page. See Bibliography below.
Kernel Considerations
The operating system kernel must address each of the hardware design
considerations expressed above. In particular, the kernel on SGI Irix
and IBM RS/6000 AIX systems supports the following functions:
Grant and Retract
A user application is granted direct access to the drawing
subsystem for the very first time by registering itself with the
kernel. The kernel returns addresses to the drawing subsystem
hardware.
Graphics Faults
Access control to the graphics hardware is governed by a
mechanism similar in many ways to the page-fault mechanism. Let
us review page-faulting: when the CPU attempts to touch a page
which is not in real memory (is in the swap space, for
instance), the CPU receives an interrupt. The interrupt handler
puts the process to sleep, and issues a read request to the
disk. When the disk has found the requested page, that page is
loaded into real memory, the virtual page tables are updated,
and the process is marked "ready-to-run". When a time slice is
available, the kernel will schedule the process and allow it to
run again.
A graphics fault proceeds in a similar manner: as long as there
are no other graphics processes that want to access the
hardware, the current process can bang away at it. Periodically,
however (typically, every 4 milliseconds), the graphics
time-slice expires. The kernel looks to see if here are any
other graphics processes that want to run. If so, then it
retracts write permission to the graphics hardware from the
first process, performs the graphics context switch, and then
grants address access to the second process. At this point, if
the first process attempts to touch the graphics i/o space, an
interrupt will be generated. The first process will be put to
sleep. The kernel will then schedule another process to run (not
necessarily another graphics process). Graphics time-slice
scheduling and regular process scheduling typically run
independently of each other.
Cursor
The kernel must provide interfaces to allow a special process
(typically, the X Server) to update the position of the cursor.
WID Management
Most high-end graphics hardware has window-id (WID) planes.
These planes control not only which hardware color palette is
used for pixel color lookup, but also typically provide hardware
clipping so that a process cannot draw outside of its window and
corrupt the screen.
The kernel must provide interfaces to manage these clipping
planes, and/or take over management itself. In particular, if a
window is moved (e.g. the user picks it up with the mouse and
moves it), the WID planes must be updated to reflect the new
window position. Window ID updates are by definition a
privileged operation: user processes must not be allowed to
twiddle with them, as this would allow them to corrupt window
contents accidentally or intentionally. If the corruption is
accidental, then it is merely ugly: the user sees crap drawn all
over the screen, where it shouldn't be. A malicious example
might be a rogue program running on a CIA/NSA machine attempting
to read confidential information from another window.
Context Management
If the graphics hardware has hardware contexts or hardware
FIFOs, then the kernel must shuffle this data around during a
context switch. If the adapter does not have a lot of memory on
it, then this data must be copied back across the bus, and
stored in some temporary location within the kernel. This memory
must, of course,be cleaned up if the graphics process exits.
Double Buffering
All high-end graphics hardware supports hardware double
buffering. Some supports hardware quad-buffering (for
double-buffered stereo viewing). Buffer swaps need to be
synchronized with vertical retrace interrupts, so that image
tearing does not occur. The kernel is often involved with
synchronizing the swap with the retrace interrupt.
Furthermore, the kernel must count the number of pending buffer
swaps for a graphics process, and put it to sleep if there are
two. A graphics program is still typically allowed to write to a
FIFO or buffer while there is one pending, outstanding swap
request. But any more than that, and things get ugly. For
example, we once allowed a program to issue 600 buffer swaps
without putting it to sleep. It then proceeded to buffer swap 60
times a second for the next ten seconds, while everybody
wondered why it couldn't be control-C'd, and otherwise acted
unexpectedly! Never mind that what it was drawing was 10 seconds
out of date with respect to the current position of the mouse!
Bibliography
Many of the above principles are discussed in greater detail in the
following classical references. If my memory serves me correctly, the
papers by Voorhies and by Rhoden are particularly descriptive of the
issues and possible solutions. Yes, these would appear to be very old,
but, if anything, they illustrate how Unix and Unix workstations have at
times enjoyed a ten year lead in technology over PC's and PC operating
systems.
1. Akeley, Kurt and Tom Jermoluk, "High Performance Polygon
Rendering", Conference Proceedings, SIGGRAPH, 1988, vol 22 no.
4, pp 239-246.
2. Doyle, Brian, "All About Multi-Processing for Unix
Workstations", Conference Proceedings NCGA '1990, pp228-253.
(National Computer Graphics Association).
3. Haletky, Edward H. and Linas Vepstas, "Integration of GL with
the X Window System", Conference Proceedings, Xhibition 1991,
pp.105-113
4. Norrod, Forest and Larry Thayer, "An Advanced VLSI Chip Set for
Very High Speed Graphics Rendering", Conference Proceedings,
NCGA 1991, pp 1-10.
5. Rhoden, Desi and Chris Wilcox. "Hardware Acceleration for Window
Systems", Conference Proceedings SIGGRAPH 1989 vol 23 no. 3 pp
61-67.
6. Stewart, Don. "VLSI: Key to Four Basic Strategies for Improving
Workstation Graphics", Conference Proceedings, NCGA 1990 pp
302-308.
7. Vepstas, Linas. "Porting OpenGL to New Hardware Platforms",
Course Notes, OpenGL, SIGGRAPH 1992.
8. Voorhies, Douglas, David Kirk and Olin Lathrop, "Virtual
Graphics", Conference Proceedings, SIGGRAPH 1988, vol 22 no. 4,
pp 247-253.
________________________________________________________________________
Last updated 18 February 1996 by Linas Vepstas.
Linas can be reached at [EMAIL PROTECTED]
See also Linas Web Page
--
Roland Nagtegaal <[EMAIL PROTECTED]>
Universiteit Leiden, Instituut Lorentz
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
