Re: [Dri-devel] Mach64 dma fixes

Leif Delgass Sat, 25 May 2002 11:01:19 -0700

On Sat, 25 May 2002, Frank C. Earl wrote:

> On Saturday 25 May 2002 11:56 am, you wrote:
> 
> > What prevents a client from modifying the contents of a buffer after it's
> > been submitted?  Sure, you can't send new buffers without the lock, but
> > the client can still write to a buffer that's already been submitted and
> > dispatched without holding the lock.
> 
> Nothing.  If the chip had been as secure as we'd initially thought, it would 
> have not mattered because all they'd do is scribble all over the screen at 
> the worst.  
> 
> If you're unmapping on submission, you don't have to lock things on the 
> client end because they can't alter after the fact.  Then you only have to 
> worry about bad data.  In this case, what you're going to want to do is to 
> unmap, build the real structure by filling in the commands for the vertex 
> entries, and submit to the processing queue.   Multiple callers could then 
> still submit what they wanted to be DMAed without waiting (in the current 
> model, don't each of the clients have to wait if one's got the lock?) because 
> there's a peice of code multiplexing the DMA resource instead of a lock 
> managing it.


I'm using the same model you had set up.  When a client submits a buffer,
it's added to the queue (but not dispatched) and there's no blocking.  
The DRM batch submits buffers when the high water mark is reached or the
flush ioctl is called (needed before reading/writing to the framebuffer,
e.g.).  Clients have to wait for the lock to submit the buffer, but the
ioctl quickly returns.  The only place where a client has to wait is in
freelist_get if the freelist is empty.  That's where buffer aging or
reading the ring head allows the call to return as soon as a single buffer
is available, rather than waiting for the whole DMA pass to complete.
 
> > I don't see the interrupt method being that different from a security
> > perspective.  The DRM is "in the driver's seat" in either case, the method
> > without interrupts is essentially the same, but with the trigger for
> > starting a new pass in a different place.  The problem isn't just relying
> > on reading registers that can be modified by the client, but ensuring that
> > the client doesn't add commands to derail the DMA pass or lock the engine.
> > The only way to make sure this doesn't happen is by copying or
> > unmapping and verifying the buffers.  I think the i830 driver does this.
> > Yes, it will impact performance, but I don't see a way to get around it
> > and still make the driver secure.  At least this extra work can be done
> > while the card is busy with a DMA operation.
> 
> If it had been secure and you couldn't derail DMA, it wouldn't have peices 
> that could be confused by malicious clients, meaning you didn't need to do 
> copying, etc. to secure the pathway, ensuring peak overall performance.  With 
> your latest test case, it's a moot point.
>   
> We're going to have to secure the stream proper in the form of code that has 
> inner loops, etc.  (The i830 does an unmap and a single append only- we've 
> got a lot more to do with the Mach64.  I've been thinking of ways around that 
> on the i830 and i810 that I'm going to be trying at some point.)  Your way 
> would be as secure in this environment.

For vertex data, we can add the register commands based on the primitive 
type and buffer size.  By placing the commands, we can ensure that any 
commands in the buffer would just be seen as data.  This would require an 
unmap and loop through the buffer, but we wouldn't have to copy all the 
data.  I'm going to try doing gui-master blits using BM_HOSTDATA rather 
than BM_ADDR and HOST_DATA[0-15] and see if we can elimintate the register 
commands in the buffer.  We could also use system bus masters for blits, 
but that would require ending the current DMA op and setting up a new one 
for each blit, since blits done this way use BM_SYSTEM_TABLE instead of 
BM_GUI_TABLE.  With BM_HOSTDATA it would be a matter of changing the 
descriptors for blits, but they could co-exist in the same stream as 
vertex and state gui-master ops.
 
> Now, as to which is more effiicient, that's still up to debate.  I can't say 
> which is going to be faster overall.  There's the aging in your design that 
> allows for buffers being released sooner than in mine.  There's the need for 
> serialization in your design that is unrequired in mine.  Which causes the 
> worst bottlenecks in performance?

As I explained above, serialization isn't needed.  It's really a question 
of which method of checking completion and dispatching buffers leaves the 
least amount of idle time.  Buffer aging could still be used in the 
interrupt driven model, so that's not really a constraint of one approach 
versus the other.  I don't think it would be too difficult to test both 
methods without too much change in the basic code infrastructure.

-- 
Leif Delgass 
http://www.retinalburn.net


_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Mach64 dma fixes

Reply via email to