Re: [Dri-devel] Re: Ring lockups on mach64

José Fonseca Tue, 11 Jun 2002 13:05:27 -0700

On 2002.06.11 20:31 Leif Delgass wrote:
> On Tue, 11 Jun 2002, José Fonseca wrote:
> 
> > On 2002.06.11 18:24 Leif Delgass wrote:
> > > On Tue, 11 Jun 2002, José Fonseca wrote:
> > >
> > > ...
> > In that case we just need to ensure that the head >= the offset of the
> > last ring submited.
> 
> But that means essentially waiting for idle.  The ring tail is moved
> every
> time we add a buffer to the ring, so the "last ring submitted" is always
> close to the new tail and could be separated from the head by many
> descriptors.  We just need to make sure that the card isn't going to
> stop,
> so the flush needs to restart DMA if the card is idle and ensure that it
> _won't_ stop if the card is active.


I see. But it's very complicated to know whether the card is gonna to stop 
or not. What we can do is have flag which can tell us if we are sure the 
card will succed or not. When we add a new buffer to the ring and the 
after the change the head is before the change then we binary "and" the 
previous flag value with TRUE.

When we want to do that kind of flush, we check the value of that flag. If 
it's true we can release immediately. If it's not then we must 
wait_for_idle because the card will probably stop, restart the DMA, and 
return.

> 
> > > A couple of other comments on your change:  I think we could just
> >
> > Although there are some things you can try simplify this is _very_
> trick
> > as you gonna see below...
> >
> > > unconditionally turn on busmastering in SRC_CNTL if the engine is
> idle in
> >
> > No. The engine could be active but not doing BM, e.g., if X submited
> > something to FIFO.
> 
> Notice that I said if the engine is _idle_.  The engine can't be active
> if
> it's idle, right?  ...

Right. Sorry. I didn't understood that the first I read this. I guess that 
it would be safe to do what you had (*) suggested but as, you said too 
below, using a variable it's probably a better idea - after all if it's in 
the cache, the speed of access if several orders faster than doing 
anything over the bus.

(*) Not true.. after seeing your diff I reminded of a problem with this: 
note that if the card is not idle then you'll gonna offset the -4 entries 
to correct it, but if the card was processing the X FIFO commands - that's 
totally wrong and it's gonna corrupt the head offset. So I reiterate what 
I said on my previous reply regarding this change. Nevertheless this is a 
non-issue as using a variable achieves the same or better without 
reordering the code.

> ...
> 
> Actually, I realized there are two writes (one to SRC_CNTL and one to
> BUS_CNTL), so reading SRC_CNTL could be better.  As you say, we could
> keep
> track of this with a variable, as long as we account for the possibility
> of the card disabling bus mastering in the case of a lockup.
> 
> ...
> >
> > What's the point in waiting for anything? I though we both had agree
> that
> > we shouldn't wait for anything when submiting a buffer... (after all we
> 
> > both drawn the same conclusions in that matter in two distinct times).
> 
> As I mentioned above, we just need to ensure that the card won't stop
> because it has already read an old end of list flag.  We could have added
> several descriptors to the ring while the card is still active on an
> earlier descriptor which the card sees as the end of the ring.  So
> restarting the card if it's idle doesn't ensure that it won't stop if

I'm not sure if I understood correctly what you're saying. Note that once 
we restart the card we can be sure that it won't stop until it finishes 
_all_ buffers we supplied until _that_ moment.

> it's active.  I was just thinking that waiting to see the card advance a
> couple
> of descriptors (but not waiting for idle) might be enough to ensure that
> it isn't going to stop.

I think that having a flag indicating whether the card can stop or not is 
more efficient. What do you think?

Another thing that we must address before this is using bigger buffers - 
most of the ring dumps I see the buffers span just one or two entries at 
maximum. There are two things: we are using a buffer for each DMA* macro 
sets (which I'm working on now); and the client submited buffers can be 
much bigger (they can even span several primitives). Until both this 
things are addressed the the card is in idle most of the time anyway (at 
least on my slow testbox).

> ...
> >
> > Yep. But after thinking better and it's probably better do all in
> > UPDATE_RING... there isn't much point in putting 2 output instructions
> in
> > a seperate function where the call overhead takes more instructions
> than
> > that, especially if they are called only from a single function.
> 
> But doesn't inlining the function remove the overhead?  I was suggesting

Yes. If you inline you'll get the same thing... (I was being naughty.. :P )

> this mainly for readability, it's not really necessary if dma_start isn't
> going to be used anywhere else.

> 
> I'm attaching a diff so you can see what I'm talking about.  Hopefully
> this will remove any confusion. :)
> 

Yes it helped!

Besides my above comment I noticed that you did something with 
RING_SPACE_TEST_WITH_RETURN, but it wasn't clear what from the diff. Is 
that code still needed or not? I've disabled and I've been doing fine 
without it.

José Fonseca

_______________________________________________________________

Multimillion Dollar Computer Inventory
Live Webcast Auctions Thru Aug. 2002 - http://www.cowanalexander.com/calendar



_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Re: Ring lockups on mach64

Reply via email to