Re: GARTSize option not documented on radeon and other problems

2007-05-04 Thread Zoltan Boszormenyi
Oliver McFadden írta:
> On 5/3/07, Zoltan Boszormenyi <[EMAIL PROTECTED]> wrote:
>   
>> Hi,
>>
>> sorry for the crossposting, I don't know who to address.
>>
>> I am experimenting the new CFS scheduler on Linux
>> and tried to start multiple glxgears to see whether
>> they are really running smooth and have evenly
>> distributed framerate.
>>
>> At first I could only start two instances of glxgears
>> but the third didn't start saying that it cannot allocate
>> GART memory and try to increase increase GARTSize.
>>
>> First problem: man radeon doesn't have anything about
>> this option, although radeon_drv.so contains this keyword.
>> I tried guessing whether the parameter is in MBs and
>> have set it to 128 but it disabled DRI because of some
>> out of memory condition. Setting it to 64 gave me working
>> DRI and I am able to start up some more instances of
>> glxgears.
>>
>> Second problem: if I start 16 of them, the last 3
>> behaves very strange, i.e. instead of the spinning gears
>> I get chaotic flickering triangles. As soon as the number
>> of glxgears goes down to 13 every window behaves
>> normal.
>>
>> Third problem: starting up 32 glxgears locked up the
>> machine almost instantly but having only 16 of them
>> also locks up the machine after some time passed.
>>
>> The machine is x86-64, Athlon64 X2 4200,
>> PCIe Radeon X700 with 128MB onboard memory,
>> up-to-date Fedora Core 6.
>>
>> Best regards,
>> Zoltán Böszörményi
>> 
>
> This is interesting. I've occasionally seen my engine just display a mess of
> triangles, but if I was to kill it and start it again, it would work fine. 
> This
> only happened very occasionally so I could never track down the problem.
>
> If running more than 13 glxgears shows the problem then maybe it would have a
> chance of getting fixed. :) You might want to open a bug report on the
> Freedesktop Bugzilla.
>   

BZ #10852

> The lockups are nothing new, unfortionally. I think there is a bug somewhere 
> in
> the R300 driver. You can also get deadlocks which are a little different... I
> think that these might go away after R300 uses TTM and thus doesn't grab the
> hardware lock anymore for texture upload, etc.
>   

Is the lockup happens with only multiple GL[X] clients?
I can play hours with Diablo 2 :-) in Wine and it doesn't lock up.

Best regards,
Zoltán Böszörményi


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: GARTSize option not documented on radeon and other problems

2007-05-04 Thread Jerome Glisse
On 5/4/07, Zoltan Boszormenyi <[EMAIL PROTECTED]> wrote:
> Oliver McFadden írta:
> > On 5/3/07, Zoltan Boszormenyi <[EMAIL PROTECTED]> wrote:
> >
> >> Hi,
> >>
> >> sorry for the crossposting, I don't know who to address.
> >>
> >> I am experimenting the new CFS scheduler on Linux
> >> and tried to start multiple glxgears to see whether
> >> they are really running smooth and have evenly
> >> distributed framerate.
> >>
> >> At first I could only start two instances of glxgears
> >> but the third didn't start saying that it cannot allocate
> >> GART memory and try to increase increase GARTSize.
> >>
> >> First problem: man radeon doesn't have anything about
> >> this option, although radeon_drv.so contains this keyword.
> >> I tried guessing whether the parameter is in MBs and
> >> have set it to 128 but it disabled DRI because of some
> >> out of memory condition. Setting it to 64 gave me working
> >> DRI and I am able to start up some more instances of
> >> glxgears.
> >>
> >> Second problem: if I start 16 of them, the last 3
> >> behaves very strange, i.e. instead of the spinning gears
> >> I get chaotic flickering triangles. As soon as the number
> >> of glxgears goes down to 13 every window behaves
> >> normal.
> >>
> >> Third problem: starting up 32 glxgears locked up the
> >> machine almost instantly but having only 16 of them
> >> also locks up the machine after some time passed.
> >>
> >> The machine is x86-64, Athlon64 X2 4200,
> >> PCIe Radeon X700 with 128MB onboard memory,
> >> up-to-date Fedora Core 6.
> >>
> >> Best regards,
> >> Zoltán Böszörményi
> >>
> >
> > This is interesting. I've occasionally seen my engine just display a mess of
> > triangles, but if I was to kill it and start it again, it would work fine. 
> > This
> > only happened very occasionally so I could never track down the problem.
> >
> > If running more than 13 glxgears shows the problem then maybe it would have 
> > a
> > chance of getting fixed. :) You might want to open a bug report on the
> > Freedesktop Bugzilla.
> >
>
> BZ #10852
>
> > The lockups are nothing new, unfortionally. I think there is a bug 
> > somewhere in
> > the R300 driver. You can also get deadlocks which are a little different... 
> > I
> > think that these might go away after R300 uses TTM and thus doesn't grab the
> > hardware lock anymore for texture upload, etc.
> >
>
> Is the lockup happens with only multiple GL[X] clients?
> I can play hours with Diablo 2 :-) in Wine and it doesn't lock up.
>
> Best regards,
> Zoltán Böszörményi

We believe lockup happen with high memory traffic (using big
texture or sending meg of datas to the card). I believe Diablo
isn't especialy such a traffic whore apps.

For your test on the cfs scheduler i don't think drm is good
to test with. From my understanding i would say that any
apps that get the gpu lockup can basicly starve other app
for it i.e. we haven't any scheduling or fair sharing of the gpu.
This isn't at all easy to solve and i think we kind of don't care
if app can DOS gpu.

best,
Jerome Glisse

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: GARTSize option not documented on radeon and other problems

2007-05-04 Thread Zoltan Boszormenyi
Jerome Glisse írta:
> On 5/4/07, Zoltan Boszormenyi <[EMAIL PROTECTED]> wrote:
>   
>> Oliver McFadden írta:
>> 
>>> On 5/3/07, Zoltan Boszormenyi <[EMAIL PROTECTED]> wrote:
>>>
>>>   
 Hi,

 sorry for the crossposting, I don't know who to address.

 I am experimenting the new CFS scheduler on Linux
 and tried to start multiple glxgears to see whether
 they are really running smooth and have evenly
 distributed framerate.

 At first I could only start two instances of glxgears
 but the third didn't start saying that it cannot allocate
 GART memory and try to increase increase GARTSize.

 First problem: man radeon doesn't have anything about
 this option, although radeon_drv.so contains this keyword.
 I tried guessing whether the parameter is in MBs and
 have set it to 128 but it disabled DRI because of some
 out of memory condition. Setting it to 64 gave me working
 DRI and I am able to start up some more instances of
 glxgears.

 Second problem: if I start 16 of them, the last 3
 behaves very strange, i.e. instead of the spinning gears
 I get chaotic flickering triangles. As soon as the number
 of glxgears goes down to 13 every window behaves
 normal.

 Third problem: starting up 32 glxgears locked up the
 machine almost instantly but having only 16 of them
 also locks up the machine after some time passed.

 The machine is x86-64, Athlon64 X2 4200,
 PCIe Radeon X700 with 128MB onboard memory,
 up-to-date Fedora Core 6.

 Best regards,
 Zoltán Böszörményi

 
>>> This is interesting. I've occasionally seen my engine just display a mess of
>>> triangles, but if I was to kill it and start it again, it would work fine. 
>>> This
>>> only happened very occasionally so I could never track down the problem.
>>>
>>> If running more than 13 glxgears shows the problem then maybe it would have 
>>> a
>>> chance of getting fixed. :) You might want to open a bug report on the
>>> Freedesktop Bugzilla.
>>>
>>>   
>> BZ #10852
>>
>> 
>>> The lockups are nothing new, unfortionally. I think there is a bug 
>>> somewhere in
>>> the R300 driver. You can also get deadlocks which are a little different... 
>>> I
>>> think that these might go away after R300 uses TTM and thus doesn't grab the
>>> hardware lock anymore for texture upload, etc.
>>>
>>>   
>> Is the lockup happens with only multiple GL[X] clients?
>> I can play hours with Diablo 2 :-) in Wine and it doesn't lock up.
>>
>> Best regards,
>> Zoltán Böszörményi
>> 
>
> We believe lockup happen with high memory traffic (using big
> texture or sending meg of datas to the card). I believe Diablo
> isn't especialy such a traffic whore apps.
>   

But neither is glxgears. If I have a small number of them, say 2-3,
I don't experience any lockup.

> For your test on the cfs scheduler i don't think drm is good
> to test with. From my understanding i would say that any
>   

But it seems that the mainstream scheduler in Linux cannot keep
even small number of GL clients run smoothly under load.
My test was to run make -j4 on the kernel source while
running 12 glxgears. All 12 gears were smoothly running.
And in the meantime, swithing workspaces worked quickly,
i.e. repainting the large windows of Firefox and Thunderbird
were although not instant but quite quick despite the high load.
This kind of load makes the mainstream scheduler stall for
several seconds which cannot be observed under Ingo Molnar's
new scheduler.

> apps that get the gpu lockup can basicly starve other app
> for it i.e. we haven't any scheduling or fair sharing of the gpu.
> This isn't at all easy to solve and i think we kind of don't care
> if app can DOS gpu.
>
> best,
> Jerome Glisse
>   

The lockup I mentioned wasn't a complete machine lock-up.
Although NumLock didn't work and the mouse pointer disappeared
from the screen, the machine was still working. I.e. the kernel
responded to Alt+SysRq commands (sync, remount-readonly, poweroff)
So it must be some software locking problem.

Best regards,
Zoltán Böszörményi


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Thomas Hellström
Keith Packard wrote:
> On Thu, 2007-05-03 at 01:01 +0200, Thomas Hellström wrote:
>
>   
>> It might be possible to find schemes that work around this. One way 
>> could possibly be to have a buffer mapping -and validate order for 
>> shared buffers.
>> 
>
> If mapping never blocks on anything other than the fence, then there
> isn't any dead lock possibility. What this says is that ordering of
> rendering between clients is *not DRMs problem*. I think that's a good
> solution though; I want to let multiple apps work on DRM-able memory
> with their own CPU without contention.
>
> I don't recall if Eric layed out the proposed rules, but:
>
>  1) Map never blocks on map. Clients interested in dealing with this 
> are on their own.
>
>  2) Submit blocks on map. You must unmap all buffers before submitting
> them. Doing the relocations in the kernel makes this all possible.
>
>  3) Map blocks on the fence from submit. We can play with pending the
> flush until the app asks for the buffer back, or we can play with
> figuring out when flushes are useful automatically. Doesn't matter
> if the policy is in the kernel.
>
> I'm interested in making deadlock avoidence trivial and eliminating any
> map-map contention.
>
>   
It's rare to have two clients access the same buffer at the same time. 
In what situation will this occur?

If we think of map / unmap and validation / fence  as taking a buffer 
mutex either for the CPU or for the GPU, that's the way implementation 
is done today. The CPU side of the mutex should IIRC be per-client 
recursive. OTOH, the TTM implementation won't stop the CPU from 
accessing the buffer when it is unmapped, but then you're on your own. 
"Mutexes" need to be taken in the correct order, otherwise a deadlock 
will occur, and GL will, as outlined in Eric's illustration, more or 
less encourage us to take buffers in the "incorrect" order.

In essence what you propose is to eliminate the deadlock problem by just 
avoiding taking the buffer mutex unless we know the GPU has it. I see 
two problems with this:

* It will encourage different DRI clients to simultaneously access
  the same buffer.
* Inter-client and GPU data coherence can be guaranteed if we issue
  a mb() / write-combining flush with the unmap operation (which,
  BTW, I'm not sure is done today). Otherwise it is up to the
  clients, and very easy to forget.

I'm a bit afraid we might in the future regret taking the easy way out?

OTOH, letting DRM resolve the deadlock by unmapping and remapping shared 
buffers in the correct order might not be the best one either. It will 
certainly mean some CPU overhead and what if we have to do the same with 
buffer validation? (Yes for some operations with thousands and thousands 
of relocations, the user space validation might need to stay).

Personally, I'm slightly biased towards having DRM resolve the deadlock, 
but I think any solution will do as long as the implications and why we 
choose a certain solution are totally clear.

For item 3) above the kernel must have a way to issue a flush when 
needed for buffer eviction.
The current implementation also requires the buffer to be completely 
flushed before mapping.
Other than that the flushing policy is currently completely up to the 
DRM drivers.

/Thomas












-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: GARTSize option not documented on radeon and other problems

2007-05-04 Thread Jerome Glisse
On 5/4/07, Zoltan Boszormenyi <[EMAIL PROTECTED]> wrote:
> Jerome Glisse írta:
> > On 5/4/07, Zoltan Boszormenyi <[EMAIL PROTECTED]> wrote:
> >
> >> Oliver McFadden írta:
> >>
> >>> On 5/3/07, Zoltan Boszormenyi <[EMAIL PROTECTED]> wrote:
> >>>
> >>>
>  Hi,
> 
>  sorry for the crossposting, I don't know who to address.
> 
>  I am experimenting the new CFS scheduler on Linux
>  and tried to start multiple glxgears to see whether
>  they are really running smooth and have evenly
>  distributed framerate.
> 
>  At first I could only start two instances of glxgears
>  but the third didn't start saying that it cannot allocate
>  GART memory and try to increase increase GARTSize.
> 
>  First problem: man radeon doesn't have anything about
>  this option, although radeon_drv.so contains this keyword.
>  I tried guessing whether the parameter is in MBs and
>  have set it to 128 but it disabled DRI because of some
>  out of memory condition. Setting it to 64 gave me working
>  DRI and I am able to start up some more instances of
>  glxgears.
> 
>  Second problem: if I start 16 of them, the last 3
>  behaves very strange, i.e. instead of the spinning gears
>  I get chaotic flickering triangles. As soon as the number
>  of glxgears goes down to 13 every window behaves
>  normal.
> 
>  Third problem: starting up 32 glxgears locked up the
>  machine almost instantly but having only 16 of them
>  also locks up the machine after some time passed.
> 
>  The machine is x86-64, Athlon64 X2 4200,
>  PCIe Radeon X700 with 128MB onboard memory,
>  up-to-date Fedora Core 6.
> 
>  Best regards,
>  Zoltán Böszörményi
> 
> 
> >>> This is interesting. I've occasionally seen my engine just display a mess 
> >>> of
> >>> triangles, but if I was to kill it and start it again, it would work 
> >>> fine. This
> >>> only happened very occasionally so I could never track down the problem.
> >>>
> >>> If running more than 13 glxgears shows the problem then maybe it would 
> >>> have a
> >>> chance of getting fixed. :) You might want to open a bug report on the
> >>> Freedesktop Bugzilla.
> >>>
> >>>
> >> BZ #10852
> >>
> >>
> >>> The lockups are nothing new, unfortionally. I think there is a bug 
> >>> somewhere in
> >>> the R300 driver. You can also get deadlocks which are a little 
> >>> different... I
> >>> think that these might go away after R300 uses TTM and thus doesn't grab 
> >>> the
> >>> hardware lock anymore for texture upload, etc.
> >>>
> >>>
> >> Is the lockup happens with only multiple GL[X] clients?
> >> I can play hours with Diablo 2 :-) in Wine and it doesn't lock up.
> >>
> >> Best regards,
> >> Zoltán Böszörményi
> >>
> >
> > We believe lockup happen with high memory traffic (using big
> > texture or sending meg of datas to the card). I believe Diablo
> > isn't especialy such a traffic whore apps.
> >
>
> But neither is glxgears. If I have a small number of them, say 2-3,
> I don't experience any lockup.
>
> > For your test on the cfs scheduler i don't think drm is good
> > to test with. From my understanding i would say that any
> >
>
> But it seems that the mainstream scheduler in Linux cannot keep
> even small number of GL clients run smoothly under load.
> My test was to run make -j4 on the kernel source while
> running 12 glxgears. All 12 gears were smoothly running.
> And in the meantime, swithing workspaces worked quickly,
> i.e. repainting the large windows of Firefox and Thunderbird
> were although not instant but quite quick despite the high load.
> This kind of load makes the mainstream scheduler stall for
> several seconds which cannot be observed under Ingo Molnar's
> new scheduler.
>
> > apps that get the gpu lockup can basicly starve other app
> > for it i.e. we haven't any scheduling or fair sharing of the gpu.
> > This isn't at all easy to solve and i think we kind of don't care
> > if app can DOS gpu.
> >
> > best,
> > Jerome Glisse
> >
>
> The lockup I mentioned wasn't a complete machine lock-up.
> Although NumLock didn't work and the mouse pointer disappeared
> from the screen, the machine was still working. I.e. the kernel
> responded to Alt+SysRq commands (sync, remount-readonly, poweroff)
> So it must be some software locking problem.
>
> Best regards,
> Zoltán Böszörményi
>
>

There was a typo in my mail, i meaned lock not lockup
when i was talking about apps sending data to gpu.
And if multiple instance of glxgears are successfull
to make the gpulockup this is because you are then
sending megs of vertex to the card and this has a
highmemory bandwith usage which trigger the lockup,
at leat i believe lockup happen because of that,
sometimes it's only a soft lockup only the gpu die
but sometimes the gpu take down with him the pci
bus and you got a hard lockup.

best,
Jerome Glisse

---

Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Jerome Glisse
On 5/4/07, Thomas Hellström <[EMAIL PROTECTED]> wrote:
> Keith Packard wrote:
> > On Thu, 2007-05-03 at 01:01 +0200, Thomas Hellström wrote:
> >
> >
> >> It might be possible to find schemes that work around this. One way
> >> could possibly be to have a buffer mapping -and validate order for
> >> shared buffers.
> >>
> >
> > If mapping never blocks on anything other than the fence, then there
> > isn't any dead lock possibility. What this says is that ordering of
> > rendering between clients is *not DRMs problem*. I think that's a good
> > solution though; I want to let multiple apps work on DRM-able memory
> > with their own CPU without contention.
> >
> > I don't recall if Eric layed out the proposed rules, but:
> >
> >  1) Map never blocks on map. Clients interested in dealing with this
> > are on their own.
> >
> >  2) Submit blocks on map. You must unmap all buffers before submitting
> > them. Doing the relocations in the kernel makes this all possible.
> >
> >  3) Map blocks on the fence from submit. We can play with pending the
> > flush until the app asks for the buffer back, or we can play with
> > figuring out when flushes are useful automatically. Doesn't matter
> > if the policy is in the kernel.
> >
> > I'm interested in making deadlock avoidence trivial and eliminating any
> > map-map contention.
> >
> >
> It's rare to have two clients access the same buffer at the same time.
> In what situation will this occur?
>
> If we think of map / unmap and validation / fence  as taking a buffer
> mutex either for the CPU or for the GPU, that's the way implementation
> is done today. The CPU side of the mutex should IIRC be per-client
> recursive. OTOH, the TTM implementation won't stop the CPU from
> accessing the buffer when it is unmapped, but then you're on your own.
> "Mutexes" need to be taken in the correct order, otherwise a deadlock
> will occur, and GL will, as outlined in Eric's illustration, more or
> less encourage us to take buffers in the "incorrect" order.
>
> In essence what you propose is to eliminate the deadlock problem by just
> avoiding taking the buffer mutex unless we know the GPU has it. I see
> two problems with this:
>
> * It will encourage different DRI clients to simultaneously access
>   the same buffer.
> * Inter-client and GPU data coherence can be guaranteed if we issue
>   a mb() / write-combining flush with the unmap operation (which,
>   BTW, I'm not sure is done today). Otherwise it is up to the
>   clients, and very easy to forget.
>
> I'm a bit afraid we might in the future regret taking the easy way out?
>
> OTOH, letting DRM resolve the deadlock by unmapping and remapping shared
> buffers in the correct order might not be the best one either. It will
> certainly mean some CPU overhead and what if we have to do the same with
> buffer validation? (Yes for some operations with thousands and thousands
> of relocations, the user space validation might need to stay).
>
> Personally, I'm slightly biased towards having DRM resolve the deadlock,
> but I think any solution will do as long as the implications and why we
> choose a certain solution are totally clear.
>
> For item 3) above the kernel must have a way to issue a flush when
> needed for buffer eviction.
> The current implementation also requires the buffer to be completely
> flushed before mapping.
> Other than that the flushing policy is currently completely up to the
> DRM drivers.
>
> /Thomas

I might say stupid things as i don't think i fully understand all
the input to this problem. Anyway here is my thought on all this:

1) First client map never block (as in Keith layout) except on
fence from drm side (point 3 in Keith layout)
2) Client should always unmap buffer before submitting (as in Keith layout)
3) In drm side you always acquire buffer lock in a give order for
instance each buffer got and id and you lock from smaller id to bigger
one (with a clever implementation the cost for that will be small)
4) We got 2 gpu queue:
 - one with pending apps ask in which we do all stuff necessary
   to be done before submitting (locking buffer, validation, ...)
   for instance we might wait here for each buffer that are still
   mapped by some other apps in user space
 - one run queue in which we add each apps ask that are now
   ready to be submited to the gpu

Of course in this scheme we keep the fencing stuff so user space can
know when it safe to use previously submited buffer again. The outcome
of having two seperate queue in drm is that if two apps lockup each other
other apps can still use the gpu so only the apps fighting for a buffer will
suffer.

And for user space synchronization i believe it a user space problem i.e.
it's up to user space to add proper synch. For instance as map doesn't
block for any client in user space two apps can mess with same buffer
it's up to user to h

Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Jerome Glisse
On 5/4/07, Jerome Glisse <[EMAIL PROTECTED]> wrote:
> On 5/4/07, Thomas Hellström <[EMAIL PROTECTED]> wrote:
> > Keith Packard wrote:
> > > On Thu, 2007-05-03 at 01:01 +0200, Thomas Hellström wrote:
> > >
> > >
> > >> It might be possible to find schemes that work around this. One way
> > >> could possibly be to have a buffer mapping -and validate order for
> > >> shared buffers.
> > >>
> > >
> > > If mapping never blocks on anything other than the fence, then there
> > > isn't any dead lock possibility. What this says is that ordering of
> > > rendering between clients is *not DRMs problem*. I think that's a good
> > > solution though; I want to let multiple apps work on DRM-able memory
> > > with their own CPU without contention.
> > >
> > > I don't recall if Eric layed out the proposed rules, but:
> > >
> > >  1) Map never blocks on map. Clients interested in dealing with this
> > > are on their own.
> > >
> > >  2) Submit blocks on map. You must unmap all buffers before submitting
> > > them. Doing the relocations in the kernel makes this all possible.
> > >
> > >  3) Map blocks on the fence from submit. We can play with pending the
> > > flush until the app asks for the buffer back, or we can play with
> > > figuring out when flushes are useful automatically. Doesn't matter
> > > if the policy is in the kernel.
> > >
> > > I'm interested in making deadlock avoidence trivial and eliminating any
> > > map-map contention.
> > >
> > >
> > It's rare to have two clients access the same buffer at the same time.
> > In what situation will this occur?
> >
> > If we think of map / unmap and validation / fence  as taking a buffer
> > mutex either for the CPU or for the GPU, that's the way implementation
> > is done today. The CPU side of the mutex should IIRC be per-client
> > recursive. OTOH, the TTM implementation won't stop the CPU from
> > accessing the buffer when it is unmapped, but then you're on your own.
> > "Mutexes" need to be taken in the correct order, otherwise a deadlock
> > will occur, and GL will, as outlined in Eric's illustration, more or
> > less encourage us to take buffers in the "incorrect" order.
> >
> > In essence what you propose is to eliminate the deadlock problem by just
> > avoiding taking the buffer mutex unless we know the GPU has it. I see
> > two problems with this:
> >
> > * It will encourage different DRI clients to simultaneously access
> >   the same buffer.
> > * Inter-client and GPU data coherence can be guaranteed if we issue
> >   a mb() / write-combining flush with the unmap operation (which,
> >   BTW, I'm not sure is done today). Otherwise it is up to the
> >   clients, and very easy to forget.
> >
> > I'm a bit afraid we might in the future regret taking the easy way out?
> >
> > OTOH, letting DRM resolve the deadlock by unmapping and remapping shared
> > buffers in the correct order might not be the best one either. It will
> > certainly mean some CPU overhead and what if we have to do the same with
> > buffer validation? (Yes for some operations with thousands and thousands
> > of relocations, the user space validation might need to stay).
> >
> > Personally, I'm slightly biased towards having DRM resolve the deadlock,
> > but I think any solution will do as long as the implications and why we
> > choose a certain solution are totally clear.
> >
> > For item 3) above the kernel must have a way to issue a flush when
> > needed for buffer eviction.
> > The current implementation also requires the buffer to be completely
> > flushed before mapping.
> > Other than that the flushing policy is currently completely up to the
> > DRM drivers.
> >
> > /Thomas
>
> I might say stupid things as i don't think i fully understand all
> the input to this problem. Anyway here is my thought on all this:
>
> 1) First client map never block (as in Keith layout) except on
> fence from drm side (point 3 in Keith layout)
> 2) Client should always unmap buffer before submitting (as in Keith layout)
> 3) In drm side you always acquire buffer lock in a give order for
> instance each buffer got and id and you lock from smaller id to bigger
> one (with a clever implementation the cost for that will be small)
> 4) We got 2 gpu queue:
>  - one with pending apps ask in which we do all stuff necessary
>to be done before submitting (locking buffer, validation, ...)
>for instance we might wait here for each buffer that are still
>mapped by some other apps in user space
>  - one run queue in which we add each apps ask that are now
>ready to be submited to the gpu
>
> Of course in this scheme we keep the fencing stuff so user space can
> know when it safe to use previously submited buffer again. The outcome
> of having two seperate queue in drm is that if two apps lockup each other
> other apps can still use the gpu so only the apps fighting for a buffer will

Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Thomas Hellström
Jerome Glisse wrote:
> On 5/4/07, Thomas Hellström <[EMAIL PROTECTED]> wrote:
>> Keith Packard wrote:
>> > On Thu, 2007-05-03 at 01:01 +0200, Thomas Hellström wrote:
>> >
>> >
>> >> It might be possible to find schemes that work around this. One way
>> >> could possibly be to have a buffer mapping -and validate order for
>> >> shared buffers.
>> >>
>> >
>> > If mapping never blocks on anything other than the fence, then there
>> > isn't any dead lock possibility. What this says is that ordering of
>> > rendering between clients is *not DRMs problem*. I think that's a good
>> > solution though; I want to let multiple apps work on DRM-able memory
>> > with their own CPU without contention.
>> >
>> > I don't recall if Eric layed out the proposed rules, but:
>> >
>> >  1) Map never blocks on map. Clients interested in dealing with this
>> > are on their own.
>> >
>> >  2) Submit blocks on map. You must unmap all buffers before submitting
>> > them. Doing the relocations in the kernel makes this all possible.
>> >
>> >  3) Map blocks on the fence from submit. We can play with pending the
>> > flush until the app asks for the buffer back, or we can play with
>> > figuring out when flushes are useful automatically. Doesn't matter
>> > if the policy is in the kernel.
>> >
>> > I'm interested in making deadlock avoidence trivial and eliminating 
>> any
>> > map-map contention.
>> >
>> >
>> It's rare to have two clients access the same buffer at the same time.
>> In what situation will this occur?
>>
>> If we think of map / unmap and validation / fence  as taking a buffer
>> mutex either for the CPU or for the GPU, that's the way implementation
>> is done today. The CPU side of the mutex should IIRC be per-client
>> recursive. OTOH, the TTM implementation won't stop the CPU from
>> accessing the buffer when it is unmapped, but then you're on your own.
>> "Mutexes" need to be taken in the correct order, otherwise a deadlock
>> will occur, and GL will, as outlined in Eric's illustration, more or
>> less encourage us to take buffers in the "incorrect" order.
>>
>> In essence what you propose is to eliminate the deadlock problem by just
>> avoiding taking the buffer mutex unless we know the GPU has it. I see
>> two problems with this:
>>
>> * It will encourage different DRI clients to simultaneously access
>>   the same buffer.
>> * Inter-client and GPU data coherence can be guaranteed if we issue
>>   a mb() / write-combining flush with the unmap operation (which,
>>   BTW, I'm not sure is done today). Otherwise it is up to the
>>   clients, and very easy to forget.
>>
>> I'm a bit afraid we might in the future regret taking the easy way out?
>>
>> OTOH, letting DRM resolve the deadlock by unmapping and remapping shared
>> buffers in the correct order might not be the best one either. It will
>> certainly mean some CPU overhead and what if we have to do the same with
>> buffer validation? (Yes for some operations with thousands and thousands
>> of relocations, the user space validation might need to stay).
>>
>> Personally, I'm slightly biased towards having DRM resolve the deadlock,
>> but I think any solution will do as long as the implications and why we
>> choose a certain solution are totally clear.
>>
>> For item 3) above the kernel must have a way to issue a flush when
>> needed for buffer eviction.
>> The current implementation also requires the buffer to be completely
>> flushed before mapping.
>> Other than that the flushing policy is currently completely up to the
>> DRM drivers.
>>
>> /Thomas
>
> I might say stupid things as i don't think i fully understand all
> the input to this problem. Anyway here is my thought on all this:
>
> 1) First client map never block (as in Keith layout) except on
>fence from drm side (point 3 in Keith layout)
>
But is there really a need for this except to avoid the above-mentioned 
deadlock?
As I'm not too up to date with all the possibilities the servers and GL 
clients may be using shared buffers,
I need some enlightenment :). Could we have an example, please?

> 4) We got 2 gpu queue:
> - one with pending apps ask in which we do all stuff 
> necessary
>   to be done before submitting (locking buffer, 
> validation, ...)
>   for instance we might wait here for each buffer that are 
> still
>   mapped by some other apps in user space
> - one run queue in which we add each apps ask that are now
>   ready to be submited to the gpu

This is getting closer and closer to a GPU scheduler, an interesting 
topic indeed.
Perhaps we should have a separate  discussion on the  needs and 
requirements for such a thing?

Regards,
/Thomas




-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Clic

Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Jerome Glisse
On 5/4/07, Thomas Hellström <[EMAIL PROTECTED]> wrote:
> Jerome Glisse wrote:
> > On 5/4/07, Thomas Hellström <[EMAIL PROTECTED]> wrote:
> >> Keith Packard wrote:
> >> > On Thu, 2007-05-03 at 01:01 +0200, Thomas Hellström wrote:
> >> >
> >> >
> >> >> It might be possible to find schemes that work around this. One way
> >> >> could possibly be to have a buffer mapping -and validate order for
> >> >> shared buffers.
> >> >>
> >> >
> >> > If mapping never blocks on anything other than the fence, then there
> >> > isn't any dead lock possibility. What this says is that ordering of
> >> > rendering between clients is *not DRMs problem*. I think that's a good
> >> > solution though; I want to let multiple apps work on DRM-able memory
> >> > with their own CPU without contention.
> >> >
> >> > I don't recall if Eric layed out the proposed rules, but:
> >> >
> >> >  1) Map never blocks on map. Clients interested in dealing with this
> >> > are on their own.
> >> >
> >> >  2) Submit blocks on map. You must unmap all buffers before submitting
> >> > them. Doing the relocations in the kernel makes this all possible.
> >> >
> >> >  3) Map blocks on the fence from submit. We can play with pending the
> >> > flush until the app asks for the buffer back, or we can play with
> >> > figuring out when flushes are useful automatically. Doesn't matter
> >> > if the policy is in the kernel.
> >> >
> >> > I'm interested in making deadlock avoidence trivial and eliminating
> >> any
> >> > map-map contention.
> >> >
> >> >
> >> It's rare to have two clients access the same buffer at the same time.
> >> In what situation will this occur?
> >>
> >> If we think of map / unmap and validation / fence  as taking a buffer
> >> mutex either for the CPU or for the GPU, that's the way implementation
> >> is done today. The CPU side of the mutex should IIRC be per-client
> >> recursive. OTOH, the TTM implementation won't stop the CPU from
> >> accessing the buffer when it is unmapped, but then you're on your own.
> >> "Mutexes" need to be taken in the correct order, otherwise a deadlock
> >> will occur, and GL will, as outlined in Eric's illustration, more or
> >> less encourage us to take buffers in the "incorrect" order.
> >>
> >> In essence what you propose is to eliminate the deadlock problem by just
> >> avoiding taking the buffer mutex unless we know the GPU has it. I see
> >> two problems with this:
> >>
> >> * It will encourage different DRI clients to simultaneously access
> >>   the same buffer.
> >> * Inter-client and GPU data coherence can be guaranteed if we issue
> >>   a mb() / write-combining flush with the unmap operation (which,
> >>   BTW, I'm not sure is done today). Otherwise it is up to the
> >>   clients, and very easy to forget.
> >>
> >> I'm a bit afraid we might in the future regret taking the easy way out?
> >>
> >> OTOH, letting DRM resolve the deadlock by unmapping and remapping shared
> >> buffers in the correct order might not be the best one either. It will
> >> certainly mean some CPU overhead and what if we have to do the same with
> >> buffer validation? (Yes for some operations with thousands and thousands
> >> of relocations, the user space validation might need to stay).
> >>
> >> Personally, I'm slightly biased towards having DRM resolve the deadlock,
> >> but I think any solution will do as long as the implications and why we
> >> choose a certain solution are totally clear.
> >>
> >> For item 3) above the kernel must have a way to issue a flush when
> >> needed for buffer eviction.
> >> The current implementation also requires the buffer to be completely
> >> flushed before mapping.
> >> Other than that the flushing policy is currently completely up to the
> >> DRM drivers.
> >>
> >> /Thomas
> >
> > I might say stupid things as i don't think i fully understand all
> > the input to this problem. Anyway here is my thought on all this:
> >
> > 1) First client map never block (as in Keith layout) except on
> >fence from drm side (point 3 in Keith layout)
> >
> But is there really a need for this except to avoid the above-mentioned
> deadlock?
> As I'm not too up to date with all the possibilities the servers and GL
> clients may be using shared buffers,
> I need some enlightenment :). Could we have an example, please?

I think the current main consumer would be compiz or any other
compositor which use TextureFromPixmap, i really think the we
might see further use of sharing graphical data among applications,
i got example here at my work of such use case even thought this
doesn't use GL at all but another indoor protocol. Another possible
case where such buffer sharing might occur is inside same application
with two or more GL context (i am ready to bet that we already have
some where example of such application).

> > 4) We got 2 gpu queue:
> > - one with pending apps ask in which we do all stuff
> > necessary
> >   to be done befor

Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Thomas Hellström
Jerome Glisse wrote:
> On 5/4/07, Thomas Hellström <[EMAIL PROTECTED]> wrote:
>> Jerome Glisse wrote:
>> > On 5/4/07, Thomas Hellström <[EMAIL PROTECTED]> wrote:
>> >> Keith Packard wrote:
>> >> > On Thu, 2007-05-03 at 01:01 +0200, Thomas Hellström wrote:
>> >> >
>> >> >
>> >> >> It might be possible to find schemes that work around this. One 
>> way
>> >> >> could possibly be to have a buffer mapping -and validate order for
>> >> >> shared buffers.
>> >> >>
>> >> >
>> >> > If mapping never blocks on anything other than the fence, then 
>> there
>> >> > isn't any dead lock possibility. What this says is that ordering of
>> >> > rendering between clients is *not DRMs problem*. I think that's 
>> a good
>> >> > solution though; I want to let multiple apps work on DRM-able 
>> memory
>> >> > with their own CPU without contention.
>> >> >
>> >> > I don't recall if Eric layed out the proposed rules, but:
>> >> >
>> >> >  1) Map never blocks on map. Clients interested in dealing with 
>> this
>> >> > are on their own.
>> >> >
>> >> >  2) Submit blocks on map. You must unmap all buffers before 
>> submitting
>> >> > them. Doing the relocations in the kernel makes this all 
>> possible.
>> >> >
>> >> >  3) Map blocks on the fence from submit. We can play with 
>> pending the
>> >> > flush until the app asks for the buffer back, or we can play 
>> with
>> >> > figuring out when flushes are useful automatically. Doesn't 
>> matter
>> >> > if the policy is in the kernel.
>> >> >
>> >> > I'm interested in making deadlock avoidence trivial and eliminating
>> >> any
>> >> > map-map contention.
>> >> >
>> >> >
>> >> It's rare to have two clients access the same buffer at the same 
>> time.
>> >> In what situation will this occur?
>> >>
>> >> If we think of map / unmap and validation / fence  as taking a buffer
>> >> mutex either for the CPU or for the GPU, that's the way 
>> implementation
>> >> is done today. The CPU side of the mutex should IIRC be per-client
>> >> recursive. OTOH, the TTM implementation won't stop the CPU from
>> >> accessing the buffer when it is unmapped, but then you're on your 
>> own.
>> >> "Mutexes" need to be taken in the correct order, otherwise a deadlock
>> >> will occur, and GL will, as outlined in Eric's illustration, more or
>> >> less encourage us to take buffers in the "incorrect" order.
>> >>
>> >> In essence what you propose is to eliminate the deadlock problem 
>> by just
>> >> avoiding taking the buffer mutex unless we know the GPU has it. I see
>> >> two problems with this:
>> >>
>> >> * It will encourage different DRI clients to simultaneously 
>> access
>> >>   the same buffer.
>> >> * Inter-client and GPU data coherence can be guaranteed if we 
>> issue
>> >>   a mb() / write-combining flush with the unmap operation (which,
>> >>   BTW, I'm not sure is done today). Otherwise it is up to the
>> >>   clients, and very easy to forget.
>> >>
>> >> I'm a bit afraid we might in the future regret taking the easy way 
>> out?
>> >>
>> >> OTOH, letting DRM resolve the deadlock by unmapping and remapping 
>> shared
>> >> buffers in the correct order might not be the best one either. It 
>> will
>> >> certainly mean some CPU overhead and what if we have to do the 
>> same with
>> >> buffer validation? (Yes for some operations with thousands and 
>> thousands
>> >> of relocations, the user space validation might need to stay).
>> >>
>> >> Personally, I'm slightly biased towards having DRM resolve the 
>> deadlock,
>> >> but I think any solution will do as long as the implications and 
>> why we
>> >> choose a certain solution are totally clear.
>> >>
>> >> For item 3) above the kernel must have a way to issue a flush when
>> >> needed for buffer eviction.
>> >> The current implementation also requires the buffer to be completely
>> >> flushed before mapping.
>> >> Other than that the flushing policy is currently completely up to the
>> >> DRM drivers.
>> >>
>> >> /Thomas
>> >
>> > I might say stupid things as i don't think i fully understand all
>> > the input to this problem. Anyway here is my thought on all this:
>> >
>> > 1) First client map never block (as in Keith layout) except on
>> >fence from drm side (point 3 in Keith layout)
>> >
>> But is there really a need for this except to avoid the above-mentioned
>> deadlock?
>> As I'm not too up to date with all the possibilities the servers and GL
>> clients may be using shared buffers,
>> I need some enlightenment :). Could we have an example, please?
>
> I think the current main consumer would be compiz or any other
> compositor which use TextureFromPixmap, i really think the we
> might see further use of sharing graphical data among applications,
> i got example here at my work of such use case even thought this
> doesn't use GL at all but another indoor protocol. Another possible
> case where such buffer sharing might occur is inside same application
> with two or more GL context (i am 

Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Jerome Glisse
On 5/4/07, Thomas Hellström <[EMAIL PROTECTED]> wrote:
> I was actually referring to an example where two clients need to have a
> buffer mapped and access it at exactly the same time.
> If there is such a situation, we have no other choice than to drop the
> buffer locking on map. If there isn't we can at least consider other
> alternatives that resolve the deadlock issue but that also will help
> clients synchronize and keep data coherent.
>
> /Thomas

One might be a texture where a portion is updated by one thread
and another portion update by another one, i believe the application
will know better than us if such concurrent access will conflict or not.
If this both thread access different pixel it make sense to let them
work together at the same time on the texture. If they are writing
to same pixel then they will have to sync between them so they
don't do somethings stupid.

My point is that the user space will know better if sync is needed
or not and how to sync access to the same buffer. Moreover we
can still add a locking mechanism in user space (in libdrm for
instance).

There is very likely others use case for such concurrent access which
i can't think off right now.

best,
Jerome Glisse

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 10855] New: on Intel 945G, (beryl or compiz) + glxgears = DRM_I830_CMDBUFFER: -22

2007-05-04 Thread bugzilla-daemon
http://bugs.freedesktop.org/show_bug.cgi?id=10855

   Summary: on Intel 945G, (beryl or compiz) + glxgears =
DRM_I830_CMDBUFFER: -22
   Product: Mesa
   Version: 6.5
  Platform: x86 (IA32)
OS/Version: Linux (All)
Status: NEW
  Severity: minor
  Priority: medium
 Component: Drivers/DRI/i915
AssignedTo: dri-devel@lists.sourceforge.net
ReportedBy: [EMAIL PROTECTED]


Hello,

with beryl or compiz activated, moving, with mouse, glxgears' window to 
next desktop crash glxgears with error DRM_I830_CMDBUFFER: -22.
I get this error too with blender when I rotate the desktop cube.
Another way to get this error is launching "wine blender_for_windows".

"dmesg | tail -2" give this :
[ 9031.808089] [drm:i915_emit_box] *ERROR* Bad box 65407,301..171,601
[ 9031.808099] [drm:i915_cmdbuffer] *ERROR* i915_dispatch_cmdbuffer failed

I've seen a (maybe) similar bug for 965G chipset, fixed in mesa 6.5.2 
(http://lists.freedesktop.org/archives/xorg/2006-November/019943.html ; 
http://lists.freedesktop.org/archives/xorg/2007-January/020826.html ;  
http://bugs.debian.org/394311).

I have an intel 945G chipset and I am on Ubuntu Feisty, with mesa
6.5.2-3ubuntu7, xserver-xorg-video-intel 2.0.0-1feisty1 and xserver-xorg-core
1.3.0.0.dfsg-1feisty1 (Ross Burton packages).


-- 
Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: GARTSize option not documented on radeon and other problems

2007-05-04 Thread Oliver McFadden
On 5/4/07, Jerome Glisse <[EMAIL PROTECTED]> wrote:
> There was a typo in my mail, i meaned lock not lockup
> when i was talking about apps sending data to gpu.
> And if multiple instance of glxgears are successfull
> to make the gpulockup this is because you are then
> sending megs of vertex to the card and this has a
> highmemory bandwith usage which trigger the lockup,
> at leat i believe lockup happen because of that,
> sometimes it's only a soft lockup only the gpu die
> but sometimes the gpu take down with him the pci
> bus and you got a hard lockup.
>
> best,
> Jerome Glisse

I'm wondering if this is some timing initialization issue. I might try to check
this out with airlied's radeontool.

I think this is the current theory as to why it happens.

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Keith Packard
On Fri, 2007-05-04 at 10:07 +0200, Thomas Hellström wrote:
>  
> It's rare to have two clients access the same buffer at the same time. 
> In what situation will this occur?

Right, what I'm trying to avoid is having any contention for
applications *not* sharing the same objects. 

If there is any locking for mapping, we can either attempt to define a
locking order, or we can have a single global lock. The former leaves us
prone to deadlocks, the latter eliminates the ability for un-contended
parallel access.

> * It will encourage different DRI clients to simultaneously access
>   the same buffer.

Sure. Separate 'DRI' from 'GL' and this may be a sensible plan. If you
want to prevent this *that's not DRI's problem*.

> * Inter-client and GPU data coherence can be guaranteed if we issue
>   a mb() / write-combining flush with the unmap operation (which,
>   BTW, I'm not sure is done today). Otherwise it is up to the
>   clients, and very easy to forget.

CPU-GPU coherence is ensured by the mutual exclusion between mapping and
submitting. You may either have data available to the CPU or to the GPU.
I think that's a basic requirement for any solution in this space.
Keying the submit and map as to whether writing will occur means that
appropriate flushing and fencing can be automatically applied within the
kernel.

> OTOH, letting DRM resolve the deadlock by unmapping and remapping shared 
> buffers in the correct order might not be the best one either. It will 
> certainly mean some CPU overhead and what if we have to do the same with 
> buffer validation? (Yes for some operations with thousands and thousands 
> of relocations, the user space validation might need to stay).

I do not want to do relocations in user space. I don't see why doing
thousands of these requires moving this operation out of the kernel.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 6664] Blank screen using with 945GM chipset

2007-05-04 Thread bugzilla-daemon
http://bugs.freedesktop.org/show_bug.cgi?id=6664


[EMAIL PROTECTED] changed:

   What|Removed |Added

 CC||[EMAIL PROTECTED]




--- Comment #3 from [EMAIL PROTECTED]  2007-05-04 08:23 PST ---
In my System it happens only with AMD64 port, before I was using x86 port and
all work great, I have an EMT64 capable processor and runs debian great, all
services work fine except xserver-xorg with driver i810, let me say that the
error is just when unloading X, before that I can reach my desktop, use the
Render, 3D accel, Composite Extension, Compiz etc but when Shuts down the
Computer I get a Kernel Panic, I think to set the driver to VESA and work but
haven´t DRI, 3D Accel, Composite etc.

My Hardware is:

- Celeron D 3.6 GHz EMT64
- ASRock 775i65G
- Intel® 865G Chipset
- Integrated Intel® Extreme Graphics 2 (intel 82865G Graphics)


The error screen message wite some like:
CODE**
[] thread_return+0x0/0xe7
[] vsf_write+0xe1/0x174
[] sys_write+0x45/0x6e
[] system_call+0x7e/0x83

Code: 4c 8b 20 48 89 c7 c4 89 63 40 ff 50 08 48 89 e8 89 ea 48 ff
RIP [8028e424] __rcu_process_callbacks+0x115/0x1a8
RSP 
CR2: 80a324c0
<0> Kernel Panic - not syncing: Aiee, killing interrupt handler!
END OF CODE**

I have put the one screenshoot here
http://bp0.blogger.com/_w833IbeIZhI/RjpWdCwgHwI/AA4/h4-Lba1UnCU/s1600-h/i810+crash.jpg


My xorg.conf is:

# /etc/X11/xorg.conf (xorg X Window System server configuration file)
#
# This file was generated by dexconf, the Debian X Configuration tool, using
# values from the debconf database.
#
# Edit this file with caution, and see the /etc/X11/xorg.conf manual page.
# (Type "man /etc/X11/xorg.conf" at the shell prompt.)
#
# This file is automatically updated on xserver-xorg package upgrades *only*
# if it has not been modified since the last upgrade of the xserver-xorg
# package.
#
# If you have edited this file but would like it to be automatically updated
# again, run the following command:
#   sudo dpkg-reconfigure -phigh xserver-xorg

Section "Files"
FontPath"/usr/share/fonts/X11/misc"
FontPath"/usr/X11R6/lib/X11/fonts/misc"
FontPath"/usr/share/fonts/X11/cyrillic"
FontPath"/usr/X11R6/lib/X11/fonts
/cyrillic"
FontPath"/usr/share/fonts/X11/100dpi/:unscaled"
FontPath"/usr/X11R6/lib/X11/fonts/100dpi/:unscaled"
FontPath"/usr/share/fonts/X11/75dpi/:unscaled"
FontPath"/usr/X11R6/lib/X11/fonts/75dpi/:unscaled"
FontPath"/usr/share/fonts/X11/Type1"
FontPath"/usr/X11R6/lib/X11/fonts/Type1"
FontPath"/usr/share/fonts/X11/100dpi"
FontPath"/usr/X11R6/lib/X11/fonts/100dpi"
FontPath"/usr/share/fonts/X11/75dpi"
FontPath"/usr/X11R6/lib/X11/fonts/75dpi"
# path to defoma fonts
FontPath"/var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType"
EndSection

Section "Module"
Load"i2c"
Load"bitmap"
Load"ddc"
Load"dri"
Load"extmod"
Load"freetype"
Load"glx"
Load"int10"
Load"vbe"
EndSection

Section "InputDevice"
Identifier"Generic Keyboard"
Driver"kbd"
Option"CoreKeyboard"
Option"XkbRules""xorg"
Option"XkbModel""pc104"
Option"XkbLayout""us"
EndSection

Section "InputDevice"
Identifier"Configured Mouse"
Driver"mouse"
Option"CorePointer"
Option"Device""/dev/input/mice"
Option"Protocol""ImPS/2"
Option"Emulate3Buttons""true"
EndSection

Section "Device"
Identifier"Intel Corporation 82945G/GZ Integrated Graphics Controller"
Driver"vesa" #"i810"  # I have to comment i810 to avoid crash...
BusID"PCI:0:2:0"
EndSection

Section "Monitor"
Identifier"DELL E176FP"
Option"DPMS"
EndSection

Section "Screen"
Identifier"Default Screen"
Device"Intel Corporation 82945G/GZ Integrated Graphics Controller"
Monitor"DELL E176FP"
DefaultDepth24
SubSection "Display"
Depth1
Modes"1280x1024" "1152x864" "1024x768" "800x600" "720x400"
"640x480"
EndSubSection
SubSection "Display"
Depth4
Modes"1280x1024" "1152x864" "1024x768" "800x600" "720x400"
"640x480"
EndSubSection
SubSection "Display"
Depth8
Modes"1280x1024" "1152x864" "1024x768" "800x600" "720x400"
"640x480"
EndSubSection
SubSection "Display"
Depth15
Modes"1280x1024" "1152x864" "1024x768" "800x600" "720x400"
"640x480"
EndSubSection
SubSection "Display"
Depth16
Modes"1280x1024" "1152x864" "1024x768" 

Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Keith Packard
On Fri, 2007-05-04 at 11:40 +0200, Jerome Glisse wrote:

> On a side note i think this scheme also fit well with gpu having
> several context and which doesn't need big validation (read
> nv gpu).

Yeah, I want to make sure we have a simple model that supports
multi-context hardware while also avoiding failing over badly when we
have more users than hardware contexts.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Keith Packard
On Fri, 2007-05-04 at 14:32 +0200, Thomas Hellström wrote:
>  If there isn't we can at least consider other 
> alternatives that resolve the deadlock issue but that also will help 
> clients synchronize and keep data coherent.

If clients want coherence, they're welcome to implement their own
locking. Let's make sure we separate the semantics required for GPU
operation from semantics required by DRM users.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Keith Whitwell
Keith Packard wrote:

>> OTOH, letting DRM resolve the deadlock by unmapping and remapping shared 
>> buffers in the correct order might not be the best one either. It will 
>> certainly mean some CPU overhead and what if we have to do the same with 
>> buffer validation? (Yes for some operations with thousands and thousands 
>> of relocations, the user space validation might need to stay).
> 
> I do not want to do relocations in user space. I don't see why doing
> thousands of these requires moving this operation out of the kernel.

Agreed.  The original conception for this was to have valdiation plus 
relocations be a single operation, and by implication in the kernel. 
Although the code as it stands doesn't do this, I think that should 
still be the approach.

The issue with thousands of relocations from my point of view isn't a 
problem - that's just a matter of getting appropriate data structures in 
place.

Where things get a bit more interesting is with hardware where you are 
required to submit a whole scene's worth of rendering before the 
hardware will kick off, and with the expectation that the texture 
placement will remain unchanged throughout the scene.  This is a very 
easy way to hit any upper limit on texture memory - the agp aperture 
size in the case of integrated chipsets.

That's a special case of a the general problem of what do you do when a 
client submits any validation list that can't be satisfied.  Failing to 
render isn't really an option, either the client or the memory manager 
has to either prevent it happening in the first place or have some 
mechanism for chopping up the dma buffer into segments which are 
satisfiable...  Neither of which I can see an absolutely reliable way to 
do.

I think that any memory manager we can propose will have flaws of some 
sort - either it is prone to failures that aren't really allowed by the 
API, is excessively complex or somewhat pessimistic.  We've chosen a 
design that is simple, optimistic, but can potentially say "no" 
unexpectedly.  It would then be up to the client to somehow pick up the 
pieces & potentially submit a smaller list.  So far we just haven't 
touched on how that might work.

The way to get around this is to mandate that hardware supports paged 
virtual memory...  But that seems to be a difficult trick.

Keith

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] [PATCH] DRM TTM Memory Manager patch

2007-05-04 Thread Keith Packard
On Fri, 2007-05-04 at 16:57 +0100, Keith Whitwell wrote:

> That's a special case of a the general problem of what do you do when a 
> client submits any validation list that can't be satisfied.  Failing to 
> render isn't really an option, either the client or the memory manager 
> has to either prevent it happening in the first place or have some 
> mechanism for chopping up the dma buffer into segments which are 
> satisfiable...  Neither of which I can see an absolutely reliable way to 
> do.

I think we must return an error from the kernel and let user mode sort
it out; potentially by breaking up the operation into smaller pieces, or
(ick), simply falling back to software. Eliminating per-submit flushing
will even make this reasonably efficient as we remap the GTT as objects
are used. I don't think we want to support automatic partitioning of the
operation in the kernel; punting that step to user mode seems like a
sensible option.

Certainly presenting all of the objects to the kernel atomically will
permit it to succeed if the device can possibly perform the operation;
ejecting all existing objects and reloading with precisely the objects
proposed by the application can be done, and is even inexpensive on UMA
hardware.

> The way to get around this is to mandate that hardware supports paged 
> virtual memory...  But that seems to be a difficult trick.

Yeah, especially as we don't currently have any examples in our
environment.

-- 
[EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part
-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 8427] New: Kernel Panic on shuting down with Xserver using i810 driver

2007-05-04 Thread bugme-daemon
http://bugzilla.kernel.org/show_bug.cgi?id=8427

   Summary: Kernel Panic on shuting down with Xserver using i810
driver
Kernel Version: 2.6.18-4-amd64
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Distribution: Debian Etch and Fedora Core Test 3

Hardware Environment:
- ASRock 775i65G
- Celeron D 3.6 GHz EMT64
- Intel® 865G Chipset
- Integrated Intel® Extreme Graphics 2 (intel 82865G Graphics)

Software Environment: Xserver 1:7.1.0-16

Problem Description:
It happens only with AMD64 port, before I was using x86 port and all work great,
I have an EMT64 capable processor and runs debian great, all services work fine
except xserver-xorg with driver i810, let me say that the error is just when
unloading X, before that I can reach my desktop, use the Render, 3D accel,
Composite Extension, Compiz etc but when Shuts down the Computer I get a Kernel
Panic, I think to set the driver to VESA and work but haven´t DRI, 3D Accel,
Composite etc.

Steps to reproduce:
In Fedora Core: Start the Linux Box with RedHat Graphical Boot enabled and the
system crash before reach Gnome or KDE Desktop, if don't enable RedHat Graphical
Boot then reach Desktop but on shutdown the system crash.

In Debian Etch the problem is when shuting down the system.

An screenshot can be viewed here:

http://bp0.blogger.com/_w833IbeIZhI/RjpWdCwgHwI/AA4/h4-Lba1UnCU/s1600-h/i810+crash.jpg

--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH] make radeons fire fewer vblank interrupts

2007-05-04 Thread Jesse Barnes
In playing around yesterday, we found that some drivers will 
unnecessarily enable interrupts for vblank events.  Since these tend to 
happen frequently (60+ Hz), they'll cause your CPU to wake up a lot, 
which will waste power if they're not really in use.

This patch hacks the radeon driver to only enable vblank interrupts when 
the user is waiting for one, rather than at IRQ setup time.  I couldn't 
find any code in the DDX that wanted vblank support, so I suppose the 
real users are in the Mesa driver somewhere, so I haven't tested it 
other than to see that my interrupt frequency really does decrease.

Comments?

Thanks,
Jesse
diff --git a/drivers/char/drm/radeon_irq.c b/drivers/char/drm/radeon_irq.c
index 3ff0baa..71f1919 100644
--- a/drivers/char/drm/radeon_irq.c
+++ b/drivers/char/drm/radeon_irq.c
@@ -147,9 +147,14 @@ int radeon_driver_vblank_wait(drm_device_t * dev, unsigned int *sequence)
 	 * by about a day rather than she wants to wait for years
 	 * using vertical blanks...
 	 */
+	/* Turn on VBL ints */
+	RADEON_WRITE(RADEON_GEN_INT_CNTL,
+		 RADEON_CRTC_VBLANK_MASK | RADEON_SW_INT_ENABLE);
 	DRM_WAIT_ON(ret, dev->vbl_queue, 3 * DRM_HZ,
 		(((cur_vblank = atomic_read(&dev->vbl_received))
 		  - *sequence) <= (1 << 23)));
+	/* Go back to just SW interrupts */
+	RADEON_WRITE(RADEON_GEN_INT_CNTL, RADEON_SW_INT_ENABLE);
 
 	*sequence = cur_vblank;
 
@@ -227,9 +232,8 @@ void radeon_driver_irq_postinstall(drm_device_t * dev)
 	atomic_set(&dev_priv->swi_emitted, 0);
 	DRM_INIT_WAITQUEUE(&dev_priv->swi_queue);
 
-	/* Turn on SW and VBL ints */
-	RADEON_WRITE(RADEON_GEN_INT_CNTL,
-		 RADEON_CRTC_VBLANK_MASK | RADEON_SW_INT_ENABLE);
+	/* Enable SW interrupts */
+	RADEON_WRITE(RADEON_GEN_INT_CNTL, RADEON_SW_INT_ENABLE);
 }
 
 void radeon_driver_irq_uninstall(drm_device_t * dev)
-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Fwd: [PATCH -mm] working 3D/DRI intel-agp.ko resume for i815 chip; Intel chipset testers wanted! (was: Re: intel-agp PM experiences ...)]

2007-05-04 Thread Sergio Monteiro Basto
Hi forward this message to dri-devel Mailing List, where you could find
more tester on i815 DRI drives .
I hope I don't had made a loop :) 

 Forwarded Message 
From: Andreas Mohr <[EMAIL PROTECTED]>
To: Pavel Machek <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>, [EMAIL PROTECTED],
[EMAIL PROTECTED], Matthew Garrett <[EMAIL PROTECTED]>,
kernel list <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
Subject: [PATCH -mm] working 3D/DRI intel-agp.ko resume for i815 chip;
Intel chipset testers wanted! (was: Re: intel-agp PM experiences ...)
Date:   Tue, 1 May 2007 16:59:47 +0200

Hi,

On Thu, Jan 18, 2007 at 11:16:51PM +, Pavel Machek wrote:
> Hi!
> 
> > > > Especially the PCI video_state trick finally got me a working resume on
> > > > 2.6.19-ck2 r128 Rage Mobility M4 AGP *WITH*(!) fully enabled and working
> > > > (and keeping working!) DRI (3D).
> > > 
> > > Can we get whitelist entry for suspend.sf.net? s2ram from there can do
> > > all the tricks you described, one letter per trick :-). We even got
> > > PCI saving lately.
> > 
> > Whitelist? Let me blacklist it all the way to Timbuktu instead!
> 
> > I've been doing more testing, and X never managed to come back to working
> > state without some of my couple intel-agp changes:
> > - a proper suspend method, doing a proper pci_save_state()
> >   or improved equivalent
> > - a missing resume check for my i815 chipset
> > - global cache flush in _configure
> > - restoring AGP bridge PCI config space
> 
> Can you post a patch?

Took way longer than I'd have wanted it to (nice daughter and stuff ;),
but here it is.

- add .suspend handler and pci_set_power_state() calls
- add i815-specific function agp_i815_remember_state() to remember important
  i815 register values
- add generic DEBUG_AGP_PM framework which will allow people to resume properly
  and help identify which registers need attention
- add obvious and detailed log message to make people sit up and take notice
  about long-standing AGP resume issues
- spelling fixes

Patch against 2.6.21-rc7-mm2, my Inspiron 8000 (i815 with Radeon AGP card,
internal Intel VGA unit NOT active) resumes fine with both
either i815-specific register saving or generic DEBUG_AGP_PM mechanism enabled.
(of course my notebook needs vbetool post and manual saving of video card
PCI space, too, but even when doing all this I still had X.org lockups before
whenever DRI/3D was enabled)

After resume I'm now still able to run both glxgears and glxinfo without
anomalies.

Right now all I care about is that this gets into mainline relatively soon,
since I'm rather certain that many other machines suffer from similar
AGP resume lockup issues that could be debugged this way (e.g. some Thinkpads,
as witnessed accidentally via IRC chats, and from the well-known "don't enable
DRI, that will lock up on resume!" chants).
Yes, this code is a cludge and somewhat far from a nicely generic
extended PCI space resume framework, but we've spent almost 10 (TEN!) years
without anything even remotely resembling a working cludge for
AGP suspend/resume in combination with DRI, so...

Feel free to offer thoughts on how this missing generic extended PCI space
restore functionality could be implemented, to be used by intel-agp and
various other drivers. No promise that it will be me who implements that,
though ;)

> Whitelist entry would still be welcome.

OK, I'll work on this next.


Thanks!

Signed-off-by: Andreas Mohr <[EMAIL PROTECTED]>


--- linux-2.6.21-rc7-mm2.orig/drivers/char/agp/intel-agp.c  2007-05-10 
14:52:26.0 +0200
+++ linux-2.6.21-rc7-mm2/drivers/char/agp/intel-agp.c   2007-05-10 
14:31:48.0 +0200
@@ -31,9 +31,16 @@
 extern int agp_memory_reserved;
 

-/* Intel 815 register */
-#define INTEL_815_APCONT   0x51
-#define INTEL_815_ATTBASE_MASK ~0x1FFF
+/* Intel i815 registers, see Intel spec #29068801 */
+#define I815_GMCHCFG   0x50
+#define I815_APCONT0x51
+#define I815_UNKNOWN_800x80
+#define I815_ATTBASE_MASK  ~0x1FFF
+#define I815_SM_RCOMP  0x98 /* Sys Mem R Compensation Ctrl */
+#define I815_SM0x9c /* System Memory Control Reg */
+#define I815_AGPCMD0xa8 /* AGP Command Register */
+#define I815_ERRSTS0xc8 /* undocumented in i815 spec; since this 
one is modified on resume and many other related chipsets have it, I assume it 
*is* ERRSTS */
+#define I815_UNKNOWN_E80xe8
 
 /* Intel i820 registers */
 #define INTEL_I820_RDCR0x51
@@ -664,7 +671,7 @@
if ((pg_start + mem->page_count) > num_entries)
goto out_err;
 
-   /* The i830 can't check the GTT for entries since its read only,
+   /* The i830 can't check the GTT for entries since it's read-only,
 * depend on the caller to make the correct offset decisions.
 */
 
@@ -787,7 +794,7 @@
if ((pg_start + mem->page_count) > num_entries)
goto out_