Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Benjamin Herrenschmidt
On Tue, 2005-05-03 at 15:24 +1000, Benjamin Herrenschmidt wrote: 
> Hi !
> 
> The radeon DRM has some interesting bug that paul and I discovered to
> cause all sort of problems like crashing the machine on suspend/resume
> (go figure ...) etc...
> 
>   dev_priv->gart_vm_start = dev_priv->fb_location
>   + RADEON_READ( RADEON_CONFIG_APER_SIZE );
> 
> The "aim" of this code is to setup the card memory map so that the GART
> sits just after the framebuffer. However, CONFIG_APER_SIZE is _not_ a
> good indication of the framebuffer size.
>
> CONFIG_APER_SIZE is only the size of the visible aperture on the PCI
> bus. Some setups (like some Macs for example) can use the dual split
> aperture mecanism, in which case CONFIG_APER_SIZE is only half of the
> VRAM. I can imagine cards overloaded with memory to have more vram that
> is directly accessible from PCI in other circumstances too (though the
> split aperture case is a real world scenario we encountered on paul's
> laptop at least).
> 
> The result is we end up putting the GART right in the middle of VRAM in
> card's space. The card's memory controller at best does nothing of it,
> at worst blows up in funny way when the engine is reset, or in some
> case, when re-initializing from suspend/resume cycle.

   .../...

Ok, I'm cross posting here because X.org is doing it wrong too. On R300,
for some reason I don't fully understand, it just goes back to the "old"
way of putting the FB at 0 (though it does properly use CONFIG_MEMSIZE
to set the size part of MC_FB_LOCATION), but for non R300, it does try
to put the framebuffer at the same address as the BAR ... and then tries
to use CONFIG_APER_SIZE for the size part of MC_FB_LOCATION, which is
incorrect.

The result is that on !r300, it won't crash since X.Org and DRI won't
create an overlapping mapping, but they won't be able to use all of VRAM
of some cards where CONFIG_APER_SIZE < CONFIG_MEM_SIZE, and on r300, it
will possibly create overlapping mappings and will cause all sorts of
troubles if you get the above case.

Another problem I haven't checked is that we should make sure, before
changing MC_FB_LOCATION, to actually disable scanning of memory by both
CRTCs (and possibly disable tiling, I had some problems in radeonfb due
to mac firmware enabling tiling, that would explode when playing with
MC_FB_LOCATION).

In the meantime, here's a patch against current Linus "git" that I'm
tempted to push asap so that at least 2.6.12 avoids the problem of
overlapping which causes random stuffs to happen with lockups. The
"issue" here is even if you don't have an r300-friendly DRM, it will
still try to initialize those things, even if it ultimately fails,
provided you have a new enough X.org, and thus will screw up the
mapping.

Index: linux-work/drivers/char/drm/radeon_drv.h
===
--- linux-work.orig/drivers/char/drm/radeon_drv.h   2005-05-02 
10:48:09.0 +1000
+++ linux-work/drivers/char/drm/radeon_drv.h2005-05-03 17:51:55.0 
+1000
@@ -346,6 +346,7 @@
 #define RADEON_CLOCK_CNTL_DATA 0x000c
 #  define RADEON_PLL_WR_EN (1 << 7)
 #define RADEON_CLOCK_CNTL_INDEX0x0008
+#define RADEON_CONFIG_MEMSIZE  0x00f8
 #define RADEON_CONFIG_APER_SIZE0x0108
 #define RADEON_CRTC_OFFSET 0x0224
 #define RADEON_CRTC_OFFSET_CNTL0x0228
Index: linux-work/drivers/char/drm/radeon_cp.c
===
--- linux-work.orig/drivers/char/drm/radeon_cp.c2005-05-02 
10:48:09.0 +1000
+++ linux-work/drivers/char/drm/radeon_cp.c 2005-05-03 17:49:25.0 
+1000
@@ -1269,6 +1269,7 @@
 {
drm_radeon_private_t *dev_priv = dev->dev_private;;
DRM_DEBUG( "\n" );
+   u32 gart_loc;
 
dev_priv->is_pci = init->is_pci;
 
@@ -1476,8 +1477,12 @@
 
 
dev_priv->gart_size = init->gart_size;
-   dev_priv->gart_vm_start = dev_priv->fb_location
-   + RADEON_READ( RADEON_CONFIG_APER_SIZE );
+   gart_loc = dev_priv->fb_location + RADEON_READ( RADEON_CONFIG_MEMSIZE );
+   /* overflow ? */
+   if ((gart_loc + dev_priv->gart_size) < dev_priv->fb_location)
+   gart_loc = dev_priv->fb_location - dev_priv->gart_size;
+   
+   dev_priv->gart_vm_start = gart_loc;
 
 #if __OS_HAS_AGP
if ( !dev_priv->is_pci )




---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Benjamin Herrenschmidt
On Tue, 2005-05-03 at 10:09 +0200, Jerome Glisse wrote:

> Has i still doesn't understand the big pictures of video drivers, i
> was wondering
> if this could have an impact on bytes swapping on r300. Thus the problem
> of bit blit we have on r300 & X driver. I don't think so but as i don't well
> understand all this... 

Well, it depends which byteswapping :) (Though paulus did some hacks for
that recently iirc).

Here is a short explanation of the memory mapping of a radeon:

 - Card view. This is the view of memory from the GPU point of view
(things you put in OFFSET registers etc...). The video RAM is mapped by
MC_FB_LOCATION which indicates location and size, and the AGP space by
MC_AGP_LOCATION. I'm on purpose ignoring the case of "PCI GART" which
can be considered as similar to AGP in that discussion. These affect the
MC (Memory Controller), so they define where an access from the engine
or the CRTCs ends up. I'm fairly sure that if you provide an address
outside of those 2 ranges, the card does a normal PCI bus master cycle
to that address. CONFIG_MEMSIZE is usually the total VRAM size. I'm not
100% sure how much of that register is actually _used_ by the HW, but
it's generally safe to assume it contains a correct value except for
some M6 chips where we need to fix it up when it's 0 (oops, the patch I
posted didn't do it !)

 - Bus view (that is vew from outside of the card, like the CPU). PCI
BAR 0 exposes a PCI region of the size CONFIG_APER_SIZE * 2. (That is 2
"apertures"). The actual value of CONFIG_APER_SIZE is defined by straps
on the video board or motherboard. Each aperture is 1/2 of the PCI
space. The way they actually map to video memory though depends on the
setting of the bit HOST_PATH_CNTL::HDP_APER_CNTL:

   0 : Both apertures map to the same area of video memory which starts
from the beginning of video RAM
   1 : Apertures map contiguous portion of memory. That is, if
CONFIG_APER_SIZE is 64Mb, then access to aperture 0 will access vram
from 0 to 64Mb-1 while access to aperture 1 will access vram from 64Mb
to 128Mb-1.

Each apperture can have a different swapper setting. This swapper
setting, though, afaik, only has an effect for datas written by the host
(but then, I'm not totally sure of what engine host data blits are
supposed to do).

Now, the setting above has to be done the most intelligently you can
based on 1) do you need 2 apertures with different swapper settings
(typical of BE machines) or not, 2) what is your CONFIG_APER_SIZE
strapping vs. how much VRAM you have ... You can see both kind of setups
on Mac cards.

Ben.




---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: r300 fixed pipeline design

2005-05-03 Thread Keith Whitwell
Aapo Tahkola wrote:
On Thu, 21 Apr 2005 09:57:48 -0400 (EDT)
Vladimir Dergachev <[EMAIL PROTECTED]> wrote:

On Thu, 21 Apr 2005, Aapo Tahkola wrote:

On Wed, 23 Feb 2005 15:03:38 -0500 (EST)
Vladimir Dergachev <[EMAIL PROTECTED]> wrote:

  With regard to state switching, it might be worth it to simply hash
various configuration (fog on /fog off, etc) and just upload state
difference on such changes.
Could work reasonably well. Problem with hashing all programs is that we 
would most likely have so many different programs that it would be undesirable 
to keep them in memory. Take for example omiting tex coord transforms, 
rescaling of normals, normalization of normals..
Sure we could just start dropping them but that might lead to instable 
framerates if we constantly translate new programs.
I cant say I knew any really good way to handle this at the moment so its 
probably best to try something and see what problems arise.
Well, we know that the register space we are interested in is less than 4K.
A megabyte would hold 256 such configurations - should be plenty, no ?

Maybe for average case but not for worst.
Heres a list of problems that prevent r300 driver from using Keith's ffp program generator:
1. _TnlProgram is of fixed size type and smaller than r300_vertex_program 
What's the actual issue here?  In what circumstances does this cause a 
problem?

2. Programs generated are incomplete in sense that they dont move input color to output(also applies to texture coords)
The programs share the semantics of regular vertex programs - which 
don't do this copying either.  So, if you need to add this sort of 
copying when you encode a regular vertex program for the r300, you'll 
need to do the same thing with the generated programs.  If not, I don't 
understand what's going on.

3. Number of temps exceeds 32 in some cases.
Can you give some details?  I'm sure this can be pared down a little.
Attached patch temporarily fixes first two issues.
Problems on r300 side(that im aware of):
1. Multitexturing is broken on r300 side as texcoords regs arent properly 
asigned in r300_setup_rs_unit
2. Problems with colors applied to textures(see dinoshade).
Ill add something that allows to switch between hw and sw tnl on the fly using 
magic keys later today.
Keith
---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Michel Dänzer
On Tue, 2005-05-03 at 15:24 +1000, Benjamin Herrenschmidt wrote:
> 
> The radeon DRM has some interesting bug that paul and I discovered to
> cause all sort of problems like crashing the machine on suspend/resume
> (go figure ...) etc...
> 
>   dev_priv->gart_vm_start = dev_priv->fb_location
>   + RADEON_READ( RADEON_CONFIG_APER_SIZE );
> 
> The "aim" of this code is to setup the card memory map so that the GART
> sits just after the framebuffer. However, CONFIG_APER_SIZE is _not_ a
> good indication of the framebuffer size.

Indeed, I apologize for this 'bogosity'. Seemed like a good idea at the
time...

> CONFIG_APER_SIZE is only the size of the visible aperture on the PCI
> bus. Some setups (like some Macs for example) can use the dual split
> aperture mecanism, in which case CONFIG_APER_SIZE is only half of the
> VRAM. I can imagine cards overloaded with memory to have more vram that
> is directly accessible from PCI in other circumstances too (though the
> split aperture case is a real world scenario we encountered on paul's
> laptop at least).

Yeah, cards with 256 MB of VRAM or more only have a 128 MB PCI aperture.
So this fix might also help with the issues people have with such cards.


> In practice, you are setting up the card's memory map, so
> CONFIG_APER_SIZE should be totally irrelevant anyway since it only
> affects the PCI window to the vram. What is relevant here is
> CONFIG_MEMSIZE I would say...
> 
>   dev_priv->gart_vm_start = dev_priv->fb_location
>   + RADEON_READ( RADEON_CONFIG_MEMSIZE );
> 
> If we want to be totally paranoid, we may want to use the max of 
> CONFIG_MEMSIZE
> and CONFIG_APER_SIZE (to avoid leaving part of the GART mapped though the
> PCI aperture.

Makes sense.

> Note that with huge VRAM sizes appearing, we also want to make sure that
> wheverver we put it won't overlap the 32 bits space since CONFIG_MEM_SIZE
> can be huge nowadays... and if it does, put the GART just _before_ the
> framebuffer instead. Again, this is all cards space, not bus view, so that
> shouldn't matter where we put these things.

Another constraint is that the GART doesn't overlap with the bus address
range of system RAM.


-- 
Earthling Michel DÃnzer  | Debian (powerpc), X and DRI developer
Libre software enthusiast|   http://svcs.affero.net/rm.php?r=daenzer



---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r 
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Michel Dänzer
On Tue, 2005-05-03 at 10:09 +0200, Jerome Glisse wrote:
> On 5/3/05, Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:
> > Ok, I'm cross posting here because X.org is doing it wrong too. On R300,
> > for some reason I don't fully understand, it just goes back to the "old"
> > way of putting the FB at 0 

I think that should be fixed as well, BTW.

> > (though it does properly use CONFIG_MEMSIZE
> > to set the size part of MC_FB_LOCATION), but for non R300, it does try
> > to put the framebuffer at the same address as the BAR ... and then tries
> > to use CONFIG_APER_SIZE for the size part of MC_FB_LOCATION, which is
> > incorrect.
> > 
> > The result is that on !r300, it won't crash since X.Org and DRI won't
> > create an overlapping mapping, but they won't be able to use all of VRAM
> > of some cards where CONFIG_APER_SIZE < CONFIG_MEM_SIZE, and on r300, it
> > will possibly create overlapping mappings and will cause all sorts of
> > troubles if you get the above case.
> 
> Has i still doesn't understand the big pictures of video drivers, i
> was wondering if this could have an impact on bytes swapping on r300. 
> Thus the problem of bit blit we have on r300 & X driver. I don't think 
> so but as i don't well understand all this... 

No, the big endian hostdata blit issues with R300 class cards are
strictly between the CP and the hostdata registers (as hostdata blits
work as expected with MMIO) and not related to the framebuffer or the
GART.


-- 
Earthling Michel DÃnzer  | Debian (powerpc), X and DRI developer
Libre software enthusiast|   http://svcs.affero.net/rm.php?r=daenzer



---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r 
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Benjamin Herrenschmidt
On Tue, 2005-05-03 at 10:41 +0200, Jerome Glisse wrote:
> > Now, the setting above has to be done the most intelligently you can
> > based on 1) do you need 2 apertures with different swapper settings
> > (typical of BE machines) or not, 2) what is your CONFIG_APER_SIZE
> > strapping vs. how much VRAM you have ... You can see both kind of setups
> > on Mac cards.
> 
> Thx a lot for your explanation. A little question why 2 apertures are more
> typical on BE devices ? Having 2 apertures is the best way to handle
> dual head, isn't it (one for each head) ?

That allows you to have a different swapper setting for each aperture,
which is handy when your host is just directly accessing the frame
buffer. A BE host need a different setting most of the time for
different bit depths.

> If my memory isn't too corrupted you said that stupid pc card
> haven't CONFIG_APER_SIZE set to half of memory, does this
> really matter (I guess that this push you to map aperture to same
> area) ?

In fact, stupid Mac cards neither in some cases. If it is == to vram or
== to vram/2, it's fine. Anything else can be source of trouble. Though
in the case of little endian peecees, it may not be -that- bad.

Ben.




---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Benjamin Herrenschmidt

> > Note that with huge VRAM sizes appearing, we also want to make sure that
> > wheverver we put it won't overlap the 32 bits space since CONFIG_MEM_SIZE
> > can be huge nowadays... and if it does, put the GART just _before_ the
> > framebuffer instead. Again, this is all cards space, not bus view, so that
> > shouldn't matter where we put these things.
> 
> Another constraint is that the GART doesn't overlap with the bus address
> range of system RAM.

Do we still care about that ? Do we ever do DMA from the card to system
RAM outside of the AGP context ?

I think a good strategy is to try to put the AGP aperture after the
video RAM, and if that doesn't fit, just before. That would keep us
"high enough" in most cases to avoid system RAM, but we can't guarantee
100% here.

Ben.




---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Dave Airlie
> In the meantime, here's a patch against current Linus "git" that I'm
> tempted to push asap so that at least 2.6.12 avoids the problem of
> overlapping which causes random stuffs to happen with lockups. The
> "issue" here is even if you don't have an r300-friendly DRM, it will
> still try to initialize those things, even if it ultimately fails,
> provided you have a new enough X.org, and thus will screw up the
> mapping.

If Michel Daenzer thinks this is okay, he was the last person to dig
around in that area with the changes for the IGPs, I'm away from my
setup so I can't look at this for another while, but if you can get
consensus quickly send it to Linus...

Dave.

> 
> Index: linux-work/drivers/char/drm/radeon_drv.h
> ===
> --- linux-work.orig/drivers/char/drm/radeon_drv.h   2005-05-02 
> 10:48:09.0 +1000
> +++ linux-work/drivers/char/drm/radeon_drv.h2005-05-03 17:51:55.0 
> +1000
> @@ -346,6 +346,7 @@
>  #define RADEON_CLOCK_CNTL_DATA 0x000c
>  #  define RADEON_PLL_WR_EN (1 << 7)
>  #define RADEON_CLOCK_CNTL_INDEX0x0008
> +#define RADEON_CONFIG_MEMSIZE  0x00f8
>  #define RADEON_CONFIG_APER_SIZE0x0108
>  #define RADEON_CRTC_OFFSET 0x0224
>  #define RADEON_CRTC_OFFSET_CNTL0x0228
> Index: linux-work/drivers/char/drm/radeon_cp.c
> ===
> --- linux-work.orig/drivers/char/drm/radeon_cp.c2005-05-02 
> 10:48:09.0 +1000
> +++ linux-work/drivers/char/drm/radeon_cp.c 2005-05-03 17:49:25.0 
> +1000
> @@ -1269,6 +1269,7 @@
>  {
> drm_radeon_private_t *dev_priv = dev->dev_private;;
> DRM_DEBUG( "\n" );
> +   u32 gart_loc;
> 
> dev_priv->is_pci = init->is_pci;
> 
> @@ -1476,8 +1477,12 @@
> 
> dev_priv->gart_size = init->gart_size;
> -   dev_priv->gart_vm_start = dev_priv->fb_location
> -   + RADEON_READ( RADEON_CONFIG_APER_SIZE );
> +   gart_loc = dev_priv->fb_location + RADEON_READ( RADEON_CONFIG_MEMSIZE 
> );
> +   /* overflow ? */
> +   if ((gart_loc + dev_priv->gart_size) < dev_priv->fb_location)
> +   gart_loc = dev_priv->fb_location - dev_priv->gart_size;
> +
> +   dev_priv->gart_vm_start = gart_loc;
> 
>  #if __OS_HAS_AGP
> if ( !dev_priv->is_pci )
> 
>


---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r 
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Michel Dänzer
On Wed, 2005-05-04 at 00:39 +1000, Benjamin Herrenschmidt wrote:
> > > Note that with huge VRAM sizes appearing, we also want to make sure that
> > > wheverver we put it won't overlap the 32 bits space since CONFIG_MEM_SIZE
> > > can be huge nowadays... and if it does, put the GART just _before_ the
> > > framebuffer instead. Again, this is all cards space, not bus view, so that
> > > shouldn't matter where we put these things.
> > 
> > Another constraint is that the GART doesn't overlap with the bus address
> > range of system RAM.
> 
> Do we still care about that ? Do we ever do DMA from the card to system
> RAM outside of the AGP context ?

Yes, e.g. for video capture (hence it's doubly surprising that the
framebuffer location would be hardcoded to 0 for r300 ;).

We should also use non-GART for the ring read pointer and scratch
register writeback.

> I think a good strategy is to try to put the AGP aperture after the
> video RAM, and if that doesn't fit, just before. That would keep us
> "high enough" in most cases to avoid system RAM, but we can't guarantee
> 100% here.

If a conflict can't be avoided, we could fail gracefully upfront
(suggesting to make the GART aperture smaller, ...) instead of risking
subtle breakage?


-- 
Earthling Michel DÃnzer  | Debian (powerpc), X and DRI developer
Libre software enthusiast|   http://svcs.affero.net/rm.php?r=daenzer



---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r 
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Vladimir Dergachev

On Wed, 4 May 2005, Benjamin Herrenschmidt wrote:

Note that with huge VRAM sizes appearing, we also want to make sure that
wheverver we put it won't overlap the 32 bits space since CONFIG_MEM_SIZE
can be huge nowadays... and if it does, put the GART just _before_ the
framebuffer instead. Again, this is all cards space, not bus view, so that
shouldn't matter where we put these things.
Another constraint is that the GART doesn't overlap with the bus address
range of system RAM.
Do we still care about that ? Do we ever do DMA from the card to system
RAM outside of the AGP context ?
This is very useful for traffic from video memory to system memory - 
for example for video capture.

 best
Vladimir Dergachev
I think a good strategy is to try to put the AGP aperture after the
video RAM, and if that doesn't fit, just before. That would keep us
"high enough" in most cases to avoid system RAM, but we can't guarantee
100% here.
Ben.

---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: r300 fixed pipeline design

2005-05-03 Thread Aapo Tahkola
On Tue, 03 May 2005 14:59:53 +0100
Keith Whitwell <[EMAIL PROTECTED]> wrote:

> Aapo Tahkola wrote:
> > On Thu, 21 Apr 2005 09:57:48 -0400 (EDT)
> > Vladimir Dergachev <[EMAIL PROTECTED]> wrote:
> > 
> > 
> >>
> >>On Thu, 21 Apr 2005, Aapo Tahkola wrote:
> >>
> >>
> >>>On Wed, 23 Feb 2005 15:03:38 -0500 (EST)
> >>>Vladimir Dergachev <[EMAIL PROTECTED]> wrote:
> >>>
> >>>
>    With regard to state switching, it might be worth it to simply hash
> various configuration (fog on /fog off, etc) and just upload state
> difference on such changes.
> >>>
> >>>Could work reasonably well. Problem with hashing all programs is that we 
> >>>would most likely have so many different programs that it would be 
> >>>undesirable to keep them in memory. Take for example omiting tex coord 
> >>>transforms, rescaling of normals, normalization of normals..
> >>>Sure we could just start dropping them but that might lead to instable 
> >>>framerates if we constantly translate new programs.
> >>>I cant say I knew any really good way to handle this at the moment so its 
> >>>probably best to try something and see what problems arise.
> >>
> >>Well, we know that the register space we are interested in is less than 4K.
> >>A megabyte would hold 256 such configurations - should be plenty, no ?
> > 
> > 
> > Maybe for average case but not for worst.
> > 
> > Heres a list of problems that prevent r300 driver from using Keith's ffp 
> > program generator:
> > 1. _TnlProgram is of fixed size type and smaller than r300_vertex_program 
> 
> What's the actual issue here?  In what circumstances does this cause a 
> problem?

Mesa is holding drivers private data bound to programs in containers just like 
in i915NewProgram.
I suggest this to be sorted out by adding PrivatePrt to vertex and fragment 
program structures in Mesa.
This way drivers can allocate their private structures at translation stage and 
more better estimate needed memory.
Also this fits well into the hashing scheme when arb programs generated by 
t_vp_build.c could be destroyed once no longer needed.

> 
> > 2. Programs generated are incomplete in sense that they dont move input 
> > color to output(also applies to texture coords)
> 
> The programs share the semantics of regular vertex programs - which 
> don't do this copying either.  So, if you need to add this sort of 
> copying when you encode a regular vertex program for the r300, you'll 
> need to do the same thing with the generated programs.  If not, I don't 
> understand what's going on.

According to issue 16 of vertex_progra spec, initial values of temps and 
results are undefined unless written to.

> > 3. Number of temps exceeds 32 in some cases.
> 
> Can you give some details?  I'm sure this can be pared down a little.

Happened with ut2k4. Ill look into this at better time.

-- 
Aapo Tahkola


---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 3195] New: Setting GL_TEXTURE_LOD_BIAS_EXT can cause a segfault.

2005-05-03 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to
   
the URL shown below and enter yourcomments there. 
   
https://bugs.freedesktop.org/show_bug.cgi?id=3195  
 
   Summary: Setting GL_TEXTURE_LOD_BIAS_EXT can cause a segfault.
   Product: Mesa
   Version: CVS
  Platform: PC
   URL: http://sourceforge.net/tracker/index.php?func=detail&aid
=1194546&group_id=31763&atid=403301
OS/Version: Linux
Status: NEW
  Severity: critical
  Priority: P3
 Component: Drivers/DRI/i810
AssignedTo: dri-devel@lists.sourceforge.net
ReportedBy: [EMAIL PROTECTED]


When I try to set the GL_TEXTURE_LOD_BIAS_EXT of my OpenGL environment with the
following line, the Intel 810 driver
(extras/Mesa/src/mesa/drivers/dri/i810/i810tex.c) sometimes suffers a
segmentation fault:
   glTexEnvf(GL_TEXTURE_FILTER_CONTROL_EXT, GL_TEXTURE_LOD_BIAS_EXT, -1.2);

Sadly, I can't seem to manufacture a simple test case.

This may *well* be due to pilot error; if so, I'd appreciate any advice on
correctly initializing the texture environment.  Still, I expect that OpenGL
drivers should never cause a segmentation fault when setting a parameter value,
so I'm submitting this report.

I can work around the problem by adding the following code:
+  if (!glIsEnabled(GL_TEXTURE_2D))
+  {
+ glEnable(GL_TEXTURE_2D);
+ glClear(0);
+ glDisable(GL_TEXTURE_2D);
+  }
   glTexEnvf(GL_TEXTURE_FILTER_CONTROL_EXT, GL_TEXTURE_LOD_BIAS_EXT, -1.2);


The following patch should avoid the segmentation fault altogether:
---cut here---
cvs diff -u ./xorg/xc/extras/Mesa/src/mesa/drivers/dri/i810/i810tex.c
Index: ./xorg/xc/extras/Mesa/src/mesa/drivers/dri/i810/i810tex.c
===
RCS file: /cvs/xorg/xc/extras/Mesa/src/mesa/drivers/dri/i810/i810tex.c,v
retrieving revision 1.1.1.1
diff -u -r1.1.1.1 i810tex.c
--- ./xorg/xc/extras/Mesa/src/mesa/drivers/dri/i810/i810tex.c   16 Jun 2004
09:18:05 -  1.1.1.1
+++ ./xorg/xc/extras/Mesa/src/mesa/drivers/dri/i810/i810tex.c   3 May 2005
20:14:32 -
@@ -319,9 +319,11 @@
case GL_TEXTURE_LOD_BIAS_EXT:
   {
  struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
- i810TextureObjectPtr t = (i810TextureObjectPtr) tObj->DriverData;
- t->Setup[I810_TEXREG_MLC] &= ~(MLC_LOD_BIAS_MASK);
- t->Setup[I810_TEXREG_MLC] |= i810ComputeLodBias(*param);
+ if (tObj) {
+i810TextureObjectPtr t = (i810TextureObjectPtr) tObj->DriverData;
+t->Setup[I810_TEXREG_MLC] &= ~(MLC_LOD_BIAS_MASK);
+t->Setup[I810_TEXREG_MLC] |= i810ComputeLodBias(*param);
+ }
   }
   break;

---cut here---  
 
 
--   
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email 
 
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: More on Longhorn graphics architecture

2005-05-03 Thread Joseph Pingenot
>From Jon Smirl on Tuesday, 03 May, 2005:
>http://www.extremetech.com/article2/0,1558,1791681,00.asp

Ars Technica's Tiger article (specifically on the evolution of the OSX
  graphics system) was very informative as well.
http://arstechnica.com/reviews/os/macosx-10.4.ars

-Joseph

-- 
[EMAIL PROTECTED]
  Graduate student in physics, Free Software developer.


---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: More on Longhorn graphics architecture

2005-05-03 Thread Diego Calleja
El Tue, 3 May 2005 18:09:41 -0400,
Jon Smirl <[EMAIL PROTECTED]> escribió:

> "What does this actually mean? 3D surfaces can be paged out to virtual
> memory as needed. This is critical in the Longhorn user interface,
> where every window will be a 3D surface. Applications can now be
> bigger than graphics card memory currently allows. Of course, this

There is a great review of mac os x 10.4 in arstechnica (which I assume 
everybody
here has probably read already, but...) which talks about something similar:

http://arstechnica.com/reviews/os/macosx-10.4.ars/14

"As it turns out, VRAM has been "virtualized" by Mac OS X since Quartz Extreme
debuted in Jaguar. Although the Jaguar Quartz diagram shows the backing store in
RAM, the Quartz Compositor is smart enough to cache those backing stores in VRAM
as well. The biggest limitation of Jaguar's Quartz implementation is that the 
actual
drawing is still done into the backing store in RAM, so the diagram accurately 
reflects
the sequence of events during an actual drawing operation. But as long as a 
window's
contents don't change, the Quartz Compositor can continue to use its VRAM cache 
of
the backing store instead of reading it from RAM every single time.

Implementing even this limited form of VRAM caching required facing up to the 
reality
that VRAM won't always be able to hold cached copies of all of the backing 
stores.
Worse, the amount of VRAM varies depending on the video card being used. To
simplify the Quartz implementation, Jaguar needed some way to make VRAM look
"limitless" even though it clearly isn't.

This problem has been solved before. The virtual memory system in a modern
OS makes RAM look "limitless." Well, okay, it makes it appears as if it is 2^32 
or 2^64
bits long, for 32-bit and 64-bit CPUs, respectively. But that's almost 
certainly larger
than the amount of physical RAM installed (particularly in the 64-bit case).

Although the details are different, this is essentially what Jaguar did with 
VRAM.
To the operating system, VRAM looks a lot larger than it actually is. Quartz 
handles the
details of swapping data in and out of VRAM as needed, using a replacement 
algorithm
tuned to keep the most frequently used pieces of data in VRAM as much as 
possible.

In Jaguar's Quartz implementation, any backing stores cached in VRAM are simply
redundant copies of the backing stores in RAM. All backing stores must exist in 
RAM
in Jaguar because that's where drawing actually takes place. Quartz drawing
commands cause the backing stores in RAM to be modified. The completed backing
store is then (DMA) transferred to the video card where the Quartz Compositor 
blends
it into the scene and (perhaps) caches it in VRAM, just in case it needs to be 
used
again at some point before it's modified (in RAM, remember) by the application 
and
needs to be re-imported into VRAM.

In Tiger with Quartz 2D Extreme, Quartz 2D drawing commands now modify the
backing store in VRAM. The Quartz Compositor, also running on the video card, 
reads
from the very same backing store in VRAM. The backing store in RAM is no longer
needed at all.

Well, theoretically, anyway. Again remember that VRAM is finite. What doesn't 
fit in
VRAM has to be stored in RAM instead. Once VRAM is full, there is a constant 
dance
of data moving between RAM and VRAM as needed to (ideally) keep the most
frequently used data in VRAM. There's also another important reason a backing
store might be in RAM instead of VRAM."

[...]




---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r 
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Benjamin Herrenschmidt

> If a conflict can't be avoided, we could fail gracefully upfront
> (suggesting to make the GART aperture smaller, ...) instead of risking
> subtle breakage?

Well, I don't know of any clean platform independant way to know if
there is a conflict or not, that is to know where RAM is in bus space.
Especially from the PCI view, it can be anywhere. In some cases, you
have to go through an iommu, that on most ppc64 machines (and I suppose
sparc64 and x86_64). So the space you are looking for "preserving" is
the iommu virtual space, which may even be different than the real ram
space.

Something that would probably work as good we can for now is to assume
RAM at 0, check for overlap above framebuffer and eventually move the
GART below the fb if it does, though ... maybe Michel is right and just
ask the user to reduce the GART when it doesn't fit above the fb ? I
hope firmwares will usually be smart enough to map the fb not at the top
of the address space...

Ben.




---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 3198] New: Savage crashes with bus type PCI

2005-05-03 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to
   
the URL shown below and enter yourcomments there. 
   
https://bugs.freedesktop.org/show_bug.cgi?id=3198  
 
   Summary: Savage crashes with bus type PCI
   Product: DRI
   Version: unspecified
  Platform: PC
OS/Version: Linux
Status: NEW
  Severity: normal
  Priority: P2
 Component: DDX drivers
AssignedTo: dri-devel@lists.sourceforge.net
ReportedBy: [EMAIL PROTECTED]


On the savage list, Felix Kühling recently suggested
(http://probo.probo.com/pipermail/savage40/2005-March/000331.html) using 'Option
"BusType"  "PCI"' as a workaround for a bug in the kernel agpgart code that
causes crashes on resume. 

However, with this option I get crashes and graphics corruption on my Acer
Aspire with a Twister chip, using the binary DRI driver snapshots. A good
program for triggering the crashes is the Helios screensaver from rss-glx
(http://rss-glx.sourceforge.net), this locks the system hard within a few 
minutes.

Without the BusType option everything (except resuming from suspend) works fine.

The relevant portion of my xorg.conf is

Section "Device"
Identifier  "S3 Inc. VT8636A [ProSavage KN133] AGP4X VGA Controller
(TwisterK)"
Driver  "savage"
BusID   "PCI:1:0:0"
Option "HWCursor" "on"
Option "AGPMode" "4"
Option "BusType" "PCI"  # This option causes crashes
EndSection

I'm using Ubuntu xorg version 6.8.2-10 and I've tried different snapshots from
the past few months, all of them eventually crash after the screensaver has
kicked in (ordinary 2D graphics works fine).  
 
 
--   
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email 
 
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Benjamin Herrenschmidt
On Wed, 2005-05-04 at 09:48 +1000, Benjamin Herrenschmidt wrote:
> > If a conflict can't be avoided, we could fail gracefully upfront
> > (suggesting to make the GART aperture smaller, ...) instead of risking
> > subtle breakage?
> 
> Well, I don't know of any clean platform independant way to know if
> there is a conflict or not, that is to know where RAM is in bus space.
> Especially from the PCI view, it can be anywhere. In some cases, you
> have to go through an iommu, that on most ppc64 machines (and I suppose
> sparc64 and x86_64). So the space you are looking for "preserving" is
> the iommu virtual space, which may even be different than the real ram
> space.
> 
> Something that would probably work as good we can for now is to assume
> RAM at 0, check for overlap above framebuffer and eventually move the
> GART below the fb if it does, though ... maybe Michel is right and just
> ask the user to reduce the GART when it doesn't fit above the fb ? I
> hope firmwares will usually be smart enough to map the fb not at the top
> of the address space...

Ok, here's a new patch that I'll send to Linus if you (Michel) acks it.

I use CONFIG_MEMSIZE, I don't try to max out with CONFIG_APER_SIZE since
I beleive we just don't care, and that avoids putting pressure on the
GART location on configs that have a large aperture size.

If the GART doesn't fit, I move it to below the framebuffer and print a
warning.

The only thing is, that patch relies on CONFIG_MEMSIZE beeing a power of
2 I suppose... Is that always true ? If not, we'll need some hackery to
get to the nearest power of 2.

Ben.

Index: linux-work/drivers/char/drm/radeon_drv.h
===
--- linux-work.orig/drivers/char/drm/radeon_drv.h   2005-05-02 
10:48:09.0 +1000
+++ linux-work/drivers/char/drm/radeon_drv.h2005-05-03 17:51:55.0 
+1000
@@ -346,6 +346,7 @@
 #define RADEON_CLOCK_CNTL_DATA 0x000c
 #  define RADEON_PLL_WR_EN (1 << 7)
 #define RADEON_CLOCK_CNTL_INDEX0x0008
+#define RADEON_CONFIG_MEMSIZE  0x00f8
 #define RADEON_CONFIG_APER_SIZE0x0108
 #define RADEON_CRTC_OFFSET 0x0224
 #define RADEON_CRTC_OFFSET_CNTL0x0228
Index: linux-work/drivers/char/drm/radeon_cp.c
===
--- linux-work.orig/drivers/char/drm/radeon_cp.c2005-05-02 
10:48:09.0 +1000
+++ linux-work/drivers/char/drm/radeon_cp.c 2005-05-04 11:36:49.0 
+1000
@@ -1269,6 +1269,7 @@
 {
drm_radeon_private_t *dev_priv = dev->dev_private;;
DRM_DEBUG( "\n" );
+   u32 gart_loc;
 
dev_priv->is_pci = init->is_pci;
 
@@ -1476,8 +1477,16 @@
 
 
dev_priv->gart_size = init->gart_size;
-   dev_priv->gart_vm_start = dev_priv->fb_location
-   + RADEON_READ( RADEON_CONFIG_APER_SIZE );
+   gart_loc = dev_priv->fb_location + RADEON_READ(RADEON_CONFIG_MEMSIZE);
+   /* overflow ? */
+   if ((gart_loc + dev_priv->gart_size) < dev_priv->fb_location) {
+   DRM_INFO("Warning ! Gart does not fit above framebuffer in "
+"card space, moving it below. Risks collision with "
+" main memory ! ");
+   gart_loc = dev_priv->fb_location - dev_priv->gart_size;
+   }
+   
+   dev_priv->gart_vm_start = gart_loc;
 
 #if __OS_HAS_AGP
if ( !dev_priv->is_pci )




---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 2596] Disabling DRI. [drm] failed to load kernel module "i915" (EE) I810(0): [dri] DRIScreenInit failed.

2005-05-03 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to
   
the URL shown below and enter yourcomments there. 
   
https://bugs.freedesktop.org/show_bug.cgi?id=2596  
 




--- Additional Comments From [EMAIL PROTECTED]  2005-05-03 19:58 ---
The DRI is not supported for i915 on FreeBSD, because the kernel module hasn't
been ported.  Is the DRI initialization the problem you're asking about?  It's
not clear what your issue is.  
 
 
--   
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email 
 
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: Radeon DRM GART mapping bogosity

2005-05-03 Thread Michel Dänzer
On Wed, 2005-05-04 at 11:41 +1000, Benjamin Herrenschmidt wrote:
> 
> Ok, here's a new patch that I'll send to Linus if you (Michel) acks it.
> 
> I use CONFIG_MEMSIZE, I don't try to max out with CONFIG_APER_SIZE since
> I beleive we just don't care, and that avoids putting pressure on the
> GART location on configs that have a large aperture size.
> 
> If the GART doesn't fit, I move it to below the framebuffer and print a
> warning.

This is fine with me (what's the tag line for that again? :).

> The only thing is, that patch relies on CONFIG_MEMSIZE beeing a power of
> 2 I suppose... Is that always true ?

It is for all the Radeons I know of, but maybe Hui knows of exceptions.


-- 
Earthling Michel DÃnzer  | Debian (powerpc), X and DRI developer
Libre software enthusiast|   http://svcs.affero.net/rm.php?r=daenzer


---
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r 
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel