from:"Ville Syrjälä"

Re: drm: Branch 'master' - 5 commits

2012-04-05 Thread Ville Syrjälä

On Tue, Apr 03, 2012 at 02:50:01PM -0700, Rob Clark wrote:
 + /* just testing a limited # of formats to test single
 +  * and multi-planar path.. would be nice to add more..
 +  */
 + if (!strcmp(p-format_str, YUYV)) {
 + pitches[0] = p-w * 2;
 + offsets[0] = 0;
 + kms_bo_get_prop(plane_bo, KMS_HANDLE, handles[0]);
 +
 + fill422(virtual, 0, p-w, p-h, pitches[0]);
 +
 + format = DRM_FORMAT_YUYV;
 + } else if (!strcmp(p-format_str, NV12)) {
 + pitches[0] = p-w;
 + offsets[0] = 0;
 + kms_bo_get_prop(plane_bo, KMS_HANDLE, handles[0]);
 + pitches[1] = p-w;
 + offsets[1] = p-w * p-h;
 + kms_bo_get_prop(plane_bo, KMS_HANDLE, handles[1]);
 +
 + fill420(virtual, virtual+offsets[1], 
 virtual+offsets[1]+1,
 + 2, 0, p-w, p-h, pitches[0]);
 +
 + format = DRM_FORMAT_NV12;
 + } else if (!strcmp(p-format_str, YV12)) {
 + pitches[0] = p-w;
 + offsets[0] = 0;
 + kms_bo_get_prop(plane_bo, KMS_HANDLE, handles[0]);
 + pitches[1] = p-w / 2;
 + offsets[1] = p-w * p-h;
 + kms_bo_get_prop(plane_bo, KMS_HANDLE, handles[1]);
 + pitches[2] = p-w / 2;
 + offsets[2] = offsets[1] + (p-w * p-h) / 4;
 + kms_bo_get_prop(plane_bo, KMS_HANDLE, handles[1]);
   ^
Should be '2'. The kernel patch I just posted might have caught this.
OTOH it might not have in case handles[2] contains uninitialized data.

We should add a test that would make sure that passing an invalid bo
handle for any plane would return with an error. The problem is knowing
what exactly is an invalid handle since it's all driver specific.
Perhaps 0x would be a reasonably safe choice. Hmm, now I wonder
if 0 might actually be a valid handle for some of the current drivers...

There should obviously also be a test that does use separate bos
for reach plane. That should either succeed and produce the correct
result, or the driver should respond with an error at least to the
setplane ioctl. Whether it would allow addfb2 to succeed is a slightly
more complicated matter. I suppose it could be possible that on some
odd hardware some planes would support multiple bos and some would
not. In which case the driver would need to allow the addfb2.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [PATCH] drm: Allow platform devices to register as DRM devices

2010-03-15 Thread Ville Syrjälä

On Mon, Mar 01, 2010 at 09:00:09AM -0700, Jordan Crouse wrote:
 
 Allow platform devices without PCI resources to be DRM devices.

I really dislike the fact that drm has bus specific junk all over the
generic code. Some ideas how to clean that up:

Add 'struct device *dev' into drm_device so you don't have to go through
the pdev/platformdev to get it every time. Also use dev_name() instead
of pci_name() and whatever you used for the platform device case.

Add 'int irq' into struct drm_device instead of adding more bus specific
hoops to get at the irq. Not sure I like the generic code mucking about
with the irq directly though but baby steps are easier to handle.

Get rid of the drm_resource_start/len wrappers. AFAICS they're all called
from the low level driver code anyway and the driver knows the bus type
so there's no need for the wrappers.

It would be nice to get the struct pci_driver out of the the drm_driver
structure. Since you now have a new pci specific drm_get_dev() thing
could you also pass the pci_driver as a function parameter instead
of having it live inside the drm_driver?

Also all cases where there's some PCI specific stuff (the busid stuff
mostly) you could just check the drm_device.pdev pointer instead of
having to add another driver flags to identify non-PCI devices. Although
I don't really like having the pdev/platformdev pointers in there at all.

That's sort of my secret drm TODO list but so far didn't have the time
to actually do the coding part.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [PATCH] drm modes to fbdev mode patch

2010-03-13 Thread Ville Syrjälä

On Sat, Mar 13, 2010 at 02:47:52PM +, James Simmons wrote:
 
 For the fbdev layer the you have your struct fb_var_screeninfo and also 
 struct fb_videomode. The struct fb_videomode was developed for the modes
 database we have. Struct fb_var_screeninfo is more than just resolution 
 data which is why we create struct fb_videomode. The really nice thing 
 is that the conversion from fb_var to fb_videomode always fixes the 
 pixclock to the proper values so you don't need the pixclock = 0 work 
 around. I tested this patch with the intelfb driver and had no problem.
 I have used it in the past with a KMS enabled tdfx drver I wrote. In the 
 future this function can be used for fbdev level mode setting. Please try 
 it out and i hope it can be merged. Thanks.
 
 diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
 index 5054970..467ac68 100644
 --- a/drivers/gpu/drm/drm_fb_helper.c
 +++ b/drivers/gpu/drm/drm_fb_helper.c
 @@ -581,6 +581,60 @@ int drm_fb_helper_setcolreg(unsigned regno,
  }
  EXPORT_SYMBOL(drm_fb_helper_setcolreg);
  
 +void drm_display_mode_to_fbmode(struct drm_display_mode *mode,
 +struct fb_videomode *fbmode)
 +{
 + fbmode-xres = mode-hdisplay;
 + fbmode-yres = mode-vdisplay;
 + fbmode-right_margin = mode-hsync_start - mode-hdisplay;
 + fbmode-lower_margin = mode-vsync_start - mode-vdisplay;
 + fbmode-hsync_len = mode-hsync_end - mode-hsync_start;
 + fbmode-vsync_len = mode-vsync_end - mode-vsync_start;
 + fbmode-left_margin = mode-htotal - mode-hsync_end;
 + fbmode-upper_margin = mode-vtotal - mode-vsync_end;
 + fbmode-refresh = mode-vrefresh;
 + fbmode-name = mode-name;

Is there some guarantee that mode won't be freed before fbmode? That
would leave fbmode-name pointing to invalid memory.

 +
 + if (mode-flags  DRM_MODE_FLAG_INTERLACE)
 + fbmode-vmode |= FB_VMODE_INTERLACED;
 +
 + if (mode-flags  DRM_MODE_FLAG_DBLSCAN)
 + fbmode-vmode |= FB_VMODE_DOUBLE;
 +

What about the sync flags?

 + /* Doing a var to fb_videomode always create a proper pixclock
 +  * we can trust, but the reverse is not true. So we create
 +  * a proper pixclock from the refresh rate wanted. */
 + fbmode-pixclock = mode-vrefresh * mode-vtotal;
 + fbmode-pixclock *= mode-htotal;
 + fbmode-pixclock /= 1000;
 + fbmode-pixclock = KHZ2PICOS(fbmode-pixclock);
 +}
 +EXPORT_SYMBOL(drm_display_mode_to_fbmode);
 +
 +void fbmode_to_drm_display_mode(struct fb_videomode *fbmode,
 +struct drm_display_mode *mode)
 +{
 + mode-hdisplay = fbmode-xres;
 + mode-vdisplay = fbmode-yres;
 + mode-hsync_start = mode-hdisplay + fbmode-right_margin;
 + mode-vsync_start = mode-vdisplay + fbmode-lower_margin;
 + mode-hsync_end = mode-hsync_start + fbmode-hsync_len;
 + mode-vsync_end = mode-vsync_start + fbmode-vsync_len;
 + mode-htotal = mode-hsync_end + fbmode-left_margin;
 + mode-vtotal = mode-vsync_end + fbmode-upper_margin;
 + mode-vrefresh = fbmode-refresh;
 + mode-clock = PICOS2KHZ(fbmode-pixclock);
 +
 + if ((fbmode-vmode  FB_VMODE_MASK) == FB_VMODE_INTERLACED)
 + mode-flags |= DRM_MODE_FLAG_INTERLACE;
 +
 + if ((fbmode-vmode  FB_VMODE_MASK) == FB_VMODE_DOUBLE)
 + mode-flags |= DRM_MODE_FLAG_DBLSCAN;

Is interlaced+dblscan considered an invalid combination? The conversion
to the other direction didn't make that assumption.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Linux-fbdev-devel] drm_fb_helper: Impossible to change video mode

2010-03-10 Thread Ville Syrjälä

On Wed, Mar 10, 2010 at 06:11:29PM +, James Simmons wrote:
 
  I don't think so. There is another driver which does this -
  vesa/uvesa. For these it is not possible to change the resolution from
  fbdev, it just provides some framebuffer on top of which fb
  applications or fbcons run.
 
 Only because that is the only way to do it. The other options was to have 
 x86emul in the kernel. That was not going to happen.
  
  I guess equivalent of xrandr would be what people would want but the
  current fbdev capabilities are far from that.
  Since KMS provides these capabilities already I would think adding a
  tool that manipulates KMS directly (kmset?) is the simplest way.
 
 Still would have to deal with the issue of keeping the graphical console 
 in sync with the changes.
  
  There are other drivers that support multihead already (matroxfb, any
  other?) and have their own driver-specific inteface.
 
 Each crtc is treated as a seperate fbdev device. I don't recall any 
 special ioctls. Maybe for mirroring which was never standardized.

matroxfb does have a bunch of custom ioctls to change the crtc-output
mapping. omapfb is another multihead fb driver and it's more complex
than matroxfb. Trying to make it perform various tricks through the
fbdev API (and a bunch of custom ioctls, and a bunch of sysfs knobs)
is something I've been doing but I would not recommend it for anyone
who has the option of using a better API.

I don't think the CRTC=fb_info makes much sense if the main use
case is fbcon. fbcon will use a single fb device and so you can't see
the console on multiple heads anyway which makes the whole thing
somewhat pointless. And if you're trying to do something more complex
you will be a lot better off bypassing fbdev altogether.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Linux-fbdev-devel] drm_fb_helper: Impossible to change video mode

2010-03-10 Thread Ville Syrjälä

On Thu, Mar 11, 2010 at 02:22:14AM +, James Simmons wrote:
 
There are other drivers that support multihead already (matroxfb, any
other?) and have their own driver-specific inteface.
   
   Each crtc is treated as a seperate fbdev device. I don't recall any 
   special ioctls. Maybe for mirroring which was never standardized.
  
  matroxfb does have a bunch of custom ioctls to change the crtc-output
  mapping. 
 
 Yes the mapping issue were never address.
 
  I don't think the CRTC=fb_info makes much sense if the main use
  case is fbcon.
 
 Actually that is what I had in mind when I reworked the fbdev api. Plus 
 with the linux console project I got multiple VTs working at the same 
 time.
 
  fbcon will use a single fb device and so you can't see
  the console on multiple heads anyway which makes the whole thing
  somewhat pointless. And if you're trying to do something more complex
  you will be a lot better off bypassing fbdev altogether.
 
 Not true. You can map different displays to different vcs. Give con2fb a 
 try some time :-)

I know about it but only one VT can be active at any given time.

 Now there is the issue of more than one keyboard being 
 mapped at a time. BTW I did getting multipe VT working at the same 
 time working in the past. It requires some cleanup on the console layer.

Well if you think that cleanup is possible and worth the effort then it
might make sense. The crtc-output mapping is still unresolved though
and it depends on hardware which combinations are supported. If for
example the hardware can't drive multiple outputs with the same CRTC
or if the outputs require totally different timings then you can't
clone the same VT to multiple outputs.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [PATCH][RFC] time: add wait_interruptible_timeout macro to sleep (w. timeout) until wake_up

2010-02-26 Thread Ville Syrjälä

On Fri, Feb 26, 2010 at 06:33:57PM +0100, Rafał Miłecki wrote:
 W dniu 26 lutego 2010 17:14 użytkownik Andrew Morton
 a...@linux-foundation.org napisał:
  On Fri, 26 Feb 2010 11:38:59 +0100 Rafa Miecki zaj...@gmail.com wrote:
 
  +#define wait_interruptible_timeout(wq, timeout)
      \
  +({                                   \
  +    long ret = timeout;                      \
  +                                    \
  +    DEFINE_WAIT(wait);                      \
  +    prepare_to_wait(wq, wait, TASK_INTERRUPTIBLE);       \
  +    if (!signal_pending(current))                  \
  +        ret = schedule_timeout(ret);            \
  +    finish_wait(wq, wait);                   \
  +                                    \
  +    ret;                             \
  +})
 
  It's often a mistake to use signals in-kernel.  Signals are more a
  userspace thing and it's better to use the lower-level kernel-specific
  messaging tools in-kernel.  Bear in mind that userspace can
  independently and asynchronously send, accept and block signals.
 
 Can you point me to something kernel-level please?
 
 
  Can KMS use wait_event_interruptible_timeout()?
 
 No. Please check definition of this:
 
 #define wait_event_interruptible_timeout(wq, condition, timeout)  \
 ({\
   long __ret = timeout;   \
   if (!(condition))   \
   __wait_event_interruptible_timeout(wq, condition, __ret); \
   __ret;  \
 })
 
 It uses condition there, but that's not a big issue. We just need to
 pass 0 (false) there and it will work so far.

Disabling the condition check doesn't make sense.

You could use a completion.

init_completion(vbl_irq);
enable_vbl_irq();
wait_for_completion(vbl_irq);
disable_vbl_irq();
and call complete(vbl_irq) in the interrupt handler.

The same would of course work with just some flag or counter
and a wait queue. Isn't there already a vbl counter that you could
compare in the condition?

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Removal of mach64

2010-02-09 Thread Ville Syrjälä

On Tue, Feb 09, 2010 at 03:40:34AM -0500, Catalin Patulea wrote:
 Hmm, I was able to get the driver working, but I have some more
 questions; let me first give you some background.
 
 My box is a Dell PowerEdge 1600SC server with an integrated ATI Rage XL:
 $ sudo lspci -vs 0e
 00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
 Subsystem: Dell Device 0135
 Flags: bus master, VGA palette snoop, stepping, medium devsel,
 latency 32
 Memory at fd00 (32-bit, non-prefetchable) [size=16M]
 I/O ports at e800 [size=256]
 Memory at fe121000 (32-bit, non-prefetchable) [size=4K]
 [virtual] Expansion ROM at c000 [disabled] [size=128K]
 Capabilities: [5c] Power Management version 2
 
 I'm fairly sure it has 8M of video RAM -- not sure why that says 16M.
 Perhaps that's just the size of the window but only 8M is physically
 present.

The 16MB is split into two apertures. They both point to the same memory
area but the byte swapping for each can be controlled independently.
You can actually have 16MB of memory on a Rage Pro but the CPU can't
directly access all of it.

 I have compiled the kernel mach64 DRM driver from Archlinux, posted by
 Alexander Lam (many thanks for that), and it seems I can enable DRI
 with 1024x768x16, but I get the following (EE):
 (II) MACH64(0): [DRI] installation complete
 (II) MACH64(0): [drm] Added 128 16384 byte DMA buffers
 (II) MACH64(0): [drm] Mapped 128 DMA buffers at 0xb66db000
 (EE) MACH64(0): [drm] Couldn't find IRQ for bus id 0:14:0
 (II) MACH64(0): [drm] Falling back to irq-free operation
 (II) MACH64(0): Direct rendering enabled
 
 Given the (II) following it, this seems to be more of a warning.
 Indeed, the device doesn't have an IRQ in lspci or /proc/interrupts.
 Is there anything I can do about this? Any particular performance
 issues I should see due to lack of an IRQ?

There may be a jumper on the card to enable/disable the IRQ. At least
I'm pretty sure older mach64 cards had them. I don't know if integrated
mach64s generally had such jumpers or not. Perhaps the BIOS has an
option whether to assign an IRQ to the graphics card.

 The other question is regarding running DRI with a higher resolution,
 1280x1024x16 (since that's my LCD's native resolution ;-) ). I get the
 following:
 (II) MACH64(0): [drm] Will request asynchronous DMA mode
 (==) MACH64(0): [drm] Using 2 MB for DMA buffers
 (II) MACH64(0): [pci] ring handle = 0x36224000
 (II) MACH64(0): [pci] Ring mapped at 0xb699d000
 (II) MACH64(0): [drm] register handle = 0xfe121000
 (II) MACH64(0): [dri] Visual configs initialized
 (II) MACH64(0): [dri] Block 0 base at 0xfe121400
 (WW) MACH64(0): Not enough memory for local textures, disabling DRI
 (II) MACH64(0): [drm] removed 1 reserved context for kernel
 (II) MACH64(0): [drm] unmapping 8192 bytes of SAREA 0xf8035000 at 0xb69a1000
 (II) MACH64(0): [drm] Closed DRM master.
 (II) MACH64(0): Using XFree86 Acceleration Architecture (XAA)
 Screen to screen bit blits
 Solid filled rectangles
 8x8 mono pattern filled rectangles
 Indirect CPU to Screen color expansion
 Solid Lines
 [...]
 (II) MACH64(0): Direct rendering disabled
 
 Is there any way to get DRI with this higher resolution? Perhaps by
 reducing the (2 MB) DMA allocation?

That 2MB is in system RAM so it would not help.

 It seems to me like 1280x1024x16 / 8 = 2.5 MBytes should fit pretty
 easily.. why do I seem to need a lot more memory for this resolution?

Back buffer and depth buffer make that 7.5 MB.

You seem to have a PCI mach64. If you had an AGP version it could use
AGP memory for texturing, or at least the hardware would allow it, I
don't know about the driver though.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [PATCH] drm: EDID accept separate sync video mode

2010-01-14 Thread Ville Syrjälä

On Thu, Jan 14, 2010 at 07:02:20PM +0100, Jerome Glisse wrote:
 X is accepting such video mode, do the same. Pointed out by Joshua Roys
 on IRC. Fix https://bugzilla.redhat.com/show_bug.cgi?id=540024
 
 Signed-off-by: Jerome Glisse jgli...@redhat.com
 ---
  drivers/gpu/drm/drm_edid.c |1 -
  1 files changed, 0 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
 index 5c9f798..6d66383 100644
 --- a/drivers/gpu/drm/drm_edid.c
 +++ b/drivers/gpu/drm/drm_edid.c
 @@ -634,7 +634,6 @@ static struct drm_display_mode *drm_mode_detailed(struct 
 drm_device *dev,
   }
   if (!(pt-misc  DRM_EDID_PT_SEPARATE_SYNC)) {
   printk(KERN_WARNING integrated sync not supported\n);
 - return NULL;
   }

I suppose the patch title should be 'accept composite sync'. Perhaps the
error message could say composite sync too since then people would know
what it's trying to say. At least I've never heard the term integrated
sync before.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [PATCH 6/6] [drm/i915] implement drmmode overlay support v2

2009-09-01 Thread Ville Syrjälä

On Tue, Sep 01, 2009 at 12:10:20PM +0200, Stephane Marchesin wrote:
 As I said, if my hw overlay only does YUY2 and I want to expose
 YV12/I420 (because that's what everyone wants), I get to do the
 conversion myself. Now in the old case I could do it in the driver,
 but now you can either:
 - remove it and players stop using the overlay altogether (because few
 players will convert YV12 to YUY2 themselves)

What kind of crappy players are you using?

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [PATCH 6/6] [drm/i915] implement drmmode overlay support v2

2009-08-31 Thread Ville Syrjälä

On Mon, Aug 31, 2009 at 01:57:55PM +0200, Thomas Hellström wrote:
 Daniel Vetter wrote:
 
 ...
  In conclusion I don't think a common ioctl is worth it. But sharing some
  code and infrastructure on the kernel side is certainly possible, if
  someone implements overlay support for another chipset. But I don't really
  count on that, because at least radeon has textured video for all it's
  chips.

 I understand your concerns about the new X architecture where everything 
 is composited, and I admit I haven't looked through your patch in detail.
 
 However,
 we'll _probably_ need to add overlay support to the Xorg gallium + KMS 
 state-tracker shortly, and if so, with that a generic KMS interface that 
 is sufficient to implement a simple Xv overlay adaptor with KMS.
 
 Given the fact that Xv and various virtual device overlay support 
 implementations exist, I assume there *must* be a way to do this 
 generically. Perhaps not in the interest of sharing kernel code, but in 
 the interest of a single user-space interface and a single user-space 
 implementation supporting multiple hardware drivers.

DirectFB has a fairly nice abstraction for this stuff (layers). You can
look there for inspiration. BTW DirectFB uses the same abstraction even
for CRTCs which is nice. I haven't really looked at kms yet but it
seems to me that if it is too heavily based on the CRTC concept it
might not be able to expose the full capabilities of hardware that
doesn't really have CRTCs (eg. TI OMAP2/3).

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Intel-gfx] [RFC] DRI2 swapbuffers (yes yet again)

2009-05-01 Thread Ville Syrjälä

On Thu, Apr 30, 2009 at 02:39:57PM -0700, Jesse Barnes wrote:
 On Fri, 1 May 2009 00:25:54 +0300
 Ville Syrjälä syrj...@sci.fi wrote:
   The completion won't happen until at least 'interval' frames have
   passed since the flip was queued, so I think the semantics match?
  
  Well I guess it satisfies the requirement that flips will never happen
  less than interval frames apart but if the application is flipping at
  a slower rate anyway you still delay each flip by interval frames even
  though there is no real need to do so. So it increases the latency a
  bit. Also if/when you add support for queueing multiple flips the code
  needs to be changed anyway to use the previous flip rather than when
  the current flip was queued as the reference.
 
 Ah yeah I see what you mean, so if the app renders a frame and then
 queues a flip to happen in two refreshes, but doesn't queue its next
 frame until one refresh after the last one, you'll get a stutter, with 2
 refreshes between the first two frames, and 3 between the next two.

Yeah.

 If the app checks the frame count though, it could compensate and lower
 its frame frequency to something it can render at a fixed rate, as well
 as sending in a proper interval value.
 
If you mean that it should set the interval on some long term
observation about it's rendering speed then I agree. If say half of the
frames took 0.9 frame counts to render and half of them took 1.2 frame
counts to render then the application could just set interval to 2 and
get smoother animation than it would get with interval 1.

But if you mean that it should check the current frame count on each
flip and base the interval value on that then I disagree because getting
the frame count and queueing the flip would not be atomic so the
calculated interval value might be incorrect by the time the flip is
queued. This sort of thing would only work if the flip ioctl would take
an absolute frame count value for the flip, and then it would also need
to return the current frame count to the application so that the
application could calculate the next flip's absolute value correctly (in
case the previous flip actually happened after the specific absolute
value).

And as a final missing piece I would mention interlaced output
with proper field parity, but I'm not sure if you're interested
in such things for this API.
   
   We could treat the 'interval' as meaning odd/even in that case.
   I.e. an interval of 1 would mean 'next field' and 2 would mean
   'start of next frame', but yeah there's not much support for
   interlacing in the kernel atm.
  
  What's needed is rather next top field or next bottom field. If you
  combine that with supporting interval (should be useful when queueuing
  up several frames using 3:2 pulldown sequence) then there seems to be
  a need for something more than just a single number.
 
 Ok.  I'd better add whatever's needed to the ioctl now so we don't have
 to make a new one later.  You think just an odd/even field flag would be
 sufficient?

I think it needs three modes: top field first, bottom field first, or
either field first. The 'either field first' option can be used when
the source material is progressive. It doesn't really make sense to
demand flips to happen on any specific field in that case.

Also I suggest using the top/bottom terms rather than the odd/even
terms because odd/even are somewhat ambiguous.

So I suppose adding one flag for top field first, and a second flag for
bottom field first would be sufficient. Setting both flags could be
considered an error, and setting neither flag would allow the flip to
happen on either field. Does that sound reasonable?

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Register Now  Save for Velocity, the Web Performance  Operations 
Conference from O'Reilly Media. Velocity features a full day of 
expert-led, hands-on workshops and two days of sessions from industry 
leaders in dedicated Performance  Operations tracks. Use code vel09scf 
and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Intel-gfx] [RFC] DRI2 swapbuffers (yes yet again)

2009-04-30 Thread Ville Syrjälä

On Wed, Apr 29, 2009 at 06:02:59PM -0700, Jesse Barnes wrote:
 On Wed, 29 Apr 2009 15:09:33 -0700
 Jesse Barnes jbar...@virtuousgeek.org wrote:
  I'm still working through mutlihead issues on the kernel side; the
  flip waits should wait for *both* vblank events before completing the
  flip.  But other than that, I'm pretty happy with things.
 
 This incremental set fixes up the multihead handling and adds swap
 interval support as a bonus.  It's nice to see flipping  no tearing on
 two heads at once!

Your interval handling seems to be too harsh. In case you wanted
something that can implement GLX_SGI_swap_control then AFAICS the
interval should only specify the minimum number of frames that must
pass between two swaps. Your code appears to always delay the flip by
exactly interval frames.

Also it seems that the cases where you have more than one back buffer
were not yet fully considered. I can see two use cases for this:

1) Triple buffering with the purpose of going faster than the monitor
refresh rate w/o tearing. Here you would never wait for any flips and
if a new flip is scheduled before the previous was completed then the
previous flip should be considered immediately complete so the buffer
can be reused.

2) Scheduling multiple flips in advance while maintaining the swap
interval. This way the application could render several frames in
advance and just queue the flips in the driver to gain a little more
breathing room for the rendering.

And as a final missing piece I would mention interlaced output with
proper field parity, but I'm not sure if you're interested in such
things for this API.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Register Now  Save for Velocity, the Web Performance  Operations 
Conference from O'Reilly Media. Velocity features a full day of 
expert-led, hands-on workshops and two days of sessions from industry 
leaders in dedicated Performance  Operations tracks. Use code vel09scf 
and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Intel-gfx] [RFC] DRI2 swapbuffers (yes yet again)

2009-04-30 Thread Ville Syrjälä

On Thu, Apr 30, 2009 at 08:49:12AM -0700, Jesse Barnes wrote:
 On Thu, 30 Apr 2009 11:36:55 +0300
 Ville Syrjälä syrj...@sci.fi wrote:
 
  On Wed, Apr 29, 2009 at 06:02:59PM -0700, Jesse Barnes wrote:
   On Wed, 29 Apr 2009 15:09:33 -0700
   Jesse Barnes jbar...@virtuousgeek.org wrote:
I'm still working through mutlihead issues on the kernel side; the
flip waits should wait for *both* vblank events before completing
the flip.  But other than that, I'm pretty happy with things.
   
   This incremental set fixes up the multihead handling and adds swap
   interval support as a bonus.  It's nice to see flipping  no
   tearing on two heads at once!
  
  Your interval handling seems to be too harsh. In case you wanted
  something that can implement GLX_SGI_swap_control then AFAICS the
  interval should only specify the minimum number of frames that must
  pass between two swaps. Your code appears to always delay the flip by
  exactly interval frames.
 
 The completion won't happen until at least 'interval' frames have
 passed since the flip was queued, so I think the semantics match?

Well I guess it satisfies the requirement that flips will never happen
less than interval frames apart but if the application is flipping at a
slower rate anyway you still delay each flip by interval frames even
though there is no real need to do so. So it increases the latency a
bit. Also if/when you add support for queueing multiple flips the code
needs to be changed anyway to use the previous flip rather than when the
current flip was queued as the reference.

  Also it seems that the cases where you have more than one back buffer
  were not yet fully considered. I can see two use cases for this:
  
  1) Triple buffering with the purpose of going faster than the monitor
  refresh rate w/o tearing. Here you would never wait for any flips and
  if a new flip is scheduled before the previous was completed then the
  previous flip should be considered immediately complete so the buffer
  can be reused.
  
  2) Scheduling multiple flips in advance while maintaining the swap
  interval. This way the application could render several frames in
  advance and just queue the flips in the driver to gain a little more
  breathing room for the rendering.
 
 Yeah, I intended for the DRI2 protocol I added to handle this case, but
 there's no code for it yet.  I think it could be done purely in the 2D
 driver though.  I think case (2) is probably the most important here,
 but (1) is pretty easy to do as well.

Well, I consider 1 more important since it makes tear-free rendering
w/o additional delays possible. But anyways both seem to have some
value so ideally both should be supported.

  And as a final missing piece I would mention interlaced output with
  proper field parity, but I'm not sure if you're interested in such
  things for this API.
 
 We could treat the 'interval' as meaning odd/even in that case.  I.e.
 an interval of 1 would mean 'next field' and 2 would mean 'start of
 next frame', but yeah there's not much support for interlacing in the
 kernel atm.

What's needed is rather next top field or next bottom field. If you
combine that with supporting interval (should be useful when queueuing
up several frames using 3:2 pulldown sequence) then there seems to be a
need for something more than just a single number.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Register Now  Save for Velocity, the Web Performance  Operations 
Conference from O'Reilly Media. Velocity features a full day of 
expert-led, hands-on workshops and two days of sessions from industry 
leaders in dedicated Performance  Operations tracks. Use code vel09scf 
and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [PATCH 1/3] mga: Use request_firmware() to load microcode

2009-02-23 Thread Ville Syrjälä

On Sun, Feb 22, 2009 at 11:45:21PM +, Ben Hutchings wrote:
 On Mon, 2009-02-23 at 00:06 +0100, Stephane Marchesin wrote:
  Hi,
  
  This mga patch replaces a firmware that was split in pieces by
  functionality and that had comments with a single blob.
 
 Each pipe's code was converted to a seperate line of the ihex file.
 
  So IMO it's actually decreasing the quality of the code.
 
 You can read the microcode?!

Each blob matches a specific set of pipe flags. The pipe flags are part
of the userspace API. So the correct order of the blobs is very important.

 I believe it's possible to include comments in ihex files, so the pipe
 names could be added as comments.  I don't really see the point though -
 who's going to be editing them?

AFAICS your code doesn't the convey the MGA_WARP_FOO - specific ucode
blob mapping in any way. The old code made that part very clear. I
suppose sufficient comments whould be enough to fix the problem. You
should also do 'where = MGA_WARP_TGZ' instead of 'where = 0' when copying
the ucode. Or perhaps you should even unroll the loop and use the
WARP_UCODE_INSTALL() approach to keep the intention crystal clear.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: drm: Branch 'modesetting-gem'

2008-08-26 Thread Ville Syrjälä

On Tue, Aug 26, 2008 at 06:20:43PM -0700, Dave Airlie wrote:
  linux-core/radeon_encoders.c |   15 +++
  1 file changed, 11 insertions(+), 4 deletions(-)
 
 New commits:
 commit a4167e7b572859a998710ee599298e5131f51620
 Author: Dave Airlie [EMAIL PROTECTED]
 Date:   Wed Aug 27 11:12:19 2008 +1000
 
 radeon: avoid oops on encoders with no crtc set
 
 diff --git a/linux-core/radeon_encoders.c b/linux-core/radeon_encoders.c
 index ec36e43..f662872 100644
 --- a/linux-core/radeon_encoders.c
 +++ b/linux-core/radeon_encoders.c
 @@ -863,11 +863,18 @@ static void radeon_atom_tmds_dpms(struct drm_encoder 
 *encoder, int mode)
   struct drm_device *dev = encoder-dev;
   struct drm_radeon_private *dev_priv = dev-dev_private;
   struct radeon_encoder *radeon_encoder = to_radeon_encoder(encoder);
 - struct radeon_crtc *radeon_crtc = to_radeon_crtc(encoder-crtc);
 + struct radeon_crtc *radeon_crtc = NULL;
 + int crtc_id = 0;
   int atom_type = -1;
   int index = -1;
   uint32_t bios_2_scratch, bios_3_scratch;
  
 + if (radeon_crtc) {
 + radeon_crtc = to_radeon_crtc(encoder-crtc);
 + crtc_id = radeon_crtc-crtc_id;
 + } else if (mode == DRM_MODE_DPMS_ON)
 + return;
 +

That doesn't look right. Now it will never get the crtc_id.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: dri2-without-sarea branches for review

2008-08-19 Thread Ville Syrjälä

On Tue, Aug 19, 2008 at 12:50:18PM -0400, Kristian Høgsberg wrote:
 On Tue, Aug 19, 2008 at 6:57 AM, Michel Dänzer
 [EMAIL PROTECTED] wrote:
  Have you considered any other schemes, e.g. some kind of event triggered
  when a buffer swap actually takes effect, and which includes information
  about the new mapping from API buffers to memory buffers? Or is the idea
  to just leave any advanced SwapBuffers schemes to the drivers?
 
 Right, the problem with triple buffering is that once we schedule a
 swap, we don't know when the previous swap is finished and we can
 start rendering again.  Is it actually different from the regular
 double buffer case though?  You still need to block the client, which
 we can just do by delaying the reply from DRI2SwapBuffers.  In the
 triple buffering case you just have an extra buffer and you're
 blocking on the previous buffer swap instead of the current.

The idea with triple buffering is that you never have to wait.
When you do a flip all you need to know is whether the previously
scheduled flip has actually happened or not. If it has you rotate the
buffers so that the current scanout buffer is left alone. If the flip
hasn't happened yet you just swap the back and front buffers.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [RFC] update DRM core vblank code to be more power friendly

2007-06-12 Thread Ville Syrjälä

On Tue, Jun 12, 2007 at 08:59:23AM +0200, Michel Dänzer wrote:
 On Mon, 2007-06-11 at 15:20 -0700, Jesse Barnes wrote:
  On Monday, June 11, 2007 11:36:10 Keith Packard wrote:
   ick. just read the registers and return the value here. We should place
   wrap-detection in the core code by reporting the range of the register
   values; with the offset suggested above, that would result in a single
   addition to convert from raw to cooked frame counts.
  
  Ok, here's an updated version:
- move wraparound logic to the core
- add pre/post modeset ioctls (per-driver right now, making them core 
  would
  mean lots more DDX changes I think), 
 
 Shouldn't really matter, DDX drivers can call driver independent ioctls.
 
  hope I got this right
- add vblank_get/put calls so interrupts are enabled at the right time
  
  I haven't implemented Ville's suggestion of adding a short timer before
  disabling interrupts again, but it should be easy now that the get/put
  routines are in place and we think it's worth it (might make vblank
  calls a little cheaper, but it would probably be hard to detect).
 
 Yeah, I'm doubtful. Ville, can you explain some use cases you're
 thinking of?

Was this discussion only for wait for vblank? I was mainly thinking
about triple buffering w/ page flipping. Now that I think about it the
interrupt would only need to be enabled until the flip is actually
completed by the hardware. With interlaced displays that event might in
reality be two vblank interrupts away (assuming the API allows one to
sync to a specific field).

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [RFC] update DRM core vblank code to be more power friendly

2007-06-11 Thread Ville Syrjälä

On Mon, Jun 11, 2007 at 11:34:00AM -0700, Jesse Barnes wrote:
 On Monday, June 11, 2007 11:14:45 Ian Romanick wrote:
  Jesse Barnes wrote:
   We've had some IRC and off-list discussions about how to improve the
   DRM's vblank code to be a bit more power friendly.  The core requirement
   is that we only enable vblank interrupts when a client is actually
   waiting for a vblank event (be it a signal or a wakeup).
  
   This patch updates the DRM core, requiring drivers to provide vblank
   enable and disable hooks, as well as a counter updater, and adds some
   i915 code to use it.
  
   When the DRM vblank code is called, the core will update the counter,
   add the desired sequence value to it, and either setup a signal or
   wait for the desired sequence number to be hit, enabling vblanks around
   the operation.  Once complete, vblank interrupts will again be disabled
   to save power.
  
   The patch doesn't yet deal with two obvious cases (and probably more
   that I'm missing, it's untested yet):
 - the hardware counter resets on mode switch, we need to rebase
   the appropriate last_counter at that point so it's not treated as
   a counter wrap
 - a client interested in signals but also blocked on a vblank event
   may cause vblanks to be disabled if it received signal at the wrong
   time
  
   I'll be happy to fix it up and/or restructure as requested.  I think the
   basic approach should be fairly sound (even devices that don't support a
   counter register could fake it using total time/vrefresh or similar), but
   if not I'd love to hear about it. :)
 
  The problem is that a few of the GLX extensions (e.g.,
  GLX_SGI_video_sync and GLX_OML_sync_control) allow applications to query
  the vblank counter directly.  I don't know of other hardware that
  maintains an actual counter.  I know that MGA doesn't, and I'm pretty
  sure that Via doesn't either.
 
 Right, we still have to expose the counter.  But that just means calling the 
 update_vblank_counter hook before returning it to userspace.
 
 And of course, another option for devices that don't have vblank count 
 registers (aside from the 'fake it based on time' mentioned above) would be 
 to just leave interrupts enabled and do the counting there as usual.  That 
 would make the enable/disable hooks no-ops, and the update_vblank_counter 
 into a simple return of the latest value.

Rather than immediately disabling the interrupt what about keeping it
enabled for a few seconds. In that case it would be never disabled if
an application should need it constantly. That should provide accurate
results for short intervals. If the interrupt has been disabled before
the counter is queried again the driver could resort to approximation.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: drm: Branch 'master'

2006-09-22 Thread Ville Syrjälä

On Thu, Sep 21, 2006 at 08:44:47PM -0700, Dave Airlie wrote:
  shared-core/drm_pciids.txt |1 +
  1 files changed, 1 insertion(+)
 
 New commits:
 diff-tree 255f3e6f76dfd267a14765dd1293229184298d89 (from 
 1f71b8d7a456fe3ec4bfc2fed70b7420cdd0d55a)
 Author: Anish Mistry [EMAIL PROTECTED]
 Date:   Fri Sep 22 03:43:34 2006 +1000
 
 bug 7092 : add pci ids for mach64 in Dell poweredge 4200
 
 diff --git a/shared-core/drm_pciids.txt b/shared-core/drm_pciids.txt
 index 9e0c099..c597708 100644
 --- a/shared-core/drm_pciids.txt
 +++ b/shared-core/drm_pciids.txt
 @@ -186,6 +186,7 @@
  0x1002 0x4c51 0 3D Rage LT Pro
  0x1002 0x4c42 0 3D Rage LT Pro AGP-133
  0x1002 0x4c44 0 3D Rage LT Pro AGP-66
 +0x1002 0x4759 0 Rage 3D IICATI 3D RAGE IIC AGP(A12/A13)

The formatting looks really strange.

Also Rage IIC doesn't have a setup engine so AFAIK it should not be 
listed in the drm.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: DRM memory manager on cards with hardware contexts

2006-09-21 Thread Ville Syrjälä

On Thu, Sep 21, 2006 at 07:18:07PM +1000, Benjamin Herrenschmidt wrote:
 
  I'm finding this an interesting discussion.  If it shifts to lkml, for 
  instance, is there a way to follow *and post* on the thread without 
  either subscribing to lkml or requiring myself to be on the CC list?
 
 I don't know if lkml allows non-subscriber posted, I think it does tho.

It does. And you can post via Gmane too.

 So you can follow from an archive, though that sucks.

nntp://news.gmane.org is quite nice.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Kernel / user interface for new memory manager

2005-08-23 Thread Ville Syrjälä

On Tue, Aug 23, 2005 at 01:31:43PM -0400, Michel Dänzer wrote:
 On Tue, 2005-08-23 at 10:00 -0700, Ian Romanick wrote:
  
  Keith Packard wrote:
   On Tue, 2005-08-23 at 16:22 +0200, Stephane Marchesin wrote:
   
  Ok, here is what came out of the irc meeting :
  - we don't need to enforce video memory ownership, but the drm needs to
  be able to track allocation owners anyway, for example if a process dies
  unexpectedly.
   
   How expensive would it be to protect one processes video memory from
   another? I would like to be able to run applications for different users
   on the screen at the same time and prevent both reading and writing of
   the images. If not possible on current hardware, what would it take from
   new hardware to make this possible?
  
  You'd need the same stuff that you need to protect system memory.  You'd
  need a hardware MMU that could block the accesses.  It might be possible
  to do it in software by looking at the command stream, but I suspect
  that would be pretty expensive.  It would be worth a try, I suppose.
 
 Yeah, I don't expect it to be prohibitive; we're basically doing just
 that for Radeons already.
 
 Another part would be to only allow mapping owned parts of the
 framebuffer.

Is there any way to make that work without going to the kernel for each 
allocation? Personally I'd like to have the protection even if it 
degrades performance slightly.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net email is Sponsored by the Better Software Conference  EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile  Plan-Driven Development * Managing Projects  Teams * Testing  QA
Security * Process Improvement  Measurement * http://www.sqe.com/bsce5sf
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Kernel / user interface for new memory manager

2005-08-23 Thread Ville Syrjälä

On Tue, Aug 23, 2005 at 11:04:22AM -0700, Ian Romanick wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Stephane Marchesin wrote:
 
  Also, with the current log design for the memory manager, it is possible
  for a rogue process to make the log wrap and not call the
  force_log_update ioctl, thus being able to create some kind of race
  condition where the drm believes it still owns the memory but another
  process has allocated it.
 
 The log design presents numerous opportunities for rogue processes to do
 bad things.  At some level, that's inherent in the nature of direct
 rendering.  If you don't trust the processes, don't enable direct rendering.

Considering one of the major uses for direct rendering is games I don't 
think that idea will work.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net email is Sponsored by the Better Software Conference  EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile  Plan-Driven Development * Managing Projects  Teams * Testing  QA
Security * Process Improvement  Measurement * http://www.sqe.com/bsce5sf
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Kernel / user interface for new memory manager

2005-08-23 Thread Ville Syrjälä

On Tue, Aug 23, 2005 at 10:49:28PM -0400, Michel Dänzer wrote:
 On Wed, 2005-08-24 at 00:40 +0200, Stephane Marchesin wrote:
  Alan Cox wrote:
   Its critical that the kernel knows what memory on the video space is
   being used for command queue and protects it. From the description of
   the SiS turboqueue I suspect you may be able to root a sis video box
   that way but without full docs I can't tell.
  
  Protecting a statically assigned command queue is one thing (something 
  similar to what's currently done on radeon would be sufficient), 
  protecting dynamically allocated video memory is another.
 
 If the DRM operated on memory objects instead of with offsets directly,
 it should be trivial: It only has to check that the caller has
 permission to access the memory objects involved.

To make this bullet proof it would also have to make sure the operation is 
clipped so that it doesn't extend beyond the allocated memory.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net email is Sponsored by the Better Software Conference  EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile  Plan-Driven Development * Managing Projects  Teams * Testing  QA
Security * Process Improvement  Measurement * http://www.sqe.com/bsce5sf
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Getting DRI working on PCI MGA cards

2005-05-11 Thread Ville Syrjälä

On Tue, May 10, 2005 at 02:59:49PM -0700, Ian Romanick wrote:
 I've started working to get PCI MGA cards, the PCI G450 specifically, 
 working with DRI.  My initial goal is to just get it working with crummy 
 performance, then improve it by adding support for IOMMUs (to simulate 
 AGP texturing) on systems like pSeries and AMD64 that have them.
 
 I've started by digging through the DRI init process in the X.org MGA 
 DDX.  As near as I can tell, the driver uses AGP memory for four things. 
  To make the PCI cards work, I'll need to make it do without AGP for 
 these things.
 
 1. WARP microcode.  This seems *really* odd to me.  The DDX carves off a 
 32KiB chunk of AGP space and gives it to the kernel to use to store the 
 WARP microcode.  Why is the DDX involved in this *at all*?  The 
 microcode exists only in the kernel module.  It seems that the DRM could 
 just as easily drm_pci_alloc a chunk of memory large enough to hold the 
 microcode for the card (which is different for G400-class cards and 
 G200-class cards).

You don't even have to allocate a single 32KiB chunk. Instead you could 
allocate smaller chunks for each of the microcode images but 32KiB should 
be small enough for a single allocation, right?

 2. Primary DMA buffer.  The DDX carves of 1MB for the primary DMA 
 buffer.  I don't think that's outside the reasonable realm for 
 drm_pci_alloc.  If it is, can this work with a smaller buffer?

I haven't measured how much of the buffer is actaully used under normal 
circumstances.

Currently the blit, swap, etc. ioctls directly write to the primary 
buffer. You could get some small gains from using secondary buffers for 
those as well but I'm not sure if it's really worth it.

 3. Secondary DMA buffers.  The DDX carves off room for 128 64KiB DMA 
 buffers.  I haven't dug that deeply, but I seem to recall that the DRI 
 driver uses these buffers as non-contiguous.  That is, it treats them as 
 128 separate buffers and not a big 8MB buffer that it cards 64KiB chunks 
 from.  If that's the case, then it should be easy enough to modify the 
 driver the drm_pci_alloc (upto) 128 64KiB chunks for PCI cards.  Is 
 there any actual performance benefit to having this be in AGP space at 

AGP reads are faster than PCI reads. I haven't actually measured if there 
is any real world difference.

 all or do they just have to be in the same address space as the 
 primary DMA buffer?

If by address space you mean AGP aperture vs. other memory then no they 
don't have be in the same address space. You can choose to use PCI or AGP 
transfers every time you submit a new buffer to the hardware.

 4. AGP textures.  Without an IOMMU, we pretty much have to punt here. 
 Performance will be bad, but I can live with that.
 
 
 If these assumptions are at least /mostly/ correct, I think I have a 
 pretty good idea how I'll change the init process around.  I'd like to, 
 basically, pull most of MGADRIAgpInit into the kernel.  There will be a 
 single device-specific command called something like 
 DRM_MGA_DMA_BOOTSTRAP.  The DDX will pass in the desired AGP mode and 
 size.  The DRM will do some magic and fill in the rest of the structure. 
  The structure used will probably look something like below.  Notice 
 that the DDX *never* needs to know anything about the WARP microcode in 
 this arrangement.

Why would the DDX need to know anything about the DMA buffers or AGP mode?

 struct drm_mga_dma_bootstrap {
   /**
* 1MB region of primary DMA space.  This is AGP space if
* \c agp_mode is non-zero and PCI space otherwise.
*/
   drmRegion   primary_dma;
 
   /**
* Region for holding textures.  If \c agp_mode is zero and
* there is no IOMMU available, this will be zero size.
*/
   drmRegion   textures;
 
   /**
* Upto 128 secondary DMA buffers.  Each region will be a
* multiple of 64KiB.  If \c agp_mode is non-zero, typically
* only the first region will be configured.  Otherwise,
* each region will be used and allocated for 64KiB.
*/

Why make this behave differently for AGP and PCI?

   drmRegion   secondary_dma[128];
 
   u8  agp_size;   /** Size of AGP region in MB. */
   u8  agp_mode;   /** Set AGP mode.  0 for PCI. */
 };
 
 Does this look good, or should I try to get more sleep before designing 
 interfaces like this? ;)

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_ids93alloc_id281op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Getting DRI working on PCI MGA cards

2005-05-11 Thread Ville Syrjälä

On Wed, May 11, 2005 at 09:59:25AM -0700, Ian Romanick wrote:
 Ville Syrjälä wrote:
 On Tue, May 10, 2005 at 02:59:49PM -0700, Ian Romanick wrote:
 
 all or do they just have to be in the same address space as the 
 primary DMA buffer?
 
 If by address space you mean AGP aperture vs. other memory then no they 
 don't have be in the same address space. You can choose to use PCI or AGP 
 transfers every time you submit a new buffer to the hardware.
 
 Yeah, that's what I meant.  The selection is made by setting bit one of 
 the address to 0 for PCI or 1 for AGP, right?

Yep.

 4. AGP textures.  Without an IOMMU, we pretty much have to punt here. 
 Performance will be bad, but I can live with that.
 
 If these assumptions are at least /mostly/ correct, I think I have a 
 pretty good idea how I'll change the init process around.  I'd like to, 
 basically, pull most of MGADRIAgpInit into the kernel.  There will be a 
 single device-specific command called something like 
 DRM_MGA_DMA_BOOTSTRAP.  The DDX will pass in the desired AGP mode and 
 size.  The DRM will do some magic and fill in the rest of the structure. 
 The structure used will probably look something like below.  Notice 
 that the DDX *never* needs to know anything about the WARP microcode in 
 this arrangement.
 
 Why would the DDX need to know anything about the DMA buffers or AGP mode?
 
 Two reasons, I think.  The DDX tells the DRI driver where this stuff is. 

Ok. I forgot how weird the current system is :(

  Doesn't the DDX also use the DMA buffers for 2D drawing commands?

Last time I looked the DDX only did MMIO. That was quite a long time ago 
though so maybe things have changed.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_ids93alloc_id281op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Linux-fbdev-devel] Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-17 Thread Ville Syrjälä

On Thu, Mar 17, 2005 at 10:08:15AM +0100, Geert Uytterhoeven wrote:
 On Thu, 17 Mar 2005, Ville [iso-8859-1] Syrj??wrote:
  On Thu, Mar 17, 2005 at 12:12:58PM +1100, Benjamin Herrenschmidt wrote:
I understand you can't have userspace program the accelerator while 
someone else is doing the same thing. Oh and I now understand that the 
same really applies to direct framebuffer access due to the swapper.  
   
   And you can't have someone program the accelerator while somebody does
   direct access neither. It's basically all exclusive.
  
  I haven't seen that happen on any hardware I own. Matrox specs explicitly 
  mention that there is no need to synchronize accelerator and direct 
  framebuffer access.
 
 Really?

Really.

 I was always given Matrox as an example of a card that would lock up if
 you access the frame buffer while the accelerator is busy...

Apparently they didn't know what they were talking about.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Linux-fbdev-devel] Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-16 Thread Ville Syrjälä

On Wed, Mar 16, 2005 at 05:31:16PM +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2005-03-16 at 00:21 -0500, Michel Dänzer wrote:
 

Actually people do use it on big-endian systems but since neither the 
mach64, ati128 or radeon drivers play with the swapper settings I can 
only 
assume that they haven't been tested very extensively.
   
   You are wrong here, all of those 3 drivers do play with the swapper
   setting, [...]
  
  I think he was referring to the DirectFB drivers.

Exactly. I prorably should have mentioned that explicitly.

 Well, do they revert it after atyfb/aty128fb/radeonfb set it then ?
 it's definitely set on mode switch.

The point is that there can be offscreen surfaces with different depth 
than the fbdev surface.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Linux-fbdev-devel] Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-16 Thread Ville Syrjälä

On Wed, Mar 16, 2005 at 12:51:27PM +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2005-03-16 at 03:47 +0200, Ville Syrjälä wrote:
 
  There's also the case with Matrox Millennium I/II cards. They must place 
  the visible frame buffer so that no line crosses the boundary of memory 
  banks. matroxfb deals with that by moving the buffer and changing 
  smem_start and smem_len appropriately. But that is really bad for 
  DirectFB's offscreen memory management. After a mode switch the memory 
  manager would need to know what kind of initial byte offset was used. Of 
  couse it would be possible to determine that from smem_start by knowing 
  how the aperture must be aligned. Actually we already do that sort of 
  thing to allow hw accelerated rendering when used on matroxfb controlled 
  G450/G450/G550 CRTC2. But in that case the offset won't change on mode 
  switch.
 
 So it alls end up to - mode switch has to bust memory layout, and any
 assumptions that DirectFB tries to do are incorrect.

I don't think so. Due to fbdev API limitations DirectFB just can't 
accurately determine how much memory will be used by the fbdev buffer. It 
can make an educated guess though. Just as long as you don't change the 
fact that the fbdev buffer will be located at the beginning of the memory 
that is.

   and because
   it seems that directFB has only been tested on little endian machines
   (damn them !) and thus doesn't understand the problem with swapper on
   framebuffer access).
  
  Actually people do use it on big-endian systems but since neither the 
  mach64, ati128 or radeon drivers play with the swapper settings I can only 
  assume that they haven't been tested very extensively.
 
 You are wrong here, all of those 3 drivers do play with the swapper
 setting, they all 3 set the swapper based on the bit depth of the
 screen, so writing an image to offscreen memory with a different bit
 depth will be broken. See usage of SURFACE_CNTL in radeonfb for example.

This was about the DirectFB drivers.

One thing just popped to my head though. If in the future we are going to 
allow graphics cards to render to system memory, using the swapper will no 
longer work. I don't see any other solution that having the CPU perform 
the byte swapping.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-16 Thread Ville Syrjälä

On Wed, Mar 16, 2005 at 02:09:18PM +1100, Benjamin Herrenschmidt wrote:
 
  It's ugly, but that's not the point. The point is that all deployed
  versions of X (and even current X.org CVS head still, in fact) make this
  assumption.
 
 Oh, that's fine and that's not a problem. I will only repaint the
 framebuffer on bit depth or line lenght changes. I'm trying to talk
 about the _future_ here. That is support for dual head at the fbdev
 level and other niceties.

I don't see the current system slowly evolving into some superb future 
system with an in kernel memory manager. The current APIs just have too 
many limitations. I think the memory manager must be the foundation of 
everything and after it's in place the fbdev API should be able to use it. 
The only change to simple fbdev apps would be that they can't get access 
to any offscreen memory as they do now. Something like DirectFB would need 
to change to accomodate the new system but I don't see that as a problem.

I think the best short term option for radeonfb is to simply follow 
matroxfb's example and cut the memory into two parts. The cutoff point 
should probably be configurable via a module option.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-16 Thread Ville Syrjälä

On Wed, Mar 16, 2005 at 10:49:11AM +1100, Benjamin Herrenschmidt wrote:
 In the meantime, can you tell me more about your arbitration scheme ?

There is a lock associated with the graphics card. The lock is always 
taken before programming the hardware. Other things wanting access to the 
hardware wait until the lock is released.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-16 Thread Ville Syrjälä

On Wed, Mar 16, 2005 at 04:00:26PM -0500, Michel Dänzer wrote:
 On Wed, 2005-03-16 at 21:51 +0200, Ville Syrjälä wrote:
  
  One thing just popped to my head though. If in the future we are going to 
  allow graphics cards to render to system memory, using the swapper will no 
  longer work. I don't see any other solution that having the CPU perform 
  the byte swapping.
 
 Sane hardware should have a way to deal with this as well.

In that case I'm not familiar with any sane hardware.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-16 Thread Ville Syrjälä

On Wed, Mar 16, 2005 at 03:58:07PM -0500, Alex Deucher wrote:
 On Wed, 16 Mar 2005 22:08:08 +0200, Ville Syrjälä [EMAIL PROTECTED] wrote:
  On Wed, Mar 16, 2005 at 02:09:18PM +1100, Benjamin Herrenschmidt wrote:
  
It's ugly, but that's not the point. The point is that all deployed
versions of X (and even current X.org CVS head still, in fact) make this
assumption.
  
   Oh, that's fine and that's not a problem. I will only repaint the
   framebuffer on bit depth or line lenght changes. I'm trying to talk
   about the _future_ here. That is support for dual head at the fbdev
   level and other niceties.
  
  I don't see the current system slowly evolving into some superb future
  system with an in kernel memory manager. The current APIs just have too
  many limitations. I think the memory manager must be the foundation of
  everything and after it's in place the fbdev API should be able to use it.
  The only change to simple fbdev apps would be that they can't get access
  to any offscreen memory as they do now. Something like DirectFB would need
  to change to accomodate the new system but I don't see that as a problem.
  
  I think the best short term option for radeonfb is to simply follow
  matroxfb's example and cut the memory into two parts. The cutoff point
  should probably be configurable via a module option.
  
 
 if we are going to go through the trouble to do it at all why not do
 it the right way?

I haven't seen anyone coming forward with a design/code for the memory 
manager.

In the meantime I'm assuming that people might want to make some use of 
their dualhead cards...

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-16 Thread Ville Syrjälä

On Thu, Mar 17, 2005 at 10:27:56AM +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2005-03-16 at 22:55 +0200, Ville Syrjälä wrote:
  On Wed, Mar 16, 2005 at 10:49:11AM +1100, Benjamin Herrenschmidt wrote:
   In the meantime, can you tell me more about your arbitration scheme ?
  
  There is a lock associated with the graphics card. The lock is always 
  taken before programming the hardware. Other things wanting access to the 
  hardware wait until the lock is released.
 
 Ok, so it would be easy to have directFB use an external arbiter without
 breaking existing clients ? It will need at least to use the vga arbiter
 that I'm about to finish, that should allow at least to have X on one
 card and directFB on another without conflict.

Is the vga arbiter required for something else besides access to some 
legacy ports? DirectFB only uses legacy ports to wait for vsync if a 
better method is not available.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-16 Thread Ville Syrjälä

On Thu, Mar 17, 2005 at 10:30:01AM +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2005-03-16 at 23:08 +0200, Ville Syrjälä wrote:
  On Wed, Mar 16, 2005 at 03:58:07PM -0500, Alex Deucher wrote:
 
  I haven't seen anyone coming forward with a design/code for the memory 
  manager.
  
  In the meantime I'm assuming that people might want to make some use of 
  their dualhead cards...
 
 Are you aware that with the current fbdev API, there will simply be no
 working use of dual head ? As soon as somebody will try to do 2
 different things on the 2 heads, it will either lockup due to lack of
 engine arbitration, or have wrong endianness, or whatever ...

I understand you can't have userspace program the accelerator while 
someone else is doing the same thing. Oh and I now understand that the 
same really applies to direct framebuffer access due to the swapper. I 
hadn't really thought about that issue before since I don't own a 
big-endian system. I really must try to get one...

So basically to fix both issues we need some locks everyone must acquire 
before accessing the hardware.

With the current mmap() registers and go interface the accelerator lock 
wouldn't actually guarantee anything but it would allow well behaving 
applications to share the accelerator. Good behaviour is already expected 
from the applications anyway due to the direct access to hardware 
registers.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-16 Thread Ville Syrjälä

On Thu, Mar 17, 2005 at 12:12:58PM +1100, Benjamin Herrenschmidt wrote:
 
  I understand you can't have userspace program the accelerator while 
  someone else is doing the same thing. Oh and I now understand that the 
  same really applies to direct framebuffer access due to the swapper.  
 
 And you can't have someone program the accelerator while somebody does
 direct access neither. It's basically all exclusive.

I haven't seen that happen on any hardware I own. Matrox specs explicitly 
mention that there is no need to synchronize accelerator and direct 
framebuffer access.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-15 Thread Ville Syrjälä

On Tue, Mar 15, 2005 at 12:30:56PM +0100, Roland Scheidegger wrote:
 Ville Syrjälä wrote:
   I think that making the assumption that all memory is preserved when 
 the
 memory layout (virtual resolution and depth) doesn't change is perfectly 
 valid too. That would allow X to do it's Ctrl-Alt-+ and - things without 
 repainting the whole screen.
 I'm not sure I agree here, as it's not always true. For instance, the 
 radeon has some restrictions whether it can use tiling or not with a 
 certain mode (interlace/double scan) thus you need to redraw everything 
 anyway

I didn't know about that. My first thought would be to disallow such modes 
but knowing that X lacks a proper fullscreen API that might not be a 
realistic option.

 (which is exactly why I implemented a driver workaround to 
 repaint everything when that happens - in fact the workaround also gets 
 rid of the offscreen contents, which is not necessary, but was much 
 easier to implement, since I couldn't find an easy way to invalidate 
 the framebuffer). What's the big deal with repainting everything? It's 
 not like you would do 100 mode changes per second so it would be 
 performance-critical...

I don't really have a big deal with invalidating the visible part of the 
memory.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Linux-fbdev-devel] Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-15 Thread Ville Syrjälä

On Tue, Mar 15, 2005 at 09:51:40AM +0100, Geert Uytterhoeven wrote:
 On Tue, 15 Mar 2005, Ville [iso-8859-1] Syrj??wrote:
  If radeonfb will allocate the buffer for the second head from the top of 
  the memory users would basically have to guess it's location. matroxfb 
  simply cuts the memory in two pieces and allocates the buffers from the 
  start of each piece. I don't really like that approach. Adding a simple 
  byte_offset field to fb_var_screeninfo would solve the problem quite 
  nicely but I don't know if such API changes are acceptable at this stage.
 
 You wouldn't have to guess its location, look at fix.smem_start.

But how would someone mmap() the whole memory then? matroxfb already plays 
tricks on fix.smem_start on Millennium I/II cards and it really confuses 
DirectFB's memory allocator.

 I once did a similar thing for an embedded prototype: take a fixed amount of
 memory for both frame buffers (this was a UMA system), fb0 starts from the 
 top,
 fb1 starts from the bottom. You can enlarge each frame buffer, until you read
 the memory of the other. Each fix.smem_{start,len} corresponds exactly to the
 memory allocated to each frame buffer.
 
 Of course, if you also want off-screen memory (i.e. memory beyond
 xres_virtual*yres_virtual*bpp/8), things get more complicated, since currently
 there's no way for the application to ask for a minimum amount of off-screen
 memory. Perhaps a new field in fb_var_screeninfo (and zero means `I don't
 care', for backwards compatibility).

Offscreen memory is pretty much essential for DirectFB.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-15 Thread Ville Syrjälä

On Tue, Mar 15, 2005 at 05:59:42PM +1100, Benjamin Herrenschmidt wrote:
 On Tue, 2005-03-15 at 08:01 +0200, Ville Syrjälä wrote:
 
  If radeonfb will allocate the buffer for the second head from the top of 
  the memory users would basically have to guess it's location. matroxfb 
  simply cuts the memory in two pieces and allocates the buffers from the 
  start of each piece. I don't really like that approach. Adding a simple 
  byte_offset field to fb_var_screeninfo would solve the problem quite 
  nicely but I don't know if such API changes are acceptable at this stage.
 
 And we don't know if all HW would support it anyway.

Such hardware would be free to ignore any user supplied byte_offset and 
place the buffer anywhere it wants. Even a read-only byte_offset field 
would help. But using the second head would require all apps to be updated 
to be aware of byte_offset :( Maybe some kind of API version thing could 
help here ie. User sets flag X somewhere indicating byte_offset should be 
used instead of changing smem_start.

We are thinking with the new model in mind, and so far, a mode 
setting 
is under control of the framebuffer. Content of video memory 
(framebuffer,
textures, overlay, whatever) simply cannot be considered as preserved
accross mode switches.

We can't also block all evolutions just because we have to support a
broken model. 
   
   I'm not suggesting that, but I do think that tying together mode
   switching and memory allocation would be a big mistake.
  
  Indeed.
 
 The main issue hwoever, is access arbitration. I'd appreciate your
 DirectFB point of view on these things.

DirectFB has it's own asbitration mechanism. It doesn't support using 
multiple framebuffer devices at the same time. For that to work DirectFB 
would just have to know if some of the framebuffer devices are actually 
different outputs of the same card so that it could associate both with 
the same lock and accelerator state.

With the current system I don't see much chance of using accelerated fbcon 
on one head and accelerated DirectFB (or something else) on the other.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-15 Thread Ville Syrjälä

On Wed, Mar 16, 2005 at 09:44:19AM +1100, Benjamin Herrenschmidt wrote:
 
  DirectFB assumes all memory outside var.yres_virtual * fix.line_length is 
  preserved. A totally valid assumption in my opinion. 
 
 Except that you can't know in advance how much fix.line_length will be.
 The fix isn't really fixed. Different cards will have different
 requirements depending on the bit depth for example. On radeonfb, the
 line_length will vary due to alignment constraints related to the
 engine, or due to tiling, etc etc...
 
 So you basically don't know in advance what will be preserved... (And
 you can't, unless you start having all sort of card specific knowledge).

True. Currently DirectFB doesn't handle this correctly. But that could be 
easily fixed if only line_length wasn't totally misplaced. It really 
belongs to fb_var_screeninfo. We could first test the mode with 
FB_ACTIVATE_TEST and actually see how much memory it needs and could 
evict enough offscreen surfaces to make room before actually setting the 
mode. Currently it would need some guesswork.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Linux-fbdev-devel] Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-15 Thread Ville Syrjälä

On Wed, Mar 16, 2005 at 10:50:52AM +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2005-03-16 at 07:37 +0800, Antonino A. Daplas wrote:
  On Tuesday 15 March 2005 21:36, Ville Syrjälä wrote:
   On Tue, Mar 15, 2005 at 09:51:40AM +0100, Geert Uytterhoeven wrote:
On Tue, 15 Mar 2005, Ville [iso-8859-1] Syrj??wrote:
 If radeonfb will allocate the buffer for the second head from the top
 of the memory users would basically have to guess it's location.
 matroxfb simply cuts the memory in two pieces and allocates the 
 buffers
 from the start of each piece. I don't really like that approach. 
 Adding
 a simple byte_offset field to fb_var_screeninfo would solve the 
 problem
 quite nicely but I don't know if such API changes are acceptable at
 this stage.
   
You wouldn't have to guess its location, look at fix.smem_start.
  
   But how would someone mmap() the whole memory then? matroxfb already plays
  
  This is multi-head, right?  That implies one fb per head.  So,  can't you do
  separate mmaps?  fb0-fix.smem_start|len and fb1-fix.smem_start|len.
 
 Sure, re-read the thread :) Also, he's worried about management of
 offscreen memory. (which is an issue too because of possible problems
 with the setup of the apertures - start of the discussion,

There's also the case with Matrox Millennium I/II cards. They must place 
the visible frame buffer so that no line crosses the boundary of memory 
banks. matroxfb deals with that by moving the buffer and changing 
smem_start and smem_len appropriately. But that is really bad for 
DirectFB's offscreen memory management. After a mode switch the memory 
manager would need to know what kind of initial byte offset was used. Of 
couse it would be possible to determine that from smem_start by knowing 
how the aperture must be aligned. Actually we already do that sort of 
thing to allow hw accelerated rendering when used on matroxfb controlled 
G450/G450/G550 CRTC2. But in that case the offset won't change on mode 
switch.

 and because
 it seems that directFB has only been tested on little endian machines
 (damn them !) and thus doesn't understand the problem with swapper on
 framebuffer access).

Actually people do use it on big-endian systems but since neither the 
mach64, ati128 or radeon drivers play with the swapper settings I can only 
assume that they haven't been tested very extensively.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: radeon, apertures memory mapping

2005-03-14 Thread Ville Syrjälä

On Sun, Mar 13, 2005 at 11:04:59PM +1100, Benjamin Herrenschmidt wrote:
  
  I must be missing something something obvious because I don't quite 
  understand what major drawbacks there are with the non-overlapping mode. 
  As I see it you get at least the same amount of CPU accessible memory as 
  you get in the overlapping mode.
 
 Yes, you do, but that means that if the apertures are configured such
 that the entire VRAM fits in a single aperture, then you just can't use
 the second aperture at all. Which means you can't have separate swapper
 setting for both apertures, and thus, can't let two independant
 processes access the video memory with different bit depth, at least on
 big endian machines unless you do trickery, and play with the swapper
 before each access.

Ok so the problem is byte swapping. Looking at atyfb for example it uses 
the big-endian aperture on big-endian systems and selects the byte 
swapping method according to the bit depth. If that really means that all 
host access to the aperture gets byte swapped then I don't see how the 
current situation can work correctly for DirectFB. Offscreen surfaces can 
use any bit depth and so their bytes could be swapped incorrectly. Makes 
me wish I had a PPC box alongside the x86 one.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Linux-fbdev-devel] Re: radeon, apertures memory mapping

2005-03-14 Thread Ville Syrjälä

On Mon, Mar 14, 2005 at 05:30:04PM +0100, Soeren Sandmann wrote:
 Benjamin Herrenschmidt [EMAIL PROTECTED] writes:
 
  In an ideal world ... However, since we are planning to move the memory
  manager to the kernel, that would mean a kernel access (syscall, ioctl,
  whatever...) twice per access to AGP memory. Not realistic.
 
 Could the user space driver batch many such accesses together and use
 a lock_many()/unlock_many() API?

Natrually it should try to do as much as possible during the 
lock()/unlock() sequence.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: FB model basic issues (WAS: radeon, apertures memory mapping)

2005-03-14 Thread Ville Syrjälä

On Mon, Mar 14, 2005 at 11:59:37PM -0500, Michel Dänzer wrote:
 On Tue, 2005-03-15 at 09:37 +1100, Benjamin Herrenschmidt wrote:
   Be that as it may, it remains a fact that such a change would break
   existing installations...
   
   I think that mode setting and memory allocation should be separated. X
   will always reserve enough video RAM for the largest resolution it uses
   for the screen contents.
  
  But X has no control on where fbdev will allocate memory. 
 
 My understanding is that so far, the fbdev API has pretty much implied
 that any mode scans out the beginning of the memory accessed via the
 framebuffer device, unless the panning ioctl is used. IIRC at least
 DirectFB makes basically the same assumptions as X there.

DirectFB assumes all memory outside var.yres_virtual * fix.line_length is 
preserved. A totally valid assumption in my opinion. It allocates all 
offscreen memory starting from the top of the memory so overlaps with 
fbdev are as rare as possible. Currently it doesn't handle multi head 
except for Matrox G400/G450/G550 TV-out but that is handled without fbdev 
so no API limitations get in the way.

I think that making the assumption that all memory is preserved when the 
memory layout (virtual resolution and depth) doesn't change is perfectly 
valid too. That would allow X to do it's Ctrl-Alt-+ and - things without 
repainting the whole screen.

If radeonfb will allocate the buffer for the second head from the top of 
the memory users would basically have to guess it's location. matroxfb 
simply cuts the memory in two pieces and allocates the buffers from the 
start of each piece. I don't really like that approach. Adding a simple 
byte_offset field to fb_var_screeninfo would solve the problem quite 
nicely but I don't know if such API changes are acceptable at this stage.

  We are thinking with the new model in mind, and so far, a mode setting 
  is under control of the framebuffer. Content of video memory (framebuffer,
  textures, overlay, whatever) simply cannot be considered as preserved
  accross mode switches.
  
  We can't also block all evolutions just because we have to support a
  broken model. 
 
 I'm not suggesting that, but I do think that tying together mode
 switching and memory allocation would be a big mistake.

Indeed.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: radeon, apertures memory mapping

2005-03-13 Thread Ville Syrjälä

On Sun, Mar 13, 2005 at 12:35:43PM +1100, Benjamin Herrenschmidt wrote:
 Hi !
 
 I'm currently rewriting radeonfb to implement support for dual head, and
 ultimately, to make it more friendly to be hooked on DRM for mesa-solo
 style setups.
 
 I have some issues however related to the way memory is mapped and
 dealing with apertures. Here is the story, suggestions welcome:
 
 The radeon card exposes to the system 2 separate apertures. That is, the
 PCI region is actually cut by the hardware in two halves, each of them
 beeing an aperture. Each aperture can have different configuration for
 the endian swappers (and possibly the surface tiling registers).
 
 I can configure the apertures to both map to the same bit of video
 memory (both covering the framebuffer from 0), or to be split, that is
 aperture 0 covering the framebuffer from 0 to CONFIG_APER_SIZE (size of
 an aperture, that is half of the PCI BAR allocation), and aperture 1
 covering the framebuffer from CONFIG_APER_SIZE to CONFIG_APER_SIZE*2.
 
 However, I can't change anything to CONFIG_APER_SIZE itself, it's
 decided by straps, either HW or in the ROM. So we end up with different
 setups depending on how the BIOS has configured things. I know that
 Apple chips are usually wired so that CONFIG_APER_SIZE is half the video
 memory, so if I use the first mode, I can only access half of the video
 RAM from PCI, if I use the second, each aperture maps a different half
 of video memory with possibly different endian swapping.
 
 But I think the setups in real life are more diverse and some BIOSes
 will have CONFIG_APER_SIZE at least as big as the entire video memory,
 thus forcing me to use the overlapped setup. In fact, CONFIG_APER_SIZE
 may even be smaller than half of the vram and thus limiting the CPU to
 part of the VRAM anyway.
 
 I have toyed with all sort of setups, and I have +/- decided to not
 bother, and always do this, please tell me what you think:
 
 Always setup HOST_PATH_CNTL.HDP_APER_CNTL to 0. That is, both apertures
 are always overlapping. On Macs, or other machines that strap
 CONFIG_APER_SIZE to half of VRAM, that means only half of vram can be
 directly accessed by the CPU. I think this is fine because of these:
 
  - We only really need to bother about CPU access for the framebuffer
 itself (and possibly the cursor). That is normal non-accelerated fbdev
 operations an mmap'ing of the framebuffer in user space. This is not
 really a problem if that is limited to some part of vram. It puts a
 small constraint on the allocation of video memory: the framebuffer has
 to be near the beginning.

It will limit DirectFB to access only CONFIG_APER_SIZE. DirectFB needs CPU 
access to offscreen memory for software fallbacks and explicit user 
access. Any other compositing window system would need to do the same. If 
the video memory manager ever gets done then it shouldn't be a major 
problem because the kernel could blit the data to/from the inaccesible 
part without the application even realizing it. Although direct access 
might be useful in that case also since it could reduce pressure on the 
GART address space.

 But my opinion is that a mode switch will
 pretty much always invalidate everything that is cached in video memory,

I don't see any reason for such a sledgehammer approach. If the new mode 
doesn't overlap with any offscreen data then there is no need to 
invalidate anything.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: radeon, apertures memory mapping

2005-03-13 Thread Ville Syrjälä

On Sun, Mar 13, 2005 at 08:20:45PM +1100, Benjamin Herrenschmidt wrote:
 On Sun, 2005-03-13 at 10:22 +0200, Ville Syrjälä wrote:
 
   
- We only really need to bother about CPU access for the framebuffer
   itself (and possibly the cursor). That is normal non-accelerated fbdev
   operations an mmap'ing of the framebuffer in user space. This is not
   really a problem if that is limited to some part of vram. It puts a
   small constraint on the allocation of video memory: the framebuffer has
   to be near the beginning.
  
  It will limit DirectFB to access only CONFIG_APER_SIZE. DirectFB needs CPU 
  access to offscreen memory for software fallbacks and explicit user 
  access. Any other compositing window system would need to do the same. If 
  the video memory manager ever gets done then it shouldn't be a major 
  problem because the kernel could blit the data to/from the inaccesible 
  part without the application even realizing it. Although direct access 
  might be useful in that case also since it could reduce pressure on the 
  GART address space.
 
 Yes, that means direct access will be limited to half of the vram on
 some setups and not on others, depending on how the BIOS sets up the
 card. Is this a real issue ? I don't think so personally. Especially
 since directfb could make use of DRM to do DMA blits either from main
 memory or from AGP space...

AGP as it's currently used is pretty much pointless for software fallbacks 
since reading from AGP memory is nearly as slow as reading from video 
memory.

 Or things can be put in accessible space,
 and blitted elsewhere using the accelerator.

This could work (and it would avoid DRM which in my book is a plus) but 
it's not very nice to have to copy the data twice.

 That half of vram is plenty enough for a framebuffer (and more). it's
 only an issue when you start doing very large offscreen surfaces. Do you
 have much usage of those without DMA ?

I have about 26MB of video memory used when running XDirectFB with GNOME, 
epiphany and 4 gnome-terminals, and I also have some videos playing on the 
TV at the same time. That's on a 32MB G400 BTW.


I must be missing something something obvious because I don't quite 
understand what major drawbacks there are with the non-overlapping mode. 
As I see it you get at least the same amount of CPU accessible memory as 
you get in the overlapping mode.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Linux-fbdev-devel] Re: radeon, apertures memory mapping

2005-03-13 Thread Ville Syrjälä

On Sun, Mar 13, 2005 at 11:19:35AM -0500, Jon Smirl wrote:
 On Sun, 13 Mar 2005 23:04:59 +1100, Benjamin Herrenschmidt
 [EMAIL PROTECTED] wrote:
  
   AGP as it's currently used is pretty much pointless for software fallbacks
   since reading from AGP memory is nearly as slow as reading from video
   memory.
  
  Hrm.. I wouldn't expect _that_ slow. It's uncacheable, right, but still
  on a faster bus. Especially if we use it the way we do on ppc where we
  actually map the RAM pages directly instead of having processes go
  through the GART.
 
 I asked at the Xdev conference if there were page table tricks that
 would work for accessing GART memory.  Everybody said no but I'm still
 wondering if there are any.
 
 For example the ppc has an instruction for flushing specific pages
 from cache, unlike the x86 where you can only flush everything.
 
 So on the ppc you could leave the GART memory mapped normally and
 cached. Do all of your fallback calculations, then flush the address
 range from cache. Now tell the GPU to go use it.
 
 Can't GART memory be normally cached RAM as long as we flush the cache
 before telling the GPU to use it?
 
 If you are doing fallback calculations in a 6MB buffer that is 1,500
 pages. Accessing all of this effectively flushes the data cache. Once
 you are done with it you probably don't want those pages in the cache
 anyway.

I don't understand why we have GART memory anyway. It's just main memory 
and I don't see any point going through the GART to access it with the 
CPU. Only the graphics card needs to use the GART.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: radeon, apertures memory mapping

2005-03-13 Thread Ville Syrjälä

On Sun, Mar 13, 2005 at 06:00:01PM -0500, Jon Smirl wrote:
 On Mon, 14 Mar 2005 09:20:01 +1100, Benjamin Herrenschmidt
 [EMAIL PROTECTED] wrote:
  Though the flushes may be fast if there is no actual hit in the cache, I
  agree. Again, that should be benched.
  
  In fact, i would _love_ to be able to mark AGP memory as cacheable on
  ppc, even if there is no performance benefit in the end. The issue is
  that currently, we end up having both a cacheable and a non-cacheable
  mapping for those pages (the kernel linear mapping still maps those
  pages cacheable, and it's almost impossible to get rid of that unless
  you are prepared to disable the large pages mapping of kernel space or
  the BATs on ppc32, which would harm kernel performances significantly).
  
  It works, but it's illegal. That means that the CPU might well speculate
  a load from one of these pages in kernel-land just because it happens to
  be next to a page where you are iterating an array, and may then bring a
  bit in the cache from that page.
 
 That shouldn't matter the page brought in would be for a speculative
 read and never accessed. It should just fall out of the cache and not
 be written back. There is only one cachable mapping. In this model
 writes are always followed by a flush before telling the GPU to access
 the memory that has just been written.

What about this scenario?

Speculative read - AGP master writes new data - CPU has invalid data in 
cache :(

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: radeon, apertures memory mapping

2005-03-13 Thread Ville Syrjälä

On Mon, Mar 14, 2005 at 10:48:19AM +1100, Benjamin Herrenschmidt wrote:
 
   That shouldn't matter the page brought in would be for a speculative
   read and never accessed. It should just fall out of the cache and not
   be written back. There is only one cachable mapping. In this model
   writes are always followed by a flush before telling the GPU to access
   the memory that has just been written.
  
  What about this scenario?
  
  Speculative read - AGP master writes new data - CPU has invalid data in 
  cache :(
 
 First, we must be very careful with AGP master writes. I don't know if
 we do a lot of them currently, but I know a collection of north bridges
 that do not support them.

I don't think normal drivers do them at all. I did experiment with 
DirectFB at one point and had it place all offscreen surfaces to AGP 
memory. It worked really well on my hardware (G400 + VIA KT133 
northbridge). I also tried it with PCI transfers and that too worked but 
was naturally slower. I'd like to make DirectFB use AGP again since 32MB 
of video memory isn't always enough.

 (Which is interesting, that means that if we want to copy something out
 of video memory, we can't write it to AGP memory and then read it, we
 need to actually do the blit from the CPU, good to know for our memory
 manager. That also means that we have a problem if the video memory
 isn't entirely accessible by the CPU ...)

What about PCI master writes? Are there bridges that don't support even 
those?

 That's something we should probably think about doing properly: Have a
 list of AGP issues (errata ?) bits that are communicated by the AGP
 host driver to the DRM.
 
 At least all the early Apple AGP bridges don't do writes, and I remember
 we have trouble with a few x86 ones as well. There are also issues when
 a single AGP burst crosses a page boundary, and other things like that.

:(

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: radeon, apertures memory mapping

2005-03-13 Thread Ville Syrjälä

On Sun, Mar 13, 2005 at 07:25:15PM -0500, Jon Smirl wrote:
 On Mon, 14 Mar 2005 10:48:19 +1100, Benjamin Herrenschmidt
 [EMAIL PROTECTED] wrote:
  
That shouldn't matter the page brought in would be for a speculative
read and never accessed. It should just fall out of the cache and not
be written back. There is only one cachable mapping. In this model
writes are always followed by a flush before telling the GPU to access
the memory that has just been written.
  
   What about this scenario?
  
   Speculative read - AGP master writes new data - CPU has invalid data in
   cache :(
  
 
 You need to reverse the cache flush process if you are going to read
 data written by the GPU.
 
 1) Make sure GPU is finished writing
 2) flush your cache
 3) read AGP memory like normal RAM.

Oh right. The CPU shouldn't write back the cached data since it hasn't 
changed.

I think you'd also need the GPU to issue an AGP flush command between 
steps 1 and 2.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: radeon, apertures memory mapping

2005-03-13 Thread Ville Syrjälä

On Mon, Mar 14, 2005 at 11:41:23AM +1100, Paul Mackerras wrote:
 Jon Smirl writes:
 
   It works, but it's illegal. That means that the CPU might well speculate
   a load from one of these pages in kernel-land just because it happens to
   be next to a page where you are iterating an array, and may then bring a
   bit in the cache from that page.
  
  That shouldn't matter the page brought in would be for a speculative
  read and never accessed. It should just fall out of the cache and not
  be written back. There is only one cachable mapping. In this model
  writes are always followed by a flush before telling the GPU to access
  the memory that has just been written.
 
 That would be fine, but it would mean making sure that every time any
 code in the DRI, DRM or X server writes to the AGP memory, it does the
 flush as well.  Sounds like a maintenance nightmare to me...

It should be the responsibility of the memory manager. If anything wants 
to access the memory it would call lock() and when it's done with the 
memory it calls unlock(). That's exactly how DirectFB's memory manager 
works.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: dri span patches...

2005-03-04 Thread Ville Syrjälä

On Thu, Mar 03, 2005 at 06:49:26PM +0100, Roland Scheidegger wrote:
 Ville Syrjälä wrote:
 On Thu, Mar 03, 2005 at 05:45:15PM +0100, Roland Scheidegger wrote:
 
 Brian Paul wrote:
 
 Roland Scheidegger wrote:
 
 
 here's a patch which mainly does 3 things: - convert sis, mach64,
 and radeon to spantmp2. The sis and mach64 drivers got a slight
 change, previously you could not read back alpha values (always
 0xff) and I don't think there was a good reason for that?
 
 
 IIRC, the mach64 doesn't support destination alpha planes.  OpenGL 
 requires that reads of absent alpha planes returns 1.0.  I don't know
 if the SiS chip is the same.
 
 Are you sure? At least the driver handles things like GL_DST_ALPHA blend
 factors, wouldn't you get awfully bogus results if it wouldn't support 
 destination alpha planes in that case?
 
 
 Like I said before only the RGB components are blended. You can choose to 
 write 0, 1, As, 1-As, Ad or 1-Ad to the destination alpha 
 ([EMAIL PROTECTED] register). Currently the driver seems to 
 write 0. It would probably be a better idea to write 1 instead.
 Sorry, but I just can't see that in the driver. And there's no 
 ALPHA_DST_SEL bit, at least not in the mach64 reg file I have...

I was just looking at the specs :) They are named MACH64_ALPHA_DST_* in 
mach64_reg.h. The driver doesn't explicitly specify any value which 
means 0 gets written.

I actually just stumbled on this issue a few days ago with the mach64 
DirectFB driver. My plan for the DirectFB driver is simply to allow ZERO + 
ZERO/ONE/SRCALPHASAT blend functions for destinations with alpha. It's a 
rather serious limitation but I think it's better than incorrect 
rendering. The issue is even worse on older Rage chips since they can only 
write 0 to the destination alpha. But that is not an issue for the DRI 
driver since those chips aren't supported.

 Regardless if it can actually blend alpha values or not, there would be 
 some half-way useful alpha values probably in the buffer.

Either you get the correct results or the wrong results. Not sure if 
there are any really useful things you can do with incorrect values.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: dri span patches...

2005-03-04 Thread Ville Syrjälä

On Thu, Mar 03, 2005 at 08:10:33PM +0100, Roland Scheidegger wrote:
 Ville Syrjälä wrote:
 Like I said before only the RGB components are blended. You can
 choose to write 0, 1, As, 1-As, Ad or 1-Ad to the destination
 alpha ([EMAIL PROTECTED] register). Currently the
 driver seems to write 0. It would probably be a better idea to
 write 1 instead.
 
 Sorry, but I just can't see that in the driver. And there's no 
 ALPHA_DST_SEL bit, at least not in the mach64 reg file I have...
 
 
 I was just looking at the specs :) They are named MACH64_ALPHA_DST_*
 in mach64_reg.h. The driver doesn't explicitly specify any value
 which means 0 gets written.
 
 I actually just stumbled on this issue a few days ago with the mach64
  DirectFB driver. My plan for the DirectFB driver is simply to allow
 ZERO + ZERO/ONE/SRCALPHASAT blend functions for destinations with
 alpha. It's a rather serious limitation but I think it's better than
 incorrect rendering. The issue is even worse on older Rage chips
 since they can only write 0 to the destination alpha. But that is not
 an issue for the DRI driver since those chips aren't supported.
 I think now I understand. It has alpha channel and all, but it simply
 won't perform the blending equation on the alpha channel, instead simply 
 writing zero, one, source alpha, 1 - minus source alpha, dst alpha, or 1 
 - dst alpha.

Exactly.

 Actually the driver does not write 0, it writes the source 
 alpha value (MACH64_ALPHA_DST_SRCALPHA) as far as I can tell.

Ok I missed that with my grepping. I was probably looking for 
ALPHA_DST_SEL myself too :)

 Actually, this design means it would have some very limited support for 
 blend_func_separate :-).

I had the same though :)

 Looks like a stupid design limitation to me (what would it cost to 
 implement that additional blend adder to the 3 you need anyway?), but ah 
 well. Maybe this wasn't required by DirectX 1.0 ;-).

Mach64 chips have quite a few stupid alpha limitations. Most likely the 
nastiest one being that texture alpha can't be modulated. And then there's 
the whole fog vs. alpha thing.

 In practice though, this might just work quite often, the alpha-blended 
 alpha values are probably not required a lot?

Not with X. With DirectFB they are needed every time you render to an ARGB 
window and then expect to display the window alpha blended on screen.

 Regardless if it can actually blend alpha values or not, there
 would be some half-way useful alpha values probably in the buffer.
 
 
 Either you get the correct results or the wrong results. Not sure if
  there are any really useful things you can do with incorrect values.
 I meant that you might just get the correct alpha values sometimes 
 (depending on the blend func that should be true I guess).

Right. Though there are only a few combinations that give correct results.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: dri span patches...

2005-03-03 Thread Ville Syrjälä

On Thu, Mar 03, 2005 at 04:48:58AM +0100, Roland Scheidegger wrote:
 here's a patch which mainly does 3 things:
 - convert sis, mach64, and radeon to spantmp2.
 The sis and mach64 drivers got a slight change, previously you could not 
 read back alpha values (always 0xff) and I don't think there was a good 
 reason for that?

AFAICS mach64 doesn't apply either the blend equation or the blend 
functions to the alpha values. You can choose to write either 0, 1, As, 
1-As, Ad or 1-Ad to the destination :(

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: dri span patches...

2005-03-03 Thread Ville Syrjälä

On Thu, Mar 03, 2005 at 10:04:02AM +0200, Ville Syrjälä wrote:
 On Thu, Mar 03, 2005 at 04:48:58AM +0100, Roland Scheidegger wrote:
  here's a patch which mainly does 3 things:
  - convert sis, mach64, and radeon to spantmp2.
  The sis and mach64 drivers got a slight change, previously you could not 
  read back alpha values (always 0xff) and I don't think there was a good 
  reason for that?
 
 AFAICS mach64 doesn't apply either the blend equation or the blend 
 functions to the alpha values. You can choose to write either 0, 1, As, 
 1-As, Ad or 1-Ad to the destination :(

Oh and a quick look at the mach64 driver indicats that it always chooses 
to write 0.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: dri span patches...

2005-03-03 Thread Ville Syrjälä

On Thu, Mar 03, 2005 at 05:45:15PM +0100, Roland Scheidegger wrote:
 Brian Paul wrote:
 Roland Scheidegger wrote:
 
 here's a patch which mainly does 3 things: - convert sis, mach64,
 and radeon to spantmp2. The sis and mach64 drivers got a slight
 change, previously you could not read back alpha values (always
 0xff) and I don't think there was a good reason for that?
 
 
 IIRC, the mach64 doesn't support destination alpha planes.  OpenGL 
 requires that reads of absent alpha planes returns 1.0.  I don't know
 if the SiS chip is the same.
 Are you sure? At least the driver handles things like GL_DST_ALPHA blend
 factors, wouldn't you get awfully bogus results if it wouldn't support 
 destination alpha planes in that case?

Like I said before only the RGB components are blended. You can choose to 
write 0, 1, As, 1-As, Ad or 1-Ad to the destination alpha 
([EMAIL PROTECTED] register). Currently the driver seems to 
write 0. It would probably be a better idea to write 1 instead.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: waitforVBlank, how does this even work?

2005-03-02 Thread Ville Syrjälä

On Wed, Mar 02, 2005 at 10:01:00PM -0500, Vladimir Dergachev wrote:
 
 
 On Thu, 3 Mar 2005, Benjamin Herrenschmidt wrote:
 
 On Wed, 2005-03-02 at 20:03 -0500, Vladimir Dergachev wrote:
 
 On Thu, 3 Mar 2005, Benjamin Herrenschmidt wrote:
 
 
 
 What about isolating interrupt-handling code into a small driver ?
 Something simple to respond to interrupts and call all handlers with a
 certain mask.
 
 This would be useful not only for drm and fbdev but also for km
 (v4l capture module) and stereo-glasses code.
 
 
 Nope, I don't agree.
 
 With which part ? ;)
 
 On having a small stub module that does just IRQs ... I think the base
 module should be the fbdev (mode setting etc...)
 
 Oh, but I was not suggesting that. I just meant that interrupt handling 
 code is self-contained and can easily serve several consumers.

I'm with you here. And the same should IMHO hold for DMA handling. And for 
memory management of course.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Linux-fbdev-devel] waitforVBlank, how does this even work?

2005-03-01 Thread Ville Syrjälä

On Wed, Mar 02, 2005 at 06:10:14PM +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2005-03-02 at 00:50 -0500, Jon Smirl wrote:
  For the r128 driver both the fbdev and drm drivers have implemented
  waitforVBlank and they both play with the interrupt registers. I can
  only assume that no one has ever tried to use them at the same time.
  In the radeon case the DRM driver has implemented waitforVBlank and
  the fbdev driver has not.
  
  This is a mess and it is yet another reason for merging DRM and fbdev
  into a sane, combined driver.
 
 I'd say nobody ever used both :)

I (and others) have with mga. The easiest solution was to disable the irq 
code in the drm. That was for running OpenGL on DirectFB btw.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95alloc_id396op=click
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: SW fallback: clipping bug [patch]

2004-10-22 Thread Ville Syrjälä

On Thu, Oct 21, 2004 at 11:49:20PM +0200, Dieter Nützel wrote:
 Am Freitag, 15. Oktober 2004 22:51 schrieb Nicolai Haehnle:
  Hi,
 
  There is disagreement about the meaning of the CLIPSPAN _n parameter in
  CVS.
 
  The drivers I have looked at and drivers/dri/common/spantmp.h treat _n as
  the number of pixels in the span after clipping.
  depthtmp.h and stenciltmp.h treat _n as the end+1 x coordinate of the span.
 
  This inconsistency leads to artifacts when software fallbacks are hit while
  clipping is used, especially with partially obscured clients. The attached
  patch should fix these artifacts by changing depthtmp.h and stenciltmp.h
  appropriately.
 
 What about this?
 
 Needed?

AFAICS the code is clearly broken and the fix is correct. The explanation 
was a bit misleading though as it's the _n1 parameter and not _n that is 
used incorrectly.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: SW fallback: clipping bug [patch]

2004-10-21 Thread Ville Syrjälä

On Thu, Oct 21, 2004 at 03:35:21PM -0700, Ian Romanick wrote:
 Dieter Nützel wrote:
 Am Freitag, 15. Oktober 2004 22:51 schrieb Nicolai Haehnle:
 
 There is disagreement about the meaning of the CLIPSPAN _n parameter in
 CVS.
 
 The drivers I have looked at and drivers/dri/common/spantmp.h treat _n as
 the number of pixels in the span after clipping.
 depthtmp.h and stenciltmp.h treat _n as the end+1 x coordinate of the 
 span.
 
 This inconsistency leads to artifacts when software fallbacks are hit 
 while
 clipping is used, especially with partially obscured clients. The attached
 patch should fix these artifacts by changing depthtmp.h and stenciltmp.h
 appropriately.
 
 What about this?
 
 Needed?
 
 I could have sworn this patch already got committed.  In any case, it 
 looks good to me.  I didn't realize the the templates in depthtmp.h 
 treated the _n parameter differently than the ones in spantmp.h.  The 
 fix, as in this patch, of making depthtmp.h work like spantmp.h seem to 
 be the right one.  The other option would be to fix each driver that 
 uses depthtmp.h.

I think this fix is better. _n is the original number of pixels and _n1 is 
the clipped number of pixels. It feels like the least confusing way to me.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: R200 ReadPixels optimization

2004-10-07 Thread Ville Syrjälä

On Thu, Oct 07, 2004 at 02:02:38PM +0100, Alan Cox wrote:
  Note that there's some code in there already which uses the blitter to copy 
  from framebuffer to agp memory, though it tries to implement the entire 
  readpixels() operation rather than being a useful low-level operation.
 
 AGP memory is hostside uncached (CPU limitations on x86 for one) which
 means it is better (swap PCI for DDR ram bus latencies is good) but
 still benefits from the treatment.

Why can't we make AGP memory cached? Wouldn't it be enought to flush the 
caches at some critical points?

I was playing around with DirectFB and AGP some years ago and enabling 
write-back caching didn't seem to have any side effects. Without caching 
AGP is almost as bad as video memory for sw fallbacks.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Code status (Was: New DRM driver model - gets rid of DRM() macros!)

2004-10-06 Thread Ville Syrjälä

On Wed, Oct 06, 2004 at 09:53:51AM -0400, Vladimir Dergachev wrote:
 
 
 On Wed, 6 Oct 2004, [iso-8859-15] José Fonseca wrote:
 
 Jon,
 
 I was trying to build and test the linux-core - I really like what
 you've been doing there - but I get endless kernel oops after insmod'ing
 any of the driver modules (not the common drm one though), regardless
 the specific chip is present or not.
 
 Before I search deeper into this could you let me know whether the code
 in *-core is not runnable yet, or if there must be something wrong on my
 side?
 
 Just as a data point I was able to insert radeon module just fine on 
 2.6.9rc3. This was yesterday. Maybe a fresh checkout will help ?

Inserting mga worked on 2.6.7 for me.

Unfortunately actually using the module failed. The problem is with 
get_order() in drm_bufs.c. After I replaced get_order() with the original 
drm order function things started to work.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Fwd: dri/mga issues with libGL

2004-10-06 Thread Ville Syrjälä

On Wed, Oct 06, 2004 at 10:32:57AM -0700, Ian Romanick wrote:
 Ville Syrjälä wrote:
 On Mon, Oct 04, 2004 at 04:24:19PM -0700, Ian Romanick wrote:
 
 I think the right answer is to apply the fix for reading alpha from the 
 framebuffer and ignore the 888 modes.  Since the hardware is operating 
 in  mode, pretending to be 888 is just wrong.  We'd have to go 
 through and make sure that 0xff is *always* written as the output from 
 the alpha blend stage in 888 mode.  Yuck.
 
 The 888 mode would be useful for the 24+8 overlay case if we change it to 
 not touch the alpha bits. But the drm would require a small change since 
 the swap ioctl always sets PLNWT to 0x.
 
 Ah.  Okay.  Other than the changes to the swap routine, how much effort 
 is it to make the hardware not draw to the alpha bits?

Apart from the swap case I think it should as simple as making sure 
mgaDDColorMask() sets the mask properly. That should take care of drawing 
and clearing I think. I have that in code but never actually tested it 
since DirectFB doesn't support overlay modes.

 I actually have (in my Mesa 5 for DirectFBGL tree) span functions for 
 (a)rgb 332, 555, 1555, 565, 888, , and depth/stencil 16/0, 32/0, 15/1, 
 24/8. The 555 and 888 functions don't touch the alpha bit(s).
 
 Some of those modes will be useful when pbuffers are supported.  Does 
 the MGA actually support a 15/1 depth/stencil mode?

It does. Well G400/G450/G550 do. G200 doesn't support stencil at all.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Fwd: dri/mga issues with libGL

2004-10-05 Thread Ville Syrjälä

On Mon, Oct 04, 2004 at 04:24:19PM -0700, Ian Romanick wrote:
 Ian Romanick wrote:
 
 It looks like destination alpha was disabled in the DDX at some point. I 
 seem to remember some discussion about this a long time ago.  Do any of 
 the DRI developers remember why this was done?
 
 So, I found this in the archives:
 
 http://marc.theaimsgroup.com/?l=dri-develm=104456147219701w=4
 
 It looks like, basically, there was a bug in the span read function. 
 Rather than fixing it, destination alpha was just disabled.
 
 If the DDX is modified to export visuals with an alphaSize of 8 and a 
 bufferSize of 32, the attached patch should fix things.  As noted in the 
 comment in the patch, this will break RGB888 visuals (e.g., using this 
 patch with an unmodified DDX).
 
 I think the right answer is to apply the fix for reading alpha from the 
 framebuffer and ignore the 888 modes.  Since the hardware is operating 
 in  mode, pretending to be 888 is just wrong.  We'd have to go 
 through and make sure that 0xff is *always* written as the output from 
 the alpha blend stage in 888 mode.  Yuck.

The 888 mode would be useful for the 24+8 overlay case if we change it to 
not touch the alpha bits. But the drm would require a small change since 
the swap ioctl always sets PLNWT to 0x.

I actually have (in my Mesa 5 for DirectFBGL tree) span functions for 
(a)rgb 332, 555, 1555, 565, 888, , and depth/stencil 16/0, 32/0, 15/1, 
24/8. The 555 and 888 functions don't touch the alpha bit(s).

I have quite a lot of changes in my Mesa 5 tree actually. Guess I really 
should bite the bullet and flip over to the XFree86/XOrg (ie. dark) side 
for a while...

 I guess this is a case where the DDX version should get a bump and the 
 DRI driver should check for the new version?

 ? src/mesa/drivers/dri/mga/depend
 Index: src/mesa/drivers/dri/mga/mga_xmesa.c
 ===
 RCS file: /cvs/mesa/Mesa/src/mesa/drivers/dri/mga/mga_xmesa.c,v
 retrieving revision 1.32
 diff -u -d -r1.32 mga_xmesa.c
 --- src/mesa/drivers/dri/mga/mga_xmesa.c  4 Oct 2004 22:58:39 -   1.32
 +++ src/mesa/drivers/dri/mga/mga_xmesa.c  4 Oct 2004 23:14:57 -

snip
I dislike this fill in modes code. My preference would be that the 3D 
driver always fill in all the modes it can support. The matching code 
should select the correct one anyways, right?

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: New proposed DRM interface design

2004-09-08 Thread Ville Syrjälä

On Wed, Sep 08, 2004 at 01:09:11PM +0200, Helge Hafting wrote:
 Jon Smirl wrote:
 
 On Tue, 07 Sep 2004 10:43:17 +0200, Helge Hafting [EMAIL PROTECTED] 
 wrote:
  
 
 Jon Smirl wrote:

 
 I would also like to fix things so that we can have two logged in
 users, one on each head. This isn't going to work if one them uses
 fbdev and keeps swithing the chip to 2D mode while the other user is
 in 3D mode. The chip needs to stay in 3D mode with the CP running.
 
  
 
 Yes!  I use the ruby patch and have two users logged in on the
 two heads of a G550.  It works fine - as long as no mode
 change is attempted.  And only one user can use 3D (or even 2D),
 the other is stuck with a unaccelerated framebuffer.

 
 
 There is nothing in the hardware preventing both users from having 3D
 displays. This is a problem in the way fbdev and DRM are designed. I
 would like to work towards fixing this.
  
 
 I have heard of someone using two 3D displays, but he
 used separate cards.  Can you get this on a single G550, which
 supports two monitors but don't duplicate all hardware?

Like Jon said the hardware can do it but the XFree86 driver doesn't allow 
it. AFAIK it doesn't even allow XAA acceleration on the secondary head.

I can run multiple 3D apps simultaneosly on both heads of my G400 with 
DirectFB. But currently DirectFB's multihead capabilities are limited so 
it only works with a TV as the secondary display.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_idP47alloc_id808op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: First DRI uber-benchmark

2004-08-22 Thread Ville Syrjälä

On Sun, Aug 22, 2004 at 01:16:18AM -0500, John Lightsey wrote:
 
 Matrox G400 32MB (mga)
 glxgears - 1000.2
 q2 640x480 - 62.9
 q2 800x600 - 52.3
 q2 1024x768 - 40.2
 q3 640x480 - 65.9
 q3 800x600 - 51.4
 q3 1024x768 - 36.4
 rtcw 640x480 - 42.3
 rtcw 800x600 - 33.5
 rtcw 1024x768 - 24.7
 ut 640x480 - 35.32
 ut 800x600 - 30.98
 ut 1024x768 - 26.7

I'm aware of two perfomance bottlenecks in the driver.

Number one is that it always uses synchronous DMA. I have asynchronous
DMA working just fine under DirectFB but it should probably be tested
more with XFree86 before going to cvs.

Number two is the TC2_MAGIC bit. It really hurts the single texturing
case and even dual texturing gains ~1 fps (in q3) with that bit turned
off.

Even with those two changes the Windows drivers are still quite a bit
faster :(

 Notes: Reliable, looks great.  UT suffered from lots of software fallback.

Any idea what fallbacks? ut2k4 demo was actually playable on my G400
if I disabled the texenv extensions. We probably need a config option
for that. ut2k3 demo always used projective multi texturing despite what
settings I tweaked so I couldn't get decent performance out of it.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink  Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Xorg] DRI merging

2004-06-15 Thread Ville Syrjälä

On Mon, Jun 14, 2004 at 09:02:46PM -0700, Mike Mestnik wrote:
 
 --- Alan Cox [EMAIL PROTECTED] wrote:
  On Sul, 2004-06-13 at 20:47, Matt Sealey wrote:
   Linux basically falls behind on two simple fronts at the moment:
   it has no simple 2D or 3D framework capable of much more than
  
  I deal with embedded Linux people on a daily basis. I think they would
  disagree. For 2D it has several in heavy use
  -   Keith's tiny X server work
  -   Nanogui (2D down to about 50K RAM)
  -   DirectFB (particularly strong at multimedia)
  
 I looked at DirectFB and found it not maintained,

DirectFB is maintained.

 I could try toget it
 working with mesa-solo cause the old branch(used in the install docs) is
 not used.

Do you mean using DRI with DirectFBGL? See my website 
http://www.sci.fi/~syrjala/gl/ There's a Mesa tarball (5.x based) that 
works with DirectFBGL.

FWIW I run DirectFB (exclusively) on all of my systems (desktop w/ matrox 
+ 2 laptops w/ mach64).

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference
Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer
Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA
REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Xorg] DRI merging

2004-06-14 Thread Ville Syrjälä

On Sun, Jun 13, 2004 at 09:24:24PM +0100, Matt Sealey wrote:
 
  -Original Message-
  From: Alan Cox [mailto:[EMAIL PROTECTED]
  Sent: 13 June 2004 20:04
  To: [EMAIL PROTECTED]
  Cc: Jon Smirl; Eric Anholt; Alex Deucher; DRI Devel;
  [EMAIL PROTECTED]
  Subject: RE: [Xorg] DRI merging
  
  
  On Sul, 2004-06-13 at 20:47, Matt Sealey wrote:
   Linux basically falls behind on two simple fronts at the moment:
   it has no simple 2D or 3D framework capable of much more than
  
  I deal with embedded Linux people on a daily basis. I think they would
  disagree. For 2D it has several in heavy use
  -   Keith's tiny X server work
  -   Nanogui (2D down to about 50K RAM)
  -   DirectFB (particularly strong at multimedia)
  
  For 3D you end up looking back at the mesa-solo work and it shares that
  same interest with the X over mesa people.
 
 Agreed, but DirectFB doesn't work with Qt, and the company I work for
 has a perfectly good OS for multimedia work (http://www.morphos.net :)

There was an effor to port QT to DirectFB. I'm not sure what happened to 
it though.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference
Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer
Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA
REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Radeon 7200 problems

2004-06-05 Thread Ville Syrjälä

On Sat, Jun 05, 2004 at 03:09:54AM -0400, Patrick McFarland wrote:

Which brings me to mention something else: I fully believe that the
kernel should be completely managing all aspects of memory and state
management of both 2D and 3D hardware. The kernel's portion of DRI
should be providing methods to allow multiple DRI using apps (such as
multiple xservers running at once) and multiple opengl apps within a
single DRI context to work flawlessly with each other.

Currently, projects such as DirectFB suffer because there is really no
unified method to do this. DirectFB nor xservers should ever be managing
memory on their own, nor managing parts of the DRI context on their own.
It becomes very easy to get different peices of software to break each
other, or simply prevent each other from working at the same time.

This, however, requires a more unified driver system (on platforms that
support it) between DRI and fbcon. (Does BSD have an equivilent to this?)
This new hybrid system would do all memory management,

I agree, the kernel should manage video and AGP memory.

do the actual
resolution and depth changes,

I agree with this one too. But the interface should be more flexible than
what fbdev provides. And it should handle overlays as well.

expose 2D and 3D hardware acceleration
functions, allow applications (DirectFB, xservers) to query the
available acceleration methods,

I disagree.

This part of the kernel should be as dumb as possible. I think the best
interface would be simply one accepting almost complete DMA buffers. The
only thing missing from these buffers would be real memory addresses. The
client should just use a surface id (handed out by the memory allocator)
instead of a real address. The kernel would then check if the client is
allowed to use those surfaces and replace the ids with real addresses. The
kernel should also check the buffers for other dangerous stuff.

For what it's worth Microsoft seems to have a quite similar system in
mind.
http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/DW04018_WINHEC2004.ppt

One clever thing they are doing is using the GART dynamically for swapping
out video memory.

provide DRI contexts and help manage
them so multiple GL apps would work on all drivers (which, afaik, few if
any correctly support), and probably increase the over all quality of
all software.

? You can run multiple GL apps just fine. If it doesn't work it's a driver
bug.

--
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/

---
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Radeon 7200 problems

2004-06-05 Thread Ville Syrjälä

On Sat, Jun 05, 2004 at 12:41:33PM +0200, Michel Dänzer wrote:
 On Sat, 2004-06-05 at 12:21 +0300, Ville Syrjälä wrote:
  On Sat, Jun 05, 2004 at 03:09:54AM -0400, Patrick McFarland wrote:
   
   expose 2D and 3D hardware acceleration
   functions, allow applications (DirectFB, xservers) to query the
   available acceleration methods,
  
  I disagree.
  
  This part of the kernel should be as dumb as possible. I think the best 
  interface would be simply one accepting almost complete DMA buffers. The 
  only thing missing from these buffers would be real memory addresses. 
 
 I'm not sure about that; pseudo-command buffers that the DRM parses and
 generates the actual DMA buffers from on the fly might be better for
 security and/or performance reasons.

Quite possible. Though I'm not totally convinced of the performace 
argument since the kernel would then have to build all of the buffers from 
scratch. With real buffers we could just check which register the value is 
for and if there are no problems with that register the value could be 
passed as is.

And one major problem with pseudo buffers is that they should not impose 
any nasty limits on what we can do. So the design would have to be good 
from the start.

  The client should just use a surface id (handed out by the memory allocator) 
  instead of a real address. The kernel would then check if the client is 
  allowed to use those surfaces and replace the ids with real addresses. The 
  kernel should also check the buffers for other dangerous stuff.
 
 Seconded.
 
 I wonder if we can reasonably get there in a backwards compatible way...

I think the current DRM interface could be moved on top of the new one. 
Maybe as a separate compatibility module...

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Mode manager / Framebuffer management

2004-05-16 Thread Ville Syrjälä

On Sun, May 16, 2004 at 03:49:49AM -0700, Mike Mestnik wrote:
 
 A fue other questions, directed at the memory mngmt ppl.  What about 2d
 only page fliping, ModeX and the like?  Where these ever supported outside
 of libsvga?  Are thay rellecs from the past that can die?

Page flipping is important. DirectFB uses it a lot. fbdev is almost enough 
to implement proper page flipping. The only thing missing is getting the 
current vblank counter from the pan ioctl(). Without that good triple 
buffering is impossible. And then there is the problem that fbdev doesn't 
handle overlays so good triple buffering for overlays is impossible right 
now. That is why I'm quite fond of the feed the kernel complete register 
values-interface. It would work for CRTCs and overlays.

Oh and there is no 2D vs. 3D page flipping.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id%62alloc_ida84op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Mode manager / Framebuffer management

2004-05-16 Thread Ville Syrjälä

On Sun, May 16, 2004 at 09:24:37AM -0700, Mike Mestnik wrote:
 Can 3D acceleration work in [1]ModeX? 2D?
 
 1.  Every n+1 pixel is for the next screen(FB).  This format makes SW
 bliting of a pix to more than one screen faster, as you only have to
 iritate the source(and destination) once.

ModeX stuff is pretty foreign to me. I think some 2D accel might be 
possible depending on how many pages you have. 3D would not work on any 
chip I know.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id%62alloc_ida84op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: Mode manager / Framebuffer management

2004-05-16 Thread Ville Syrjälä

On Sun, May 16, 2004 at 08:08:10PM -0700, David Bronaugh wrote:
 Vladimir Dergachev wrote:
 
 This brings out an interesting memory management point: swappable versus 
 non-swappable graphics objects.
 
 A framebuffer is obviously non-swappable, while textures are swappable.

Not just textures but other offscreen buffers as well. With a compositing 
window system all windows have offscreen or system memory buffers. And 
then there are font glyphs, pixmaps etc.

The problem here becomes how to determine what to throw out and what to do 
when things get kicked out. Also memory fragmentation can be a problem. If 
we want to handle all of this the system will start to look like 
DirectFB's memory allocator.

Currently DirectFB manages all of the offscreen memory and all memory 
chunks have a usage counter so it can decide what to throw out. If the 
surface in question has a system memory buffer as well we make sure that 
it's updated before the video memory copy is thrown out.

How is that sort of thing going to work inside the kernel?

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id%62alloc_ida84op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Mesa3d-dev] Re: [Linux-fbdev-devel] Redesign of kernel graphics interface

2004-05-15 Thread Ville Syrjälä

On Sat, May 15, 2004 at 09:27:34AM +0200, Holger Waechtler wrote:
 Ville Syrjälä wrote:
 On Fri, May 14, 2004 at 11:40:04AM -0700, Jon Smirl wrote:
 
 Does DirectFB work on anything beside Matrox now?
 
 
 It works on any card with a working fbdev driver (vga16fb excluded). 
 Hardware acceleration is available on quite a few chips these days.
 
 ati128  cyber5k  mach64  neomagic  nvidia  savage  tdfx
 cle266  i810 matrox  nsc   radeon  sis315
 
 Keep in mind that beside the matrox driver almost none of them 
 implements the full accelerated 2D API

cle266 is about the same level as matrox. I could get mach64 into this 
group with some modifications to the memory allocator. I think for some of 
the other drivers the biggest problem is lack of developers.

 and that most are misusing the 3D 
 engine to implement 2D functionality. Alpha blended stretch blits are 
 almost always implemented using the 3D texture engines on PC graphics cards.

I don't consider that mis-use, simply use.

 About one third of these drivers have been written using specs and 
 documentation files that have never been officially released by the 
 hardware vendors, so I'm not really sure whether they are much better 
 from a security point of view than a closed source driver -- what's the 
 point of a open source hardware driver without hardware specs? - you're 
 not really able to review it seriously.

I don't think the specs would help in that regard much. For all you know 
the specs could be all wrong. If you have the code and can see the 
expected results that usually means the code works correctly. And closed 
source kernel drivers could do a lot of other nasty stuff even without 
touching the hardware.

 The specs for the remaining ones usually showed up as soon the hardware 
 was getting outdated. Basically the same situation like the one you see 
 with DRI drivers.

Except now the specs are getting harder to get even for old hardware. The 
vendors seem to be returning to the old ways :(

 I use matrox and mach64 drivers daily. It's been a few years since I 
 seriously used XFree86.
 
 Have you ever thought about the inherent security risks of memory mapped 
 i/o registers when executing non-trusted applications? Imagine what 
 happens if every single application is allowed to program the blitter 
 and texture engines to copy host memory from anywhere in the system to 
 graphics memory and back - a single misbehaving application can damage 
 your entire system.

I am aware of the risks. Currently it's not an issue for me. And if I 
limit myself to running only XDirectFB the risk is equal to running 
accelerated XFree86. Of course I would be glad to make it all secure but 
only without losing any of the nice features.

 And do you really have the time to review every line of code you execute 
 on your system?

Clearly not. There is some stuff you just have to trust (or not care).

 2) mesa supports a broad number of cards, basically everything there is 
 doc for
 
 
 Just about as broad as DirectFB.
 
 be honest.

I am. This is the list of drivers in Mesa cvs:
i810   mga radeon  tdfx 
ffbi830r128  savage  unichrome
gamma  mach64  r200  sis

The list is almost the same as the DirectFB driver list. Granted that some 
of the Mesa drivers use more of the hardware's capabilities, but that's 
only because I don't have the hardware ;)

 I'm not suggesting that everyone start using DirectFB. Everyong should be 
 able to use any API they want. The kernel should provide just enough to 
 allow these APIs to be implemented.
 
 that would be always possible, don't worry.

I do worry a bit because of this OpenGL as the one and only API talk.

 Please keep in mind that we developed DirectFB at Convergence as API to 
 access SettopBox and Game Console functionality in a convenient way, it 
 was never intended and has not been designed for use in 
 security-critical desktop or workstation environments.

I am aware of that and that's why I don't recommend it to everyone. 
Personally I just find it to my liking. Even the code itself makes me 
a happy camper whereas XFree86 code gives me a headache...

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id%62alloc_ida84op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Mesa3d-dev] Re: [Linux-fbdev-devel] Redesign of kernel graphics interface

2004-05-14 Thread Ville Syrjälä

On Fri, May 14, 2004 at 10:51:35AM -0700, Jon Smirl wrote:
 Just look at this picture and you can see the trend of 2D vs 3D (coprocessor
 based) graphics.
 http://www.de.tomshardware.com/graphic/20040504/images/architecture.gif
 Within one or two generations the 2D box is going to be gone.
 
 If Linux wants to stay current with technology we have to start using the
 coprocessor features of the GPU. Most of the benchmarks I have seen show
 coprocessor vs programmed at 100:1 speed differential. This is also a
 competitive problem, Microsoft and Apple have both decided to go with the GPU
 coprocessor this year. 

I don't understand you GPU vs. PIO comparisons. You can use the 2D engine 
with DMA as well. And at least with older cards the 2D engine is clearly 
faster than the 3D engine (~100% faster for blits on my G400) so trying to 
bypass it is just stupid.

 I said OpenGL is the only accelerated API available on Linux. Can you name
 another?

DirectFB.

 There is a little acceleration in framebuffer, but I don't know of any
 others. Also, software mesa works just fine to provide OpenGL on dumb 2D cards.

Using unaccelerated OpenGL for 2D rendering doesn't sound exatly useful. 

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id%62alloc_ida84op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Re: [Linux-fbdev-devel] Redesign of kernel graphics interface

2004-05-10 Thread Ville Syrjälä

On Mon, May 10, 2004 at 05:14:04PM +0100, James Simmons wrote:
 
  You are mixing things. Mode setting has nothing to do with rendering. OpenGL
  is a rendering client. It produces commands sent to the low level kernel
  driver and provides a 3D API, but it's not the only one. In this regard,
  fbcon is a client too and XFree 2D accel is another one.
 
 But we are rendering to draw fonts, clearing a area of the screen,copyarea.
 If we are to have a universal solution it needs to OpenGL. Either that or 
 mode switching stays in the kernel.  

Rendering and mode switching are completely separate issues.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by Sleepycat Software
Learn developer strategies Cisco, Motorola, Ericsson  Lucent use to
deliver higher performing products faster, at low TCO.
http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: t_vertex conversion for R200 (was Re: [Dri-devel] [r200] Viewperf-7.1.1 numbers - Not so good.)

2004-04-07 Thread Ville Syrjälä

On Wed, Apr 07, 2004 at 12:12:23PM +0200, Felix Kühling wrote:
 
 The t_vertex stuff replaces t_dd_vbtmp stuff. I've done this for the
 savage driver and it was pretty straight forward.

I have mga converted. The conversion was quite easy to do. I haven't 
commited it because wanted to see if I could do projective texturing 
without ptex vertices. Unfortunately it doens't seem to be exactly easy to 
add that capability because the emit functions deal with one attribute at 
a time and projective texturing would require access to two attributes at 
a time.

I also saw a really strange segfault somewhere. It looked like a for loop 
was going out of bounds which made me supicious of my compiler. If I 
remember correctly it happened with the cva test program.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70alloc_id638op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

[Dri-devel] Re: [Mesa3d-dev] Three concepts for changing the way video works in Linux

2004-03-14 Thread Ville Syrjälä

On Sat, Mar 13, 2004 at 06:50:07PM -0800, Jon Smirl wrote:
 A fourth concept which has already been heavily discussed is making OpenGL the
 primary graphics API. OpenGL can be quite small, OpenGL-ES is already shipping
 on some phones. Just to recap OpenGL/xserver is a response to Windows Longhorn.
 It is also a high level API that allows graphics hardware to grow more
 intelligent without disrupting the top level API.

Is still don't understand this point. Why do you need to define a primary 
API? The DRM isn't really tied to OpenGL in any way so the common API is 
the DRM API. People should be able to implement any API on top of DRM.

And speaking of the DRM API I feel it is too heavy currently. I would 
really like something that didn't need much (if any) device specific init 
code in user space.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70alloc_id638op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Re: [Mesa3d-dev] Three concepts for changing the way video works in Linux

2004-03-14 Thread Ville Syrjälä

On Sun, Mar 14, 2004 at 08:29:18AM -0800, Jon Smirl wrote:
 
 You're right that DRM is not tied to the OpenGL API. But what about things like
 memory management of the video RAM and AGP space? Right now that is happening
 inside of the OpenGL libraries.

Well memory management can be a bit of a problem. Currently with DirectFB 
we split the video RAM into DirectFB part and OpenGL texture part. It 
would be nice to be able to use DirectFB's memory manager to allocate 
textures.

Then there are issues like multi-head cards and video capture that would 
work best if the memory allocator would be moved into the kernel.

 Also note that the DRM interface is different for each board. There is no 'draw
 circle' entry point to DRM. So it is possible to implement another API on top of
 DRM but you will need to coordinate memory management and do a different version
 for each DRM driver.

Of course. It's no different than the situation with MMIO accel.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70alloc_id638op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: PCI MGA support (was Re: [Dri-devel] mga cvs changes)

2004-03-13 Thread Ville Syrjälä

On Fri, Mar 12, 2004 at 01:40:30PM -0800, Ian Romanick wrote:
 
 I want to initially support this by just disabling AGP texturing.  I 
 looked in the DRI driver, and it looks like this can be done with some 
 trivial changes in mga_xmesa.c.  Basically, the code just needs to 
 handle the case where serverInfo-agpTextureSize is 0.  The DDX driver 
 needs to be modified to detect PCI cards and do something smart there. 

I think the chip detection for the DRI driver should be in the DRM. I have 
this in my local Mesa 5 tree. I just added a new getparam ioctl that 
returns the PCI id. But since G450 PCI doesn't have a unique PCI id we'd
need something slightly different.

 I don't want to exclude the 
 possability of sourcing vertex data from on-card memory in the future.

The cards can't do this.

 MGADRIPciInit wouldn't be a complete duplication of MGADRIAgpInit 
 because I don't intend to (initially) support PCIGART.  Even when 
 PCIGART is supported, not all chips in the MGA family support it.  Is 
 the PCI G450 the only one?

I think so. G200 was the last chip to have a real PCI variant but none
of the chips can do scatter gather. I've never heard of PCI G400 or G550. 
Of course even AGP chips can use PCI transfers but that might only make  
sense for something like video capture.

Currently the DMA buffers are 64 kB but if we would reduce them to 
PAGE_SIZE we could perhaps support all PCI cards. The only problem is the 
primary buffer since it's 1 MB currently. And switching to PAGE_SIZE 
buffers might require even more space in the primary buffer. I haven't 
actually measured how full the buffers are typically...

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70alloc_id638op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: PCI MGA support (was Re: [Dri-devel] mga cvs changes)

2004-03-13 Thread Ville Syrjälä

On Fri, Mar 12, 2004 at 03:39:50PM -0800, Alex Deucher wrote:
   As I recall the G450-PCI cards were just AGP chips with an agp to
  pci
   bridge.  Perhaps you need to hack up an agpgart driver for the
  bridge? 
   Also, for the pci g450, matrox only supported 3D on motherboards
  with
   intel chipsets for whatever reason.  
  
  If possible, I'd like to leave that until later.  Esp. since I don't 
  have any docs for that. :(
  
 
 You do now!  looks like PLX owns HiNT now and they have the databooks
 on their website.  according to this thread:
 http://marc.theaimsgroup.com/?l=dri-develm=102373024625910w=2
 the pci g540 uses the Hint HB1-SE33 bridge.  PLX bought HiNT and
 changes a few names:
 http://www.plxtech.com/products/hint/naming.htm
 but makes the specs available here:
 http://www.plxtech.com/products/hint/6152.asp

I just had a look at the docs and didn't see any mention of an address 
translation table :( Is the bridge only used because they needed to 
connect a 66Mhz AGP device to a 33MHz PCI bus?

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70alloc_id638op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] savage-2-0-0 test notes

2004-02-23 Thread Ville Syrjälä

On Mon, Feb 23, 2004 at 11:58:25AM +0100, Felix Kühling wrote:
 Hmm, the driver lacks the *pixel.[ch] files that for instance the mga
 driver has. The r128 driver seems to have some pixel drawing stuff in
 r128_span.[ch].

span files have the software stuff and pixel files have some AGP 
glReadPixels() stuff but I think it's disabled (maybe broken?). The span 
stuff is straightforward to add because there are some templates.

I'm not sure if the AGP glReadPixels() stuff is actually very useful since 
reading from AGP aperture with the CPU is also very slow. I've been 
wondering why we can't enable caching on the AGP aperture. Write combining 
already needs to do some sort of flush after writing so I don't understand 
why we can't use write-back caching instead and just make sure to flush 
the cache before reading stuff. Someone care to enlighten me?

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps  Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id56alloc_id438op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Hardware TexSubImage

2004-02-23 Thread Ville Syrjälä

On Mon, Feb 23, 2004 at 10:42:04AM -0700, Brian Paul wrote:
 Chris Ison wrote:
 What do you mean by hardware TexSubImage?
 
 
 Maybe I'm reading it wrong but looking at these texbubimage functions, they
 all copy out a the part of the texture to use, what I'm wondering is if
 there is a way to tell the card to do that.
 
 I believe a DMA transfer is used to move texture data from main memory 
 to the card.  What else do you have in mind?

Maybe this isn't what the the question was about but but we could try to 
upload only the changed parts of the texture. We'd just have to maintain 
dirty regions in addition to the simple dirty bits we now have.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps  Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id56alloc_id438op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] savage-2-0-0 test notes

2004-02-23 Thread Ville Syrjälä

On Mon, Feb 23, 2004 at 01:38:34PM +0100, Felix Kühling wrote:
  span files have the software stuff and pixel files have some AGP 
  glReadPixels() stuff but I think it's disabled (maybe broken?). The span 
  stuff is straightforward to add because there are some templates.
 
 If the span stuff is all that's needed then it should be working.

I think so. The SetBuffer() function in there should take care of buffer 
selection for all software reading and writing operations.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps  Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id56alloc_id438op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Re: [Dri-users] TORCS initialization problems with Matrox g550

2004-02-17 Thread Ville Syrjälä

On Tue, Feb 17, 2004 at 09:00:12PM +0100, Bernhard Wymann wrote:
 Hi All
 
 The microcode in the MGA driver can't handle multi texturing with 
 projective textures. See Ville's comments in the thread Slow Mtex in
  DRI.
 
 But in this case I think it should still work, just dead slow (with a 
 software fallback). If I understand the torcs thread correctly, it just 
 crashed instead.
 
 Yes, that is the reason why we introduced the -m switch. It also crashes 
 without -m on i810 and ATI Rage hardware, if I remebmer correct.

You're probably testing with old drivers. I fixed the ptex fallback in all 
affected drivers some six months ago.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps  Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id56alloc_id438op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] New DRM design (was ATI I2C, MONID)

2004-02-16 Thread Ville Syrjälä

On Mon, Feb 16, 2004 at 01:30:33PM +0100, Helge Hafting wrote:
 Ville Syrjälä wrote:
 [...]
 
 I want to bypass the drm to do accel from user space. Doing an ioctl() for 
 each blit feels very expensive. Rather than do an ioctl() for each blit 
 the drm could check the commands in the DMA buffer for bad stuff. But that 
 doesn't feel very efficient either. 
 If an ioctl per blit is expensive, why don't you make an ioctl
 that accepts an arbitrarily long list of blits?
 You keep a sequre interface, and keeps performance up by handing over
 a lot of blits in one go.

And what if the app does a few blits and then draws a few lines and then 
some rectangle fills and then... If all of that is folded into one ioctl() 
it pretty much becomes identical to checking the raw DMA buffers for 
bad stuff.

 If you don't care about the security I 
 People care about security.  A server shoudn't be able to fall over just
 because someone plays quake on the console. And a home machine shouldn't
 be less stable either, that isn't necessary. You can have both performance
 and security.

I didn't say everyone should not care. People can care if they need to.

 think you should be allowed to bypass it to gain some speed. 
 And you can gain quite a bit more by writing your high-performance
 program for one particular card, one mode and one resolution.  No 
 interfaces at all, only hardware accesses.  That gave us nice performance 
 on the 1MHz machines of the
 eighties.  Nobody does it any more though.

I'm not doint that.

 And finally I find the current situation with multi-head cards quite 
 bad. I'd like the ablitity for a user space app to open the whole card 
 as one entity. That includes all CRTCs, outputs and the whole memory 
 (minus whatever is in use by other stuff like DMA stuff and video 
 capture). If the app doesn't want to handle such details it would just get 
 whatever is used by the current VT.
 That might be useful, but it is also useful to be able to deal with only one
 head at a time, so that another head may be used by another user.

Yes I know. Actually if the video memory manager is moved to the kernel 
accessing all heads becomes as simple as opening all of them individually. 

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps  Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id56alloc_id438op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Slow Mtex in DRI

2004-02-15 Thread Ville Syrjälä

On Sun, Feb 15, 2004 at 01:34:14PM +0100, Felix Kühling wrote:
 Which hardware/driver are you using under Linux?
 
 I just became aware of a hardware limitation in several drivers (mga,
 r128, savage). They can do multi texturing, but if multi texturing and
 projective textures are used at the same time they fall back to software
 rendering.

A comment in the r128 driver sources makes me think the hardware can 
support this but someone would need to add support for a new vertex format 
into the templates.

For mga we would need better microcode to support this. The hardware could 
do it but the microcode is holding us back :(

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps  Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id56alloc_id438op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Slow Mtex in DRI

2004-02-15 Thread Ville Syrjälä

On Sun, Feb 15, 2004 at 08:32:15AM -0600, Ryan Underwood wrote:
 
 On Sun, Feb 15, 2004 at 03:50:54PM +0200, Ville Syrjälä wrote:
  
  A comment in the r128 driver sources makes me think the hardware can 
  support this but someone would need to add support for a new vertex format 
  into the templates.
  
  For mga we would need better microcode to support this. The hardware could 
  do it but the microcode is holding us back :(
 
 Is it possible to extract a newer microcode from the windows drivers?

Maybe if someone figures out which microcode is which. And if there's 
projective multitexturing microcode in the drivers someone has to figure 
out how they want the vertex data. I had a look at a hexdump of the 
windows drivers once and it looked like the microcode may be slightly 
different than what we have.

 But I wonder if mga can even do this under
 windows.

I have no idea. It would be easy to test with projtex.

 I guess it would only be G400+, since G200 only has one WARP
 so it seems that it wouldnt be able to do hardware multitexture.

That's right. G200 can't do multitexturing at all. That's the reason why 
it has only one WARP, it doesn't really need two WARPs.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps  Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id56alloc_id438op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Fwd: [Dri-users] glPushAttrib()

2004-02-12 Thread Ville Syrjälä

On Thu, Feb 12, 2004 at 03:43:09PM -0700, Brian Paul wrote:
 
 My suspicion is that this is a state tracking bug in the mga driver. 
 I looked things over a bit and there is a little overlap between 
 lighting state and texture state in the mga driver (separate specular 
 color).

mga? The poster said it was observed on ATI cards.

The older bug mentioned in the link could very likely have been a bug in 
the mga driver. Secandary color handling was broken until I fixed it 
sometime last year.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps  Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id56alloc_id438op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] New DRM design (was ATI I2C, MONID)

2004-02-09 Thread Ville Syrjälä

On Sun, Feb 08, 2004 at 08:27:17PM -0800, Jon Smirl wrote:
 I have a new version of DRM at bk://mesa3d.bkbits.net/drm 
 In it's current form it's 2.6 kernel only. Some support is generic but I've
 mainly be working on an R200. It is still under development with lots of work to
 do.

Ugh. I find the idea of OpenGL as the one and only API really disturbing. 
I don't want to layer another high level API over OpenGL.


I'll just write some of my ideas here. Naturally I don't have any code to 
actually implement any of this so feel free to ignore it...

Here's a quick and dirty chart of how I think things should be organised.

--
user space
   ---
   | fbdev + fbcon | drm |
--
memory manager/arbitration/DMA/irq
--

I like the idea of reducing fbdev to something that just writes the 
registers and they could be calculated in user space by a trusted 
application (fbset, xserver...). And fbdev would just notify fbcon 
whenever this happens and fbcon would feed the correct values back to 
fbdev on a VT switch. Fbdev would use the low-level module for irq support 
but besides that I don't think there's much more that it would need. Maybe 
overlays and such could also be handled through fbdev.

I want to bypass the drm to do accel from user space. Doing an ioctl() for 
each blit feels very expensive. Rather than do an ioctl() for each blit 
the drm could check the commands in the DMA buffer for bad stuff. But that 
doesn't feel very efficient either. If you don't care about the security I 
think you should be allowed to bypass it to gain some speed. And of course 
there may be cards without DMA so you may need the ability to do MMIO 
stuff directly.

The memory manager component could probably take care of agpgart usage as 
well as video memory. Drm could use this to check blits, clears and 
whatnot. And yes I want to allow access to the framebuffer memory from 
user space.

And finally I find the current situation with multi-head cards quite 
bad. I'd like the ablitity for a user space app to open the whole card 
as one entity. That includes all CRTCs, outputs and the whole memory 
(minus whatever is in use by other stuff like DMA stuff and video 
capture). If the app doesn't want to handle such details it would just get 
whatever is used by the current VT.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

[Dri-devel] Re: [Mesa3d-dev] Question about texmem.c

2004-01-30 Thread Ville Syrjälä

On Thu, Jan 29, 2004 at 01:07:30PM +, Keith Whitwell wrote:
 In driCalculateTextureFirstLastLevel, there's this bit of code:
 
 
  if (tObj-MinFilter == GL_NEAREST || tObj-MinFilter == GL_LINEAR) {
  /* GL_NEAREST and GL_LINEAR only care about GL_TEXTURE_BASE_LEVEL.
   */
 
  firstLevel = lastLevel = tObj-BaseLevel;
   }
   else {
firstLevel = tObj-BaseLevel + (GLint)(tObj-MinLod + 0.5);
firstLevel = MAX2(firstLevel, tObj-BaseLevel);
lastLevel = tObj-BaseLevel + (GLint)(tObj-MaxLod + 0.5);
lastLevel = MAX2(lastLevel, t-tObj-BaseLevel);
lastLevel = MIN2(lastLevel, t-tObj-BaseLevel + 
baseImage-MaxLog2);
lastLevel = MIN2(lastLevel, t-tObj-MaxLevel);
lastLevel = MAX2(firstLevel, lastLevel);
   }
 
 
 I'm wondering if this has been thought through.  For the test to work, this 
 code fragement will have to be re-evaluated whenever tObj-MinFilter 
 changes, or at least whenever it changes to/from NEAREST or LINEAR.
 
 I don't think the drivers do this at the moment.  Correct?

mga does the right thing ie. when MinFilter changes we call
driSwapOutTextureObject() which causes mgaSetTexImages() to be called and 
thus driCalculateTextureFirstLastLevel().

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] MGA font corruption revisited - now reproducible

2004-01-20 Thread Ville Syrjälä

On Tue, Jan 20, 2004 at 03:52:37AM -0600, Ryan Underwood wrote:

Ville,

On Thu, Dec 11, 2003 at 02:02:36PM +0200, Ville Syrjälä wrote:

I can't reproduce any font corruption with crack-attack (or any other gl
app) and quake2 just segfaults when I try to run it. Maybe it doesn't
like to run with the demo pak file...

But running quake3 and crack-attack at the same time does cause some
really nasty texture corruption. They appear to step on each others'
textures...

Just to let you know, it appears the RENDER bug has been solved. I
think I didn't properly replace the driver before. :) However, I was
doing my own driver hacking, so I was forced to replace it correctly
this time.

Ah good.

The only problem I have with the mga driver right now is lack of mouse
cursor in UT, though there is a claim in bugzilla that you fixed it. Do
you have any details on the fix?

http://freedesktop.org/cgi-bin/viewcvs.cgi/dri/xc/xc/lib/GL/mesa/src/drv/mga/Attic/mga_xmesa.c.diff?r1=1.66r2=1.67hideattic=0
http://freedesktop.org/cgi-bin/viewcvs.cgi/dri/xc/xc/lib/GL/mesa/src/drv/mga/Attic/mgaioctl.c.diff?r1=1.37r2=1.38hideattic=0

The _mesa_notifySwapBuffers() call is the important bit. Without that the
pipeline wasn't flushed properly.

--
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/

---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] GL_ATI_envmap_bumpmap

2004-01-20 Thread Ville Syrjälä

On Tue, Jan 20, 2004 at 03:47:49PM -0600, Ryan Underwood wrote:
 
 Hi,
 
 GL_ATI_envmap_bumpmap seems to describe identical functionality to
 DirectX6 EMBM.  ATI's drivers support this extension and it is
 implemented in Mesa apparently.  Does anyone know of a demo or sample
 code that utilizes this extension?

Last time I looked Mesa didn't support this extension. My plan was to add 
bumpmap support to the mga driver if/when someone added the relevant Mesa 
bits...

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Re: CVS Update: xc (branch: savage-2-0-0-branch)

2003-12-20 Thread Ville Syrjälä

On Sat, Dec 20, 2003 at 01:47:24PM +0100, Felix Kühling wrote:
 I believe this is also missing in the MGA driver. The symptoms can be
 seen for instance in the gflux demo. Part of the waving square is not
 drawn.

I fixed the mga swapbuffers code a few weeks ago.

 On Sat, 20 Dec 2003 04:44:31 -0800
 Felix Kuehling [EMAIL PROTECTED] wrote:
 
  CVSROOT:/cvs/dri
  Module name:xc
  Repository: xc/xc/lib/GL/mesa/src/drv/savage/
  Changes by: [EMAIL PROTECTED]   03/12/20 04:44:31
  
  Log message:
Call _mesa_notifySwapBuffers before buffer swapping.
  

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78alloc_id371op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] MGA font corruption revisited - now reproducible

2003-12-11 Thread Ville Syrjälä

On Wed, Dec 10, 2003 at 06:32:18AM -0600, Ryan Underwood wrote:
 
 On Wed, Dec 10, 2003 at 12:53:53PM +0200, Ville Syrjälä wrote:
  
  If you remove/comment out the following line
  $(MAKE_CMD) $(MFLAGS) $(WORLDOPTS) World
  in the top Makefile make World shouldn't actually build anything.
  After that you should be able to build only the driver.
  
  I uploaded my patched mga_drv.o to my website
  http://www.sci.fi/~syrjala/dri/ so you might want to try that before
  building yourself...
 
 Here is a long story, prepare yourself.
 
 I ran crack-attack and the corruption didn't occur this time.

I can't reproduce any font corruption with crack-attack (or any other gl
app) and quake2 just segfaults when I try to run it. Maybe it doesn't
like to run with the demo pak file...

But running quake3 and crack-attack at the same time does cause some
really nasty texture corruption. They appear to step on each others'
textures...

I've only tried with gtk apps because I don't have qt installed. 

 Something else, 4 out of 5 times, quake2 does not cleanly exit, strace
 ends here (10 is the fd of /dev/dri/card0):
 
 [pid 12025] ioctl(10, 0x400c6445, 0xbfffe4f0) = 0
 [pid 12025] ioctl(10, 0xc0286429, 0xbfffe330) = 0
 [pid 12025] ioctl(10, 0x400c6445, 0xbfffe4f0) = 0
 [pid 12025] ioctl(10, 0xc0286429, 0xbfffe330) = 0
 [pid 12025] ioctl(10, 0x400c6445, 0xbfffe4f0) = 0
 [pid 12025] ioctl(10, 0xc0286429, 0xbfffe0b0) = 0
 [pid 12025] ioctl(10, 0x400c6445, 0xbfffe4f0) = 0
 [pid 12025] ioctl(10, 0xc0286429, 0xbfffe330) = 0
 [pid 12025] ioctl(10, 0x400c6445, 0xbfffe4f0) = 0
 [pid 12025] ioctl(10, 0xc0286429, 0xbfffe330) = 0
 [pid 12025] ioctl(10, 0x400c6445, 0xbfffe4f0) = 0
 [pid 12025] ioctl(10, 0x4008642b, 0xbfffe510) = 0
 [pid 12025] ioctl(10, 0x4008642a, 0xbfffe138) = ? ERESTARTSYS (To be restarted)
 [pid 12025] +++ killed by SIGKILL +++
 
 It hangs on that last ioctl which is where I kill the application.  How do
 I translate those ioctls into hardware commands

You need to match the 8 lsbs of that middle number to the ioctl numbers
specified in drm.h and mga_drm.h.

 so I know where to look
 for the problem?

That last one is the DRM_LOCK ioctl. It returns -ERESTARTSYS when there's
a signal pending. I suppose the app just sits there and doesn't
handle the signal for some reason. Some SDL magic?

It's not a driver problem AFAICS.

 On another topic, do you use a dualhead G400?  If so, are you able to
 properly use DPMS on the second head?

I don't run XFree86 except when trying to hunt DRI related bugs. It's
been well over a year since I really used XFree86 and I honestly don't
remember if DPMS ever worked with the second head. I don't have a second
monitor to test right now.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78alloc_id371op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] MGA font corruption revisited - now reproducible

2003-12-10 Thread Ville Syrjälä

On Tue, Dec 09, 2003 at 01:24:16PM -0600, Ryan Underwood wrote:
 
 Thanks for the insight.  Is this already something that has been
 extensively looked at without success, or would it be worth my time to
 dig into the code and try to find the cause?  I've thought about it, but
 afraid that I will just hit a brick wall someone else already ran into
 with it. ;)

I've attached a patch that should hopefully fix this problem. The render
code just forgot to reset the multi texturing registers. I've not
actually tested the patch but I don't see anything else wrong with the
code...

 Is there anywhere I can get a G400 databook for reference, or is that
 not publicly available?

They're not available anymore :( It's a real shame since they seemed to be
quite friendly towards open source developers at one point. I can almost
understand that they don't want to release any parhelia docs but I don't
understand why they stopped giving out the older docs...

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/
Index: mga_reg.h
===
RCS file: /cvs/dri/xc/xc/programs/Xserver/hw/xfree86/drivers/mga/mga_reg.h,v
retrieving revision 1.8
diff -u -r1.8 mga_reg.h
--- mga_reg.h   27 Jan 2002 20:05:35 -  1.8
+++ mga_reg.h   10 Dec 2003 06:27:05 -
@@ -475,6 +475,9 @@
 #define MGAREG_ALPHACTRL   0x2c7c
 #define MGAREG_DWGSYNC 0x2c4c
 
+#define MGAREG_TDUALSTAGE0 0x2cf8
+#define MGAREG_TDUALSTAGE1 0x2cfc
+
 #define MGAREG_AGP_PLL 0x1e4c
 #define MGA_AGP2XPLL_ENABLE0x1
 #define MGA_AGP2XPLL_DISABLE   0x0
Index: mga_storm.c
===
RCS file: /cvs/dri/xc/xc/programs/Xserver/hw/xfree86/drivers/mga/mga_storm.c,v
retrieving revision 1.20
diff -u -r1.20 mga_storm.c
--- mga_storm.c 25 Mar 2003 11:21:06 -  1.20
+++ mga_storm.c 10 Dec 2003 06:27:07 -
@@ -341,6 +341,10 @@
 tex_padw = 1  log2w;
 tex_padh = 1  log2h;
 
+WAITFIFO(2);
+OUTREG(MGAREG_TDUALSTAGE0, 0);
+OUTREG(MGAREG_TDUALSTAGE1, 0);
+
 WAITFIFO(15);
 OUTREG(MGAREG_TMR0, (1  20) / tex_padw);  /* sx inc */
 OUTREG(MGAREG_TMR1, 0);  /* sy inc */
@@ -425,6 +429,9 @@
 tex_padw = 1  log2w;
 tex_padh = 1  log2h;
 
+WAITFIFO(2);
+OUTREG(MGAREG_TDUALSTAGE0, 0);
+OUTREG(MGAREG_TDUALSTAGE1, 0);
 
 WAITFIFO(12);
 OUTREG(MGAREG_DR4, red  7);  /* red start */
@@ -522,6 +529,10 @@
 tex_padw = 1  log2w;
 tex_padh = 1  log2h;
 
+WAITFIFO(2);
+OUTREG(MGAREG_TDUALSTAGE0, 0);
+OUTREG(MGAREG_TDUALSTAGE1, 0);
+
 WAITFIFO(15);
 OUTREG(MGAREG_TMR0, (1  20) / tex_padw);  /* sx inc */
 OUTREG(MGAREG_TMR1, 0);  /* sy inc */

Re: [Dri-devel] Polygon offsets with r128 and mga

2003-12-04 Thread Ville Syrjälä

On Fri, Dec 05, 2003 at 12:14:18AM +0100, Felix Kühling wrote:
 Hi,
 
 as I'm trying to port the savage driver I stumbled over this. The mga
 and r128 drivers define different values for DEPTH_SCALE in xxx_tris.c.
 This is a parameter of t_dd_tritmp.h and specifies the minimum
 resolvable unit of depth coordinates for computing polygon offsets. In
 mgatris.c it is defined as mmesa-depth_scale, while in r128_tris.c it
 is defined as 1.0. I couldn't find any difference in the way the two
 drivers setup vertices (including the hw_viewport matrix **) that would
 explain the difference. I'm wondering which one is correct. My guess
 based on my vague memory of how the projection transformation works is
 mga, but I may be wrong. Any ideas?

At least on mga glean's polygonOffset test fails with 1.0. It passes with
depth_scale and 16bit depth buffer. The test fails with 15bit, 24bit and
32bit depth buffers regardless of this value. I can make 15bit pass with a
small adjustment to depth_scale...

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78alloc_id371op=click
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Incomplete scene with OpenGL + direct rendering + mga

2003-11-26 Thread Ville Syrjälä

On Wed, Nov 26, 2003 at 09:30:57AM +0100, Jan Gukelberger wrote:
  You should try the daily snapshots available from the DRI website... 
  
 I did so yesterday. 
 I also had to install the XFree86 binary from the 'extra' section and ln -s
 libexpat.so.0 
 libexpat.so.1 
 
 However, it didn't solve any problem. The examples of my first message still
 produce the 
 same results as before. :-( 

Ok I tried this myself in XFree86 and you're absolutely correct. Only the
first triangle is visible.

It looked like a clipping problem. So I added some debug printks into the
drm module but the clipping coordinates are in fact correct.

Then I tried drawing another triangle in the middle and behold the right
hand triangle became visible.

Doing a glFlush() before glutSwapBuffers() makes the last triangle
visible.

This also explains why I didn't see it with DirectFB since it always does
a glFlush() when the context is unlocked.

I think there's something wrong with tracking DMA buffer ages...

 Moreover, bzflag does now often freeze. This happens reproducably when
 enabling the 
 option Smoothing.

Indeed it does. Also many of the mesa demos now segfault in the spanline
functions. I think this might be what happens to bzflag too since
smoothing uses a software fallback. Although bzflag doesn't leave a
coredump for some reason...

 Any further advice? 

I'll see if I can solve these issues and get back to you...

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Incomplete scene with OpenGL + direct rendering + mga

2003-11-26 Thread Ville Syrjälä

On Wed, Nov 26, 2003 at 03:18:08PM +0100, Jan Gukelberger wrote:
  Doing a glFlush() before glutSwapBuffers() makes the last triangle 
  visible. 
  
 That's right! At least, this is a temporary workaround so I can go on
 experimenting with 
 OpenGL programming. 
  
 But I really hate not being able to play bzflag during lunch time ;-(

Both problems are now fixed.

It's a good thing you reported this because your missing triangle was the
same bug that caused the mouse cursor to disappear in ut2k3 and some other
games :) A simple example with source was extremely helpful.

I've uploaded a new mga_dri.so to my wesite http://www.sci.fi/~syrjala/dri/
so you don't have to wait for another daily snapshot.

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [Dri-devel] Incomplete scene with OpenGL + direct rendering + mga

2003-11-25 Thread Ville Syrjälä

On Tue, Nov 25, 2003 at 10:40:27AM +0100, Jan Gukelberger wrote:
 I have tried to compile and run some simple OpenGL programs. But lots 
 of them don't display anything or only part of the scene if direct 
 rendering is enabled. 
 However, with LIBGL_ALWAYS_INDIRECT=y the scene looks like it should 
 do - but animation is extremely slow. 
  
 My system is a standard SuSE 8.2 installation with XFree86 4.3.0 and 
 the standard 2.4.20 kernel. 
 Graphics adapter: Matrox Millenium G550 (only first head configured) 
 I would have already tried to upgrade some packages if i had known 
 where the problem is. 

The mga dri driver had a lot of bugs since no one was actively maintaining
it. I took that job a few months ago and have since been fixing bugs and
adding new features. What we have in cvs should be quite good.

 A program demonstrating this is the following. 
 As this is extracted from example source code of the NeHe OpenGL tutorial 
 (nehe.gamedev.net) there shouldn't be any coding mistakes in. 

This example works fine on my system. I'm running DirectFB instead of
XFree86 but that should not make any difference.

You should try the daily snapshots available from the DRI website...

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: libexpat foo, was: Re: [Dri-devel] ATI radeon 9200 DRI

2003-11-25 Thread Ville Syrjälä

On Tue, Nov 25, 2003 at 12:28:29PM -0800, Ian Romanick wrote:
 The problem seems to be with older distros that have libexpat.so.0 
 instead of libexpat.so.1.  People have found that it works if they 
 symlink libexpat.so.0 to libexpat.so.1.

It's not just older distros. Gentoo has libexpat.so.0 also. libexpat is
version 1.95.7 btw.

Why does the build system link with .1 anyway? Does it have two versions
of libexpat installed and the linker does this automagically or does the
distro package add some patch to expat forcing the name to .1? Gentoo's
expat doesn't seem to have any patches related to this.

 Can we just link with libexpat.so instead?

So someone should just relink the build system's libexpat with -soname
libexpat.so?

-- 
Ville Syrjälä
[EMAIL PROTECTED]
http://www.sci.fi/~syrjala/


---
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

1 2 >

1 - 100 of 151 matches

Mail list logo