On 5/7/07, Thomas Hellström <[EMAIL PROTECTED]> wrote:
> Dave Airlie wrote:
> >> I've started a cleanup branch,
> >>
> >> I've just checked in the mm_ioctl split out into separate parts, I'll
> >> try and get fence and buffer done as well..
> >>
> >> This will break compatiblity but to be honest, anyone who has deployed
> >> this beyond embedded system work (i.e. TG and me), deserves what they
> >> get for integrating unreleased code :-)
> >
> > Of course quite how to fix up the buffer object ioctl chain scheme I'm
> > not sure, I'm not 100% sure this is really a win in most situations??
> > have we any numbers?
> >
> > Dave.
> Dave,
> It's good that you've found time to do this.
>
> The IOCTL chaining is presently only used to submit a list of buffers
> for validation. Any other use can go away, and if we can find a better
> way of submitting a buffer list (be it a linked list or array) for a
> validatebufferlist  we can and should probably use that.

Okay I've cleaned up this stuff as best I can, it builds I haven't
tested it as my crash box is hooked up doing some other testing at the
moment... but I kept the list for the fence/validate operations...

I've also removed the typedefs from most of the code and I don't see a
major impact on readability so I may proceed to do this in more
places..

>
> If there is a consensus to skip the "validatebufferlist" functionality
> altogether we can skip also that, but we still need a way to submit a
> buffer list with the device-specific "submit" ioctl so It is my gut
> feeling that we should have a "validatebufferlist" reference
> implementation. I'm also not sure about what the Nouveau guys think of
> removing this functionality? I like the linked list for the following
> two reasons:
>
>    1. When you start a command buffer sequence with BEGIN_BATCH(_dwords,
>       _relocs, _flags), you know the number of relocs and the space
>       needed in the batch buffer, but you don't know for sure the number
>       of buffers on the validate list.
>    2. It's been tested, debugged and it works ok.
>
> Another important consideration is "what buffer attribues should we
> allow the validate call to change?".
> Currently it can only change flags, with a mask and give a hint as to
> how a certain operation should be performed.
> I'd like to see the following attribute as well:
>
>     * The fence class, That is, "what command submission mechanism are
>       we validating for?". If the fence class does change compared to
>       the current fence (if any), we need to expire the current fence
>       before validating. This is a need for for hardware with multiple
>       command submission mechanism.
>
>
> Then there's tiling. I think the kernel needs to know about tiling to
> efficiently set up tiling (AKA "fence") registers in the limited GPU
> virtual address space (if possible on a "per buffer" basis), and then it
> needs to know also about the desired tile stride and a device specific
> tile-type. User space gets back the actual set tile-type and the
> underlying GPU virtual space tile-stride (which may exceed the desired
> tile stride due to hardware limitations).  I think setting the tile
> attributes can be done using a separate IOCTL, but the actual GPU
> virtual space tile stride needs to be reported back with the validate IOCTL.
>
> so a typical command sequence for a tiled buffer would be:
>
> drmBOSetTiled(bo, 3 /* Tile stride */, DRM_INTEL_FLAG_TILED_XMAJOR);
>
> BEGIN_BATCH(5 /*dwords*/, 2 /*relocs*/, myFlags);
> OUT_BATCH(...)
> OUT_BATCH(...)
> OUT_RELOC_OFFSET(buffer);
> OUT_RELOC_STRIDE(buffer);
> OUT_BATCH(FIRE_CMD);
> ADVANCE_BATCH();
>
> To summarize:
>
>     * Tile parameters can be set using a separate ioctl (tile stride and
>       tile type)
>     * An extra validate "TILED" flag can make DRM set up the buffer for
>       tiling.
>     * Validate needs to return the GPU virtual space tile-stride, and
>       perhaps the actual tile type.
>     * We can use pte tricks to always have the _desired_ tile stride for
>       CPU mappings. (This means we might have different strides in GPU-
>       and CPU space).
>     * Only the pages actually _used_ by the buffer are inserted into GPU
>       virtual space. This may not save GPU virtual space, but it saves a
>       lot of memory for buffers with odd strides. (For example 1025).
>

This interface sounds okay for tiling, I'd need to play around with it
to be sure it would work on radeon, which I probably won't get to
before I go on holidays... but I was looking at doing this for my own
project as it might save me 2-3MBs of memory I'm needlessly mapping
into the GTT at the moment to fill in the tile gaps..

Dave.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to