> To me this speaks to another aspect of the gallium interface which is
> a bit odd -- in particular the way several of our interfaces basically
> copy their inputs into a structure and pass that back to the state
> tracker.  Why are we doing that?  The state tracker already knows what
> it asked us to do, and there is no reason to assume that it needs us
> to re-present that information back to it.

Yes, only the CSOs don't have this form of copying: all other
structures include the input parameters there.

As a random example pipe_sampler_view has the lots of parameters that
a driver would have converter into the hardware format and are thus
redundant, and unlikely to be read by state tracker.

Textures and buffers also have many visible data members that the
state tracker may or may not read.
In particular, the Mesa state tracker already keeps everything in the
Mesa internal structures, and so benefits little from such data

We may want to consider going toward making _all_ Gallium structures
opaque (and, by the way, using declared-only structs instead of void*
like we do for CSOs, which are not checkable by the compiler).


Another serious data duplication issue are drivers that just copy the
input state in internal structures and return, to then process
everything in draw calls.

This usually results in state being duplicated (and copied) 3 times:
in Mesa internal structures, in the state tracker structures and then
in the driver.
The draw module may also keep a 4th copy of the state.
Note that when reference counting is involved, copies are even more
expensive since they now need atomic operations.

Usually drivers do this because:
1. They need to pass data to the draw module in case of fallbacks, and
thus cannot send it to hardware and forget about it
2. They need to recreate the whole hardware context state in some cases
3. They multiplex multiple pipe_contexts on a single screen
4. They need a global view of state, rather than a single state change
at a time, to decide what to do

A possible solution is to remove all set_* and bind_* calls and
replace them with data members of pipe_context that the state tracker
would use instead of its own internal structures.

In addition, and a new "what's new" bitmask would be added, and the
driver would check it on draw calls.

Performance-wise, this replaces num_state_changes dynamic function
calls to the driver, with (log2(total_states) + num_state_changes)
branches to check the "what's new" bitmask.

Furthermore:
1. State is never copied, since the state tracker constructs it in place
2. There is no longer any need for "state save helper" in the blitter
module and similar
3. The draw module can potentially directly read state from
pipe_context instead of duplicating it yet a
4. Drivers no longer need to have all the functions that store the
parameters, set a dirty flag and return

Note that the Direct3D DDI does not do this, but they have to keep
binary compatibility, which is easier with Set* calls than this
scheme.

softpipe, nvfx, nv50, r300 and probably others already do this
internally, and having the state tracker itself construct the data
would remove a lot of redundant copying code and increase performance.

Having drivers capable of doing "send-to-hardware-and-forget-about-it"
on arbitrary state setting could be a nice thing instead, but
unfortunately a lot of hardware fundamentally can't do this, since for
instance:
1. Shaders need to be all seen to be linked, possibly modifying the
shaders themselves (nv30)
2. Constants need to be written directly into the fragment program (nv30-nv40)
3. Fragment programs depend on the viewport to implement
fragment.position (r300)
4. Fragment programs depend on bound textures to specify normalization
type and emulate NPOT (r300, r600?, nv30)
and so on...
5. Sometimes sampler state and textures must be seen together since
the hardware mixes it


> The only really new information provided by the driver to the state
> tracker by transfer_create + transfer_map is:
> - the pointer to the data
> - stride
> - slice stride

There is also the 3D box, unless transfers start covering the whole
resource, which seems really suboptimal for stuff like glTexSubImage.

This needs to be provided to the driver unless a buffer-specialized
interface is made (then a 1D box is enough).


> Thanks for the summary.  I'd add that there is also some information
> available publicly about the D3D10 DDI, which follows a slightly
> different interface to the API.  In that world, there is a single
> create resource function:

It is indeed extremely interesting, and it looks like it should be the
first place to look for inspiration for Gallium interface.

I added a comparison of the D3D11 DDI and Gallium to src/gallium/docs.

> There is however clearly concern about the possible need for
> specialized transfer mechanisms for particular buffer types.  It seems
> like they've taken an approach that leaves the choice to the driver
> whether to specialize or not -- basically there are a number of
> specialized map/unmap entrypoints, but all with the same function
> prototype so that a driver could if it wanted to point them all to a
> single generic implementation, or if it preferred, provide specialized
> implementations for some of them.  There is some discussion of these
> choices in the page below:

Yes, indeed; not sure why they do it though since the user-visible API
appears to be unified.

> I'm really keen to get gallium-resources merged - probably combined
> with the buffer_usage_cleanup branch.  I suspect there are some
> lingering bugs in -resources that are addressed by the cleanup branch.
>  Have you had a chance to do any testing of the changes I made on
> -resources or -cleanup?

I added a commit to fix nvfx and make nv50 compile on
gallium-resources; some of the remaining issues seem to be addressed
by buffer-usage-cleanup.
It may be useful to merge gallium-resources into -cleanup again so the
latest fixes there are picked up.

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to