On 07.01.2013 17:03, Marek Olšák wrote:
On Mon, Jan 7, 2013 at 3:45 PM, Christian König <deathsim...@vodafone.de> wrote:
On 07.01.2013 01:24, Marek Olšák wrote:
On Sun, Jan 6, 2013 at 11:58 PM, Jerome Glisse <j.gli...@gmail.com> wrote:
On Sun, Jan 6, 2013 at 4:00 PM, Marek Olšák <mar...@gmail.com> wrote:
I agree with Christian. You can use a separate instance of
radeon_winsys_cs for the DMA CS. The winsys exposes all the functions
you need (except one) for you to coordinate work between 2 command
streams in the pipe driver. You may only need to expose one additional
winsys function to the driver for synchronization, it's called
"radeon_drm_cs_sync_flush". I'm confident that this can be implemented
and layered on top of the winsys, presumably with fewer lines of code
and cleaner.
The relocation add function need to access both the dma ring and the
cs ring no matter on which ring the relocation is added. Doing the
sync in the pipe driver would increase the code, each call site of
add_reloc would need to check if the bo is referenced by the other
ring and flush the other ring if so. Which also means that there is a
higher likelyhood that someone adding an add reloc forget about the
flushing.
Well, in that case, you can define a new set of functions in the pipe
driver, which are layered on top of radeon_winsys_cs and the existing
interface radeon_winsys::cs_*.

If you want to be super clean, you can add a new module that defines
this command stream pair:

struct r600_cs_with_dma {
     struct radeon_winsys_cs *cs_main, *cs_dma;
};

And define a set of functions which work with that, reimplementing all
the cs_* functions by calling the existing functions of radeon_winsys.
The pipe driver would then use the new CS functions everywhere instead
of radeon_winsys.

To me, the best design decision here is not to try to *hack* the
existing winsys code to make it do what you want without giving it
another thought. Adding another layer is preferable, because it keeps
both parts simple and separated.

Well thinking about it more and more I don't think add_reloc is the right
place to do the sync anyway.

Imagine a loop that wants to handle a bunch of buffers, first they are zero
cleared and then rendered to. Those buffers are unique, so we can zero clear
them all at once. In an ideal world they should all end up in the same DMA
command stream.

Now comes a buffer that is first rendered to and then copied around (for
example), in this moment the DMA command stream needs to be flushed, cause
now a new DMA command stream starts that actually needs to run after the
rendering command stream.

So instead of flushing when we see that a buffer gets added to a command
stream we need to remember in which oder the command stream needs to get
submitted and only flush when this order is going to change.
I agree with all your points. add_reloc is a bad place for
synchronization for yet another reason: you don't really know anything
about what the driver is trying to do and what commands and
relocations are likely to come next, as opposed to e.g. a write
transfer where you are 100% sure that:
- the source resource isn't referenced by a CS nor is it busy
- the destination resource will likely be used pretty soon as a source
for rendering

In addition to that, I believe that using the async DMA is useless for
anything but write-only transfers with a staging resource. Every other
case is always synchronous with rendering and therefore defeats the
purpose of the *async* DMA. I therefore propose:

1) Let's not use the async DMA in resource_copy_region *at all*.

2) Let's replace all the resource_copy_region and copy_buffer
occurencies in transfer_unmap with async DMA copies. The function
implementing the copy should do all the necessary synchronization
between command streams by itself.

Yeah, that pretty much matches with what I had in mind.

Additional to that we should consider removing the duplicate check from add_reloc. It only makes sense for the GFX command stream, only limited sense for compute and UVD command streams, and really hurts us for DMA command streams.

Feel free to submit code for this. I would volunteer to code it myself, but unfortunately don't have time for it at the moment.

Christian.


Marek


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to