Re: [RFC] Host1x/TegraDRM UAPI (sync points)

Mikko Perttunen Tue, 30 Jun 2020 03:26:59 -0700

On 6/29/20 10:42 PM, Dmitry Osipenko wrote:


Secondly, I suppose neither GPU, nor DLA could wait on a host1x sync
point, correct? Or are they integrated with Host1x HW?

They can access syncpoints directly. (That's what I alluded to in the"Introduction to the hardware" section :) all those things have hardwareaccess to syncpoints)


>
> .. rest ..
>

Let me try to summarize once more for my own understanding:

* When submitting a job, you would allocate new syncpoints for the job
* After submitting the job, those syncpoints are not usable anymore

* Postfences of that job would keep references to those syncpoints sothey aren't freed and cleared before the fences have been released* Once postfences have been released, syncpoints would be returned tothe pool and reset to zero

The advantage of this would be that at any point in time, there would bea 1:1 correspondence between allocated syncpoints and jobs; so you couldshuffle the jobs around channels or reorder them.


Please correct if I got that wrong :)

---

I have two concerns:

* A lot of churn on syncpoints - any time you submit a job you might notget a syncpoint for an indefinite time. If we allocate syncpointsup-front at least you know beforehand, and then you have the syncpointas long as you need it.* Plumbing the dma-fence/sync_file everywhere, and keeping it aliveuntil waits on it have completed, is more work than just having theID/threshold. This is probably mainly a problem for downstream, whereupdating code for this would be difficult. I know that's not a properargument but I hope we can reach something that works for both worlds.


Here's a proposal in between:

* Keep syncpoint allocation and submission in jobs as in my originalproposal

* Don't attempt to recover user channel contexts. What this means:

* If we have a hardware channel per context (MLOCKing), just teardown the channel* Otherwise, we can just remove (either by patching or by fullteardown/resubmit of the channel) all jobs submitted by the user channelcontext that submitted the hanging job. Jobs of other contexts would beundisturbed (though potentially delayed, which could be taken intoaccount and timeouts adjusted)* If this happens, we can set removed jobs' post-fences to error statusand user will have to resubmit them.

* We should be able to keep the syncpoint refcounting based on fences.

This can be made more fine-grained by not caring about the user channelcontext, but tearing down all jobs with the same syncpoint. I think theresult would be that we can get either what you described (or how Iunderstood it in the summary in the beginning of the message), or a moretraditional syncpoint-per-userctx workflow, depending on how theuserspace decides to allocate syncpoints.

If needed, the kernel can still do e.g. reordering (you mentioned jobpriorities) at syncpoint granularity, which, if the userspace followedthe model you described, would be the same thing as job granularity.

(Maybe it would be more difficult with current drm_scheduler, sorry,haven't had the time yet to read up on that. Dealing with clearing workstuff up before summer vacation)


Mikko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC] Host1x/TegraDRM UAPI (sync points)

Reply via email to