On Wed, Feb 27, 2013 at 6:14 PM, John Stultz <john.stu...@linaro.org> wrote:
> Also note: I've done this so far without any feedback from the Android devs
> (despite my reaching out to Erik a few times recently), so if they object to
> pushing it to staging, in deference to it being their code I'll back off,
> even though I do think it would be good to have the code get more visibility
> upstream in staging. I don't mean to step on anyone's toes. :)

Yeah, sorry about that.  I kept meaning to get back to you but kept
getting distracted.  A little background on the patches:

In Honeycomb where we introduced the Hardware Composer HAL.  This is a
userspace layer that allows composition acceleration on a per platform
basis.  Different SoC vendors have implemented this using overlays, 2d
blitters, a combinations of both, or other clever/disgusting means.
Along with the HWC we consolidated a lot of our camera and media
pipeline to allow their input to be fed into the GPU or
display(overlay.)  In order to exploit parallelism the the graphics
pipeline, this introduced lots of implicit synchronization
dependancies.  After a couple years of working with many different SoC
vendors, we found that it was really difficult to communicate our
system's expectations of the implicit contract and it was difficult
for the SoC vendors to properly implement the implicit contract in
each of their IP blocks (display, gpu, camera, video codecs).  It was
also incredibly difficult to debug when problems/deadlocks arose.

In an effort to clean up the situation we decided to create set of
simple synchronization primitives and have our compositor
(SurfaceFlinger) manage the synchronization contract explicitly.  We
designed these primitives so that they can be passed across processes
(much like ion/dma_buf handles), can be backed by hardware
synchronization primitives, and can be combined with other sync
dependancies in a heterogeneous manner.  We also added enough
debugging information to make pinpointing a synchronization deadlock
bug easier.  There are also OpenGL extensions added (which I believe
have been ratified by Khronos) to convert a "native" sync object to a
gl fence object and vise versa.

So far shipped this system on two products (the Nexus 10 and 4) with
two different SoCs (Samsung Exynos5250 and Qualcomm MSM8064.)  These
two projects were much easier to work out the kinks in the
graphics/compositing pipelines.  In addition we were able to use the
telemetry and tracing features to track down the causes of dropped
frames aka "jank."

As for the implementation, I started with having the main driver op
primitive be a wait() op.  I quickly noticed that most of the tricky
race condition prone code was ending up in the drivers wait() op.  It
also made handling asynchronous waits of more than one type of sync_pt
difficult to manage.  In the end I opted for something roughly like
poll() where all the heavy lifting is done at the high level and the
drivers only need to implement a simple check function.

Happy to hear feedback and (especially) bug reports/fixes.

Cheers,
    Erik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to