On Sat, 28 Mar 2009 05:57:35 +0100, Benjamin M. Schwartz <bmsch...@fas.harvard.edu> wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dear What,

Short: <video> won't work on slow devices.  Help!

Long:
The <video> tag has great potential to be useful on low-powered computers
and computing devices, where current internet video streaming solutions
(such as Adobe's Flash) are too computationally expensive.  My personal
experience is with OLPC XO-1*, on which Flash (and Gnash) are terribly
slow for any purpose, but Theora+Vorbis playback is quite smooth at
reasonable resolutions and bitrates.

The <video> standard allows arbitrary manipulations of the video stream
within the HTML renderer.  To permit this, the initial implementations
(such as the one in Firefox 3.5) will perform all video decoding
operations on the CPU, including the tremendously expensive YUV->RGB
conversion and scaling.  This is viable only for moderate resolutions and
extremely fast processors.

Recognizing this, the Firefox developers expect that the decoding process
will eventually be accelerated. However, an accelerated implementation of
the <video> spec inevitably requires a 3D GPU, in order to permit
transparent video, blended overlays, and arbitrary rotations.

Pure software playback of video looks like a slideshow on the XO, or any
device with similar CPU power, achieving 1 or 2 fps.  However, these
devices typically have a 2D graphics chip that provides "video overlay"
acceleration: 1-bit alpha, YUV->RGB, and simple scaling, all in
special-purpose hardware.**  Using the overlay (via XVideo on Linux)
allows smooth, full-speed playback.

THE QUESTION:
What is the recommended way to handle the <video> tag on such hardware?

There are two obvious solutions:
0. Implement the spec, and just let it be really slow.
1. Attempt to approximate the correct behavior, given the limitations of
the hardware.  Make the video appear where it's supposed to appear, and
use the 1-bit alpha (dithered?) to blend static items over it.  Ignore
transparency of the video.  Ignore rotations, etc.
2. Ignore the HTML context.  Show the video "in manners more suitable to
the user (e.g. full-screen or in an independent resizable window)".

Which is preferable?  Is it worth specifying a preferred behavior?

In the typical case a simple hardware overlay correctly positioned could be used, but there will always be a need for a software fallback when rotation, filters, etc are used. Like Robert O'Callahan said, a user agent would need to detect when it is safe to use hardware acceleration and use it only then.

If there is something that could be changed in the spec to make things a bit easier for user agents it might be an overlay attribute, just like SVG has: http://www.w3.org/TR/SVGTiny12/multimedia.html#compositingBehaviorAttribute

I'm not convinced such an attribute would help, just pointing it out here...

--
Philip Jägenstedt
Opera Software

Reply via email to