On Mar 21, 2007, at 6:16 PM, Ian Hickson wrote:
On Wed, 21 Mar 2007, Maciej Stachowiak wrote:
With the recent discussions about the <video> element, we've
decided to
post our own proposal in this area. This proposal is a joint
effort from
the Safari/WebKit team and some of Apple's top timed media
experts, who
have experience with QuickTime and other media technologies.
Great!
http://webkit.org/specs/HTML_Timed_Media_Elements.html
Looking at this in the context of the current spec:
* The <audio>, "controller", playback rate, start and end times,
step(),
and looping features were left out of the current version of the
spec in
the interests of simplicity. I understand that Apple wishes to
implement
a larger set of features in one go than the spec currently
describes;
naturally I am interested in making sure the spec covers your
needs as
well. My concern is that we make the spec too complicated for other
browser vendors to implement interoperably in one go. Biting off
more
than we can chew is a common mistake in Web specification
development.
Starting with simple features, and adding features based on demand
rather than just checking off features for parity with other
development
environments leads to a more streamlined API that is easier to use.
How should we approach this?
I'd like to hear from other browser vendors how they feel about this.
We think many of the new features are needed to be able to fully
replace plugin-based solutions. I think it would be reasonable to
agree on a "please implement this first" subset, but I'd like to hear
especially from Mozilla and Opera reps on this.
Regarding specific features: what are the use cases for start/end
and
looping? People keep telling me they're important but the use
cases I've
seen either don't seem like they would work well with a declarative
mechanism (being better implemented directly using cue marks and
JS), or
are things that you wouldn't do using HTML anyway (like a user
wanting
to bookmark into a video -- they're not going to be changing the
markup
themselves, so this doesn't solve their use case).
Looping is useful for more presentational uses of video. Start and
end time are useful in case you want to package a bunch of small bits
of video in one file and just play different segments, similar to the
way content authors sometimes have one big image and use different
subregions. Or consider looping audio, or a single audio file with
multiple sound effects. These are two examples.
For <audio> in general, there's been very little demand for <audio>
other than from people suggesting that it makes abstract logical
sense
to introduce <audio> and <video> at the same time. But there is
clearly
demand for something like this on the Web, e.g. internet radio,
Amazon
track sampling, etc. I'm not sure how similar the APIs should be.
I think <audio> can use almost the exact same APIs for most things as
<video>. This has the nice side benefit that new Audio() can just
make an <audio> element and provide all the relevant useful API.
* I'm concerned about the "type" attribute for content negotiation.
[... snip ...] I'll respond to this in a separate reply about this
and broader codec issues.
* The "mute" feature is IMHO better left at the UI level, with the API
only having a single volume attribute. This is because there are
multiple ways to implement muting, and it seems better to not
bias the
API towards a particular method.
(I've seen three major muting interfaces: a mute button that sets a
temporary override which is independent of volume, a mute button
that
simply sets the volume to zero, and a -20dB button that you hit
two or
three times to get to 0.)
Having said that, without a mute API, the UA and the author UI can't
stay synchronised with respect to mute state.
As discussed on IRC, I think all three models can be implemented well
with a mute API, and I don't think the mute independent of volume can
be implemented quite right if multiple things can be controlling the
video and you don't have a mute API.
* What's the use case for hasAudio or hasVideo? Wouldn't the author
know
ahead of time whether the content has audio or video?
That depends. If you are displaying one fixed piece of media, then
sure. If you are displaying general user-selectable content, then not
necessarily. You might want to hide or disable volume controls for a
video with no soundtrack for instance. Or you might want to show some
filler content for content with a video/* MIME type that does not in
fact have a video track (which is valid per the relevant RFCs - video
MIME types say video may be present, but do not promise it).
* The states in this proposal are orthogonal to the states in the
current
spec; both look useful, though, and maybe we should have both.
Anybody
have any opinions on this?
I'll have to read over both sets of states more closely.
Regarding your states: In our proposal, we don't distinguish stopped
and paused. A stop operation would just be "pause(); currentTime = 0;
currentLoop = 0;". "AUTOPAUSED" would be the condition where you
return to "PRESENTABLE" or "UNDERSTANDABLE" state from "PLAYABLE" or
"PLAYRHOUGHOK" when isPaused is false. "PLAYING" would be the case
where you are in "PLAYABLE" , "PLAYRHOUGHOK" or "LOADED" state and
isPaused is false.
So at first glance, I think our proposed states plus the isPaused
boolean subsume yours, and are more immediately useful for a custom
controller UI.
* Time triggers, or cue marks, are a useful feature that has currently
been left in the v2 list; I've heard some demand for this though
and I
would not be opposed to putting this in v1 if people think we
should.
I think it's pretty useful, since a lot of edge-case features (like
triggering a URL navigation at a particular time in the video) can be
handled by this.
* I have no objection to adding more events. Once we have a better
idea
what should happen here I'll add the relevant events.
Sounds good.
Regards,
Maciej