[whatwg] Cue points in media elements

Brian Campbell Sun, 29 Apr 2007 00:14:33 -0700

I'm a developer of a custom engine for interactive multimedia, andI've recently noticed the work WHATWG has been doing on adding<video> and <audio> elements to HTML. I'm very glad to see thesebeing proposed for addition to HTML, because if they (and severalother features) are done right, it means that there may be a chancefor us to stop using a custom engine, and use an off-the-shelf HTMLengine, putting our development focus on our authoring tools instead.My hope is that eventually, if these features get enough penetration,to put our content up on the web directly, rather than having todistribute the runtime software with it.

I've taken a look at the current specification for media elements,and on the whole, it looks like it would meet our needs. We arecurrently using VP3, and a combination of MP3 and Vorbis audio, forour codecs, so having Ogg Theora (based on VP3) and Ogg Vorbis as abaseline would be completely fine with us, and much preferable to thepatent issues and licensing fees we'd need to deal with if we usedMPEG4.

For the sort of content that we produce, cue points are incrediblyimportant. Most of our content consists of a video or voiceoverplaying while bullet points appear, animations play, and graphics arerevealed, all in sync with the video. We have a very simple systemfor doing cue points, that is extremely easy for the content authorsto write and is robust for paused media, media that is skipped to theend, etc. We simply have a blocking call, WAIT, that waits until aspecific point or the end of a specified media element. For instance,in our language, you might see something like this:


  (movie "Foo.mov" :name 'movie)
  (wait @movie (tc 2 3))
  (show @bullet-1)
  (wait @movie)
  (show @bullet-2)

If the user skips to the end of the media clip, that simply causesall WAITs on that media clip to return instantly. If they skipforward in the media clip, without ending it, all WAITs before thatpoint will return instantly. If the user pauses the media clip, allWAITs on the media clip will block until it is playing again.

This is a nice system, but I can't see how even as simple a system asthis could be implemented given the current specification of cuepoints. The problem is that the callbacks execute "when the currentplayback position of a media element reaches" the cue point. It seemsunclear to me what "reaching" a particular time means. If videoplayback freezes for a second, and so misses a cue point, is thatconsidered to have been "reached"? Is there any way that you canguarantee that a cue point will be executed as long as video haspassed a particular cue point? With a lot of bookkeeping and the"timeupdate" event along with the cue points, you may be able to keeptrack of the current time in the movie well enough to deal with theuser skipping forward, pausing, and the video stalling and restartingdue to running out of buffer. This doesn't address, as far as I cantell, issues like the thread displaying the video pausing forwhatever reason and so skipping forward after it resumes, which maycause cue points to be lost, and which isn't specified to send a"timeupdate" event.

Basically, what is necessary is a way to specify that a cue pointshould always be fired as long as playback has passed a certain time,not just if it "reaches" a particular time. This would prevent usfrom having to do a lot of bookkeeping to make sure that cue pointshaven't been missed, and make everything simpler and less fragile.

We're also greatly interested in making our content accessible, tomeet Section 508 requirements. For now, we are focusing on captioningfor the deaf. We have voiceovers on some screens with no associatedvideo, video that appears in various places on the screen, and theoccasional sound effects. Because there is not a consistent videolocation, nor is there even a frame for voiceovers to appear in, wedon't display the captions directly over the video, but instead sendevents to the current screen, which is responsible for catching theevents and displaying them in a location appropriate for that screen,usually a standard location. In the current spec, all that isprovided for is controls to turn closed captions on or off. Whatwould be much better is a way to enable the video element to sendcaption events, which include the text of the current caption, andcan be used to display those captions in a way that fits the designof the content better.

I hope these comments make sense; let me know if you have anyquestions or suggestions.


Thanks,
Brian Campbell
Interactive Media Lab, Dartmouth College
http://iml.dartmouth.edu

[whatwg] Cue points in media elements

Reply via email to