Re: [whatwg] cue points in media elements
On Wed, 24 Oct 2007, Dave Singer wrote: Caution: cross-posted to whatwg and htmlwg; be careful with follow-ups! Actually, please don't cross-post new threads to both groups. As mentioned earlier this week, I only cross-post when the messages I'm replying to were sent to both groups as a convenience to both groups so they can see what progress is being made on issues that were discussed there; as a general rule it's better to not cross-post. Thanks! We've been looking into both semantic and implementation considerations of cue points. We wonder whether cue ranges might not make more sense. Done. I also changed the way that cue points (er, ranges) are removed, which I think will make it easier to handle swapping in sets of subtitles or the like. Comments welcome. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] cue points in media elements
Caution: cross-posted to whatwg and htmlwg; be careful with follow-ups! * * * * * We've been looking into both semantic and implementation considerations of cue points. We wonder whether cue ranges might not make more sense. Cues might often be used to establish appropriate parallel state. For example, cues could be used to show 'chapter names', or to provide commentary in an HTML pane on the display. Under these circumstances, the question arises as to what the right behavior is when seeking. Should any of the cue-points preceding the seek point be activated (in order to establish the right context), and if so, how many? Should any of the cue-points after the previous play point be activated to tear-down any state at that point? There is also an implementation question. What should happen if cue-points are more dense than the playback software can process in real-time? In video, this would cause catch-up techniques (e.g. frame-dropping). But dropping cue-points is problematic. If it's permitted, any cue-points that depend on previous ones having also fired (when playing linearly) cannot assume that they have, in fact, fired. They have to re-establish state without any regard for context, which may complicate them. (Though it's true that to an extent they have to do this anyway, if seeking can happen). Worse, if the event to set a parallel state (e.g. a parental warning on a blue passage) is executed, and the event to remove it is not, the resulting display may be misleading or semantically incorrect or inconsistent. These questions seem to resolve much better with cue ranges. For a cue range, events are executed on either both entry and exit, or neither, much in the way that mouse events are generated for cursor movement, giving either both mouseEnter and mouseExit or neither. Similarly, fast mouse movements might tunnel right across a region with neither an entry nor exit event. Formally, the logical definition of a cue range event would be that the time is periodically sampled (as densely as possible). At each sampling instant, a cue event is dispatched: * for every range for which the previous sampling instant was in that range, and the current sampling instant is not; * for every range for which the previous sampling instant was not in that range, and the current sampling instant is. Note that * this formal definition is amenable to optimization, by looking ahead to the 'next interesting time' when a cue rang starts or ends, when playing. * for any range for which you get an entry, you are assured you will get an exit eventually. * you are not guaranteed to get the events *at* their defined times; they might be 'late', though the system should be implemented in such a way as to minimize lateness. * short ranges might experience no sampling instant within them, and might be skipped, posting no events, though this also should be avoided if possible by implementations. * on a seek, you will get exit events for the time seeked from, if appropriate, as well as entry events for the time seeked to. I would suggest that the cue-range interval includes its start time but excludes its end-time. Therefore seeking to the exact start time of a cue-range, even before playback is started, fires its entry event (if we were previously outside the range), whereas seeking to the end-time of a range, even before playback started, fires its exit event (if we were previously inside it). If reverse playback is started after such seeks, then you get immediately another event (exit or entry), but I think that's OK as reverse playback is unusual. I guess the algorithm could be sensitive to the sign of the default playback rate, but that seems both excessively complicated, and also raises questions of what happens if the sign is changed while paused. If a cue-range end time is the same as its start time, one merely gets two events (enter and exit) dispatched at the same time, or nothing at all (if it gets 'tunneled over'). Does this ease both the semantic and implementation considerations? -- David Singer Apple/QuickTime
Re: [whatwg] Cue points in media elements
On Sun, 29 Apr 2007, Brian Campbell wrote: The problem is that the callbacks execute when the current playback position of a media element reaches the cue point. It seems unclear to me what reaching a particular time means. If video playback freezes for a second, and so misses a cue point, is that considered to have been reached? Is there any way that you can guarantee that a cue point will be executed as long as video has passed a particular cue point? With a lot of bookkeeping and the timeupdate event along with the cue points, you may be able to keep track of the current time in the movie well enough to deal with the user skipping forward, pausing, and the video stalling and restarting due to running out of buffer. This doesn't address, as far as I can tell, issues like the thread displaying the video pausing for whatever reason and so skipping forward after it resumes, which may cause cue points to be lost, and which isn't specified to send a timeupdate event. I've defined what reaching a particular time means. I have explicitly made it invoke the times that might get skipped due to missing frames during normal playback. I have also made it _not_ fire the callbacks for times in between the old and new positions when seeking. Basically, what is necessary is a way to specify that a cue point should always be fired as long as playback has passed a certain time, not just if it reaches a particular time. This would prevent us from having to do a lot of bookkeeping to make sure that cue points haven't been missed, and make everything simpler and less fragile. You can use the timeupdate event for this -- it fires whenever a cue point is hit, and whenever the timeline is seeked (even implicitly by the looping algorithm). For now, we are focusing on captioning for the deaf. We have voiceovers on some screens with no associated video, video that appears in various places on the screen, and the occasional sound effects. Because there is not a consistent video location, nor is there even a frame for voiceovers to appear in, we don't display the captions directly over the video, but instead send events to the current screen, which is responsible for catching the events and displaying them in a location appropriate for that screen, usually a standard location. In the current spec, all that is provided for is controls to turn closed captions on or off. What would be much better is a way to enable the video element to send caption events, which include the text of the current caption, and can be used to display those captions in a way that fits the design of the content better. I've added this to the list for version 2 features. I'm interested in seeing what the requirements are for captions before we go ahead and spec them in too much detail. Implementation feedback will be helpful here. Thanks for your feedback! On Mon, 30 Apr 2007, Ralph Giles wrote: I'd be more in favor of triggering any cue point callbacks that lie between the current playback position and the current playback position of the next frame (audio frame for audio/ and video frame for video/ I guess). That means more bookkeeping to implement your system, but is less surprising in other cases. Could you elaborate on this? Right now the system triggers cue points up to the current displayed frame, and some cue points between the current frame and the next frame, if the gap between the frames is long enough that the time updates more often than the framerate. As I read it, cue points are relative to the current playback position, which does not advance if the stream buffer underruns, but it would if playback restarts after a gap, as might happen if the connection drops, or in an RTP stream. My proposal above would need to be amended to handle that case, and the decoder dropping frames...finding the right language here is hard. Does the new text work for this? A more abstract interface is necessary than just 'caption events'. Here are some use cases worth considering: * A media file has embedded textual metadata like title, author, copyright license, that the designer would like to access for associated display elsewhere in the page, or to alter the displayed user interface based on the metadata. This is pretty essential for parity with flash-based internet radio players. * The designer wants to access closed captioned or subtitle text through the DOM as it becomes available for display elsewhere in the page. * There are points in the media file where the embedded metadata changes. These points cannot be retrieved without scanning the file, which is expensive over the network, and may not be possible in general if the stream is a live feed. Nevertheless, the designer wants to be notified when the associated metadata changes so other elements can be updated. This is in fact the normal case for http streaming
Re: [whatwg] Cue points in media elements
At 17:04 -0400 1/05/07, Brian Campbell wrote: On May 1, 2007, at 1:05 PM, Kevin Calhoun wrote: I believe that a cue point is reached if its time is traversed during playback. What does traversed mean in terms of (a) seeking across the cue point (b) playing in reverse (rewinding) and (c) the media stalling an restarting at a later point in the stream? I would say that playing (at any rate and in any direction) is a continuous function, and therefore cue points are triggered, when playing, whenever two samples of the time straddle the cue point (where straddel includes one of the samples being at the cue point). Seeking is discontinuous, and therefore cue points are triggered only if a seek results in landing on the cue point, if not playing. If playing, then the usual rules apply. Frame dropping, stalling, and so on, are aspects of the playback behavior and nothing to do with the logical model of cues laid on a time axis. -- David Singer Apple Computer/QuickTime
Re: [whatwg] Cue points in media elements
On 4/29/07, Brian Campbell [EMAIL PROTECTED] wrote: For the sort of content that we produce, cue points are incredibly important. Most of our content consists of a video or voiceover playing while bullet points appear, animations play, and graphics are revealed, all in sync with the video. We have a very simple system for doing cue points, that is extremely easy for the content authors to write and is robust for paused media, media that is skipped to the end, etc. We simply have a blocking call, WAIT, that waits until a specific point or the end of a specified media element. For instance, in our language, you might see something like this: (movie Foo.mov :name 'movie) (wait @movie (tc 2 3)) (show @bullet-1) (wait @movie) (show @bullet-2) If the user skips to the end of the media clip, that simply causes all WAITs on that media clip to return instantly. If they skip forward in the media clip, without ending it, all WAITs before that point will return instantly. If the user pauses the media clip, all WAITs on the media clip will block until it is playing again. This is a nice system, but I can't see how even as simple a system as this could be implemented given the current specification of cue points. The problem is that the callbacks execute when the current playback position of a media element reaches the cue point. It seems unclear to me what reaching a particular time means. If video playback freezes for a second, and so misses a cue point, is that considered to have been reached? Is there any way that you can guarantee that a cue point will be executed as long as video has passed a particular cue point? With a lot of bookkeeping and the timeupdate event along with the cue points, you may be able to keep track of the current time in the movie well enough to deal with the user skipping forward, pausing, and the video stalling and restarting due to running out of buffer. This doesn't address, as far as I can tell, issues like the thread displaying the video pausing for whatever reason and so skipping forward after it resumes, which may cause cue points to be lost, and which isn't specified to send a timeupdate event. Basically, what is necessary is a way to specify that a cue point should always be fired as long as playback has passed a certain time, not just if it reaches a particular time. This would prevent us from having to do a lot of bookkeeping to make sure that cue points haven't been missed, and make everything simpler and less fragile. In order to capture this kind of situations, with flexibility in mind, I think the concept of cue points may be changed to cue periods... Method names: addEnterCuePeriod(time1, time2, callback) removeEnterCuePeriod(time1, time2, callback) addLeaveCuePeriod(time1, time2, callback) removeLeaveCuePeriod(time1, time2, callback) The callback function mentioned by addEnterCuePeriod will be invoked once when the video enter the period of time bounded by time1 and time2. How the video get to a frame between time1 and time2 doesn't matter. i.e. the callback function may be invoked by a normally playing video reaching time1, a video being fast forward / wind back into the period between time1 time2, or a particular timing between time1 time2 of the video being directly seek for. The mechanism of LeaveCuePeriod is similar, while this time the callback is invoked when the video leave the specified cue period. (Or should this pair of methods left out?) With these four methods, one can not only achieve the bullet point effect, but also video captions appearance and disappearance. Hope this helps. 郁
Re: [whatwg] Cue points in media elements
On Apr 30, 2007, at 4:15 PM, Ralph Giles wrote: [On Apr 29, 2007, at 12:14 AM, Brian Campbell wrote:[ If video playback freezes for a second, and so misses a cue point, is that considered to have been reached? As I read it, cue points are relative to the current playback position, which does not advance if the stream buffer underruns, but it would if playback restarts after a gap, as might happen if the connection drops, or in an RTP stream. My proposal above would need to be amended to handle that case, and the decoder dropping frames...finding the right language here is hard. I believe that a cue point is reached if its time is traversed during playback. - Kevin
Re: [whatwg] Cue points in media elements
On Apr 30, 2007, at 7:15 PM, Ralph Giles wrote: Thanks for adding to the discussion. We're very interested in implementing support for presentations as well, so it's good to hear from someone with experience. Thanks for responding, I'm glad to hear your input. On Sun, Apr 29, 2007 at 03:14:27AM -0400, Brian Campbell wrote: in our language, you might see something like this: (movie Foo.mov :name 'movie) (wait @movie (tc 2 3)) (show @bullet-1) (wait @movie) (show @bullet-2) If the user skips to the end of the media clip, that simply causes all WAITs on that media clip to return instantly. If they skip forward in the media clip, without ending it, all WAITs before that point will return instantly. How does this work if, for example, the user seeks forward, and then back to an earlier position? Would some of the 'show's be undone, or do they not seek backward with the media playback? We don't expose arbitrary seeking controls to our users; just play/ pause, skip forward back one card (which resets all state to a known value) and skip past the current video/audio (which just causes all waits on that media element to return instantly). Is the essential component of your system that all the shows be called in sequence to build up a display state, or that the last state trigger before the current playback point have been triggered? The former. Isn't this slow if a bunch of intermediate animations are triggered by a seek? Yes, though this is more a bug in our animation API (which could be taught to skip directly to the end of an animation when associated video/audio ends, but that just hasn't been done yet). Actually, that brings up another point, which is a bit more speculative. It may be nice to have a way to register a callback that will be called at animation rates (at least 15 frames/second or so) that is called with the current play time of a media element. This would allow you to keep animations in sync with video, even if the video might stall briefly, or seek forward or backward for whatever reason. We haven't implemented this in our current system (as I said, it still has the bug that animations still take their full time to play even when you skip video), but it may be helpful for this sort of thing. Does your system support live streaming as well? That complicates the design some when the presentation media updates appear dynamically. No, we only support progressive download. Anyway I think you could implement your system with the currently proposed interface by checking the current playback position and clearing a separate list of waits inside your timeupdate callback. I agree, it would be possible, but from my current reading of the spec it sounds like some cue points might be missed until quite a bit later (since timeupdate isn't guaranteed to be called every time anything discontinuous happens with the media). In general, having to do extra bookkeeping to keep track of the state of the media may be fragile, so stronger guarantees about when cue points are fired is better than trying to keep track of what's going on with timeupdate events. I agree this should be clarified. The appropriate interpretation should be when the current playback position reaches the frame corresponding to the queue point, but digital media has quantized frames, while the cue points are floating point numbers. Triggering all cue point callbacks between the last current playback position and the current one (including during seeks) would be one option, and do what you want as long as you aren't seeking backward. I'd be more in favor of triggering any cue point callbacks that lie between the current playback position and the current playback position of the next frame (audio frame for audio/ and video frame for video/ I guess). That means more bookkeeping to implement your system, but is less surprising in other cases. Sure, that would probably work. As I said, bookkeeping is generally a problem because it might get out of sync, but with stronger guarantees about when cue points are triggered, I think it could work. If video playback freezes for a second, and so misses a cue point, is that considered to have been reached? As I read it, cue points are relative to the current playback position, which does not advance if the stream buffer underruns, but it would if playback restarts after a gap, as might happen if the connection drops, or in an RTP stream. My proposal above would need to be amended to handle that case, and the decoder dropping frames...finding the right language here is hard. Yes, it's a tricky little problem. Our current system stays out of trouble because it makes quite a few simplifying assumptions (video is played forward only, progressive download, not streaming, etc). Obviously, in order to support a more general API, you're
Re: [whatwg] Cue points in media elements
On May 1, 2007, at 1:05 PM, Kevin Calhoun wrote: I believe that a cue point is reached if its time is traversed during playback. What does traversed mean in terms of (a) seeking across the cue point (b) playing in reverse (rewinding) and (c) the media stalling an restarting at a later point in the stream?
Re: [whatwg] Cue points in media elements
Hearing about cue points in media elements. Just sorta reminds me of keyTimes in SMIL. I know SMIL seems funky to some people, but I do really love it! It is so way cool! So far as I know it doesn't do quite what you're talking about here, but it does similar stuff including non-linear distortions of timing elements and the like. It's declarative (though I don't think it's Turing complete -- wager of virtual beans proposed) and its syntax is worthy of emulation in that classical ontology recapitulates philology sort of sense. It is so much a W3C standard that it has six or eight or twelve standards devoted to it. David Dailey (who is trying to learn how not to re-invent wheels) http://srufaculty.sru.edu/david.dailey/copyright/dailey_on_copyright.htm Damn bastard mutant wheels keep popping up around me like unwanted copyrighted utterances in a world where intellectual landfills are charged by the bit! -- anonymous - Original Message - From: Brian Campbell [EMAIL PROTECTED] To: Ralph Giles [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Tuesday, May 01, 2007 4:57 PM Subject: Re: [whatwg] Cue points in media elements On Apr 30, 2007, at 7:15 PM, Ralph Giles wrote: Thanks for adding to the discussion. We're very interested in implementing support for presentations as well, so it's good to hear from someone with experience. Thanks for responding, I'm glad to hear your input. On Sun, Apr 29, 2007 at 03:14:27AM -0400, Brian Campbell wrote: in our language, you might see something like this: (movie Foo.mov :name 'movie) (wait @movie (tc 2 3)) (show @bullet-1) (wait @movie) (show @bullet-2) If the user skips to the end of the media clip, that simply causes all WAITs on that media clip to return instantly. If they skip forward in the media clip, without ending it, all WAITs before that point will return instantly. How does this work if, for example, the user seeks forward, and then back to an earlier position? Would some of the 'show's be undone, or do they not seek backward with the media playback? We don't expose arbitrary seeking controls to our users; just play/ pause, skip forward back one card (which resets all state to a known value) and skip past the current video/audio (which just causes all waits on that media element to return instantly). Is the essential component of your system that all the shows be called in sequence to build up a display state, or that the last state trigger before the current playback point have been triggered? The former. Isn't this slow if a bunch of intermediate animations are triggered by a seek? Yes, though this is more a bug in our animation API (which could be taught to skip directly to the end of an animation when associated video/audio ends, but that just hasn't been done yet). Actually, that brings up another point, which is a bit more speculative. It may be nice to have a way to register a callback that will be called at animation rates (at least 15 frames/second or so) that is called with the current play time of a media element. This would allow you to keep animations in sync with video, even if the video might stall briefly, or seek forward or backward for whatever reason. We haven't implemented this in our current system (as I said, it still has the bug that animations still take their full time to play even when you skip video), but it may be helpful for this sort of thing. Does your system support live streaming as well? That complicates the design some when the presentation media updates appear dynamically. No, we only support progressive download. Anyway I think you could implement your system with the currently proposed interface by checking the current playback position and clearing a separate list of waits inside your timeupdate callback. I agree, it would be possible, but from my current reading of the spec it sounds like some cue points might be missed until quite a bit later (since timeupdate isn't guaranteed to be called every time anything discontinuous happens with the media). In general, having to do extra bookkeeping to keep track of the state of the media may be fragile, so stronger guarantees about when cue points are fired is better than trying to keep track of what's going on with timeupdate events. I agree this should be clarified. The appropriate interpretation should be when the current playback position reaches the frame corresponding to the queue point, but digital media has quantized frames, while the cue points are floating point numbers. Triggering all cue point callbacks between the last current playback position and the current one (including during seeks) would be one option, and do what you want as long as you aren't seeking backward. I'd be more in favor of triggering any cue point callbacks that lie between the current playback position and the current playback position of the next frame (audio frame for audio
Re: [whatwg] Cue points in media elements
Thanks for adding to the discussion. We're very interested in implementing support for presentations as well, so it's good to hear from someone with experience. Since we work on streaming media formats, I always assumed things would have to be broken up by the server and the various components streamed separately to a browser, and I hadn't noticed the cue point support until you pointed it out. Some comments and questions below... On Sun, Apr 29, 2007 at 03:14:27AM -0400, Brian Campbell wrote: in our language, you might see something like this: (movie Foo.mov :name 'movie) (wait @movie (tc 2 3)) (show @bullet-1) (wait @movie) (show @bullet-2) If the user skips to the end of the media clip, that simply causes all WAITs on that media clip to return instantly. If they skip forward in the media clip, without ending it, all WAITs before that point will return instantly. How does this work if, for example, the user seeks forward, and then back to an earlier position? Would some of the 'show's be undone, or do they not seek backward with the media playback? Is the essential component of your system that all the shows be called in sequence to build up a display state, or that the last state trigger before the current playback point have been triggered? Isn't this slow if a bunch of intermediate animations are triggered by a seek? Does your system support live streaming as well? That complicates the design some when the presentation media updates appear dynamically. Anyway I think you could implement your system with the currently proposed interface by checking the current playback position and clearing a separate list of waits inside your timeupdate callback. This is a nice system, but I can't see how even as simple a system as this could be implemented given the current specification of cue points. The problem is that the callbacks execute when the current playback position of a media element reaches the cue point. It seems unclear to me what reaching a particular time means. I agree this should be clarified. The appropriate interpretation should be when the current playback position reaches the frame corresponding to the queue point, but digital media has quantized frames, while the cue points are floating point numbers. Triggering all cue point callbacks between the last current playback position and the current one (including during seeks) would be one option, and do what you want as long as you aren't seeking backward. I'd be more in favor of triggering any cue point callbacks that lie between the current playback position and the current playback position of the next frame (audio frame for audio/ and video frame for video/ I guess). That means more bookkeeping to implement your system, but is less surprising in other cases. If video playback freezes for a second, and so misses a cue point, is that considered to have been reached? As I read it, cue points are relative to the current playback position, which does not advance if the stream buffer underruns, but it would if playback restarts after a gap, as might happen if the connection drops, or in an RTP stream. My proposal above would need to be amended to handle that case, and the decoder dropping frames...finding the right language here is hard. In the current spec, all that is provided for is controls to turn closed captions on or off. What would be much better is a way to enable the video element to send caption events, which include the text of the current caption, and can be used to display those captions in a way that fits the design of the content better. I really like this idea. It would also be nice if, for example, the closed caption text were available through the DOM so it could be presented elsewhere, searched locally, and so on. But what about things like album art, which might be embedded in an audio stream? Should that be accessible? Should a video element expose a set of known cue points embedded in the file? A more abstract interface is necessary than just 'caption events'. Here are some use cases worth considering: * A media file has embedded textual metadata like title, author, copyright license, that the designer would like to access for associated display elsewhere in the page, or to alter the displayed user interface based on the metadata. This is pretty essential for parity with flash-based internet radio players. * A media file has embedded non-textual metadata like an album cover image, that the designer would like to access for display elsewhere in the page. * The designer wants to access closed captioned or subtitle text through the DOM as it becomes available for display elsewhere in the page. * There are points in the media file where the embedded metadata changes. These points cannot be retrieved without scanning the file, which is expensive over
[whatwg] Cue points in media elements
I'm a developer of a custom engine for interactive multimedia, and I've recently noticed the work WHATWG has been doing on adding video and audio elements to HTML. I'm very glad to see these being proposed for addition to HTML, because if they (and several other features) are done right, it means that there may be a chance for us to stop using a custom engine, and use an off-the-shelf HTML engine, putting our development focus on our authoring tools instead. My hope is that eventually, if these features get enough penetration, to put our content up on the web directly, rather than having to distribute the runtime software with it. I've taken a look at the current specification for media elements, and on the whole, it looks like it would meet our needs. We are currently using VP3, and a combination of MP3 and Vorbis audio, for our codecs, so having Ogg Theora (based on VP3) and Ogg Vorbis as a baseline would be completely fine with us, and much preferable to the patent issues and licensing fees we'd need to deal with if we used MPEG4. For the sort of content that we produce, cue points are incredibly important. Most of our content consists of a video or voiceover playing while bullet points appear, animations play, and graphics are revealed, all in sync with the video. We have a very simple system for doing cue points, that is extremely easy for the content authors to write and is robust for paused media, media that is skipped to the end, etc. We simply have a blocking call, WAIT, that waits until a specific point or the end of a specified media element. For instance, in our language, you might see something like this: (movie Foo.mov :name 'movie) (wait @movie (tc 2 3)) (show @bullet-1) (wait @movie) (show @bullet-2) If the user skips to the end of the media clip, that simply causes all WAITs on that media clip to return instantly. If they skip forward in the media clip, without ending it, all WAITs before that point will return instantly. If the user pauses the media clip, all WAITs on the media clip will block until it is playing again. This is a nice system, but I can't see how even as simple a system as this could be implemented given the current specification of cue points. The problem is that the callbacks execute when the current playback position of a media element reaches the cue point. It seems unclear to me what reaching a particular time means. If video playback freezes for a second, and so misses a cue point, is that considered to have been reached? Is there any way that you can guarantee that a cue point will be executed as long as video has passed a particular cue point? With a lot of bookkeeping and the timeupdate event along with the cue points, you may be able to keep track of the current time in the movie well enough to deal with the user skipping forward, pausing, and the video stalling and restarting due to running out of buffer. This doesn't address, as far as I can tell, issues like the thread displaying the video pausing for whatever reason and so skipping forward after it resumes, which may cause cue points to be lost, and which isn't specified to send a timeupdate event. Basically, what is necessary is a way to specify that a cue point should always be fired as long as playback has passed a certain time, not just if it reaches a particular time. This would prevent us from having to do a lot of bookkeeping to make sure that cue points haven't been missed, and make everything simpler and less fragile. We're also greatly interested in making our content accessible, to meet Section 508 requirements. For now, we are focusing on captioning for the deaf. We have voiceovers on some screens with no associated video, video that appears in various places on the screen, and the occasional sound effects. Because there is not a consistent video location, nor is there even a frame for voiceovers to appear in, we don't display the captions directly over the video, but instead send events to the current screen, which is responsible for catching the events and displaying them in a location appropriate for that screen, usually a standard location. In the current spec, all that is provided for is controls to turn closed captions on or off. What would be much better is a way to enable the video element to send caption events, which include the text of the current caption, and can be used to display those captions in a way that fits the design of the content better. I hope these comments make sense; let me know if you have any questions or suggestions. Thanks, Brian Campbell Interactive Media Lab, Dartmouth College http://iml.dartmouth.edu