Re: [whatwg] WebVTT feedback (was Re: Video feedback)
Hi Philip, all, On Tue, Jun 7, 2011 at 8:12 PM, Philip Jägenstedt wrote: > On Sat, 04 Jun 2011 17:05:55 +0200, Silvia Pfeiffer > wrote: > >>> On Mon, 3 Jan 2011, Philip J盲genstedt wrote: > > Silvia, is your mail client a bit funny with character encodings? (The UTF-8 > representation of U+00E4 is the same as the GBK representation of U+76F2.) I'm using GMAIL, so if there is anything wrong, you'll have to report it to Google. ;-) Checking back, I actually received your name in Ian's email with that funny encoding. I'm not sure it's gmail's fault for interpreting it in this way or whether there was some information in email headers lost during delivery or what else. > > * The "bad cue" handling is stricter than it should be. After > > collecting an id, the next line must be a timestamp line. Otherwise, > > we skip everything until a blank line, so in the following the > > parser would jump to "bad cue" on line "2" and skip the whole cue. > > > > 1 > > 2 > > 00:00:00.000 --> 00:00:01.000 > > Bla > > > > This doesn't match what most existing SRT parsers do, as they simply > > look for timing lines and ignore everything else. If we really need > > to collect the id instead of ignoring it like everyone else, this > > should be more robust, so that a valid timing line always begins a > > new cue. Personally, I'd prefer if it is simply ignored and that we > > use some form of in-cue markup for styling hooks. > > The IDs are useful for referencing cues from script, so I haven't > removed them. I've also left the parsing as is for when neither the > first nor second line is a timing line, since that gives us a lot of > headroom for future extensions (we can do anything so long as the > second line doesn't start with a timestamp and "-->" and another > timestamp). In the case of feeding future extensions to current parsers, it's way better fallback behavior to simply ignore the unrecognized second line than to discard the entire cue. The current behavior seems unnecessarily strict and makes the parser more complicated than it needs to be. My preference is just ignore anything preceding the timing line, but even if we must have IDs it can still be made simpler and more robust than what is currently spec'ed. >>> >>> If we just ignore content until we hit a line that happens to look like a >>> timing line, then we are much more constrained in what we can do in the >>> future. For example, we couldn't introduce a "comment block" syntax, >>> since >>> any comment containing a timing line wouldn't be ignored. On the other >>> hand if we keep the syntax as it is now, we can introduce a comment block >>> just by having its first line include a "-->" but not have it match the >>> timestamp syntax, e.g. by having it be "--> COMMENT" or some such. >>> >>> Looking at the parser more closely, I don't really see how doing anything >>> more complex than skipping the block entirely would be simpler than what >>> we have now, anyway. >> >> Yes, I think that can work. The pattern of a line with "-->" without >> time markers is currently ignored, so we can introduce something with >> it for special content like comments, style and default. > > This seems to have been Ian's assumption, but it's not what the spec says. > Follow the steps in > http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#parsing-0 > > 32. If line contains the three-character substring "-->" (U+002D > HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN), then jump to > the step labeled timings below. > > 40. Timings: Collect WebVTT cue timings and settings from line, using cue > for the results. If that fails, jump to the step labeled bad cue. > > 54. Bad cue: Discard cue. > > (Followed by a loop to skip until the next empty line.) > > The effect is that that any line containing "-->" that is not a timing line > causes everything up to the next newline to be ignored. Yes, that's what I expect. Therefore we can create such cues in the file format right now and the browsers as they currently work will ignore such content. In future, they can be extended to actually do something sensible with it. Isn't that what "is currently ignored" means? It doesn't break the parser - the parser just skips over it. Am I missing something? (And yes: I'd actually like to include these specs now rather than later, so we can extend the parsing algo right now. But I am not fussed about timing. It's good to understand how we will exend the format.) * Voice synthesis of e.g. mixed English/French captions. Given that this would only be useful to be people who know both languages, it seem not worth complicating the format for. >>> >>> Agreed on all fronts. >> >> I disagree with the third case. Many people speak more than one >> language and even if they don't speak the langu
Re: [whatwg] Video feedback
On Tue, Jun 7, 2011 at 7:04 PM, Philip Jägenstedt wrote: > On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer > wrote: > > >> On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson wrote: >>> >>> On Thu, 16 Dec 2010, Silvia Pfeiffer wrote: I do not know how technically the change of stream composition works in MPEG, but in Ogg we have to end a current stream and start a new one to switch compositions. This has been called "sequential multiplexing" or "chaining". In this case, stream setup information is repeated, which would probably lead to creating a new steam handler and possibly a new firing of "loadedmetadata". I am not sure how chaining is implemented in browsers. >>> >>> Per spec, chaining isn't currently supported. The closest thing I can >>> find >>> in the spec to this situation is handling a non-fatal error, which causes >>> the unexpected content to be ignored. >>> >>> >>> On Fri, 17 Dec 2010, Eric Winkelman wrote: The short answer for changing stream composition is that there is a Program Map Table (PMT) that is repeated every 100 milliseconds and describes the content of the stream. Depending on the programming, the stream's composition could change entering/exiting every advertisement. >>> >>> If this is something that browser vendors want to support, I can specify >>> how to handle it. Anyone? >> >> Icecast streams have chained files, so streaming Ogg to an audio >> element would hit this problem. There is a bug in FF for this: >> https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate >> bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's >> also a webkit bug for icecast streaming, which is probably related >> https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera >> is able to deal with icecast streams, but it seems to deal with it. >> >> The thing is: you can implement playback and seeking without any >> further changes to the spec. But then the browser-internal metadata >> states will change depending on the chunk you're on. Should that also >> update the exposed metadata in the API then? Probably yes, because >> otherwise the JS developer may deal with contradictory information. >> Maybe we need a "metadatachange" event for this? > > An Icecast stream is conceptually just one infinite audio stream, even > though at the container level it is several chained Ogg streams. duration > will be Infinity and currentTime will be constantly increasing. This doesn't > seem to be a case where any spec change is needed. Am I missing something? That is all correct. However, because it is a sequence of Ogg streams, there are new Ogg headers in the middle. These new Ogg headers will lead to new metadata loaded in the media framework - e.g. because the new Ogg stream is encoded with a different audio sampling rate and a different video width/height etc. So, therefore, the metadata in the media framework changes. However, what the browser reports to the JS developer doesn't change. Or if it does change, the JS developer is not informed of it because it is a single infinite audio (or video) stream. Thus the question whether we need a new "metadatachange" event to expose this to the JS developer. It would then also signify that potentially the number of tracks that are available may have changed and other such information. Hope that clarifies it. Cheers, Silvia.
Re: [whatwg] Support for page transitions
On Tue, Jun 7, 2011 at 7:45 AM, Mikko Rantalainen wrote: > The things I don't want to have in this specification (page author control): > > - actual transition animation ("slide the next page from the left") > - transition duration > - ability to specify easing for transition movement > > Instead there should be a method for defining that submitting a form > with a given button, the UA should use transition to "next page". > Hitting another button on the same form should use transition to > "previous page" and hitting some link should use "closing" transition. It would make sense for the author to be able to control this too. You can already do in-page transitions using CSS, and the same syntax could be reused for page transitions: http://www.w3.org/TR/css3-transitions/ > Note that the "next page" button may or may not match with rel="next" > and as such, I think that there should be additional method for > specifying this kind of relation. What are cases where it wouldn't match? > I think that it would make sense to use "next page" transition for > rel="next" by default, but there's a need to attach "next page" > transition to interactive elements other than rel="next". What need? > I think that this could be sensible to have in HTML instead of just in > the CSS (or some other method) because it's possible that other software > but just the styling system could use the information about target type > for links and buttons. Offhand, it seems sensible to reuse rel; let each platform work out the default transition animation for each link type (perhaps none in most cases); and allow authors to override the transition animation on a per-link basis. Selectors like a[rel~=next] would be useful here for authors. On the desktop, you don't usually have this sort of next-page animation, so it would be weird if pages exhibited that behavior unless the author specifically requested it. On Android or other particular platforms, it might make sense as the default. But it definitely makes sense to me to put this in CSS.
Re: [whatwg] Support for page transitions
On 6/7/11, Mikko Rantalainen wrote:> > Note that the "next page" button may or may not match with rel="next" > and as such, I think that there should be additional method for Elaborate; they both refer to the next resource in a sequence of documents. Note that a document may be an element in multiple sequences of documents. If I understand correctly, the feature you want that's not supported by rel=next and rel=prev is sending a draft to a server when switching forms. Even better than that would be saving a draft for every input filled by a user. This draft can be written to a local disk, or stored at networked servers for global access, for the user agent to refill when the user revisits the form. I think some user agents already implement the latter, so that leaves bells, whistles and transitions.
[whatwg] Support for page transitions
I'm pretty sure that most people on this list have seen page transitions in Internet Explorer 5.5+ (http://msdn.microsoft.com/en-us/library/ms532847%28v=vs.85%29.aspx#Interpage_Transition). I think that web application user experience could be improved if transitions between pages were supported. However, I'm also pretty sure that the implementation in IE 5.5+ is not a good one because it gives too much power to the page author and too little control for the user. I'll start with a use case: I have a web service/application that has a wizard for registration. The wizard consists of multiple forms (HTML pages) that must be filled in a sequence. Each form has "Next" and "Previous" submit buttons. I think that the user experience would improve if I could attach transitions to these buttons. I'd like to have "slide the next page from right" for the "Next" button and "slide the next page from the left" for the "Previous" button. (If you have seen Android Phone OS and it's system menus, this is very similar setup - it slides the next screen from right when selecting a submenu item and it slides previous screen from the left if one presses "back" button. Android also uses zoom in and zoom out to represent opening an application and returning to the home screen.) The things I don't want to have in this specification (page author control): - actual transition animation ("slide the next page from the left") - transition duration - ability to specify easing for transition movement Instead there should be a method for defining that submitting a form with a given button, the UA should use transition to "next page". Hitting another button on the same form should use transition to "previous page" and hitting some link should use "closing" transition. The transition to use for "next page" is up to UA and user preferences. I'd prefer that UA started the animation immediately instead of waiting that the next page is ready. If the animation were slow, it could re-render the next page on the fly during the animation as pieces of next page come ready. Note that the "next page" button may or may not match with rel="next" and as such, I think that there should be additional method for specifying this kind of relation. Perhaps the attribute should be called "transition" with possible values such as: * next (advance to next part of the sequence/drill down the menu system, possible transition could be sliding the current page towards left and the next page sliding in view from right) * prev (go back to previous part/return upwards menu, possible transition could be reverse of next) * open (open a document e.g. open a google docs document from the list of possible documents, possible animation could be zoom to the next page) * close (close current document e.g. close currently edited document and return to the list of possible documents, possible animation could be reverse of open) * swap (replace the current view with another view with some transition that gives a hint that the previous view was not destroyed e.g. select another "open" google document from some kind of quick menu, perhaps some kind of 3d animation where page rotates around vertical axis and another page is behind it) I think that it would make sense to use "next page" transition for rel="next" by default, but there's a need to attach "next page" transition to interactive elements other than rel="next". I think that this could be sensible to have in HTML instead of just in the CSS (or some other method) because it's possible that other software but just the styling system could use the information about target type for links and buttons. This needs to be implemented by the UA because transitions between different URLs cannot be implemented with JavaScript unlike in-page transition effects and animations. -- Mikko
Re: [whatwg] WebVTT feedback (was Re: Video feedback)
On Sat, 04 Jun 2011 17:05:55 +0200, Silvia Pfeiffer wrote: On Mon, 3 Jan 2011, Philip J盲genstedt wrote: Silvia, is your mail client a bit funny with character encodings? (The UTF-8 representation of U+00E4 is the same as the GBK representation of U+76F2.) > > * The "bad cue" handling is stricter than it should be. After > > collecting an id, the next line must be a timestamp line. Otherwise, > > we skip everything until a blank line, so in the following the > > parser would jump to "bad cue" on line "2" and skip the whole cue. > > > > 1 > > 2 > > 00:00:00.000 --> 00:00:01.000 > > Bla > > > > This doesn't match what most existing SRT parsers do, as they simply > > look for timing lines and ignore everything else. If we really need > > to collect the id instead of ignoring it like everyone else, this > > should be more robust, so that a valid timing line always begins a > > new cue. Personally, I'd prefer if it is simply ignored and that we > > use some form of in-cue markup for styling hooks. > > The IDs are useful for referencing cues from script, so I haven't > removed them. I've also left the parsing as is for when neither the > first nor second line is a timing line, since that gives us a lot of > headroom for future extensions (we can do anything so long as the > second line doesn't start with a timestamp and "-->" and another > timestamp). In the case of feeding future extensions to current parsers, it's way better fallback behavior to simply ignore the unrecognized second line than to discard the entire cue. The current behavior seems unnecessarily strict and makes the parser more complicated than it needs to be. My preference is just ignore anything preceding the timing line, but even if we must have IDs it can still be made simpler and more robust than what is currently spec'ed. If we just ignore content until we hit a line that happens to look like a timing line, then we are much more constrained in what we can do in the future. For example, we couldn't introduce a "comment block" syntax, since any comment containing a timing line wouldn't be ignored. On the other hand if we keep the syntax as it is now, we can introduce a comment block just by having its first line include a "-->" but not have it match the timestamp syntax, e.g. by having it be "--> COMMENT" or some such. Looking at the parser more closely, I don't really see how doing anything more complex than skipping the block entirely would be simpler than what we have now, anyway. Yes, I think that can work. The pattern of a line with "-->" without time markers is currently ignored, so we can introduce something with it for special content like comments, style and default. This seems to have been Ian's assumption, but it's not what the spec says. Follow the steps in http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#parsing-0 32. If line contains the three-character substring "-->" (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN), then jump to the step labeled timings below. 40. Timings: Collect WebVTT cue timings and settings from line, using cue for the results. If that fails, jump to the step labeled bad cue. 54. Bad cue: Discard cue. (Followed by a loop to skip until the next empty line.) The effect is that that any line containing "-->" that is not a timing line causes everything up to the next newline to be ignored. * underline: EBU STL, CEA-608 and CEA-708 support underlining of characters. I've added support for 'text-decoration'. And for . I am happy now, thanks. :-) Huh. For those who are surprised, this was added in http://html5.org/r/6004 at the same time as was made conforming for HTML. See http://www.w3.org/Bugs/Public/show_bug.cgi?id=10838 * Voice synthesis of e.g. mixed English/French captions. Given that this would only be useful to be people who know both languages, it seem not worth complicating the format for. Agreed on all fronts. I disagree with the third case. Many people speak more than one language and even if they don't speak the language that is in use in a cue, it is still bad to render it in using the wrong language model, in particular if it is rendered by a screen reader. We really need a mechanism to attach a language marker to a cue segment. It's not needed for the rendering of French vs English, is it? It is theoretically useful for CJK, but as I've said before it seems to be more common to transliterate the foreign script in these cases. Do you have any examples of real-world subtitles/captions that would benefit from more fine-grained language information? This kind of information would indeed be useful. Note that I'm not so much worried about captions and subtitles here, but rather worried about audio descriptions as rendered from cue text descriptions. When would one want these descriptions to be multi-language? -- Philip Jägenstedt Core Develope
Re: [whatwg] Video feedback
On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer wrote: On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson wrote: On Thu, 16 Dec 2010, Silvia Pfeiffer wrote: I do not know how technically the change of stream composition works in MPEG, but in Ogg we have to end a current stream and start a new one to switch compositions. This has been called "sequential multiplexing" or "chaining". In this case, stream setup information is repeated, which would probably lead to creating a new steam handler and possibly a new firing of "loadedmetadata". I am not sure how chaining is implemented in browsers. Per spec, chaining isn't currently supported. The closest thing I can find in the spec to this situation is handling a non-fatal error, which causes the unexpected content to be ignored. On Fri, 17 Dec 2010, Eric Winkelman wrote: The short answer for changing stream composition is that there is a Program Map Table (PMT) that is repeated every 100 milliseconds and describes the content of the stream. Depending on the programming, the stream's composition could change entering/exiting every advertisement. If this is something that browser vendors want to support, I can specify how to handle it. Anyone? Icecast streams have chained files, so streaming Ogg to an audio element would hit this problem. There is a bug in FF for this: https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's also a webkit bug for icecast streaming, which is probably related https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera is able to deal with icecast streams, but it seems to deal with it. The thing is: you can implement playback and seeking without any further changes to the spec. But then the browser-internal metadata states will change depending on the chunk you're on. Should that also update the exposed metadata in the API then? Probably yes, because otherwise the JS developer may deal with contradictory information. Maybe we need a "metadatachange" event for this? An Icecast stream is conceptually just one infinite audio stream, even though at the container level it is several chained Ogg streams. duration will be Infinity and currentTime will be constantly increasing. This doesn't seem to be a case where any spec change is needed. Am I missing something? -- Philip Jägenstedt Core Developer Opera Software