Re: [whatwg] WebVTT feedback (was Re: Video feedback)

2011-06-07 Thread Silvia Pfeiffer
Hi Philip, all,

On Tue, Jun 7, 2011 at 8:12 PM, Philip Jägenstedt  wrote:
> On Sat, 04 Jun 2011 17:05:55 +0200, Silvia Pfeiffer
>  wrote:
>
>>> On Mon, 3 Jan 2011, Philip J盲genstedt wrote:
>
> Silvia, is your mail client a bit funny with character encodings? (The UTF-8
> representation of U+00E4 is the same as the GBK representation of U+76F2.)

I'm using GMAIL, so if there is anything wrong, you'll have to report
it to Google. ;-)
Checking back, I actually received your name in Ian's email with that
funny encoding. I'm not sure it's gmail's fault for interpreting it in
this way or whether there was some information in email headers lost
during delivery or what else.


 > > * The "bad cue" handling is stricter than it should be. After
 > > collecting an id, the next line must be a timestamp line. Otherwise,
 > > we skip everything until a blank line, so in the following the
 > > parser would jump to "bad cue" on line "2" and skip the whole cue.
 > >
 > > 1
 > > 2
 > > 00:00:00.000 --> 00:00:01.000
 > > Bla
 > >
 > > This doesn't match what most existing SRT parsers do, as they simply
 > > look for timing lines and ignore everything else. If we really need
 > > to collect the id instead of ignoring it like everyone else, this
 > > should be more robust, so that a valid timing line always begins a
 > > new cue. Personally, I'd prefer if it is simply ignored and that we
 > > use some form of in-cue markup for styling hooks.
 >
 > The IDs are useful for referencing cues from script, so I haven't
 > removed them. I've also left the parsing as is for when neither the
 > first nor second line is a timing line, since that gives us a lot of
 > headroom for future extensions (we can do anything so long as the
 > second line doesn't start with a timestamp and "-->" and another
 > timestamp).

 In the case of feeding future extensions to current parsers, it's way
 better fallback behavior to simply ignore the unrecognized second line
 than to discard the entire cue. The current behavior seems unnecessarily
 strict and makes the parser more complicated than it needs to be. My
 preference is just ignore anything preceding the timing line, but even
 if we must have IDs it can still be made simpler and more robust than
 what is currently spec'ed.
>>>
>>> If we just ignore content until we hit a line that happens to look like a
>>> timing line, then we are much more constrained in what we can do in the
>>> future. For example, we couldn't introduce a "comment block" syntax,
>>> since
>>> any comment containing a timing line wouldn't be ignored. On the other
>>> hand if we keep the syntax as it is now, we can introduce a comment block
>>> just by having its first line include a "-->" but not have it match the
>>> timestamp syntax, e.g. by having it be "--> COMMENT" or some such.
>>>
>>> Looking at the parser more closely, I don't really see how doing anything
>>> more complex than skipping the block entirely would be simpler than what
>>> we have now, anyway.
>>
>> Yes, I think that can work. The pattern of a line with "-->" without
>> time markers is currently ignored, so we can introduce something with
>> it for special content like comments, style and default.
>
> This seems to have been Ian's assumption, but it's not what the spec says.
> Follow the steps in
> http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#parsing-0
>
> 32. If line contains the three-character substring "-->" (U+002D
> HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN), then jump to
> the step labeled timings below.
>
> 40. Timings: Collect WebVTT cue timings and settings from line, using cue
> for the results. If that fails, jump to the step labeled bad cue.
>
> 54. Bad cue: Discard cue.
>
> (Followed by a loop to skip until the next empty line.)
>
> The effect is that that any line containing "-->" that is not a timing line
> causes everything up to the next newline to be ignored.


Yes, that's what I expect. Therefore we can create such cues in the
file format right now and the browsers as they currently work will
ignore such content. In future, they can be extended to actually do
something sensible with it. Isn't that what "is currently ignored"
means? It doesn't break the parser - the parser just skips over it. Am
I missing something?

(And yes: I'd actually like to include these specs now rather than
later, so we can extend the parsing algo right now. But I am not
fussed about timing. It's good to understand how we will exend the
format.)


 * Voice synthesis of e.g. mixed English/French captions. Given that this
 would only be useful to be people who know both languages, it seem not
 worth complicating the format for.
>>>
>>> Agreed on all fronts.
>>
>> I disagree with the third case. Many people speak more than one
>> language and even if they don't speak the langu

Re: [whatwg] Video feedback

2011-06-07 Thread Silvia Pfeiffer
On Tue, Jun 7, 2011 at 7:04 PM, Philip Jägenstedt  wrote:
> On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer
>  wrote:
>
>
>> On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson  wrote:
>>>
>>> On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:

 I do not know how technically the change of stream composition works in
 MPEG, but in Ogg we have to end a current stream and start a new one to
 switch compositions. This has been called "sequential multiplexing" or
 "chaining". In this case, stream setup information is repeated, which
 would probably lead to creating a new steam handler and possibly a new
 firing of "loadedmetadata". I am not sure how chaining is implemented in
 browsers.
>>>
>>> Per spec, chaining isn't currently supported. The closest thing I can
>>> find
>>> in the spec to this situation is handling a non-fatal error, which causes
>>> the unexpected content to be ignored.
>>>
>>>
>>> On Fri, 17 Dec 2010, Eric Winkelman wrote:

 The short answer for changing stream composition is that there is a
 Program Map Table (PMT) that is repeated every 100 milliseconds and
 describes the content of the stream.  Depending on the programming, the
 stream's composition could change entering/exiting every advertisement.
>>>
>>> If this is something that browser vendors want to support, I can specify
>>> how to handle it. Anyone?
>>
>> Icecast streams have chained files, so streaming Ogg to an audio
>> element would hit this problem. There is a bug in FF for this:
>> https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
>> bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
>> also a webkit bug for icecast streaming, which is probably related
>> https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
>> is able to deal with icecast streams, but it seems to deal with it.
>>
>> The thing is: you can implement playback and seeking without any
>> further changes to the spec. But then the browser-internal metadata
>> states will change depending on the chunk you're on. Should that also
>> update the exposed metadata in the API then? Probably yes, because
>> otherwise the JS developer may deal with contradictory information.
>> Maybe we need a "metadatachange" event for this?
>
> An Icecast stream is conceptually just one infinite audio stream, even
> though at the container level it is several chained Ogg streams. duration
> will be Infinity and currentTime will be constantly increasing. This doesn't
> seem to be a case where any spec change is needed. Am I missing something?


That is all correct. However, because it is a sequence of Ogg streams,
there are new Ogg headers in the middle. These new Ogg headers will
lead to new metadata loaded in the media framework - e.g. because the
new Ogg stream is encoded with a different audio sampling rate and a
different video width/height etc. So, therefore, the metadata in the
media framework changes. However, what the browser reports to the JS
developer doesn't change. Or if it does change, the JS developer is
not informed of it because it is a single infinite audio (or video)
stream. Thus the question whether we need a new "metadatachange" event
to expose this to the JS developer. It would then also signify that
potentially the number of tracks that are available may have changed
and other such information.

Hope that clarifies it.

Cheers,
Silvia.


Re: [whatwg] Support for page transitions

2011-06-07 Thread Aryeh Gregor
On Tue, Jun 7, 2011 at 7:45 AM, Mikko Rantalainen
 wrote:
> The things I don't want to have in this specification (page author control):
>
> - actual transition animation ("slide the next page from the left")
> - transition duration
> - ability to specify easing for transition movement
>
> Instead there should be a method for defining that submitting a form
> with a given button, the UA should use transition to "next page".
> Hitting another button on the same form should use transition to
> "previous page" and hitting some link should use "closing" transition.

It would make sense for the author to be able to control this too.
You can already do in-page transitions using CSS, and the same syntax
could be reused for page transitions:

http://www.w3.org/TR/css3-transitions/

> Note that the "next page" button may or may not match with rel="next"
> and as such, I think that there should be additional method for
> specifying this kind of relation.

What are cases where it wouldn't match?

> I think that it would make sense to use "next page" transition for
> rel="next" by default, but there's a need to attach "next page"
> transition to interactive elements other than rel="next".

What need?

> I think that this could be sensible to have in HTML instead of just in
> the CSS (or some other method) because it's possible that other software
> but just the styling system could use the information about target type
> for links and buttons.

Offhand, it seems sensible to reuse rel; let each platform work out
the default transition animation for each link type (perhaps none in
most cases); and allow authors to override the transition animation on
a per-link basis.  Selectors like a[rel~=next] would be useful here
for authors.  On the desktop, you don't usually have this sort of
next-page animation, so it would be weird if pages exhibited that
behavior unless the author specifically requested it.  On Android or
other particular platforms, it might make sense as the default.  But
it definitely makes sense to me to put this in CSS.


Re: [whatwg] Support for page transitions

2011-06-07 Thread Bjartur Thorlacius
On 6/7/11, Mikko Rantalainen  wrote:>
> Note that the "next page" button may or may not match with rel="next"
> and as such, I think that there should be additional method for
Elaborate; they both refer to the next resource in a sequence of
documents. Note that a document may be an element in multiple
sequences of documents.

If I understand correctly, the feature you want that's not supported
by rel=next and rel=prev is sending a draft to a server when switching
forms. Even better than that would be saving a draft for every input
filled by a user. This draft can be written to a local disk, or stored
at networked servers for global access, for the user agent to refill
when the user revisits the form. I think some user agents already
implement the latter, so that leaves bells, whistles and transitions.


[whatwg] Support for page transitions

2011-06-07 Thread Mikko Rantalainen
I'm pretty sure that most people on this list have seen page transitions
in Internet Explorer 5.5+
(http://msdn.microsoft.com/en-us/library/ms532847%28v=vs.85%29.aspx#Interpage_Transition).

I think that web application user experience could be improved if
transitions between pages were supported. However, I'm also pretty sure
that the implementation in IE 5.5+ is not a good one because it gives
too much power to the page author and too little control for the user.

I'll start with a use case: I have a web service/application that has a
wizard for registration. The wizard consists of multiple forms (HTML
pages) that must be filled in a sequence. Each form has "Next" and
"Previous" submit buttons. I think that the user experience would
improve if I could attach transitions to these buttons. I'd like to have
"slide the next page from right" for the "Next" button and "slide the
next page from the left" for the "Previous" button. (If you have seen
Android Phone OS and it's system menus, this is very similar setup - it
slides the next screen from right when selecting a submenu item and it
slides previous screen from the left if one presses "back" button.
Android also uses zoom in and zoom out to represent opening an
application and returning to the home screen.)

The things I don't want to have in this specification (page author control):

- actual transition animation ("slide the next page from the left")
- transition duration
- ability to specify easing for transition movement

Instead there should be a method for defining that submitting a form
with a given button, the UA should use transition to "next page".
Hitting another button on the same form should use transition to
"previous page" and hitting some link should use "closing" transition.

The transition to use for "next page" is up to UA and user preferences.
I'd prefer that UA started the animation immediately instead of waiting
that the next page is ready. If the animation were slow, it could
re-render the next page on the fly during the animation as pieces of
next page come ready.

Note that the "next page" button may or may not match with rel="next"
and as such, I think that there should be additional method for
specifying this kind of relation. Perhaps the attribute should be called
"transition" with possible values such as:

* next (advance to next part of the sequence/drill down the menu system,
possible transition could be sliding the current page towards left and
the next page sliding in view from right)
* prev (go back to previous part/return upwards menu, possible
transition could be reverse of next)
* open (open a document e.g. open a google docs document from the list
of possible documents, possible animation could be zoom to the next page)
* close (close current document e.g. close currently edited document and
return to the list of possible documents, possible animation could be
reverse of open)
* swap (replace the current view with another view with some transition
that gives a hint that the previous view was not destroyed e.g. select
another "open" google document from some kind of quick menu, perhaps
some kind of 3d animation where page rotates around vertical axis and
another page is behind it)

I think that it would make sense to use "next page" transition for
rel="next" by default, but there's a need to attach "next page"
transition to interactive elements other than rel="next".

I think that this could be sensible to have in HTML instead of just in
the CSS (or some other method) because it's possible that other software
but just the styling system could use the information about target type
for links and buttons.

This needs to be implemented by the UA because transitions between
different URLs cannot be implemented with JavaScript unlike in-page
transition effects and animations.

-- 
Mikko


Re: [whatwg] WebVTT feedback (was Re: Video feedback)

2011-06-07 Thread Philip Jägenstedt
On Sat, 04 Jun 2011 17:05:55 +0200, Silvia Pfeiffer  
 wrote:



On Mon, 3 Jan 2011, Philip J盲genstedt wrote:


Silvia, is your mail client a bit funny with character encodings? (The  
UTF-8 representation of U+00E4 is the same as the GBK representation of  
U+76F2.)



> > * The "bad cue" handling is stricter than it should be. After
> > collecting an id, the next line must be a timestamp line.  
Otherwise,

> > we skip everything until a blank line, so in the following the
> > parser would jump to "bad cue" on line "2" and skip the whole cue.
> >
> > 1
> > 2
> > 00:00:00.000 --> 00:00:01.000
> > Bla
> >
> > This doesn't match what most existing SRT parsers do, as they  
simply

> > look for timing lines and ignore everything else. If we really need
> > to collect the id instead of ignoring it like everyone else, this
> > should be more robust, so that a valid timing line always begins a
> > new cue. Personally, I'd prefer if it is simply ignored and that we
> > use some form of in-cue markup for styling hooks.
>
> The IDs are useful for referencing cues from script, so I haven't
> removed them. I've also left the parsing as is for when neither the
> first nor second line is a timing line, since that gives us a lot of
> headroom for future extensions (we can do anything so long as the
> second line doesn't start with a timestamp and "-->" and another
> timestamp).

In the case of feeding future extensions to current parsers, it's way
better fallback behavior to simply ignore the unrecognized second line
than to discard the entire cue. The current behavior seems  
unnecessarily

strict and makes the parser more complicated than it needs to be. My
preference is just ignore anything preceding the timing line, but even
if we must have IDs it can still be made simpler and more robust than
what is currently spec'ed.


If we just ignore content until we hit a line that happens to look like  
a

timing line, then we are much more constrained in what we can do in the
future. For example, we couldn't introduce a "comment block" syntax,  
since

any comment containing a timing line wouldn't be ignored. On the other
hand if we keep the syntax as it is now, we can introduce a comment  
block

just by having its first line include a "-->" but not have it match the
timestamp syntax, e.g. by having it be "--> COMMENT" or some such.

Looking at the parser more closely, I don't really see how doing  
anything

more complex than skipping the block entirely would be simpler than what
we have now, anyway.


Yes, I think that can work. The pattern of a line with "-->" without
time markers is currently ignored, so we can introduce something with
it for special content like comments, style and default.


This seems to have been Ian's assumption, but it's not what the spec says.  
Follow the steps in  
http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#parsing-0


32. If line contains the three-character substring "-->" (U+002D  
HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN), then jump to  
the step labeled timings below.


40. Timings: Collect WebVTT cue timings and settings from line, using cue  
for the results. If that fails, jump to the step labeled bad cue.


54. Bad cue: Discard cue.

(Followed by a loop to skip until the next empty line.)

The effect is that that any line containing "-->" that is not a timing  
line causes everything up to the next newline to be ignored.





* underline: EBU STL, CEA-608 and CEA-708 support underlining of
characters.


I've added support for 'text-decoration'.


And for . I am happy now, thanks. :-)


Huh. For those who are surprised, this was added in  
http://html5.org/r/6004 at the same time as  was made conforming for  
HTML. See http://www.w3.org/Bugs/Public/show_bug.cgi?id=10838




* Voice synthesis of e.g. mixed English/French captions. Given that  
this

would only be useful to be people who know both languages, it seem not
worth complicating the format for.


Agreed on all fronts.


I disagree with the third case. Many people speak more than one
language and even if they don't speak the language that is in use in a
cue, it is still bad to render it in using the wrong language model,
in particular if it is rendered by a screen reader. We really need a
mechanism to attach a language marker to a cue segment.


It's not needed for the rendering of French vs English, is it? It is  
theoretically useful for CJK, but as I've said before it seems to be more  
common to transliterate the foreign script in these cases.



Do you have any examples of real-world subtitles/captions that would
benefit from more fine-grained language information?


This kind of information would indeed be useful.


Note that I'm not so much worried about captions and subtitles here,
but rather worried about audio descriptions as rendered from cue text
descriptions.


When would one want these descriptions to be multi-language?

--
Philip Jägenstedt
Core Develope

Re: [whatwg] Video feedback

2011-06-07 Thread Philip Jägenstedt
On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer  
 wrote:




On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson  wrote:

On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:


I do not know how technically the change of stream composition works in
MPEG, but in Ogg we have to end a current stream and start a new one to
switch compositions. This has been called "sequential multiplexing" or
"chaining". In this case, stream setup information is repeated, which
would probably lead to creating a new steam handler and possibly a new
firing of "loadedmetadata". I am not sure how chaining is implemented  
in

browsers.


Per spec, chaining isn't currently supported. The closest thing I can  
find
in the spec to this situation is handling a non-fatal error, which  
causes

the unexpected content to be ignored.


On Fri, 17 Dec 2010, Eric Winkelman wrote:


The short answer for changing stream composition is that there is a
Program Map Table (PMT) that is repeated every 100 milliseconds and
describes the content of the stream.  Depending on the programming, the
stream's composition could change entering/exiting every advertisement.


If this is something that browser vendors want to support, I can specify
how to handle it. Anyone?


Icecast streams have chained files, so streaming Ogg to an audio
element would hit this problem. There is a bug in FF for this:
https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
also a webkit bug for icecast streaming, which is probably related
https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
is able to deal with icecast streams, but it seems to deal with it.

The thing is: you can implement playback and seeking without any
further changes to the spec. But then the browser-internal metadata
states will change depending on the chunk you're on. Should that also
update the exposed metadata in the API then? Probably yes, because
otherwise the JS developer may deal with contradictory information.
Maybe we need a "metadatachange" event for this?


An Icecast stream is conceptually just one infinite audio stream, even  
though at the container level it is several chained Ogg streams. duration  
will be Infinity and currentTime will be constantly increasing. This  
doesn't seem to be a case where any spec change is needed. Am I missing  
something?


--
Philip Jägenstedt
Core Developer
Opera Software