Re: [whatwg] Video feedback

2011-07-08 Thread Ian Hickson
On Thu, 7 Jul 2011, Eric Winkelman wrote:
> On Thursday, June 02 Ian Hickson wrote:
> > On Fri, 18 Mar 2011, Eric Winkelman wrote:
> > >
> > > For in-band metadata tracks, there is neither a standard way to 
> > > represent the type of metadata in the HTMLTrackElement interface nor 
> > > is there a standard way to represent multiple different types of 
> > > metadata tracks.
> > 
> > There can be a standard way. The idea is that all the types of 
> > metadata tracks that browsers will support should be specified so that 
> > all browsers can map them the same way. I'm happy to work with anyone 
> > interested in writing such a mapping spec, just let me know.
> 
> I would be very interested in working on this spec.

It would be several specs, probably, each focusing on a particular set of 
metadata in a particular format (e.g. advertising timings in an MPEG 
wrapper, or whatever).


> What's the next step?

First, research: what formats and metadata streams are you interested in? 
Who uses them? How are they implemented in producers and (more 
importantly) consumers today? What are the use cases?

Second, describe the problem: make a clear statement of purpose that 
scopes the effort to provide guidelines to prevent feature creep.

Third, listen to implementors: find those that are interested in 
implementing this particular mapping of metadata to the DOM API, get their 
input, see what they want.

Fourth, implement: make or have someone else make an experimental 
implementation of a mapping that addresses the problem described in the 
earlier steps.

Fifth, specify: write a specification that describes the mapping described 
in step two, based on what you've researched in step one and based on the 
feedback from steps three and four.

Sixth, test: update the experimental implement to fit the spec, get other 
implementations to implement the spec. Have real users play with it.

Seventh, simplify: remove what you don't need.

Finally, iterate: repeat all these steps for as long as there's any 
interest in this mapping, fixing problems, adding new features if they're 
needed, removing old features that didn't get used or implemented, etc.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Video feedback

2011-07-07 Thread Eric Winkelman
On Thursday, June 02 Ian Hickson wrote:
>
> On Fri, 18 Mar 2011, Eric Winkelman wrote:
> >
> > For in-band metadata tracks, there is neither a standard way to 
> > represent the type of metadata in the HTMLTrackElement interface nor 
> > is there a standard way to represent multiple different types of 
> > metadata tracks.
> 
> There can be a standard way. The idea is that all the types of 
> metadata tracks that browsers will support should be specified so that 
> all browsers can map them the same way. I'm happy to work with anyone 
> interested in writing such a mapping spec, just let me know.

I would be very interested in working on this spec.  

CableLabs works with numerous groups delivering content containing a variety of 
metadata, so we have a good idea what is currently used.  We're also working 
with the groups defining adaptive bit rate delivery protocols about how 
metadata might be carried.

What's the next step?

Eric


Re: [whatwg] Video feedback

2011-07-07 Thread Bob Lund

> -Original Message-
> From: whatwg-boun...@lists.whatwg.org [mailto:whatwg-
> boun...@lists.whatwg.org] On Behalf Of Mark Watson
> Sent: Monday, June 20, 2011 2:29 AM
> To: Eric Carlson
> Cc: Silvia Pfeiffer; whatwg Group; Simon Pieters
> Subject: Re: [whatwg] Video feedback
> 
> 
> On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote:
> 
> >
> > On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote:
> >
> >> On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters 
> wrote:
> >>> On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
> >>>  wrote:
> >>>
> >>>>> For commercial video providers, the tracks in a live stream change
> >>>>> all the time; this is not limited to audio and video tracks but
> >>>>> would include text tracks as well.
> >>>>
> >>>> OK, all this indicates to me that we probably want a
> "metadatachanged"
> >>>> event to indicate there has been a change and that JS may need to
> >>>> check some of its assumptions.
> >>>
> >>> We already have durationchange. Duration is metadata. If we want to
> >>> support changes to width/height, and the script is interested in
> >>> when that happens, maybe there should be a dimensionchange event
> >>> (but what's the use case for changing width/height mid-stream?).
> >>> Does the spec support changes to text tracks mid-stream?
> >>
> >> It's not about what the spec supports, but what real-world streams
> provide.
> >>
> >> I don't think it makes sense to put an event on every single type of
> >> metadata that can change. Most of the time, when you have a stream
> >> change, many variables will change together, so a single event is a
> >> lot less events to raise. It's an event that signifies that the media
> >> framework has reset the video/audio decoding pipeline and loaded a
> >> whole bunch of new stuff. You should imagine it as a concatenation of
> >> different media resources. And yes, they can have different track
> >> constitution and different audio sampling rate (which the audio API
> >> will care about) etc etc.
> >>
> >  In addition, it is possible for a stream to lose or gain an audio
> track. In this case the dimensions won't change but a script may want to
> react to the change in audioTracks.
> 
> The TrackList object has an onchanged event, which I assumed would fire
> when any of the information in the TrackList changes (e.g. tracks added
> or removed). But actually the spec doesn't state when this event fires
> (as far as I could tell - unless it is implied by some general
> definition of events called onchanged).
> 
> Should there be some clarification here ?
> 
> >
> >  I agree with Silvia, a more generic "metadata changed" event makes
> more sense.
> 
> Yes, and it should support the case in which text tracks are
> added/removed too.

Has there been a bug submitted to add a metadata changed event when video, 
audio or text tracks are added or deleted from a media resource?

Thanks,
Bob Lund

> 
> Also, as Eric (C) pointed out, one of the things which can change is
> which of several available versions of the content is being rendered
> (for adaptive bitrate cases). This doesn't necessarily change any of the
> metadata currently exposed on the video element, but nevertheless it's
> information that the application may need. It would be nice to expose
> some kind of identifier for the currently rendered stream and have an
> event when this changes. I think that a stream-format-supplied
> identifier would be sufficient.
> 
> ...Mark
> 
> >
> > eric
> >
> >



Re: [whatwg] Video feedback

2011-06-20 Thread Mark Watson

On Jun 20, 2011, at 5:28 PM, Silvia Pfeiffer wrote:

> On Tue, Jun 21, 2011 at 12:07 AM, Mark Watson  wrote:
>> 
>> On Jun 20, 2011, at 11:52 AM, Silvia Pfeiffer wrote:
>> 
>>> On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson  wrote:
> 
>> The TrackList object has an onchanged event, which I assumed would fire 
>> when
>> any of the information in the TrackList changes (e.g. tracks added or
>> removed). But actually the spec doesn't state when this event fires (as 
>> far
>> as I could tell - unless it is implied by some general definition of 
>> events
>> called onchanged).
>> 
>> Should there be some clarification here ?
> 
> I understood that to relate to a change of cues only, since it is on
> the tracklist. I.e. it's an aggregate event from the oncuechange event
> of a cue inside the track. I didn't think it would relate to a change
> of existence of that track.
> 
> Note that the even is attached to the TrackList, not the TrackList[],
> so it cannot be raised when a track is added or removed, only when
> something inside the TrackList changes.
 
 Are we talking about the same thing ? There is no TrackList array and
 TrackList is only used for audio/video, not text, so I don't understand the
 comment about cues.
 I'm talking
 about 
 http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist
  which
 is the base class for MultipleTrackList and ExclusiveTrackList used to
 represent all the audio and video tracks (respectively). One instance of 
 the
 object represents all the tracks, so I would assume that a change in the
 number of tracks is a change to this object.
>>> 
>>> Ah yes, you're right: I got confused.
>>> 
>>> It says "Whenever the selected track is changed, the user agent must
>>> queue a task to fire a simple event named change at the
>>> MultipleTrackList object." This means it fires when the selectedIndex
>>> is changed, i.e. the user chooses a different track for rendering. I
>>> still don't think it relates to changes in the composition of tracks
>>> of a resource. That should be something different and should probably
>>> be on the MediaElement and not on the track list to also cover changes
>>> in text tracks.
>> 
>> Fair enough.
>> 
>>> 
>>> 
>> Also, as Eric (C) pointed out, one of the things which can change is 
>> which
>> of several available versions of the content is being rendered (for 
>> adaptive
>> bitrate cases). This doesn't necessarily change any of the metadata
>> currently exposed on the video element, but nevertheless it's information
>> that the application may need. It would be nice to expose some kind of
>> identifier for the currently rendered stream and have an event when this
>> changes. I think that a stream-format-supplied identifier would be
>> sufficient.
> 
> I don't know about the adaptive streaming situation. I think that is
> more about statistics/metrics rather than about change of resource.
> All the alternatives in an adaptive streaming "resource" should
> provide the same number of tracks and the same video dimensions, just
> at different bitrate/quality, no?
 
 I think of the different adaptive versions on a per-track basis (i.e. the
 alternatives are *within* each track), not a bunch of alternatives each of
 which contains several tracks. Both are possible, of course.
 
 It's certainly possible (indeed common) for different bitrate video
 encodings to have different resolutions - there are video encoding reasons
 to do this. Of course the aspect ratio should not change and nor should the
 dimensions on the screen (both would be a little peculiar for the user).
 
 Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not
 the same as the resolution (for a start, they are in CSS pixels, which are
 square), but I think it quite likely that if the resolution of the video
 changes than the videoWidth and videoHeight might change. I'd be interested
 to hear how existing implementations relate resolution to videoWidth and
 videoHeight.
>>> 
>>> Well, if videoWidth and videoHeight change and no dimensions on the
>>> video are provided through CSS, then surely the video will change size
>>> and the display will shrink. That would be a terrible user experience.
>>> For that reason I would suggest that such a change not be made in
>>> alternative adaptive streams.
>> 
>> That seems backwards to me! I would say "For that reason I would suggest 
>> that dimensions are provided through CSS or through the width and height 
>> attributes."
>> 
>> Alternatively, we change the specification of the video element to 
>> accommodate this aspect of adaptive streaming (for example, the videoWidth 
>> and videoHeight could be defined to be based on the highest resolu

Re: [whatwg] Video feedback

2011-06-20 Thread Silvia Pfeiffer
On Tue, Jun 21, 2011 at 12:07 AM, Mark Watson  wrote:
>
> On Jun 20, 2011, at 11:52 AM, Silvia Pfeiffer wrote:
>
>> On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson  wrote:

> The TrackList object has an onchanged event, which I assumed would fire 
> when
> any of the information in the TrackList changes (e.g. tracks added or
> removed). But actually the spec doesn't state when this event fires (as 
> far
> as I could tell - unless it is implied by some general definition of 
> events
> called onchanged).
>
> Should there be some clarification here ?

 I understood that to relate to a change of cues only, since it is on
 the tracklist. I.e. it's an aggregate event from the oncuechange event
 of a cue inside the track. I didn't think it would relate to a change
 of existence of that track.

 Note that the even is attached to the TrackList, not the TrackList[],
 so it cannot be raised when a track is added or removed, only when
 something inside the TrackList changes.
>>>
>>> Are we talking about the same thing ? There is no TrackList array and
>>> TrackList is only used for audio/video, not text, so I don't understand the
>>> comment about cues.
>>> I'm talking
>>> about 
>>> http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist
>>>  which
>>> is the base class for MultipleTrackList and ExclusiveTrackList used to
>>> represent all the audio and video tracks (respectively). One instance of the
>>> object represents all the tracks, so I would assume that a change in the
>>> number of tracks is a change to this object.
>>
>> Ah yes, you're right: I got confused.
>>
>> It says "Whenever the selected track is changed, the user agent must
>> queue a task to fire a simple event named change at the
>> MultipleTrackList object." This means it fires when the selectedIndex
>> is changed, i.e. the user chooses a different track for rendering. I
>> still don't think it relates to changes in the composition of tracks
>> of a resource. That should be something different and should probably
>> be on the MediaElement and not on the track list to also cover changes
>> in text tracks.
>
> Fair enough.
>
>>
>>
> Also, as Eric (C) pointed out, one of the things which can change is which
> of several available versions of the content is being rendered (for 
> adaptive
> bitrate cases). This doesn't necessarily change any of the metadata
> currently exposed on the video element, but nevertheless it's information
> that the application may need. It would be nice to expose some kind of
> identifier for the currently rendered stream and have an event when this
> changes. I think that a stream-format-supplied identifier would be
> sufficient.

 I don't know about the adaptive streaming situation. I think that is
 more about statistics/metrics rather than about change of resource.
 All the alternatives in an adaptive streaming "resource" should
 provide the same number of tracks and the same video dimensions, just
 at different bitrate/quality, no?
>>>
>>> I think of the different adaptive versions on a per-track basis (i.e. the
>>> alternatives are *within* each track), not a bunch of alternatives each of
>>> which contains several tracks. Both are possible, of course.
>>>
>>> It's certainly possible (indeed common) for different bitrate video
>>> encodings to have different resolutions - there are video encoding reasons
>>> to do this. Of course the aspect ratio should not change and nor should the
>>> dimensions on the screen (both would be a little peculiar for the user).
>>>
>>> Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not
>>> the same as the resolution (for a start, they are in CSS pixels, which are
>>> square), but I think it quite likely that if the resolution of the video
>>> changes than the videoWidth and videoHeight might change. I'd be interested
>>> to hear how existing implementations relate resolution to videoWidth and
>>> videoHeight.
>>
>> Well, if videoWidth and videoHeight change and no dimensions on the
>> video are provided through CSS, then surely the video will change size
>> and the display will shrink. That would be a terrible user experience.
>> For that reason I would suggest that such a change not be made in
>> alternative adaptive streams.
>
> That seems backwards to me! I would say "For that reason I would suggest that 
> dimensions are provided through CSS or through the width and height 
> attributes."
>
> Alternatively, we change the specification of the video element to 
> accommodate this aspect of adaptive streaming (for example, the videoWidth 
> and videoHeight could be defined to be based on the highest resolution 
> bitrate being considered.)
>
> There are good video encoding reasons for different bitrates to be encoded at 
> different resolutions which are far more important than any reasons

Re: [whatwg] Video feedback

2011-06-20 Thread Mark Watson

On Jun 20, 2011, at 11:52 AM, Silvia Pfeiffer wrote:

> On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson  wrote:
>>> 
 The TrackList object has an onchanged event, which I assumed would fire 
 when
 any of the information in the TrackList changes (e.g. tracks added or
 removed). But actually the spec doesn't state when this event fires (as far
 as I could tell - unless it is implied by some general definition of events
 called onchanged).
 
 Should there be some clarification here ?
>>> 
>>> I understood that to relate to a change of cues only, since it is on
>>> the tracklist. I.e. it's an aggregate event from the oncuechange event
>>> of a cue inside the track. I didn't think it would relate to a change
>>> of existence of that track.
>>> 
>>> Note that the even is attached to the TrackList, not the TrackList[],
>>> so it cannot be raised when a track is added or removed, only when
>>> something inside the TrackList changes.
>> 
>> Are we talking about the same thing ? There is no TrackList array and
>> TrackList is only used for audio/video, not text, so I don't understand the
>> comment about cues.
>> I'm talking
>> about 
>> http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist
>>  which
>> is the base class for MultipleTrackList and ExclusiveTrackList used to
>> represent all the audio and video tracks (respectively). One instance of the
>> object represents all the tracks, so I would assume that a change in the
>> number of tracks is a change to this object.
> 
> Ah yes, you're right: I got confused.
> 
> It says "Whenever the selected track is changed, the user agent must
> queue a task to fire a simple event named change at the
> MultipleTrackList object." This means it fires when the selectedIndex
> is changed, i.e. the user chooses a different track for rendering. I
> still don't think it relates to changes in the composition of tracks
> of a resource. That should be something different and should probably
> be on the MediaElement and not on the track list to also cover changes
> in text tracks.

Fair enough.

> 
> 
 Also, as Eric (C) pointed out, one of the things which can change is which
 of several available versions of the content is being rendered (for 
 adaptive
 bitrate cases). This doesn't necessarily change any of the metadata
 currently exposed on the video element, but nevertheless it's information
 that the application may need. It would be nice to expose some kind of
 identifier for the currently rendered stream and have an event when this
 changes. I think that a stream-format-supplied identifier would be
 sufficient.
>>> 
>>> I don't know about the adaptive streaming situation. I think that is
>>> more about statistics/metrics rather than about change of resource.
>>> All the alternatives in an adaptive streaming "resource" should
>>> provide the same number of tracks and the same video dimensions, just
>>> at different bitrate/quality, no?
>> 
>> I think of the different adaptive versions on a per-track basis (i.e. the
>> alternatives are *within* each track), not a bunch of alternatives each of
>> which contains several tracks. Both are possible, of course.
>> 
>> It's certainly possible (indeed common) for different bitrate video
>> encodings to have different resolutions - there are video encoding reasons
>> to do this. Of course the aspect ratio should not change and nor should the
>> dimensions on the screen (both would be a little peculiar for the user).
>> 
>> Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not
>> the same as the resolution (for a start, they are in CSS pixels, which are
>> square), but I think it quite likely that if the resolution of the video
>> changes than the videoWidth and videoHeight might change. I'd be interested
>> to hear how existing implementations relate resolution to videoWidth and
>> videoHeight.
> 
> Well, if videoWidth and videoHeight change and no dimensions on the
> video are provided through CSS, then surely the video will change size
> and the display will shrink. That would be a terrible user experience.
> For that reason I would suggest that such a change not be made in
> alternative adaptive streams.

That seems backwards to me! I would say "For that reason I would suggest that 
dimensions are provided through CSS or through the width and height attributes."

Alternatively, we change the specification of the video element to accommodate 
this aspect of adaptive streaming (for example, the videoWidth and videoHeight 
could be defined to be based on the highest resolution bitrate being 
considered.)

There are good video encoding reasons for different bitrates to be encoded at 
different resolutions which are far more important than any reasons not to do 
either of the above.

> 
> 
>>> Different video dimensions should be
>>> provided through the  element and @media attribute, but within
>>> an adaptive s

Re: [whatwg] Video feedback

2011-06-20 Thread Silvia Pfeiffer
On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson  wrote:
>>
>>> The TrackList object has an onchanged event, which I assumed would fire when
>>> any of the information in the TrackList changes (e.g. tracks added or
>>> removed). But actually the spec doesn't state when this event fires (as far
>>> as I could tell - unless it is implied by some general definition of events
>>> called onchanged).
>>>
>>> Should there be some clarification here ?
>>
>> I understood that to relate to a change of cues only, since it is on
>> the tracklist. I.e. it's an aggregate event from the oncuechange event
>> of a cue inside the track. I didn't think it would relate to a change
>> of existence of that track.
>>
>> Note that the even is attached to the TrackList, not the TrackList[],
>> so it cannot be raised when a track is added or removed, only when
>> something inside the TrackList changes.
>
> Are we talking about the same thing ? There is no TrackList array and
> TrackList is only used for audio/video, not text, so I don't understand the
> comment about cues.
> I'm talking
> about http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist which
> is the base class for MultipleTrackList and ExclusiveTrackList used to
> represent all the audio and video tracks (respectively). One instance of the
> object represents all the tracks, so I would assume that a change in the
> number of tracks is a change to this object.

Ah yes, you're right: I got confused.

It says "Whenever the selected track is changed, the user agent must
queue a task to fire a simple event named change at the
MultipleTrackList object." This means it fires when the selectedIndex
is changed, i.e. the user chooses a different track for rendering. I
still don't think it relates to changes in the composition of tracks
of a resource. That should be something different and should probably
be on the MediaElement and not on the track list to also cover changes
in text tracks.


>>> Also, as Eric (C) pointed out, one of the things which can change is which
>>> of several available versions of the content is being rendered (for adaptive
>>> bitrate cases). This doesn't necessarily change any of the metadata
>>> currently exposed on the video element, but nevertheless it's information
>>> that the application may need. It would be nice to expose some kind of
>>> identifier for the currently rendered stream and have an event when this
>>> changes. I think that a stream-format-supplied identifier would be
>>> sufficient.
>>
>> I don't know about the adaptive streaming situation. I think that is
>> more about statistics/metrics rather than about change of resource.
>> All the alternatives in an adaptive streaming "resource" should
>> provide the same number of tracks and the same video dimensions, just
>> at different bitrate/quality, no?
>
> I think of the different adaptive versions on a per-track basis (i.e. the
> alternatives are *within* each track), not a bunch of alternatives each of
> which contains several tracks. Both are possible, of course.
>
> It's certainly possible (indeed common) for different bitrate video
> encodings to have different resolutions - there are video encoding reasons
> to do this. Of course the aspect ratio should not change and nor should the
> dimensions on the screen (both would be a little peculiar for the user).
>
> Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not
> the same as the resolution (for a start, they are in CSS pixels, which are
> square), but I think it quite likely that if the resolution of the video
> changes than the videoWidth and videoHeight might change. I'd be interested
> to hear how existing implementations relate resolution to videoWidth and
> videoHeight.

Well, if videoWidth and videoHeight change and no dimensions on the
video are provided through CSS, then surely the video will change size
and the display will shrink. That would be a terrible user experience.
For that reason I would suggest that such a change not be made in
alternative adaptive streams.


>> Different video dimensions should be
>> provided through the  element and @media attribute, but within
>> an adaptive stream, the alternatives should be consistent because the
>> target device won't change. I guess this is a discussion for another
>> thread... :-)
>
> Possibly ;-) The device knows much better than the page author what
> capabilities it has and so what resolutions are suitable for the device. So
> it is better to provide all the alternatives as a single resource and have
> the device work out which subset it can support. Or at least, the list
> should be provided all at the same level - there is no rationale for a
> hierarchy of alternatives.

The way in which HTML deals with different devices and their different
capabilities is through media queries. As a author you provide your
content with different versions of media-dependent style sheets and
content, so that when you view th

Re: [whatwg] Video feedback

2011-06-20 Thread Mark Watson

On Jun 20, 2011, at 10:42 AM, Silvia Pfeiffer wrote:

On Mon, Jun 20, 2011 at 6:29 PM, Mark Watson 
mailto:wats...@netflix.com>> wrote:

On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote:


On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote:

On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters 
mailto:sim...@opera.com>> wrote:
On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
mailto:silviapfeiff...@gmail.com>> wrote:

For commercial video providers, the tracks in a live stream change all
the time; this is not limited to audio and video tracks but would include
text tracks as well.

OK, all this indicates to me that we probably want a "metadatachanged"
event to indicate there has been a change and that JS may need to
check some of its assumptions.

We already have durationchange. Duration is metadata. If we want to support
changes to width/height, and the script is interested in when that happens,
maybe there should be a dimensionchange event (but what's the use case for
changing width/height mid-stream?). Does the spec support changes to text
tracks mid-stream?

It's not about what the spec supports, but what real-world streams provide.

I don't think it makes sense to put an event on every single type of
metadata that can change. Most of the time, when you have a stream
change, many variables will change together, so a single event is a
lot less events to raise. It's an event that signifies that the media
framework has reset the video/audio decoding pipeline and loaded a
whole bunch of new stuff. You should imagine it as a concatenation of
different media resources. And yes, they can have different track
constitution and different audio sampling rate (which the audio API
will care about) etc etc.

 In addition, it is possible for a stream to lose or gain an audio track. In 
this case the dimensions won't change but a script may want to react to the 
change in audioTracks.

The TrackList object has an onchanged event, which I assumed would fire when 
any of the information in the TrackList changes (e.g. tracks added or removed). 
But actually the spec doesn't state when this event fires (as far as I could 
tell - unless it is implied by some general definition of events called 
onchanged).

Should there be some clarification here ?

I understood that to relate to a change of cues only, since it is on
the tracklist. I.e. it's an aggregate event from the oncuechange event
of a cue inside the track. I didn't think it would relate to a change
of existence of that track.

Note that the even is attached to the TrackList, not the TrackList[],
so it cannot be raised when a track is added or removed, only when
something inside the TrackList changes.

Are we talking about the same thing ? There is no TrackList array and TrackList 
is only used for audio/video, not text, so I don't understand the comment about 
cues.

I'm talking about 
http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist
 which is the base class for MultipleTrackList and ExclusiveTrackList used to 
represent all the audio and video tracks (respectively). One instance of the 
object represents all the tracks, so I would assume that a change in the number 
of tracks is a change to this object.




 I agree with Silvia, a more generic "metadata changed" event makes more sense.

Yes, and it should support the case in which text tracks are added/removed too.

Yes, it needs to be an event on the MediaElement.


Also, as Eric (C) pointed out, one of the things which can change is which of 
seve
ral available versions of the content is being rendered (for adaptive bitrate 
cases). This doesn't necessarily change any of the metadata currently exposed 
on the video element, but nevertheless it's information that the application 
may need. It would be nice to expose some kind of identifier for the currently 
rendered stream and have an event when this changes. I think that a 
stream-format-supplied identifier would be sufficient.


I don't know about the adaptive streaming situation. I think that is
more about statistics/metrics rather than about change of resource.
All the alternatives in an adaptive streaming "resource" should
provide the same number of tracks and the same video dimensions, just
at different bitrate/quality, no?

I think of the different adaptive versions on a per-track basis (i.e. the 
alternatives are *within* each track), not a bunch of alternatives each of 
which contains several tracks. Both are possible, of course.

It's certainly possible (indeed common) for different bitrate video encodings 
to have different resolutions - there are video encoding reasons to do this. Of 
course the aspect ratio should not change and nor should the dimensions on the 
screen (both would be a little peculiar for the user).

Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not the 
same as the resolution (for a start, they are in CSS pixels, which are square), 
but I think it quite likely that if t

Re: [whatwg] Video feedback

2011-06-20 Thread Silvia Pfeiffer
On Mon, Jun 20, 2011 at 6:29 PM, Mark Watson  wrote:
>
> On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote:
>
>>
>> On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote:
>>
>>> On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters  wrote:
 On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
  wrote:

>> For commercial video providers, the tracks in a live stream change all
>> the time; this is not limited to audio and video tracks but would include
>> text tracks as well.
>
> OK, all this indicates to me that we probably want a "metadatachanged"
> event to indicate there has been a change and that JS may need to
> check some of its assumptions.

 We already have durationchange. Duration is metadata. If we want to support
 changes to width/height, and the script is interested in when that happens,
 maybe there should be a dimensionchange event (but what's the use case for
 changing width/height mid-stream?). Does the spec support changes to text
 tracks mid-stream?
>>>
>>> It's not about what the spec supports, but what real-world streams provide.
>>>
>>> I don't think it makes sense to put an event on every single type of
>>> metadata that can change. Most of the time, when you have a stream
>>> change, many variables will change together, so a single event is a
>>> lot less events to raise. It's an event that signifies that the media
>>> framework has reset the video/audio decoding pipeline and loaded a
>>> whole bunch of new stuff. You should imagine it as a concatenation of
>>> different media resources. And yes, they can have different track
>>> constitution and different audio sampling rate (which the audio API
>>> will care about) etc etc.
>>>
>>  In addition, it is possible for a stream to lose or gain an audio track. In 
>> this case the dimensions won't change but a script may want to react to the 
>> change in audioTracks.
>
> The TrackList object has an onchanged event, which I assumed would fire when 
> any of the information in the TrackList changes (e.g. tracks added or 
> removed). But actually the spec doesn't state when this event fires (as far 
> as I could tell - unless it is implied by some general definition of events 
> called onchanged).
>
> Should there be some clarification here ?

I understood that to relate to a change of cues only, since it is on
the tracklist. I.e. it's an aggregate event from the oncuechange event
of a cue inside the track. I didn't think it would relate to a change
of existence of that track.

Note that the even is attached to the TrackList, not the TrackList[],
so it cannot be raised when a track is added or removed, only when
something inside the TrackList changes.


>>  I agree with Silvia, a more generic "metadata changed" event makes more 
>> sense.
>
> Yes, and it should support the case in which text tracks are added/removed 
> too.

Yes, it needs to be an event on the MediaElement.


> Also, as Eric (C) pointed out, one of the things which can change is which of 
> several available versions of the content is being rendered (for adaptive 
> bitrate cases). This doesn't necessarily change any of the metadata currently 
> exposed on the video element, but nevertheless it's information that the 
> application may need. It would be nice to expose some kind of identifier for 
> the currently rendered stream and have an event when this changes. I think 
> that a stream-format-supplied identifier would be sufficient.


I don't know about the adaptive streaming situation. I think that is
more about statistics/metrics rather than about change of resource.
All the alternatives in an adaptive streaming "resource" should
provide the same number of tracks and the same video dimensions, just
at different bitrate/quality, no? Different video dimensions should be
provided through the  element and @media attribute, but within
an adaptive stream, the alternatives should be consistent because the
target device won't change. I guess this is a discussion for another
thread... :-)

Cheers,
Silvia.


Re: [whatwg] Video feedback

2011-06-20 Thread Mark Watson

On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote:

> 
> On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote:
> 
>> On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters  wrote:
>>> On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
>>>  wrote:
>>> 
> For commercial video providers, the tracks in a live stream change all
> the time; this is not limited to audio and video tracks but would include
> text tracks as well.
 
 OK, all this indicates to me that we probably want a "metadatachanged"
 event to indicate there has been a change and that JS may need to
 check some of its assumptions.
>>> 
>>> We already have durationchange. Duration is metadata. If we want to support
>>> changes to width/height, and the script is interested in when that happens,
>>> maybe there should be a dimensionchange event (but what's the use case for
>>> changing width/height mid-stream?). Does the spec support changes to text
>>> tracks mid-stream?
>> 
>> It's not about what the spec supports, but what real-world streams provide.
>> 
>> I don't think it makes sense to put an event on every single type of
>> metadata that can change. Most of the time, when you have a stream
>> change, many variables will change together, so a single event is a
>> lot less events to raise. It's an event that signifies that the media
>> framework has reset the video/audio decoding pipeline and loaded a
>> whole bunch of new stuff. You should imagine it as a concatenation of
>> different media resources. And yes, they can have different track
>> constitution and different audio sampling rate (which the audio API
>> will care about) etc etc.
>> 
>  In addition, it is possible for a stream to lose or gain an audio track. In 
> this case the dimensions won't change but a script may want to react to the 
> change in audioTracks. 

The TrackList object has an onchanged event, which I assumed would fire when 
any of the information in the TrackList changes (e.g. tracks added or removed). 
But actually the spec doesn't state when this event fires (as far as I could 
tell - unless it is implied by some general definition of events called 
onchanged).

Should there be some clarification here ?

> 
>  I agree with Silvia, a more generic "metadata changed" event makes more 
> sense. 

Yes, and it should support the case in which text tracks are added/removed too.

Also, as Eric (C) pointed out, one of the things which can change is which of 
several available versions of the content is being rendered (for adaptive 
bitrate cases). This doesn't necessarily change any of the metadata currently 
exposed on the video element, but nevertheless it's information that the 
application may need. It would be nice to expose some kind of identifier for 
the currently rendered stream and have an event when this changes. I think that 
a stream-format-supplied identifier would be sufficient.

...Mark

> 
> eric
> 
> 



Re: [whatwg] Video feedback

2011-06-09 Thread Eric Carlson

On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote:

> On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters  wrote:
>> On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
>>  wrote:
>> 
 For commercial video providers, the tracks in a live stream change all
 the time; this is not limited to audio and video tracks but would include
 text tracks as well.
>>> 
>>> OK, all this indicates to me that we probably want a "metadatachanged"
>>> event to indicate there has been a change and that JS may need to
>>> check some of its assumptions.
>> 
>> We already have durationchange. Duration is metadata. If we want to support
>> changes to width/height, and the script is interested in when that happens,
>> maybe there should be a dimensionchange event (but what's the use case for
>> changing width/height mid-stream?). Does the spec support changes to text
>> tracks mid-stream?
> 
> It's not about what the spec supports, but what real-world streams provide.
> 
> I don't think it makes sense to put an event on every single type of
> metadata that can change. Most of the time, when you have a stream
> change, many variables will change together, so a single event is a
> lot less events to raise. It's an event that signifies that the media
> framework has reset the video/audio decoding pipeline and loaded a
> whole bunch of new stuff. You should imagine it as a concatenation of
> different media resources. And yes, they can have different track
> constitution and different audio sampling rate (which the audio API
> will care about) etc etc.
> 
  In addition, it is possible for a stream to lose or gain an audio track. In 
this case the dimensions won't change but a script may want to react to the 
change in audioTracks. 

  I agree with Silvia, a more generic "metadata changed" event makes more 
sense. 

eric



Re: [whatwg] Video feedback

2011-06-09 Thread Silvia Pfeiffer
On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters  wrote:
> On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer
>  wrote:
>
>>> For commercial video providers, the tracks in a live stream change all
>>> the time; this is not limited to audio and video tracks but would include
>>> text tracks as well.
>>
>> OK, all this indicates to me that we probably want a "metadatachanged"
>> event to indicate there has been a change and that JS may need to
>> check some of its assumptions.
>
> We already have durationchange. Duration is metadata. If we want to support
> changes to width/height, and the script is interested in when that happens,
> maybe there should be a dimensionchange event (but what's the use case for
> changing width/height mid-stream?). Does the spec support changes to text
> tracks mid-stream?

It's not about what the spec supports, but what real-world streams provide.

I don't think it makes sense to put an event on every single type of
metadata that can change. Most of the time, when you have a stream
change, many variables will change together, so a single event is a
lot less events to raise. It's an event that signifies that the media
framework has reset the video/audio decoding pipeline and loaded a
whole bunch of new stuff. You should imagine it as a concatenation of
different media resources. And yes, they can have different track
constitution and different audio sampling rate (which the audio API
will care about) etc etc.

The durationchange is a different type of event. It has not much to do
with having a change of a media format, but more one with getting new
information that more data is available than previously expected. It's
one that allows streaming of long video resources, even if they are
just a of a single encoding setting. In contrast what we are talking
about is that the encoding settings change mid-stream.

Cheers,
Silvia.


Re: [whatwg] Video feedback

2011-06-08 Thread Simon Pieters
On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer  
 wrote:


For commercial video providers, the tracks in a live stream change all  
the time; this is not limited to audio and video tracks but would  
include text tracks as well.


OK, all this indicates to me that we probably want a "metadatachanged"
event to indicate there has been a change and that JS may need to
check some of its assumptions.


We already have durationchange. Duration is metadata. If we want to  
support changes to width/height, and the script is interested in when that  
happens, maybe there should be a dimensionchange event (but what's the use  
case for changing width/height mid-stream?). Does the spec support changes  
to text tracks mid-stream?


--
Simon Pieters
Opera Software


Re: [whatwg] Video feedback

2011-06-08 Thread Silvia Pfeiffer
On Thu, Jun 9, 2011 at 1:57 AM, Bob Lund  wrote:
>
>
>> -Original Message-
>> From: whatwg-boun...@lists.whatwg.org [mailto:whatwg-
>> boun...@lists.whatwg.org] On Behalf Of Eric Carlson
>> Sent: Wednesday, June 08, 2011 9:34 AM
>> To: Silvia Pfeiffer; Philip Jägenstedt
>> Cc: whatwg@lists.whatwg.org
>> Subject: Re: [whatwg] Video feedback
>>
>>
>> On Jun 8, 2011, at 3:35 AM, Silvia Pfeiffer wrote:
>>
>> >> Nothing exposed via the current API would change, AFAICT.
>> >
>> > Thus, after a change mid-stream to, say,  a smaller video width and
>> > height, would the video.videoWidth and video.videoHeight attributes
>> > represent the width and height of the previous stream or the current
>> > one?
>> >
>> >
>> >> I agree that if we
>> >> start exposing things like sampling rate or want to support arbitrary
>> >> chained Ogg, then there is a problem.
>> >
>> > I think we already have a problem with width and height for chained
>> > Ogg and we cannot stop people from putting chained Ogg into the @src.
>> >
>> > I actually took this discussion away from MPEG PTM, which is where
>> > Eric's question came from, because I don't understand how it works
>> > with MPEG. But I can see that it's not just a problem of MPEG, but
>> > also of Ogg (and possibly of WebM which can have multiple Segments).
>> > So, I think we need a generic solution for it.
>> >
>>   The characteristics of an Apple HTTP live stream can change on the
>> fly. For example if the user's bandwidth to the streaming server
>> changes, the video width and height can change as the stream resolution
>> is switched up or down, or the number of tracks can change when a stream
>> switches from video+audio to audio only. In addition, a server can
>> insert segments with different characteristics into a stream on the fly,
>> eg. inserting an ad or emergency announcement.
>>
>>   It is not possible to predict these changes before they occur.
>>
>> eric
>
> For commercial video providers, the tracks in a live stream change all the 
> time; this is not limited to audio and video tracks but would include text 
> tracks as well.

OK, all this indicates to me that we probably want a "metadatachanged"
event to indicate there has been a change and that JS may need to
check some of its assumptions.

Silvia.


Re: [whatwg] Video feedback

2011-06-08 Thread Bob Lund


> -Original Message-
> From: whatwg-boun...@lists.whatwg.org [mailto:whatwg-
> boun...@lists.whatwg.org] On Behalf Of Eric Carlson
> Sent: Wednesday, June 08, 2011 9:34 AM
> To: Silvia Pfeiffer; Philip Jägenstedt
> Cc: whatwg@lists.whatwg.org
> Subject: Re: [whatwg] Video feedback
> 
> 
> On Jun 8, 2011, at 3:35 AM, Silvia Pfeiffer wrote:
> 
> >> Nothing exposed via the current API would change, AFAICT.
> >
> > Thus, after a change mid-stream to, say,  a smaller video width and
> > height, would the video.videoWidth and video.videoHeight attributes
> > represent the width and height of the previous stream or the current
> > one?
> >
> >
> >> I agree that if we
> >> start exposing things like sampling rate or want to support arbitrary
> >> chained Ogg, then there is a problem.
> >
> > I think we already have a problem with width and height for chained
> > Ogg and we cannot stop people from putting chained Ogg into the @src.
> >
> > I actually took this discussion away from MPEG PTM, which is where
> > Eric's question came from, because I don't understand how it works
> > with MPEG. But I can see that it's not just a problem of MPEG, but
> > also of Ogg (and possibly of WebM which can have multiple Segments).
> > So, I think we need a generic solution for it.
> >
>   The characteristics of an Apple HTTP live stream can change on the
> fly. For example if the user's bandwidth to the streaming server
> changes, the video width and height can change as the stream resolution
> is switched up or down, or the number of tracks can change when a stream
> switches from video+audio to audio only. In addition, a server can
> insert segments with different characteristics into a stream on the fly,
> eg. inserting an ad or emergency announcement.
> 
>   It is not possible to predict these changes before they occur.
> 
> eric

For commercial video providers, the tracks in a live stream change all the 
time; this is not limited to audio and video tracks but would include text 
tracks as well. 

Bob Lund



Re: [whatwg] Video feedback

2011-06-08 Thread Eric Carlson

On Jun 8, 2011, at 3:35 AM, Silvia Pfeiffer wrote:

>> Nothing exposed via the current API would change, AFAICT.
> 
> Thus, after a change mid-stream to, say,  a smaller video width and
> height, would the video.videoWidth and video.videoHeight attributes
> represent the width and height of the previous stream or the current
> one?
> 
> 
>> I agree that if we
>> start exposing things like sampling rate or want to support arbitrary
>> chained Ogg, then there is a problem.
> 
> I think we already have a problem with width and height for chained
> Ogg and we cannot stop people from putting chained Ogg into the @src.
> 
> I actually took this discussion away from MPEG PTM, which is where
> Eric's question came from, because I don't understand how it works
> with MPEG. But I can see that it's not just a problem of MPEG, but
> also of Ogg (and possibly of WebM which can have multiple Segments).
> So, I think we need a generic solution for it.
> 
  The characteristics of an Apple HTTP live stream can change on the fly. For 
example if the user's bandwidth to the streaming server changes, the video 
width and height can change as the stream resolution is switched up or down, or 
the number of tracks can change when a stream switches from video+audio to 
audio only. In addition, a server can insert segments with different 
characteristics into a stream on the fly, eg. inserting an ad or emergency 
announcement.

  It is not possible to predict these changes before they occur.

eric



Re: [whatwg] Video feedback

2011-06-08 Thread Philip Jägenstedt
On Wed, 08 Jun 2011 13:38:18 +0200, Silvia Pfeiffer  
 wrote:


On Wed, Jun 8, 2011 at 9:18 PM, Philip Jägenstedt   
wrote:

On Wed, 08 Jun 2011 12:35:24 +0200, Silvia Pfeiffer
 wrote:


On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt 
wrote:


On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer
 wrote:

That is all correct. However, because it is a sequence of Ogg  
streams,

there are new Ogg headers in the middle. These new Ogg headers will
lead to new metadata loaded in the media framework - e.g. because the
new Ogg stream is encoded with a different audio sampling rate and a
different video width/height etc. So, therefore, the metadata in the
media framework changes. However, what the browser reports to the JS
developer doesn't change. Or if it does change, the JS developer is
not informed of it because it is a single infinite audio (or video)
stream. Thus the question whether we need a new "metadatachange"  
event

to expose this to the JS developer. It would then also signify that
potentially the number of tracks that are available may have changed
and other such information.


Nothing exposed via the current API would change, AFAICT.


Thus, after a change mid-stream to, say,  a smaller video width and
height, would the video.videoWidth and video.videoHeight attributes
represent the width and height of the previous stream or the current
one?



I agree that if we
start exposing things like sampling rate or want to support arbitrary
chained Ogg, then there is a problem.


I think we already have a problem with width and height for chained
Ogg and we cannot stop people from putting chained Ogg into the @src.

I actually took this discussion away from MPEG PTM, which is where
Eric's question came from, because I don't understand how it works
with MPEG. But I can see that it's not just a problem of MPEG, but
also of Ogg (and possibly of WebM which can have multiple Segments).
So, I think we need a generic solution for it.


OK, I don't think we disagree. I'm just saying that for Icecast audio
streams, there is no problem.


Hmm.. because there is nothing in the API that actually exposes audio  
metadata?


Yes.


As for Ogg and WebM, I'm inclined to say that we just shouldn't support
that, unless there's some compelling use case for it.


You know that you can also transmit video with icecast...?


Nope :) I guess that invalidates everything I've said about Icecast.  
Practically, though, no one is using Icecast to mix audio tracks with  
audio+video tracks and getting upset that it doesn't work in browsers,  
right?


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Video feedback

2011-06-08 Thread Silvia Pfeiffer
On Wed, Jun 8, 2011 at 9:18 PM, Philip Jägenstedt  wrote:
> On Wed, 08 Jun 2011 12:35:24 +0200, Silvia Pfeiffer
>  wrote:
>
>> On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt 
>> wrote:
>>>
>>> On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer
>>>  wrote:
>>>
 That is all correct. However, because it is a sequence of Ogg streams,
 there are new Ogg headers in the middle. These new Ogg headers will
 lead to new metadata loaded in the media framework - e.g. because the
 new Ogg stream is encoded with a different audio sampling rate and a
 different video width/height etc. So, therefore, the metadata in the
 media framework changes. However, what the browser reports to the JS
 developer doesn't change. Or if it does change, the JS developer is
 not informed of it because it is a single infinite audio (or video)
 stream. Thus the question whether we need a new "metadatachange" event
 to expose this to the JS developer. It would then also signify that
 potentially the number of tracks that are available may have changed
 and other such information.
>>>
>>> Nothing exposed via the current API would change, AFAICT.
>>
>> Thus, after a change mid-stream to, say,  a smaller video width and
>> height, would the video.videoWidth and video.videoHeight attributes
>> represent the width and height of the previous stream or the current
>> one?
>>
>>
>>> I agree that if we
>>> start exposing things like sampling rate or want to support arbitrary
>>> chained Ogg, then there is a problem.
>>
>> I think we already have a problem with width and height for chained
>> Ogg and we cannot stop people from putting chained Ogg into the @src.
>>
>> I actually took this discussion away from MPEG PTM, which is where
>> Eric's question came from, because I don't understand how it works
>> with MPEG. But I can see that it's not just a problem of MPEG, but
>> also of Ogg (and possibly of WebM which can have multiple Segments).
>> So, I think we need a generic solution for it.
>
> OK, I don't think we disagree. I'm just saying that for Icecast audio
> streams, there is no problem.

Hmm.. because there is nothing in the API that actually exposes audio metadata?


> As for Ogg and WebM, I'm inclined to say that we just shouldn't support
> that, unless there's some compelling use case for it.

You know that you can also transmit video with icecast...?

Silvia.

> There's also the
> option of tweaking the muxers so that all the streams are known up-front,
> even if there won't be any data arriving for them until half-way through the
> file.
>
> I also know nothing about MPEG or the use cases involved, so no opinions
> there.
>
> --
> Philip Jägenstedt
> Core Developer
> Opera Software
>


Re: [whatwg] Video feedback

2011-06-08 Thread Philip Jägenstedt
On Wed, 08 Jun 2011 12:35:24 +0200, Silvia Pfeiffer  
 wrote:


On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt   
wrote:

On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer
 wrote:


That is all correct. However, because it is a sequence of Ogg streams,
there are new Ogg headers in the middle. These new Ogg headers will
lead to new metadata loaded in the media framework - e.g. because the
new Ogg stream is encoded with a different audio sampling rate and a
different video width/height etc. So, therefore, the metadata in the
media framework changes. However, what the browser reports to the JS
developer doesn't change. Or if it does change, the JS developer is
not informed of it because it is a single infinite audio (or video)
stream. Thus the question whether we need a new "metadatachange" event
to expose this to the JS developer. It would then also signify that
potentially the number of tracks that are available may have changed
and other such information.


Nothing exposed via the current API would change, AFAICT.


Thus, after a change mid-stream to, say,  a smaller video width and
height, would the video.videoWidth and video.videoHeight attributes
represent the width and height of the previous stream or the current
one?



I agree that if we
start exposing things like sampling rate or want to support arbitrary
chained Ogg, then there is a problem.


I think we already have a problem with width and height for chained
Ogg and we cannot stop people from putting chained Ogg into the @src.

I actually took this discussion away from MPEG PTM, which is where
Eric's question came from, because I don't understand how it works
with MPEG. But I can see that it's not just a problem of MPEG, but
also of Ogg (and possibly of WebM which can have multiple Segments).
So, I think we need a generic solution for it.


OK, I don't think we disagree. I'm just saying that for Icecast audio  
streams, there is no problem.


As for Ogg and WebM, I'm inclined to say that we just shouldn't support  
that, unless there's some compelling use case for it. There's also the  
option of tweaking the muxers so that all the streams are known up-front,  
even if there won't be any data arriving for them until half-way through  
the file.


I also know nothing about MPEG or the use cases involved, so no opinions  
there.


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Video feedback

2011-06-08 Thread Silvia Pfeiffer
On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt  wrote:
> On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer
>  wrote:
>
>> On Tue, Jun 7, 2011 at 7:04 PM, Philip Jägenstedt 
>> wrote:
>>>
>>> On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer
>>>  wrote:
>>>
>>>
 On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson  wrote:
>
> On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:
>>
>> I do not know how technically the change of stream composition works
>> in
>> MPEG, but in Ogg we have to end a current stream and start a new one
>> to
>> switch compositions. This has been called "sequential multiplexing" or
>> "chaining". In this case, stream setup information is repeated, which
>> would probably lead to creating a new steam handler and possibly a new
>> firing of "loadedmetadata". I am not sure how chaining is implemented
>> in
>> browsers.
>
> Per spec, chaining isn't currently supported. The closest thing I can
> find
> in the spec to this situation is handling a non-fatal error, which
> causes
> the unexpected content to be ignored.
>
>
> On Fri, 17 Dec 2010, Eric Winkelman wrote:
>>
>> The short answer for changing stream composition is that there is a
>> Program Map Table (PMT) that is repeated every 100 milliseconds and
>> describes the content of the stream.  Depending on the programming,
>> the
>> stream's composition could change entering/exiting every
>> advertisement.
>
> If this is something that browser vendors want to support, I can
> specify
> how to handle it. Anyone?

 Icecast streams have chained files, so streaming Ogg to an audio
 element would hit this problem. There is a bug in FF for this:
 https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
 bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
 also a webkit bug for icecast streaming, which is probably related
 https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
 is able to deal with icecast streams, but it seems to deal with it.

 The thing is: you can implement playback and seeking without any
 further changes to the spec. But then the browser-internal metadata
 states will change depending on the chunk you're on. Should that also
 update the exposed metadata in the API then? Probably yes, because
 otherwise the JS developer may deal with contradictory information.
 Maybe we need a "metadatachange" event for this?
>>>
>>> An Icecast stream is conceptually just one infinite audio stream, even
>>> though at the container level it is several chained Ogg streams. duration
>>> will be Infinity and currentTime will be constantly increasing. This
>>> doesn't
>>> seem to be a case where any spec change is needed. Am I missing
>>> something?
>>
>>
>> That is all correct. However, because it is a sequence of Ogg streams,
>> there are new Ogg headers in the middle. These new Ogg headers will
>> lead to new metadata loaded in the media framework - e.g. because the
>> new Ogg stream is encoded with a different audio sampling rate and a
>> different video width/height etc. So, therefore, the metadata in the
>> media framework changes. However, what the browser reports to the JS
>> developer doesn't change. Or if it does change, the JS developer is
>> not informed of it because it is a single infinite audio (or video)
>> stream. Thus the question whether we need a new "metadatachange" event
>> to expose this to the JS developer. It would then also signify that
>> potentially the number of tracks that are available may have changed
>> and other such information.
>
> Nothing exposed via the current API would change, AFAICT.

Thus, after a change mid-stream to, say,  a smaller video width and
height, would the video.videoWidth and video.videoHeight attributes
represent the width and height of the previous stream or the current
one?


> I agree that if we
> start exposing things like sampling rate or want to support arbitrary
> chained Ogg, then there is a problem.

I think we already have a problem with width and height for chained
Ogg and we cannot stop people from putting chained Ogg into the @src.

I actually took this discussion away from MPEG PTM, which is where
Eric's question came from, because I don't understand how it works
with MPEG. But I can see that it's not just a problem of MPEG, but
also of Ogg (and possibly of WebM which can have multiple Segments).
So, I think we need a generic solution for it.

Cheers,
Silvia.


Re: [whatwg] Video feedback

2011-06-08 Thread Philip Jägenstedt
On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer  
 wrote:


On Tue, Jun 7, 2011 at 7:04 PM, Philip Jägenstedt   
wrote:

On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer
 wrote:



On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson  wrote:


On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:


I do not know how technically the change of stream composition works  
in
MPEG, but in Ogg we have to end a current stream and start a new one  
to
switch compositions. This has been called "sequential multiplexing"  
or

"chaining". In this case, stream setup information is repeated, which
would probably lead to creating a new steam handler and possibly a  
new
firing of "loadedmetadata". I am not sure how chaining is  
implemented in

browsers.


Per spec, chaining isn't currently supported. The closest thing I can
find
in the spec to this situation is handling a non-fatal error, which  
causes

the unexpected content to be ignored.


On Fri, 17 Dec 2010, Eric Winkelman wrote:


The short answer for changing stream composition is that there is a
Program Map Table (PMT) that is repeated every 100 milliseconds and
describes the content of the stream.  Depending on the programming,  
the
stream's composition could change entering/exiting every  
advertisement.


If this is something that browser vendors want to support, I can  
specify

how to handle it. Anyone?


Icecast streams have chained files, so streaming Ogg to an audio
element would hit this problem. There is a bug in FF for this:
https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
also a webkit bug for icecast streaming, which is probably related
https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
is able to deal with icecast streams, but it seems to deal with it.

The thing is: you can implement playback and seeking without any
further changes to the spec. But then the browser-internal metadata
states will change depending on the chunk you're on. Should that also
update the exposed metadata in the API then? Probably yes, because
otherwise the JS developer may deal with contradictory information.
Maybe we need a "metadatachange" event for this?


An Icecast stream is conceptually just one infinite audio stream, even
though at the container level it is several chained Ogg streams.  
duration
will be Infinity and currentTime will be constantly increasing. This  
doesn't
seem to be a case where any spec change is needed. Am I missing  
something?



That is all correct. However, because it is a sequence of Ogg streams,
there are new Ogg headers in the middle. These new Ogg headers will
lead to new metadata loaded in the media framework - e.g. because the
new Ogg stream is encoded with a different audio sampling rate and a
different video width/height etc. So, therefore, the metadata in the
media framework changes. However, what the browser reports to the JS
developer doesn't change. Or if it does change, the JS developer is
not informed of it because it is a single infinite audio (or video)
stream. Thus the question whether we need a new "metadatachange" event
to expose this to the JS developer. It would then also signify that
potentially the number of tracks that are available may have changed
and other such information.


Nothing exposed via the current API would change, AFAICT. I agree that if  
we start exposing things like sampling rate or want to support arbitrary  
chained Ogg, then there is a problem.


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Video feedback

2011-06-07 Thread Silvia Pfeiffer
On Tue, Jun 7, 2011 at 7:04 PM, Philip Jägenstedt  wrote:
> On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer
>  wrote:
>
>
>> On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson  wrote:
>>>
>>> On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:

 I do not know how technically the change of stream composition works in
 MPEG, but in Ogg we have to end a current stream and start a new one to
 switch compositions. This has been called "sequential multiplexing" or
 "chaining". In this case, stream setup information is repeated, which
 would probably lead to creating a new steam handler and possibly a new
 firing of "loadedmetadata". I am not sure how chaining is implemented in
 browsers.
>>>
>>> Per spec, chaining isn't currently supported. The closest thing I can
>>> find
>>> in the spec to this situation is handling a non-fatal error, which causes
>>> the unexpected content to be ignored.
>>>
>>>
>>> On Fri, 17 Dec 2010, Eric Winkelman wrote:

 The short answer for changing stream composition is that there is a
 Program Map Table (PMT) that is repeated every 100 milliseconds and
 describes the content of the stream.  Depending on the programming, the
 stream's composition could change entering/exiting every advertisement.
>>>
>>> If this is something that browser vendors want to support, I can specify
>>> how to handle it. Anyone?
>>
>> Icecast streams have chained files, so streaming Ogg to an audio
>> element would hit this problem. There is a bug in FF for this:
>> https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
>> bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
>> also a webkit bug for icecast streaming, which is probably related
>> https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
>> is able to deal with icecast streams, but it seems to deal with it.
>>
>> The thing is: you can implement playback and seeking without any
>> further changes to the spec. But then the browser-internal metadata
>> states will change depending on the chunk you're on. Should that also
>> update the exposed metadata in the API then? Probably yes, because
>> otherwise the JS developer may deal with contradictory information.
>> Maybe we need a "metadatachange" event for this?
>
> An Icecast stream is conceptually just one infinite audio stream, even
> though at the container level it is several chained Ogg streams. duration
> will be Infinity and currentTime will be constantly increasing. This doesn't
> seem to be a case where any spec change is needed. Am I missing something?


That is all correct. However, because it is a sequence of Ogg streams,
there are new Ogg headers in the middle. These new Ogg headers will
lead to new metadata loaded in the media framework - e.g. because the
new Ogg stream is encoded with a different audio sampling rate and a
different video width/height etc. So, therefore, the metadata in the
media framework changes. However, what the browser reports to the JS
developer doesn't change. Or if it does change, the JS developer is
not informed of it because it is a single infinite audio (or video)
stream. Thus the question whether we need a new "metadatachange" event
to expose this to the JS developer. It would then also signify that
potentially the number of tracks that are available may have changed
and other such information.

Hope that clarifies it.

Cheers,
Silvia.


Re: [whatwg] Video feedback

2011-06-07 Thread Philip Jägenstedt
On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer  
 wrote:




On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson  wrote:

On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:


I do not know how technically the change of stream composition works in
MPEG, but in Ogg we have to end a current stream and start a new one to
switch compositions. This has been called "sequential multiplexing" or
"chaining". In this case, stream setup information is repeated, which
would probably lead to creating a new steam handler and possibly a new
firing of "loadedmetadata". I am not sure how chaining is implemented  
in

browsers.


Per spec, chaining isn't currently supported. The closest thing I can  
find
in the spec to this situation is handling a non-fatal error, which  
causes

the unexpected content to be ignored.


On Fri, 17 Dec 2010, Eric Winkelman wrote:


The short answer for changing stream composition is that there is a
Program Map Table (PMT) that is repeated every 100 milliseconds and
describes the content of the stream.  Depending on the programming, the
stream's composition could change entering/exiting every advertisement.


If this is something that browser vendors want to support, I can specify
how to handle it. Anyone?


Icecast streams have chained files, so streaming Ogg to an audio
element would hit this problem. There is a bug in FF for this:
https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
also a webkit bug for icecast streaming, which is probably related
https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
is able to deal with icecast streams, but it seems to deal with it.

The thing is: you can implement playback and seeking without any
further changes to the spec. But then the browser-internal metadata
states will change depending on the chunk you're on. Should that also
update the exposed metadata in the API then? Probably yes, because
otherwise the JS developer may deal with contradictory information.
Maybe we need a "metadatachange" event for this?


An Icecast stream is conceptually just one infinite audio stream, even  
though at the container level it is several chained Ogg streams. duration  
will be Infinity and currentTime will be constantly increasing. This  
doesn't seem to be a case where any spec change is needed. Am I missing  
something?


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Video feedback

2011-06-03 Thread Silvia Pfeiffer
I'll be replying to WebVTT related stuff in a separate thread. Here
just feedback on the other stuff.

(Incidentally: why is there  element feedback in here with
video? I don't really understand the connection.)



On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson  wrote:
> On Thu, 16 Dec 2010, Silvia Pfeiffer wrote:
>>
>> I do not know how technically the change of stream composition works in
>> MPEG, but in Ogg we have to end a current stream and start a new one to
>> switch compositions. This has been called "sequential multiplexing" or
>> "chaining". In this case, stream setup information is repeated, which
>> would probably lead to creating a new steam handler and possibly a new
>> firing of "loadedmetadata". I am not sure how chaining is implemented in
>> browsers.
>
> Per spec, chaining isn't currently supported. The closest thing I can find
> in the spec to this situation is handling a non-fatal error, which causes
> the unexpected content to be ignored.
>
>
> On Fri, 17 Dec 2010, Eric Winkelman wrote:
>>
>> The short answer for changing stream composition is that there is a
>> Program Map Table (PMT) that is repeated every 100 milliseconds and
>> describes the content of the stream.  Depending on the programming, the
>> stream's composition could change entering/exiting every advertisement.
>
> If this is something that browser vendors want to support, I can specify
> how to handle it. Anyone?

Icecast streams have chained files, so streaming Ogg to an audio
element would hit this problem. There is a bug in FF for this:
https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate
bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's
also a webkit bug for icecast streaming, which is probably related
https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera
is able to deal with icecast streams, but it seems to deal with it.

The thing is: you can implement playback and seeking without any
further changes to the spec. But then the browser-internal metadata
states will change depending on the chunk you're on. Should that also
update the exposed metadata in the API then? Probably yes, because
otherwise the JS developer may deal with contradictory information.
Maybe we need a "metadatachange" event for this?



> On Tue, 24 May 2011, Silvia Pfeiffer wrote:
>>
>> Ian and I had a brief conversation recently where I mentioned a problem
>> with extended text descriptions with screen readers (and worse still
>> with braille devices) and the suggestion was that the "paused for user
>> interaction" state of a media element may be the solution. I would like
>> to pick this up and discuss in detail how that would work to confirm my
>> sketchy understanding.
>>
>> *The use case:*
>>
>> In the specification for media elements we have a  kind of
>> "descriptions", which are:
>> "Textual descriptions of the video component of the media resource,
>> intended for audio synthesis when the visual component is unavailable
>> (e.g. because the user is interacting with the application without a
>> screen while driving, or because the user is blind). Synthesized as a
>> separate audio track."
>>
>> I'm for now assuming that the synthesis will be done through a screen
>> reader and not through the browser itself, thus making the
>> descriptions available to users as synthesized audio or as braille if
>> the screen reader is set up for a braille device.
>>
>> The textual descriptions are provided as chunks of text with a start
>> and a end time (so-called "cues"). The cues are processed during video
>> playback as the video's playback time starts to fall within the time
>> frame of the cue. Thus, it is expected the that cues are consumed
>> during the cue's time frame and are not present any more when the end
>> time of the cue is reached, so they don't conflict with the video's
>> normal audio.
>>
>> However, on many occasions, it is not possible to consume the cue text
>> in the given time frame. In particular not in the following
>> situations:
>>
>> 1. The screen reader takes longer to read out the cue text than the
>> cue's time frame provides for. This is particularly the case with long
>> cue text, but also when the screen reader's reading rate is slower
>> than what the author of the cue text expected.
>>
>> 2. The braille device is used for reading. Since reading braille is
>> much slower than listening to read-out text, the cue time frame will
>> invariably be too short.
>>
>> 3. The user seeked right into the middle of a cue and thus the time
>> frame that is available for reading out the cue text is shorter than
>> the cue author calculated with.
>>
>> Correct me if I'm wrong, but it seems that what we need is a way for
>> the screen reader to pause the video element from continuing to play
>> while the screen reader is still busy delivering the cue text. (In
>> a11y talk: what is required is a means to deal with "extended
>> descriptions", which extend the timeline of the video.) Onc

Re: [whatwg] Video feedback

2011-06-03 Thread Philip Jägenstedt

On Fri, 03 Jun 2011 01:28:45 +0200, Ian Hickson  wrote:


> On Fri, 22 Oct 2010, Simon Pieters wrote:


Actually it was me, but that's OK :)


> > There was also some discussion about metadata. Language is sometimes
> > necessary for the font engine to pick the right glyph.
>
> Could you elaborate on this? My assumption was that we'd just use CSS,
> which doesn't rely on language for this.

It's not in any spec that I'm aware of, but some browsers (including
Opera) pick different glyphs depending on the language of the text,
which really helps when rendering CJK when you have several CJK fonts on
the system. Browsers will already know the language from , so this would be for external players.


How is this problem solved in SRT players today?


Not at all, it seems. Both VLC and Totem allow setting the character  
encoding and font used for subtitles in the (global) preferences menu, so  
presumably you would change that if the default doesn't work. Font  
switching seems to mainly be an issue when your system has other default  
fonts than the text you're reading, and it appears that is rare enough  
that very little software does anything about it, browsers perhaps being  
an exception.





On Mon, 3 Jan 2011, Philip Jägenstedt wrote:


> > * The "bad cue" handling is stricter than it should be. After
> > collecting an id, the next line must be a timestamp line. Otherwise,
> > we skip everything until a blank line, so in the following the
> > parser would jump to "bad cue" on line "2" and skip the whole cue.
> >
> > 1
> > 2
> > 00:00:00.000 --> 00:00:01.000
> > Bla
> >
> > This doesn't match what most existing SRT parsers do, as they simply
> > look for timing lines and ignore everything else. If we really need
> > to collect the id instead of ignoring it like everyone else, this
> > should be more robust, so that a valid timing line always begins a
> > new cue. Personally, I'd prefer if it is simply ignored and that we
> > use some form of in-cue markup for styling hooks.
>
> The IDs are useful for referencing cues from script, so I haven't
> removed them. I've also left the parsing as is for when neither the
> first nor second line is a timing line, since that gives us a lot of
> headroom for future extensions (we can do anything so long as the
> second line doesn't start with a timestamp and "-->" and another
> timestamp).

In the case of feeding future extensions to current parsers, it's way
better fallback behavior to simply ignore the unrecognized second line
than to discard the entire cue. The current behavior seems unnecessarily
strict and makes the parser more complicated than it needs to be. My
preference is just ignore anything preceding the timing line, but even
if we must have IDs it can still be made simpler and more robust than
what is currently spec'ed.


If we just ignore content until we hit a line that happens to look like a
timing line, then we are much more constrained in what we can do in the
future. For example, we couldn't introduce a "comment block" syntax,  
since

any comment containing a timing line wouldn't be ignored. On the other
hand if we keep the syntax as it is now, we can introduce a comment block
just by having its first line include a "-->" but not have it match the
timestamp syntax, e.g. by having it be "--> COMMENT" or some such.


One of us must be confused, do you mean something like this?

1
--> COMMENT
00:00.000 --> 00:01.000
Cue text

Adding this syntax would break the *current* parser, as it would fail in  
step 39 (Collect WebVTT cue timings and settings) and then skip the rest  
of the cue. If we want any room for extensions along these lines, then  
multiple lines preceding the timing line must be handled gracefully.



Looking at the parser more closely, I don't really see how doing anything
more complex than skipping the block entirely would be simpler than what
we have now, anyway.


I suggest:

 * Step 31: Try to "collect WebVTT cue timings and settings" instead of  
checking for the substring "-->". If it succeeds, jump to what is now step  
40. If it fails, continue at what is now step 32. (This allows adding any  
syntax as long as it doesn't exactly match a timing line, including "-->  
COMMENT". As a bonus, one can fail faster when trying to parse an entire  
timing line rather than doing a substring search for "-->".)


 * Step 32: Only set the id line if it's not already set. (Assuming we  
want the first line to be the id line in future extensions.)


 * Step 39: Jump to the new step 31.

In case not every detail is correct, the idea is to first try to match a  
timing line and to take the first line that is not a timing line (if any)  
as the id, leaving everything in between open for future syntax changes,  
even if they use "-->".


I think it's fairly important that we handle this. Double id lines is an  
easy mistake to make when copying things around. Silently dropping those  
cues would be worse than what many existing (line-based, id-

Re: [whatwg] Video feedback

2011-06-02 Thread Tab Atkins Jr.
On Thu, Jun 2, 2011 at 7:58 PM, Glenn Maynard  wrote:
> The most straightforward solution would seems to be having @lang be a
> CSS property; I don't know the rationale for this being done by HTML
> instead.

The language of a block of text is a property of the content, not a
styling attribute.  It must be carried by the content itself.

As an interesting aside, the direction of a block of text is a
property of the content as well, but CSS has a 'direction' property.
We only added that because XML didn't define a generic @dir attribute,
so we needed *some* way for generic XML languages to specify the text
direction (in this case, by specifying their own direction-specifying
attribute and then providing a default stylesheet that sets
'direction' based on that).  If XML had specified xml:dir like they
did xml:lang, 'direction' wouldn't exist.  Similarly, if XML hadn't
specified xml:lang, we'd probably have a 'language' property.

~TJ


Re: [whatwg] Video feedback

2011-06-02 Thread Glenn Maynard
On Thu, Jun 2, 2011 at 7:28 PM, Ian Hickson  wrote:
> We can add comments pretty easily (e.g. we could say that " comment and ">" ends it -- that's already being ignored by the current
> parser), if people really need them. But are comments really that useful?
> Did SRT have problem due to not supporting inline comments? (Or did it
> support inline comments?)

I've only worked with SSA subtitles (fansubbing), where {text in
braces} effectively worked as a comment.  We used them a lot to
communicate between editors on a phrase-by-phrase basis.

But for that use case, using hidden spans makes more sense, since you
can toggle them on and off to view them inline, etc.

Given that, I'd be fine with a comment format that doesn't allow
mid-cue comments, if it makes the format simpler.

>> The text on the left is a transcription, the top is a transliteration,
>> and the bottom is a translation.
>
> Aren't these three separate text tracks?

They're all in the same track, in practice, since media players don't
play multiple subtitle tracks.

It's true that having them in separate tracks would be better, so they
can be disabled individually.  This is probably rare enough that it
should just be sorted out with scripts, at least to start.

> It's not clear to me that we need language information to apply proper
> font selection and word wrapping, since CSS doesn't do it.

But it doesn't have to, since HTML does this with @lang.

> Mixing one CJK language with one non-CJK language seems fine. That should
> always work, assuming you specify good fonts in the CSS.

The font is ultimately in the user's control.  I tell Firefox to
always use Tahoma for Western text and MS Gothic for Japanese text,
ignoring the often ugly site-specified fonts.  The only control sites
have over my fonts is the language they say the text is (or which the
whole page is detected as).  The same principle seems to apply for
captions.

(That's not to say that it's important enough to add yet and I'm fine
with punting on this, at least for now.  I just don't think specifying
fonts is the right solution.)

The most straightforward solution would seems to be having @lang be a
CSS property; I don't know the rationale for this being done by HTML
instead.

> I don't understand why we can't have good typography for CJK and non-CJK
> together. Surely there are fonts that get both right?

I've never seen a Japanese font that didn't look terrible for English
text.  Also, I don't want my font selection to be severely limited due
to the need to use a single font for both languages, instead of using
the right font for the right text.

>> One example of how this can be tricky: at 0:17, a caption on the bottom
>> wraps and takes two lines, which then pushes the line at 0:19 upward
>> (that part's simple enough).  If instead the top part had appeared
>> first, the renderer would need to figure out in advance to push it
>> upwards, to make space for the two-line caption underneith it.
>> Otherwise, the captions would be forced to switch places.
>
> Right, without lookahead I don't know how you'd solve it. With lookahead
> things get pretty dicey pretty quickly.

The problem is that, at least here, the whole scene is nearly
incomprehensible if the top/bottom arrangement isn't maintained.
Lacking anything better, I suspect authors would use similar brittle
hacks with WebVTT.

Anyway, I don't have a simple solution either.

>> I think that, no matter what you do, people will insert line breaks in
>> cues.  I'd follow the HTML model here: convert newlines to spaces and
>> have a separate, explicit line break like  if needed, so people
>> don't manually line-break unless they actually mean to.
>
> The line-breaks-are-line-breaks feature is one of the features that
> originally made SRT seem like a good idea. It still seems like the neatest
> way of having a line break.

But does this matter?  Line breaks within a cue are relatively
uncommon in my experience (perhaps it's different for other
languages), compared to how many people will insert line breaks in a
text editor simply to break lines while authoring.  If you do this
while testing on a large monitor, it's likely to look reasonable when
rendered; the brokenness won't show up until it's played in a smaller
window.  Anyone using a non-programmer's text editor that doesn't
handle long lines cleanly is likely to do this.

Wrapping lines manually in SRTs also appears to be common (even
standard) practice, perhaps due to inadequate line wrapping in SRT
renderers.  Making line breaks explicit should help keep people from
translating this habit to WebVTT.

>> Related to line breaking, should there be an   escape?  Inserting
>> nbsp literally into files is somewhat annoying for authoring, since
>> they're indistinguishable from regular spaces.
>
> How common would   be?

I guess the main cases I've used nbsp for don't apply so much to
captions, eg. © 2011 (likely to come at the start of a caption,
so not likely to be wrapp