Re: [whatwg] Video feedback
On Thu, 7 Jul 2011, Eric Winkelman wrote: > On Thursday, June 02 Ian Hickson wrote: > > On Fri, 18 Mar 2011, Eric Winkelman wrote: > > > > > > For in-band metadata tracks, there is neither a standard way to > > > represent the type of metadata in the HTMLTrackElement interface nor > > > is there a standard way to represent multiple different types of > > > metadata tracks. > > > > There can be a standard way. The idea is that all the types of > > metadata tracks that browsers will support should be specified so that > > all browsers can map them the same way. I'm happy to work with anyone > > interested in writing such a mapping spec, just let me know. > > I would be very interested in working on this spec. It would be several specs, probably, each focusing on a particular set of metadata in a particular format (e.g. advertising timings in an MPEG wrapper, or whatever). > What's the next step? First, research: what formats and metadata streams are you interested in? Who uses them? How are they implemented in producers and (more importantly) consumers today? What are the use cases? Second, describe the problem: make a clear statement of purpose that scopes the effort to provide guidelines to prevent feature creep. Third, listen to implementors: find those that are interested in implementing this particular mapping of metadata to the DOM API, get their input, see what they want. Fourth, implement: make or have someone else make an experimental implementation of a mapping that addresses the problem described in the earlier steps. Fifth, specify: write a specification that describes the mapping described in step two, based on what you've researched in step one and based on the feedback from steps three and four. Sixth, test: update the experimental implement to fit the spec, get other implementations to implement the spec. Have real users play with it. Seventh, simplify: remove what you don't need. Finally, iterate: repeat all these steps for as long as there's any interest in this mapping, fixing problems, adding new features if they're needed, removing old features that didn't get used or implemented, etc. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Video feedback
On Thursday, June 02 Ian Hickson wrote: > > On Fri, 18 Mar 2011, Eric Winkelman wrote: > > > > For in-band metadata tracks, there is neither a standard way to > > represent the type of metadata in the HTMLTrackElement interface nor > > is there a standard way to represent multiple different types of > > metadata tracks. > > There can be a standard way. The idea is that all the types of > metadata tracks that browsers will support should be specified so that > all browsers can map them the same way. I'm happy to work with anyone > interested in writing such a mapping spec, just let me know. I would be very interested in working on this spec. CableLabs works with numerous groups delivering content containing a variety of metadata, so we have a good idea what is currently used. We're also working with the groups defining adaptive bit rate delivery protocols about how metadata might be carried. What's the next step? Eric
Re: [whatwg] Video feedback
> -Original Message- > From: whatwg-boun...@lists.whatwg.org [mailto:whatwg- > boun...@lists.whatwg.org] On Behalf Of Mark Watson > Sent: Monday, June 20, 2011 2:29 AM > To: Eric Carlson > Cc: Silvia Pfeiffer; whatwg Group; Simon Pieters > Subject: Re: [whatwg] Video feedback > > > On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote: > > > > > On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote: > > > >> On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters > wrote: > >>> On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer > >>> wrote: > >>> > >>>>> For commercial video providers, the tracks in a live stream change > >>>>> all the time; this is not limited to audio and video tracks but > >>>>> would include text tracks as well. > >>>> > >>>> OK, all this indicates to me that we probably want a > "metadatachanged" > >>>> event to indicate there has been a change and that JS may need to > >>>> check some of its assumptions. > >>> > >>> We already have durationchange. Duration is metadata. If we want to > >>> support changes to width/height, and the script is interested in > >>> when that happens, maybe there should be a dimensionchange event > >>> (but what's the use case for changing width/height mid-stream?). > >>> Does the spec support changes to text tracks mid-stream? > >> > >> It's not about what the spec supports, but what real-world streams > provide. > >> > >> I don't think it makes sense to put an event on every single type of > >> metadata that can change. Most of the time, when you have a stream > >> change, many variables will change together, so a single event is a > >> lot less events to raise. It's an event that signifies that the media > >> framework has reset the video/audio decoding pipeline and loaded a > >> whole bunch of new stuff. You should imagine it as a concatenation of > >> different media resources. And yes, they can have different track > >> constitution and different audio sampling rate (which the audio API > >> will care about) etc etc. > >> > > In addition, it is possible for a stream to lose or gain an audio > track. In this case the dimensions won't change but a script may want to > react to the change in audioTracks. > > The TrackList object has an onchanged event, which I assumed would fire > when any of the information in the TrackList changes (e.g. tracks added > or removed). But actually the spec doesn't state when this event fires > (as far as I could tell - unless it is implied by some general > definition of events called onchanged). > > Should there be some clarification here ? > > > > > I agree with Silvia, a more generic "metadata changed" event makes > more sense. > > Yes, and it should support the case in which text tracks are > added/removed too. Has there been a bug submitted to add a metadata changed event when video, audio or text tracks are added or deleted from a media resource? Thanks, Bob Lund > > Also, as Eric (C) pointed out, one of the things which can change is > which of several available versions of the content is being rendered > (for adaptive bitrate cases). This doesn't necessarily change any of the > metadata currently exposed on the video element, but nevertheless it's > information that the application may need. It would be nice to expose > some kind of identifier for the currently rendered stream and have an > event when this changes. I think that a stream-format-supplied > identifier would be sufficient. > > ...Mark > > > > > eric > > > >
Re: [whatwg] Video feedback
On Jun 20, 2011, at 5:28 PM, Silvia Pfeiffer wrote: > On Tue, Jun 21, 2011 at 12:07 AM, Mark Watson wrote: >> >> On Jun 20, 2011, at 11:52 AM, Silvia Pfeiffer wrote: >> >>> On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson wrote: > >> The TrackList object has an onchanged event, which I assumed would fire >> when >> any of the information in the TrackList changes (e.g. tracks added or >> removed). But actually the spec doesn't state when this event fires (as >> far >> as I could tell - unless it is implied by some general definition of >> events >> called onchanged). >> >> Should there be some clarification here ? > > I understood that to relate to a change of cues only, since it is on > the tracklist. I.e. it's an aggregate event from the oncuechange event > of a cue inside the track. I didn't think it would relate to a change > of existence of that track. > > Note that the even is attached to the TrackList, not the TrackList[], > so it cannot be raised when a track is added or removed, only when > something inside the TrackList changes. Are we talking about the same thing ? There is no TrackList array and TrackList is only used for audio/video, not text, so I don't understand the comment about cues. I'm talking about http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist which is the base class for MultipleTrackList and ExclusiveTrackList used to represent all the audio and video tracks (respectively). One instance of the object represents all the tracks, so I would assume that a change in the number of tracks is a change to this object. >>> >>> Ah yes, you're right: I got confused. >>> >>> It says "Whenever the selected track is changed, the user agent must >>> queue a task to fire a simple event named change at the >>> MultipleTrackList object." This means it fires when the selectedIndex >>> is changed, i.e. the user chooses a different track for rendering. I >>> still don't think it relates to changes in the composition of tracks >>> of a resource. That should be something different and should probably >>> be on the MediaElement and not on the track list to also cover changes >>> in text tracks. >> >> Fair enough. >> >>> >>> >> Also, as Eric (C) pointed out, one of the things which can change is >> which >> of several available versions of the content is being rendered (for >> adaptive >> bitrate cases). This doesn't necessarily change any of the metadata >> currently exposed on the video element, but nevertheless it's information >> that the application may need. It would be nice to expose some kind of >> identifier for the currently rendered stream and have an event when this >> changes. I think that a stream-format-supplied identifier would be >> sufficient. > > I don't know about the adaptive streaming situation. I think that is > more about statistics/metrics rather than about change of resource. > All the alternatives in an adaptive streaming "resource" should > provide the same number of tracks and the same video dimensions, just > at different bitrate/quality, no? I think of the different adaptive versions on a per-track basis (i.e. the alternatives are *within* each track), not a bunch of alternatives each of which contains several tracks. Both are possible, of course. It's certainly possible (indeed common) for different bitrate video encodings to have different resolutions - there are video encoding reasons to do this. Of course the aspect ratio should not change and nor should the dimensions on the screen (both would be a little peculiar for the user). Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not the same as the resolution (for a start, they are in CSS pixels, which are square), but I think it quite likely that if the resolution of the video changes than the videoWidth and videoHeight might change. I'd be interested to hear how existing implementations relate resolution to videoWidth and videoHeight. >>> >>> Well, if videoWidth and videoHeight change and no dimensions on the >>> video are provided through CSS, then surely the video will change size >>> and the display will shrink. That would be a terrible user experience. >>> For that reason I would suggest that such a change not be made in >>> alternative adaptive streams. >> >> That seems backwards to me! I would say "For that reason I would suggest >> that dimensions are provided through CSS or through the width and height >> attributes." >> >> Alternatively, we change the specification of the video element to >> accommodate this aspect of adaptive streaming (for example, the videoWidth >> and videoHeight could be defined to be based on the highest resolu
Re: [whatwg] Video feedback
On Tue, Jun 21, 2011 at 12:07 AM, Mark Watson wrote: > > On Jun 20, 2011, at 11:52 AM, Silvia Pfeiffer wrote: > >> On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson wrote: > The TrackList object has an onchanged event, which I assumed would fire > when > any of the information in the TrackList changes (e.g. tracks added or > removed). But actually the spec doesn't state when this event fires (as > far > as I could tell - unless it is implied by some general definition of > events > called onchanged). > > Should there be some clarification here ? I understood that to relate to a change of cues only, since it is on the tracklist. I.e. it's an aggregate event from the oncuechange event of a cue inside the track. I didn't think it would relate to a change of existence of that track. Note that the even is attached to the TrackList, not the TrackList[], so it cannot be raised when a track is added or removed, only when something inside the TrackList changes. >>> >>> Are we talking about the same thing ? There is no TrackList array and >>> TrackList is only used for audio/video, not text, so I don't understand the >>> comment about cues. >>> I'm talking >>> about >>> http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist >>> which >>> is the base class for MultipleTrackList and ExclusiveTrackList used to >>> represent all the audio and video tracks (respectively). One instance of the >>> object represents all the tracks, so I would assume that a change in the >>> number of tracks is a change to this object. >> >> Ah yes, you're right: I got confused. >> >> It says "Whenever the selected track is changed, the user agent must >> queue a task to fire a simple event named change at the >> MultipleTrackList object." This means it fires when the selectedIndex >> is changed, i.e. the user chooses a different track for rendering. I >> still don't think it relates to changes in the composition of tracks >> of a resource. That should be something different and should probably >> be on the MediaElement and not on the track list to also cover changes >> in text tracks. > > Fair enough. > >> >> > Also, as Eric (C) pointed out, one of the things which can change is which > of several available versions of the content is being rendered (for > adaptive > bitrate cases). This doesn't necessarily change any of the metadata > currently exposed on the video element, but nevertheless it's information > that the application may need. It would be nice to expose some kind of > identifier for the currently rendered stream and have an event when this > changes. I think that a stream-format-supplied identifier would be > sufficient. I don't know about the adaptive streaming situation. I think that is more about statistics/metrics rather than about change of resource. All the alternatives in an adaptive streaming "resource" should provide the same number of tracks and the same video dimensions, just at different bitrate/quality, no? >>> >>> I think of the different adaptive versions on a per-track basis (i.e. the >>> alternatives are *within* each track), not a bunch of alternatives each of >>> which contains several tracks. Both are possible, of course. >>> >>> It's certainly possible (indeed common) for different bitrate video >>> encodings to have different resolutions - there are video encoding reasons >>> to do this. Of course the aspect ratio should not change and nor should the >>> dimensions on the screen (both would be a little peculiar for the user). >>> >>> Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not >>> the same as the resolution (for a start, they are in CSS pixels, which are >>> square), but I think it quite likely that if the resolution of the video >>> changes than the videoWidth and videoHeight might change. I'd be interested >>> to hear how existing implementations relate resolution to videoWidth and >>> videoHeight. >> >> Well, if videoWidth and videoHeight change and no dimensions on the >> video are provided through CSS, then surely the video will change size >> and the display will shrink. That would be a terrible user experience. >> For that reason I would suggest that such a change not be made in >> alternative adaptive streams. > > That seems backwards to me! I would say "For that reason I would suggest that > dimensions are provided through CSS or through the width and height > attributes." > > Alternatively, we change the specification of the video element to > accommodate this aspect of adaptive streaming (for example, the videoWidth > and videoHeight could be defined to be based on the highest resolution > bitrate being considered.) > > There are good video encoding reasons for different bitrates to be encoded at > different resolutions which are far more important than any reasons
Re: [whatwg] Video feedback
On Jun 20, 2011, at 11:52 AM, Silvia Pfeiffer wrote: > On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson wrote: >>> The TrackList object has an onchanged event, which I assumed would fire when any of the information in the TrackList changes (e.g. tracks added or removed). But actually the spec doesn't state when this event fires (as far as I could tell - unless it is implied by some general definition of events called onchanged). Should there be some clarification here ? >>> >>> I understood that to relate to a change of cues only, since it is on >>> the tracklist. I.e. it's an aggregate event from the oncuechange event >>> of a cue inside the track. I didn't think it would relate to a change >>> of existence of that track. >>> >>> Note that the even is attached to the TrackList, not the TrackList[], >>> so it cannot be raised when a track is added or removed, only when >>> something inside the TrackList changes. >> >> Are we talking about the same thing ? There is no TrackList array and >> TrackList is only used for audio/video, not text, so I don't understand the >> comment about cues. >> I'm talking >> about >> http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist >> which >> is the base class for MultipleTrackList and ExclusiveTrackList used to >> represent all the audio and video tracks (respectively). One instance of the >> object represents all the tracks, so I would assume that a change in the >> number of tracks is a change to this object. > > Ah yes, you're right: I got confused. > > It says "Whenever the selected track is changed, the user agent must > queue a task to fire a simple event named change at the > MultipleTrackList object." This means it fires when the selectedIndex > is changed, i.e. the user chooses a different track for rendering. I > still don't think it relates to changes in the composition of tracks > of a resource. That should be something different and should probably > be on the MediaElement and not on the track list to also cover changes > in text tracks. Fair enough. > > Also, as Eric (C) pointed out, one of the things which can change is which of several available versions of the content is being rendered (for adaptive bitrate cases). This doesn't necessarily change any of the metadata currently exposed on the video element, but nevertheless it's information that the application may need. It would be nice to expose some kind of identifier for the currently rendered stream and have an event when this changes. I think that a stream-format-supplied identifier would be sufficient. >>> >>> I don't know about the adaptive streaming situation. I think that is >>> more about statistics/metrics rather than about change of resource. >>> All the alternatives in an adaptive streaming "resource" should >>> provide the same number of tracks and the same video dimensions, just >>> at different bitrate/quality, no? >> >> I think of the different adaptive versions on a per-track basis (i.e. the >> alternatives are *within* each track), not a bunch of alternatives each of >> which contains several tracks. Both are possible, of course. >> >> It's certainly possible (indeed common) for different bitrate video >> encodings to have different resolutions - there are video encoding reasons >> to do this. Of course the aspect ratio should not change and nor should the >> dimensions on the screen (both would be a little peculiar for the user). >> >> Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not >> the same as the resolution (for a start, they are in CSS pixels, which are >> square), but I think it quite likely that if the resolution of the video >> changes than the videoWidth and videoHeight might change. I'd be interested >> to hear how existing implementations relate resolution to videoWidth and >> videoHeight. > > Well, if videoWidth and videoHeight change and no dimensions on the > video are provided through CSS, then surely the video will change size > and the display will shrink. That would be a terrible user experience. > For that reason I would suggest that such a change not be made in > alternative adaptive streams. That seems backwards to me! I would say "For that reason I would suggest that dimensions are provided through CSS or through the width and height attributes." Alternatively, we change the specification of the video element to accommodate this aspect of adaptive streaming (for example, the videoWidth and videoHeight could be defined to be based on the highest resolution bitrate being considered.) There are good video encoding reasons for different bitrates to be encoded at different resolutions which are far more important than any reasons not to do either of the above. > > >>> Different video dimensions should be >>> provided through the element and @media attribute, but within >>> an adaptive s
Re: [whatwg] Video feedback
On Mon, Jun 20, 2011 at 7:31 PM, Mark Watson wrote: >> >>> The TrackList object has an onchanged event, which I assumed would fire when >>> any of the information in the TrackList changes (e.g. tracks added or >>> removed). But actually the spec doesn't state when this event fires (as far >>> as I could tell - unless it is implied by some general definition of events >>> called onchanged). >>> >>> Should there be some clarification here ? >> >> I understood that to relate to a change of cues only, since it is on >> the tracklist. I.e. it's an aggregate event from the oncuechange event >> of a cue inside the track. I didn't think it would relate to a change >> of existence of that track. >> >> Note that the even is attached to the TrackList, not the TrackList[], >> so it cannot be raised when a track is added or removed, only when >> something inside the TrackList changes. > > Are we talking about the same thing ? There is no TrackList array and > TrackList is only used for audio/video, not text, so I don't understand the > comment about cues. > I'm talking > about http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist which > is the base class for MultipleTrackList and ExclusiveTrackList used to > represent all the audio and video tracks (respectively). One instance of the > object represents all the tracks, so I would assume that a change in the > number of tracks is a change to this object. Ah yes, you're right: I got confused. It says "Whenever the selected track is changed, the user agent must queue a task to fire a simple event named change at the MultipleTrackList object." This means it fires when the selectedIndex is changed, i.e. the user chooses a different track for rendering. I still don't think it relates to changes in the composition of tracks of a resource. That should be something different and should probably be on the MediaElement and not on the track list to also cover changes in text tracks. >>> Also, as Eric (C) pointed out, one of the things which can change is which >>> of several available versions of the content is being rendered (for adaptive >>> bitrate cases). This doesn't necessarily change any of the metadata >>> currently exposed on the video element, but nevertheless it's information >>> that the application may need. It would be nice to expose some kind of >>> identifier for the currently rendered stream and have an event when this >>> changes. I think that a stream-format-supplied identifier would be >>> sufficient. >> >> I don't know about the adaptive streaming situation. I think that is >> more about statistics/metrics rather than about change of resource. >> All the alternatives in an adaptive streaming "resource" should >> provide the same number of tracks and the same video dimensions, just >> at different bitrate/quality, no? > > I think of the different adaptive versions on a per-track basis (i.e. the > alternatives are *within* each track), not a bunch of alternatives each of > which contains several tracks. Both are possible, of course. > > It's certainly possible (indeed common) for different bitrate video > encodings to have different resolutions - there are video encoding reasons > to do this. Of course the aspect ratio should not change and nor should the > dimensions on the screen (both would be a little peculiar for the user). > > Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not > the same as the resolution (for a start, they are in CSS pixels, which are > square), but I think it quite likely that if the resolution of the video > changes than the videoWidth and videoHeight might change. I'd be interested > to hear how existing implementations relate resolution to videoWidth and > videoHeight. Well, if videoWidth and videoHeight change and no dimensions on the video are provided through CSS, then surely the video will change size and the display will shrink. That would be a terrible user experience. For that reason I would suggest that such a change not be made in alternative adaptive streams. >> Different video dimensions should be >> provided through the element and @media attribute, but within >> an adaptive stream, the alternatives should be consistent because the >> target device won't change. I guess this is a discussion for another >> thread... :-) > > Possibly ;-) The device knows much better than the page author what > capabilities it has and so what resolutions are suitable for the device. So > it is better to provide all the alternatives as a single resource and have > the device work out which subset it can support. Or at least, the list > should be provided all at the same level - there is no rationale for a > hierarchy of alternatives. The way in which HTML deals with different devices and their different capabilities is through media queries. As a author you provide your content with different versions of media-dependent style sheets and content, so that when you view th
Re: [whatwg] Video feedback
On Jun 20, 2011, at 10:42 AM, Silvia Pfeiffer wrote: On Mon, Jun 20, 2011 at 6:29 PM, Mark Watson mailto:wats...@netflix.com>> wrote: On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote: On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote: On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters mailto:sim...@opera.com>> wrote: On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer mailto:silviapfeiff...@gmail.com>> wrote: For commercial video providers, the tracks in a live stream change all the time; this is not limited to audio and video tracks but would include text tracks as well. OK, all this indicates to me that we probably want a "metadatachanged" event to indicate there has been a change and that JS may need to check some of its assumptions. We already have durationchange. Duration is metadata. If we want to support changes to width/height, and the script is interested in when that happens, maybe there should be a dimensionchange event (but what's the use case for changing width/height mid-stream?). Does the spec support changes to text tracks mid-stream? It's not about what the spec supports, but what real-world streams provide. I don't think it makes sense to put an event on every single type of metadata that can change. Most of the time, when you have a stream change, many variables will change together, so a single event is a lot less events to raise. It's an event that signifies that the media framework has reset the video/audio decoding pipeline and loaded a whole bunch of new stuff. You should imagine it as a concatenation of different media resources. And yes, they can have different track constitution and different audio sampling rate (which the audio API will care about) etc etc. In addition, it is possible for a stream to lose or gain an audio track. In this case the dimensions won't change but a script may want to react to the change in audioTracks. The TrackList object has an onchanged event, which I assumed would fire when any of the information in the TrackList changes (e.g. tracks added or removed). But actually the spec doesn't state when this event fires (as far as I could tell - unless it is implied by some general definition of events called onchanged). Should there be some clarification here ? I understood that to relate to a change of cues only, since it is on the tracklist. I.e. it's an aggregate event from the oncuechange event of a cue inside the track. I didn't think it would relate to a change of existence of that track. Note that the even is attached to the TrackList, not the TrackList[], so it cannot be raised when a track is added or removed, only when something inside the TrackList changes. Are we talking about the same thing ? There is no TrackList array and TrackList is only used for audio/video, not text, so I don't understand the comment about cues. I'm talking about http://www.whatwg.org/specs/web-apps/current-work/multipage/the-iframe-element.html#tracklist which is the base class for MultipleTrackList and ExclusiveTrackList used to represent all the audio and video tracks (respectively). One instance of the object represents all the tracks, so I would assume that a change in the number of tracks is a change to this object. I agree with Silvia, a more generic "metadata changed" event makes more sense. Yes, and it should support the case in which text tracks are added/removed too. Yes, it needs to be an event on the MediaElement. Also, as Eric (C) pointed out, one of the things which can change is which of seve ral available versions of the content is being rendered (for adaptive bitrate cases). This doesn't necessarily change any of the metadata currently exposed on the video element, but nevertheless it's information that the application may need. It would be nice to expose some kind of identifier for the currently rendered stream and have an event when this changes. I think that a stream-format-supplied identifier would be sufficient. I don't know about the adaptive streaming situation. I think that is more about statistics/metrics rather than about change of resource. All the alternatives in an adaptive streaming "resource" should provide the same number of tracks and the same video dimensions, just at different bitrate/quality, no? I think of the different adaptive versions on a per-track basis (i.e. the alternatives are *within* each track), not a bunch of alternatives each of which contains several tracks. Both are possible, of course. It's certainly possible (indeed common) for different bitrate video encodings to have different resolutions - there are video encoding reasons to do this. Of course the aspect ratio should not change and nor should the dimensions on the screen (both would be a little peculiar for the user). Now, the videoWidth and videoHeight attributes of HTMLVideoElement are not the same as the resolution (for a start, they are in CSS pixels, which are square), but I think it quite likely that if t
Re: [whatwg] Video feedback
On Mon, Jun 20, 2011 at 6:29 PM, Mark Watson wrote: > > On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote: > >> >> On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote: >> >>> On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters wrote: On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer wrote: >> For commercial video providers, the tracks in a live stream change all >> the time; this is not limited to audio and video tracks but would include >> text tracks as well. > > OK, all this indicates to me that we probably want a "metadatachanged" > event to indicate there has been a change and that JS may need to > check some of its assumptions. We already have durationchange. Duration is metadata. If we want to support changes to width/height, and the script is interested in when that happens, maybe there should be a dimensionchange event (but what's the use case for changing width/height mid-stream?). Does the spec support changes to text tracks mid-stream? >>> >>> It's not about what the spec supports, but what real-world streams provide. >>> >>> I don't think it makes sense to put an event on every single type of >>> metadata that can change. Most of the time, when you have a stream >>> change, many variables will change together, so a single event is a >>> lot less events to raise. It's an event that signifies that the media >>> framework has reset the video/audio decoding pipeline and loaded a >>> whole bunch of new stuff. You should imagine it as a concatenation of >>> different media resources. And yes, they can have different track >>> constitution and different audio sampling rate (which the audio API >>> will care about) etc etc. >>> >> In addition, it is possible for a stream to lose or gain an audio track. In >> this case the dimensions won't change but a script may want to react to the >> change in audioTracks. > > The TrackList object has an onchanged event, which I assumed would fire when > any of the information in the TrackList changes (e.g. tracks added or > removed). But actually the spec doesn't state when this event fires (as far > as I could tell - unless it is implied by some general definition of events > called onchanged). > > Should there be some clarification here ? I understood that to relate to a change of cues only, since it is on the tracklist. I.e. it's an aggregate event from the oncuechange event of a cue inside the track. I didn't think it would relate to a change of existence of that track. Note that the even is attached to the TrackList, not the TrackList[], so it cannot be raised when a track is added or removed, only when something inside the TrackList changes. >> I agree with Silvia, a more generic "metadata changed" event makes more >> sense. > > Yes, and it should support the case in which text tracks are added/removed > too. Yes, it needs to be an event on the MediaElement. > Also, as Eric (C) pointed out, one of the things which can change is which of > several available versions of the content is being rendered (for adaptive > bitrate cases). This doesn't necessarily change any of the metadata currently > exposed on the video element, but nevertheless it's information that the > application may need. It would be nice to expose some kind of identifier for > the currently rendered stream and have an event when this changes. I think > that a stream-format-supplied identifier would be sufficient. I don't know about the adaptive streaming situation. I think that is more about statistics/metrics rather than about change of resource. All the alternatives in an adaptive streaming "resource" should provide the same number of tracks and the same video dimensions, just at different bitrate/quality, no? Different video dimensions should be provided through the element and @media attribute, but within an adaptive stream, the alternatives should be consistent because the target device won't change. I guess this is a discussion for another thread... :-) Cheers, Silvia.
Re: [whatwg] Video feedback
On Jun 9, 2011, at 4:32 PM, Eric Carlson wrote: > > On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote: > >> On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters wrote: >>> On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer >>> wrote: >>> > For commercial video providers, the tracks in a live stream change all > the time; this is not limited to audio and video tracks but would include > text tracks as well. OK, all this indicates to me that we probably want a "metadatachanged" event to indicate there has been a change and that JS may need to check some of its assumptions. >>> >>> We already have durationchange. Duration is metadata. If we want to support >>> changes to width/height, and the script is interested in when that happens, >>> maybe there should be a dimensionchange event (but what's the use case for >>> changing width/height mid-stream?). Does the spec support changes to text >>> tracks mid-stream? >> >> It's not about what the spec supports, but what real-world streams provide. >> >> I don't think it makes sense to put an event on every single type of >> metadata that can change. Most of the time, when you have a stream >> change, many variables will change together, so a single event is a >> lot less events to raise. It's an event that signifies that the media >> framework has reset the video/audio decoding pipeline and loaded a >> whole bunch of new stuff. You should imagine it as a concatenation of >> different media resources. And yes, they can have different track >> constitution and different audio sampling rate (which the audio API >> will care about) etc etc. >> > In addition, it is possible for a stream to lose or gain an audio track. In > this case the dimensions won't change but a script may want to react to the > change in audioTracks. The TrackList object has an onchanged event, which I assumed would fire when any of the information in the TrackList changes (e.g. tracks added or removed). But actually the spec doesn't state when this event fires (as far as I could tell - unless it is implied by some general definition of events called onchanged). Should there be some clarification here ? > > I agree with Silvia, a more generic "metadata changed" event makes more > sense. Yes, and it should support the case in which text tracks are added/removed too. Also, as Eric (C) pointed out, one of the things which can change is which of several available versions of the content is being rendered (for adaptive bitrate cases). This doesn't necessarily change any of the metadata currently exposed on the video element, but nevertheless it's information that the application may need. It would be nice to expose some kind of identifier for the currently rendered stream and have an event when this changes. I think that a stream-format-supplied identifier would be sufficient. ...Mark > > eric > >
Re: [whatwg] Video feedback
On Jun 9, 2011, at 12:02 AM, Silvia Pfeiffer wrote: > On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters wrote: >> On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer >> wrote: >> For commercial video providers, the tracks in a live stream change all the time; this is not limited to audio and video tracks but would include text tracks as well. >>> >>> OK, all this indicates to me that we probably want a "metadatachanged" >>> event to indicate there has been a change and that JS may need to >>> check some of its assumptions. >> >> We already have durationchange. Duration is metadata. If we want to support >> changes to width/height, and the script is interested in when that happens, >> maybe there should be a dimensionchange event (but what's the use case for >> changing width/height mid-stream?). Does the spec support changes to text >> tracks mid-stream? > > It's not about what the spec supports, but what real-world streams provide. > > I don't think it makes sense to put an event on every single type of > metadata that can change. Most of the time, when you have a stream > change, many variables will change together, so a single event is a > lot less events to raise. It's an event that signifies that the media > framework has reset the video/audio decoding pipeline and loaded a > whole bunch of new stuff. You should imagine it as a concatenation of > different media resources. And yes, they can have different track > constitution and different audio sampling rate (which the audio API > will care about) etc etc. > In addition, it is possible for a stream to lose or gain an audio track. In this case the dimensions won't change but a script may want to react to the change in audioTracks. I agree with Silvia, a more generic "metadata changed" event makes more sense. eric
Re: [whatwg] Video feedback
On Thu, Jun 9, 2011 at 4:34 PM, Simon Pieters wrote: > On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer > wrote: > >>> For commercial video providers, the tracks in a live stream change all >>> the time; this is not limited to audio and video tracks but would include >>> text tracks as well. >> >> OK, all this indicates to me that we probably want a "metadatachanged" >> event to indicate there has been a change and that JS may need to >> check some of its assumptions. > > We already have durationchange. Duration is metadata. If we want to support > changes to width/height, and the script is interested in when that happens, > maybe there should be a dimensionchange event (but what's the use case for > changing width/height mid-stream?). Does the spec support changes to text > tracks mid-stream? It's not about what the spec supports, but what real-world streams provide. I don't think it makes sense to put an event on every single type of metadata that can change. Most of the time, when you have a stream change, many variables will change together, so a single event is a lot less events to raise. It's an event that signifies that the media framework has reset the video/audio decoding pipeline and loaded a whole bunch of new stuff. You should imagine it as a concatenation of different media resources. And yes, they can have different track constitution and different audio sampling rate (which the audio API will care about) etc etc. The durationchange is a different type of event. It has not much to do with having a change of a media format, but more one with getting new information that more data is available than previously expected. It's one that allows streaming of long video resources, even if they are just a of a single encoding setting. In contrast what we are talking about is that the encoding settings change mid-stream. Cheers, Silvia.
Re: [whatwg] Video feedback
On Thu, 09 Jun 2011 03:47:49 +0200, Silvia Pfeiffer wrote: For commercial video providers, the tracks in a live stream change all the time; this is not limited to audio and video tracks but would include text tracks as well. OK, all this indicates to me that we probably want a "metadatachanged" event to indicate there has been a change and that JS may need to check some of its assumptions. We already have durationchange. Duration is metadata. If we want to support changes to width/height, and the script is interested in when that happens, maybe there should be a dimensionchange event (but what's the use case for changing width/height mid-stream?). Does the spec support changes to text tracks mid-stream? -- Simon Pieters Opera Software
Re: [whatwg] Video feedback
On Thu, Jun 9, 2011 at 1:57 AM, Bob Lund wrote: > > >> -Original Message- >> From: whatwg-boun...@lists.whatwg.org [mailto:whatwg- >> boun...@lists.whatwg.org] On Behalf Of Eric Carlson >> Sent: Wednesday, June 08, 2011 9:34 AM >> To: Silvia Pfeiffer; Philip Jägenstedt >> Cc: whatwg@lists.whatwg.org >> Subject: Re: [whatwg] Video feedback >> >> >> On Jun 8, 2011, at 3:35 AM, Silvia Pfeiffer wrote: >> >> >> Nothing exposed via the current API would change, AFAICT. >> > >> > Thus, after a change mid-stream to, say, a smaller video width and >> > height, would the video.videoWidth and video.videoHeight attributes >> > represent the width and height of the previous stream or the current >> > one? >> > >> > >> >> I agree that if we >> >> start exposing things like sampling rate or want to support arbitrary >> >> chained Ogg, then there is a problem. >> > >> > I think we already have a problem with width and height for chained >> > Ogg and we cannot stop people from putting chained Ogg into the @src. >> > >> > I actually took this discussion away from MPEG PTM, which is where >> > Eric's question came from, because I don't understand how it works >> > with MPEG. But I can see that it's not just a problem of MPEG, but >> > also of Ogg (and possibly of WebM which can have multiple Segments). >> > So, I think we need a generic solution for it. >> > >> The characteristics of an Apple HTTP live stream can change on the >> fly. For example if the user's bandwidth to the streaming server >> changes, the video width and height can change as the stream resolution >> is switched up or down, or the number of tracks can change when a stream >> switches from video+audio to audio only. In addition, a server can >> insert segments with different characteristics into a stream on the fly, >> eg. inserting an ad or emergency announcement. >> >> It is not possible to predict these changes before they occur. >> >> eric > > For commercial video providers, the tracks in a live stream change all the > time; this is not limited to audio and video tracks but would include text > tracks as well. OK, all this indicates to me that we probably want a "metadatachanged" event to indicate there has been a change and that JS may need to check some of its assumptions. Silvia.
Re: [whatwg] Video feedback
> -Original Message- > From: whatwg-boun...@lists.whatwg.org [mailto:whatwg- > boun...@lists.whatwg.org] On Behalf Of Eric Carlson > Sent: Wednesday, June 08, 2011 9:34 AM > To: Silvia Pfeiffer; Philip Jägenstedt > Cc: whatwg@lists.whatwg.org > Subject: Re: [whatwg] Video feedback > > > On Jun 8, 2011, at 3:35 AM, Silvia Pfeiffer wrote: > > >> Nothing exposed via the current API would change, AFAICT. > > > > Thus, after a change mid-stream to, say, a smaller video width and > > height, would the video.videoWidth and video.videoHeight attributes > > represent the width and height of the previous stream or the current > > one? > > > > > >> I agree that if we > >> start exposing things like sampling rate or want to support arbitrary > >> chained Ogg, then there is a problem. > > > > I think we already have a problem with width and height for chained > > Ogg and we cannot stop people from putting chained Ogg into the @src. > > > > I actually took this discussion away from MPEG PTM, which is where > > Eric's question came from, because I don't understand how it works > > with MPEG. But I can see that it's not just a problem of MPEG, but > > also of Ogg (and possibly of WebM which can have multiple Segments). > > So, I think we need a generic solution for it. > > > The characteristics of an Apple HTTP live stream can change on the > fly. For example if the user's bandwidth to the streaming server > changes, the video width and height can change as the stream resolution > is switched up or down, or the number of tracks can change when a stream > switches from video+audio to audio only. In addition, a server can > insert segments with different characteristics into a stream on the fly, > eg. inserting an ad or emergency announcement. > > It is not possible to predict these changes before they occur. > > eric For commercial video providers, the tracks in a live stream change all the time; this is not limited to audio and video tracks but would include text tracks as well. Bob Lund
Re: [whatwg] Video feedback
On Jun 8, 2011, at 3:35 AM, Silvia Pfeiffer wrote: >> Nothing exposed via the current API would change, AFAICT. > > Thus, after a change mid-stream to, say, a smaller video width and > height, would the video.videoWidth and video.videoHeight attributes > represent the width and height of the previous stream or the current > one? > > >> I agree that if we >> start exposing things like sampling rate or want to support arbitrary >> chained Ogg, then there is a problem. > > I think we already have a problem with width and height for chained > Ogg and we cannot stop people from putting chained Ogg into the @src. > > I actually took this discussion away from MPEG PTM, which is where > Eric's question came from, because I don't understand how it works > with MPEG. But I can see that it's not just a problem of MPEG, but > also of Ogg (and possibly of WebM which can have multiple Segments). > So, I think we need a generic solution for it. > The characteristics of an Apple HTTP live stream can change on the fly. For example if the user's bandwidth to the streaming server changes, the video width and height can change as the stream resolution is switched up or down, or the number of tracks can change when a stream switches from video+audio to audio only. In addition, a server can insert segments with different characteristics into a stream on the fly, eg. inserting an ad or emergency announcement. It is not possible to predict these changes before they occur. eric
Re: [whatwg] Video feedback
On Wed, 08 Jun 2011 13:38:18 +0200, Silvia Pfeiffer wrote: On Wed, Jun 8, 2011 at 9:18 PM, Philip Jägenstedt wrote: On Wed, 08 Jun 2011 12:35:24 +0200, Silvia Pfeiffer wrote: On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt wrote: On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer wrote: That is all correct. However, because it is a sequence of Ogg streams, there are new Ogg headers in the middle. These new Ogg headers will lead to new metadata loaded in the media framework - e.g. because the new Ogg stream is encoded with a different audio sampling rate and a different video width/height etc. So, therefore, the metadata in the media framework changes. However, what the browser reports to the JS developer doesn't change. Or if it does change, the JS developer is not informed of it because it is a single infinite audio (or video) stream. Thus the question whether we need a new "metadatachange" event to expose this to the JS developer. It would then also signify that potentially the number of tracks that are available may have changed and other such information. Nothing exposed via the current API would change, AFAICT. Thus, after a change mid-stream to, say, a smaller video width and height, would the video.videoWidth and video.videoHeight attributes represent the width and height of the previous stream or the current one? I agree that if we start exposing things like sampling rate or want to support arbitrary chained Ogg, then there is a problem. I think we already have a problem with width and height for chained Ogg and we cannot stop people from putting chained Ogg into the @src. I actually took this discussion away from MPEG PTM, which is where Eric's question came from, because I don't understand how it works with MPEG. But I can see that it's not just a problem of MPEG, but also of Ogg (and possibly of WebM which can have multiple Segments). So, I think we need a generic solution for it. OK, I don't think we disagree. I'm just saying that for Icecast audio streams, there is no problem. Hmm.. because there is nothing in the API that actually exposes audio metadata? Yes. As for Ogg and WebM, I'm inclined to say that we just shouldn't support that, unless there's some compelling use case for it. You know that you can also transmit video with icecast...? Nope :) I guess that invalidates everything I've said about Icecast. Practically, though, no one is using Icecast to mix audio tracks with audio+video tracks and getting upset that it doesn't work in browsers, right? -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Video feedback
On Wed, Jun 8, 2011 at 9:18 PM, Philip Jägenstedt wrote: > On Wed, 08 Jun 2011 12:35:24 +0200, Silvia Pfeiffer > wrote: > >> On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt >> wrote: >>> >>> On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer >>> wrote: >>> That is all correct. However, because it is a sequence of Ogg streams, there are new Ogg headers in the middle. These new Ogg headers will lead to new metadata loaded in the media framework - e.g. because the new Ogg stream is encoded with a different audio sampling rate and a different video width/height etc. So, therefore, the metadata in the media framework changes. However, what the browser reports to the JS developer doesn't change. Or if it does change, the JS developer is not informed of it because it is a single infinite audio (or video) stream. Thus the question whether we need a new "metadatachange" event to expose this to the JS developer. It would then also signify that potentially the number of tracks that are available may have changed and other such information. >>> >>> Nothing exposed via the current API would change, AFAICT. >> >> Thus, after a change mid-stream to, say, a smaller video width and >> height, would the video.videoWidth and video.videoHeight attributes >> represent the width and height of the previous stream or the current >> one? >> >> >>> I agree that if we >>> start exposing things like sampling rate or want to support arbitrary >>> chained Ogg, then there is a problem. >> >> I think we already have a problem with width and height for chained >> Ogg and we cannot stop people from putting chained Ogg into the @src. >> >> I actually took this discussion away from MPEG PTM, which is where >> Eric's question came from, because I don't understand how it works >> with MPEG. But I can see that it's not just a problem of MPEG, but >> also of Ogg (and possibly of WebM which can have multiple Segments). >> So, I think we need a generic solution for it. > > OK, I don't think we disagree. I'm just saying that for Icecast audio > streams, there is no problem. Hmm.. because there is nothing in the API that actually exposes audio metadata? > As for Ogg and WebM, I'm inclined to say that we just shouldn't support > that, unless there's some compelling use case for it. You know that you can also transmit video with icecast...? Silvia. > There's also the > option of tweaking the muxers so that all the streams are known up-front, > even if there won't be any data arriving for them until half-way through the > file. > > I also know nothing about MPEG or the use cases involved, so no opinions > there. > > -- > Philip Jägenstedt > Core Developer > Opera Software >
Re: [whatwg] Video feedback
On Wed, 08 Jun 2011 12:35:24 +0200, Silvia Pfeiffer wrote: On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt wrote: On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer wrote: That is all correct. However, because it is a sequence of Ogg streams, there are new Ogg headers in the middle. These new Ogg headers will lead to new metadata loaded in the media framework - e.g. because the new Ogg stream is encoded with a different audio sampling rate and a different video width/height etc. So, therefore, the metadata in the media framework changes. However, what the browser reports to the JS developer doesn't change. Or if it does change, the JS developer is not informed of it because it is a single infinite audio (or video) stream. Thus the question whether we need a new "metadatachange" event to expose this to the JS developer. It would then also signify that potentially the number of tracks that are available may have changed and other such information. Nothing exposed via the current API would change, AFAICT. Thus, after a change mid-stream to, say, a smaller video width and height, would the video.videoWidth and video.videoHeight attributes represent the width and height of the previous stream or the current one? I agree that if we start exposing things like sampling rate or want to support arbitrary chained Ogg, then there is a problem. I think we already have a problem with width and height for chained Ogg and we cannot stop people from putting chained Ogg into the @src. I actually took this discussion away from MPEG PTM, which is where Eric's question came from, because I don't understand how it works with MPEG. But I can see that it's not just a problem of MPEG, but also of Ogg (and possibly of WebM which can have multiple Segments). So, I think we need a generic solution for it. OK, I don't think we disagree. I'm just saying that for Icecast audio streams, there is no problem. As for Ogg and WebM, I'm inclined to say that we just shouldn't support that, unless there's some compelling use case for it. There's also the option of tweaking the muxers so that all the streams are known up-front, even if there won't be any data arriving for them until half-way through the file. I also know nothing about MPEG or the use cases involved, so no opinions there. -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Video feedback
On Wed, Jun 8, 2011 at 6:14 PM, Philip Jägenstedt wrote: > On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer > wrote: > >> On Tue, Jun 7, 2011 at 7:04 PM, Philip Jägenstedt >> wrote: >>> >>> On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer >>> wrote: >>> >>> On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson wrote: > > On Thu, 16 Dec 2010, Silvia Pfeiffer wrote: >> >> I do not know how technically the change of stream composition works >> in >> MPEG, but in Ogg we have to end a current stream and start a new one >> to >> switch compositions. This has been called "sequential multiplexing" or >> "chaining". In this case, stream setup information is repeated, which >> would probably lead to creating a new steam handler and possibly a new >> firing of "loadedmetadata". I am not sure how chaining is implemented >> in >> browsers. > > Per spec, chaining isn't currently supported. The closest thing I can > find > in the spec to this situation is handling a non-fatal error, which > causes > the unexpected content to be ignored. > > > On Fri, 17 Dec 2010, Eric Winkelman wrote: >> >> The short answer for changing stream composition is that there is a >> Program Map Table (PMT) that is repeated every 100 milliseconds and >> describes the content of the stream. Depending on the programming, >> the >> stream's composition could change entering/exiting every >> advertisement. > > If this is something that browser vendors want to support, I can > specify > how to handle it. Anyone? Icecast streams have chained files, so streaming Ogg to an audio element would hit this problem. There is a bug in FF for this: https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's also a webkit bug for icecast streaming, which is probably related https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera is able to deal with icecast streams, but it seems to deal with it. The thing is: you can implement playback and seeking without any further changes to the spec. But then the browser-internal metadata states will change depending on the chunk you're on. Should that also update the exposed metadata in the API then? Probably yes, because otherwise the JS developer may deal with contradictory information. Maybe we need a "metadatachange" event for this? >>> >>> An Icecast stream is conceptually just one infinite audio stream, even >>> though at the container level it is several chained Ogg streams. duration >>> will be Infinity and currentTime will be constantly increasing. This >>> doesn't >>> seem to be a case where any spec change is needed. Am I missing >>> something? >> >> >> That is all correct. However, because it is a sequence of Ogg streams, >> there are new Ogg headers in the middle. These new Ogg headers will >> lead to new metadata loaded in the media framework - e.g. because the >> new Ogg stream is encoded with a different audio sampling rate and a >> different video width/height etc. So, therefore, the metadata in the >> media framework changes. However, what the browser reports to the JS >> developer doesn't change. Or if it does change, the JS developer is >> not informed of it because it is a single infinite audio (or video) >> stream. Thus the question whether we need a new "metadatachange" event >> to expose this to the JS developer. It would then also signify that >> potentially the number of tracks that are available may have changed >> and other such information. > > Nothing exposed via the current API would change, AFAICT. Thus, after a change mid-stream to, say, a smaller video width and height, would the video.videoWidth and video.videoHeight attributes represent the width and height of the previous stream or the current one? > I agree that if we > start exposing things like sampling rate or want to support arbitrary > chained Ogg, then there is a problem. I think we already have a problem with width and height for chained Ogg and we cannot stop people from putting chained Ogg into the @src. I actually took this discussion away from MPEG PTM, which is where Eric's question came from, because I don't understand how it works with MPEG. But I can see that it's not just a problem of MPEG, but also of Ogg (and possibly of WebM which can have multiple Segments). So, I think we need a generic solution for it. Cheers, Silvia.
Re: [whatwg] Video feedback
On Wed, 08 Jun 2011 02:46:15 +0200, Silvia Pfeiffer wrote: On Tue, Jun 7, 2011 at 7:04 PM, Philip Jägenstedt wrote: On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer wrote: On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson wrote: On Thu, 16 Dec 2010, Silvia Pfeiffer wrote: I do not know how technically the change of stream composition works in MPEG, but in Ogg we have to end a current stream and start a new one to switch compositions. This has been called "sequential multiplexing" or "chaining". In this case, stream setup information is repeated, which would probably lead to creating a new steam handler and possibly a new firing of "loadedmetadata". I am not sure how chaining is implemented in browsers. Per spec, chaining isn't currently supported. The closest thing I can find in the spec to this situation is handling a non-fatal error, which causes the unexpected content to be ignored. On Fri, 17 Dec 2010, Eric Winkelman wrote: The short answer for changing stream composition is that there is a Program Map Table (PMT) that is repeated every 100 milliseconds and describes the content of the stream. Depending on the programming, the stream's composition could change entering/exiting every advertisement. If this is something that browser vendors want to support, I can specify how to handle it. Anyone? Icecast streams have chained files, so streaming Ogg to an audio element would hit this problem. There is a bug in FF for this: https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's also a webkit bug for icecast streaming, which is probably related https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera is able to deal with icecast streams, but it seems to deal with it. The thing is: you can implement playback and seeking without any further changes to the spec. But then the browser-internal metadata states will change depending on the chunk you're on. Should that also update the exposed metadata in the API then? Probably yes, because otherwise the JS developer may deal with contradictory information. Maybe we need a "metadatachange" event for this? An Icecast stream is conceptually just one infinite audio stream, even though at the container level it is several chained Ogg streams. duration will be Infinity and currentTime will be constantly increasing. This doesn't seem to be a case where any spec change is needed. Am I missing something? That is all correct. However, because it is a sequence of Ogg streams, there are new Ogg headers in the middle. These new Ogg headers will lead to new metadata loaded in the media framework - e.g. because the new Ogg stream is encoded with a different audio sampling rate and a different video width/height etc. So, therefore, the metadata in the media framework changes. However, what the browser reports to the JS developer doesn't change. Or if it does change, the JS developer is not informed of it because it is a single infinite audio (or video) stream. Thus the question whether we need a new "metadatachange" event to expose this to the JS developer. It would then also signify that potentially the number of tracks that are available may have changed and other such information. Nothing exposed via the current API would change, AFAICT. I agree that if we start exposing things like sampling rate or want to support arbitrary chained Ogg, then there is a problem. -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Video feedback
On Tue, Jun 7, 2011 at 7:04 PM, Philip Jägenstedt wrote: > On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer > wrote: > > >> On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson wrote: >>> >>> On Thu, 16 Dec 2010, Silvia Pfeiffer wrote: I do not know how technically the change of stream composition works in MPEG, but in Ogg we have to end a current stream and start a new one to switch compositions. This has been called "sequential multiplexing" or "chaining". In this case, stream setup information is repeated, which would probably lead to creating a new steam handler and possibly a new firing of "loadedmetadata". I am not sure how chaining is implemented in browsers. >>> >>> Per spec, chaining isn't currently supported. The closest thing I can >>> find >>> in the spec to this situation is handling a non-fatal error, which causes >>> the unexpected content to be ignored. >>> >>> >>> On Fri, 17 Dec 2010, Eric Winkelman wrote: The short answer for changing stream composition is that there is a Program Map Table (PMT) that is repeated every 100 milliseconds and describes the content of the stream. Depending on the programming, the stream's composition could change entering/exiting every advertisement. >>> >>> If this is something that browser vendors want to support, I can specify >>> how to handle it. Anyone? >> >> Icecast streams have chained files, so streaming Ogg to an audio >> element would hit this problem. There is a bug in FF for this: >> https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate >> bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's >> also a webkit bug for icecast streaming, which is probably related >> https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera >> is able to deal with icecast streams, but it seems to deal with it. >> >> The thing is: you can implement playback and seeking without any >> further changes to the spec. But then the browser-internal metadata >> states will change depending on the chunk you're on. Should that also >> update the exposed metadata in the API then? Probably yes, because >> otherwise the JS developer may deal with contradictory information. >> Maybe we need a "metadatachange" event for this? > > An Icecast stream is conceptually just one infinite audio stream, even > though at the container level it is several chained Ogg streams. duration > will be Infinity and currentTime will be constantly increasing. This doesn't > seem to be a case where any spec change is needed. Am I missing something? That is all correct. However, because it is a sequence of Ogg streams, there are new Ogg headers in the middle. These new Ogg headers will lead to new metadata loaded in the media framework - e.g. because the new Ogg stream is encoded with a different audio sampling rate and a different video width/height etc. So, therefore, the metadata in the media framework changes. However, what the browser reports to the JS developer doesn't change. Or if it does change, the JS developer is not informed of it because it is a single infinite audio (or video) stream. Thus the question whether we need a new "metadatachange" event to expose this to the JS developer. It would then also signify that potentially the number of tracks that are available may have changed and other such information. Hope that clarifies it. Cheers, Silvia.
Re: [whatwg] Video feedback
On Sat, 04 Jun 2011 03:39:58 +0200, Silvia Pfeiffer wrote: On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson wrote: On Thu, 16 Dec 2010, Silvia Pfeiffer wrote: I do not know how technically the change of stream composition works in MPEG, but in Ogg we have to end a current stream and start a new one to switch compositions. This has been called "sequential multiplexing" or "chaining". In this case, stream setup information is repeated, which would probably lead to creating a new steam handler and possibly a new firing of "loadedmetadata". I am not sure how chaining is implemented in browsers. Per spec, chaining isn't currently supported. The closest thing I can find in the spec to this situation is handling a non-fatal error, which causes the unexpected content to be ignored. On Fri, 17 Dec 2010, Eric Winkelman wrote: The short answer for changing stream composition is that there is a Program Map Table (PMT) that is repeated every 100 milliseconds and describes the content of the stream. Depending on the programming, the stream's composition could change entering/exiting every advertisement. If this is something that browser vendors want to support, I can specify how to handle it. Anyone? Icecast streams have chained files, so streaming Ogg to an audio element would hit this problem. There is a bug in FF for this: https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's also a webkit bug for icecast streaming, which is probably related https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera is able to deal with icecast streams, but it seems to deal with it. The thing is: you can implement playback and seeking without any further changes to the spec. But then the browser-internal metadata states will change depending on the chunk you're on. Should that also update the exposed metadata in the API then? Probably yes, because otherwise the JS developer may deal with contradictory information. Maybe we need a "metadatachange" event for this? An Icecast stream is conceptually just one infinite audio stream, even though at the container level it is several chained Ogg streams. duration will be Infinity and currentTime will be constantly increasing. This doesn't seem to be a case where any spec change is needed. Am I missing something? -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Video feedback
I'll be replying to WebVTT related stuff in a separate thread. Here just feedback on the other stuff. (Incidentally: why is there element feedback in here with video? I don't really understand the connection.) On Fri, Jun 3, 2011 at 9:28 AM, Ian Hickson wrote: > On Thu, 16 Dec 2010, Silvia Pfeiffer wrote: >> >> I do not know how technically the change of stream composition works in >> MPEG, but in Ogg we have to end a current stream and start a new one to >> switch compositions. This has been called "sequential multiplexing" or >> "chaining". In this case, stream setup information is repeated, which >> would probably lead to creating a new steam handler and possibly a new >> firing of "loadedmetadata". I am not sure how chaining is implemented in >> browsers. > > Per spec, chaining isn't currently supported. The closest thing I can find > in the spec to this situation is handling a non-fatal error, which causes > the unexpected content to be ignored. > > > On Fri, 17 Dec 2010, Eric Winkelman wrote: >> >> The short answer for changing stream composition is that there is a >> Program Map Table (PMT) that is repeated every 100 milliseconds and >> describes the content of the stream. Depending on the programming, the >> stream's composition could change entering/exiting every advertisement. > > If this is something that browser vendors want to support, I can specify > how to handle it. Anyone? Icecast streams have chained files, so streaming Ogg to an audio element would hit this problem. There is a bug in FF for this: https://bugzilla.mozilla.org/show_bug.cgi?id=455165 (and a duplicate bug at https://bugzilla.mozilla.org/show_bug.cgi?id=611519). There's also a webkit bug for icecast streaming, which is probably related https://bugs.webkit.org/show_bug.cgi?id=42750 . I'm not sure how Opera is able to deal with icecast streams, but it seems to deal with it. The thing is: you can implement playback and seeking without any further changes to the spec. But then the browser-internal metadata states will change depending on the chunk you're on. Should that also update the exposed metadata in the API then? Probably yes, because otherwise the JS developer may deal with contradictory information. Maybe we need a "metadatachange" event for this? > On Tue, 24 May 2011, Silvia Pfeiffer wrote: >> >> Ian and I had a brief conversation recently where I mentioned a problem >> with extended text descriptions with screen readers (and worse still >> with braille devices) and the suggestion was that the "paused for user >> interaction" state of a media element may be the solution. I would like >> to pick this up and discuss in detail how that would work to confirm my >> sketchy understanding. >> >> *The use case:* >> >> In the specification for media elements we have a kind of >> "descriptions", which are: >> "Textual descriptions of the video component of the media resource, >> intended for audio synthesis when the visual component is unavailable >> (e.g. because the user is interacting with the application without a >> screen while driving, or because the user is blind). Synthesized as a >> separate audio track." >> >> I'm for now assuming that the synthesis will be done through a screen >> reader and not through the browser itself, thus making the >> descriptions available to users as synthesized audio or as braille if >> the screen reader is set up for a braille device. >> >> The textual descriptions are provided as chunks of text with a start >> and a end time (so-called "cues"). The cues are processed during video >> playback as the video's playback time starts to fall within the time >> frame of the cue. Thus, it is expected the that cues are consumed >> during the cue's time frame and are not present any more when the end >> time of the cue is reached, so they don't conflict with the video's >> normal audio. >> >> However, on many occasions, it is not possible to consume the cue text >> in the given time frame. In particular not in the following >> situations: >> >> 1. The screen reader takes longer to read out the cue text than the >> cue's time frame provides for. This is particularly the case with long >> cue text, but also when the screen reader's reading rate is slower >> than what the author of the cue text expected. >> >> 2. The braille device is used for reading. Since reading braille is >> much slower than listening to read-out text, the cue time frame will >> invariably be too short. >> >> 3. The user seeked right into the middle of a cue and thus the time >> frame that is available for reading out the cue text is shorter than >> the cue author calculated with. >> >> Correct me if I'm wrong, but it seems that what we need is a way for >> the screen reader to pause the video element from continuing to play >> while the screen reader is still busy delivering the cue text. (In >> a11y talk: what is required is a means to deal with "extended >> descriptions", which extend the timeline of the video.) Onc
Re: [whatwg] Video feedback
On Fri, 03 Jun 2011 01:28:45 +0200, Ian Hickson wrote: > On Fri, 22 Oct 2010, Simon Pieters wrote: Actually it was me, but that's OK :) > > There was also some discussion about metadata. Language is sometimes > > necessary for the font engine to pick the right glyph. > > Could you elaborate on this? My assumption was that we'd just use CSS, > which doesn't rely on language for this. It's not in any spec that I'm aware of, but some browsers (including Opera) pick different glyphs depending on the language of the text, which really helps when rendering CJK when you have several CJK fonts on the system. Browsers will already know the language from , so this would be for external players. How is this problem solved in SRT players today? Not at all, it seems. Both VLC and Totem allow setting the character encoding and font used for subtitles in the (global) preferences menu, so presumably you would change that if the default doesn't work. Font switching seems to mainly be an issue when your system has other default fonts than the text you're reading, and it appears that is rare enough that very little software does anything about it, browsers perhaps being an exception. On Mon, 3 Jan 2011, Philip Jägenstedt wrote: > > * The "bad cue" handling is stricter than it should be. After > > collecting an id, the next line must be a timestamp line. Otherwise, > > we skip everything until a blank line, so in the following the > > parser would jump to "bad cue" on line "2" and skip the whole cue. > > > > 1 > > 2 > > 00:00:00.000 --> 00:00:01.000 > > Bla > > > > This doesn't match what most existing SRT parsers do, as they simply > > look for timing lines and ignore everything else. If we really need > > to collect the id instead of ignoring it like everyone else, this > > should be more robust, so that a valid timing line always begins a > > new cue. Personally, I'd prefer if it is simply ignored and that we > > use some form of in-cue markup for styling hooks. > > The IDs are useful for referencing cues from script, so I haven't > removed them. I've also left the parsing as is for when neither the > first nor second line is a timing line, since that gives us a lot of > headroom for future extensions (we can do anything so long as the > second line doesn't start with a timestamp and "-->" and another > timestamp). In the case of feeding future extensions to current parsers, it's way better fallback behavior to simply ignore the unrecognized second line than to discard the entire cue. The current behavior seems unnecessarily strict and makes the parser more complicated than it needs to be. My preference is just ignore anything preceding the timing line, but even if we must have IDs it can still be made simpler and more robust than what is currently spec'ed. If we just ignore content until we hit a line that happens to look like a timing line, then we are much more constrained in what we can do in the future. For example, we couldn't introduce a "comment block" syntax, since any comment containing a timing line wouldn't be ignored. On the other hand if we keep the syntax as it is now, we can introduce a comment block just by having its first line include a "-->" but not have it match the timestamp syntax, e.g. by having it be "--> COMMENT" or some such. One of us must be confused, do you mean something like this? 1 --> COMMENT 00:00.000 --> 00:01.000 Cue text Adding this syntax would break the *current* parser, as it would fail in step 39 (Collect WebVTT cue timings and settings) and then skip the rest of the cue. If we want any room for extensions along these lines, then multiple lines preceding the timing line must be handled gracefully. Looking at the parser more closely, I don't really see how doing anything more complex than skipping the block entirely would be simpler than what we have now, anyway. I suggest: * Step 31: Try to "collect WebVTT cue timings and settings" instead of checking for the substring "-->". If it succeeds, jump to what is now step 40. If it fails, continue at what is now step 32. (This allows adding any syntax as long as it doesn't exactly match a timing line, including "--> COMMENT". As a bonus, one can fail faster when trying to parse an entire timing line rather than doing a substring search for "-->".) * Step 32: Only set the id line if it's not already set. (Assuming we want the first line to be the id line in future extensions.) * Step 39: Jump to the new step 31. In case not every detail is correct, the idea is to first try to match a timing line and to take the first line that is not a timing line (if any) as the id, leaving everything in between open for future syntax changes, even if they use "-->". I think it's fairly important that we handle this. Double id lines is an easy mistake to make when copying things around. Silently dropping those cues would be worse than what many existing (line-based, id-
Re: [whatwg] Video feedback
On Thu, Jun 2, 2011 at 7:58 PM, Glenn Maynard wrote: > The most straightforward solution would seems to be having @lang be a > CSS property; I don't know the rationale for this being done by HTML > instead. The language of a block of text is a property of the content, not a styling attribute. It must be carried by the content itself. As an interesting aside, the direction of a block of text is a property of the content as well, but CSS has a 'direction' property. We only added that because XML didn't define a generic @dir attribute, so we needed *some* way for generic XML languages to specify the text direction (in this case, by specifying their own direction-specifying attribute and then providing a default stylesheet that sets 'direction' based on that). If XML had specified xml:dir like they did xml:lang, 'direction' wouldn't exist. Similarly, if XML hadn't specified xml:lang, we'd probably have a 'language' property. ~TJ
Re: [whatwg] Video feedback
On Thu, Jun 2, 2011 at 7:28 PM, Ian Hickson wrote: > We can add comments pretty easily (e.g. we could say that " comment and ">" ends it -- that's already being ignored by the current > parser), if people really need them. But are comments really that useful? > Did SRT have problem due to not supporting inline comments? (Or did it > support inline comments?) I've only worked with SSA subtitles (fansubbing), where {text in braces} effectively worked as a comment. We used them a lot to communicate between editors on a phrase-by-phrase basis. But for that use case, using hidden spans makes more sense, since you can toggle them on and off to view them inline, etc. Given that, I'd be fine with a comment format that doesn't allow mid-cue comments, if it makes the format simpler. >> The text on the left is a transcription, the top is a transliteration, >> and the bottom is a translation. > > Aren't these three separate text tracks? They're all in the same track, in practice, since media players don't play multiple subtitle tracks. It's true that having them in separate tracks would be better, so they can be disabled individually. This is probably rare enough that it should just be sorted out with scripts, at least to start. > It's not clear to me that we need language information to apply proper > font selection and word wrapping, since CSS doesn't do it. But it doesn't have to, since HTML does this with @lang. > Mixing one CJK language with one non-CJK language seems fine. That should > always work, assuming you specify good fonts in the CSS. The font is ultimately in the user's control. I tell Firefox to always use Tahoma for Western text and MS Gothic for Japanese text, ignoring the often ugly site-specified fonts. The only control sites have over my fonts is the language they say the text is (or which the whole page is detected as). The same principle seems to apply for captions. (That's not to say that it's important enough to add yet and I'm fine with punting on this, at least for now. I just don't think specifying fonts is the right solution.) The most straightforward solution would seems to be having @lang be a CSS property; I don't know the rationale for this being done by HTML instead. > I don't understand why we can't have good typography for CJK and non-CJK > together. Surely there are fonts that get both right? I've never seen a Japanese font that didn't look terrible for English text. Also, I don't want my font selection to be severely limited due to the need to use a single font for both languages, instead of using the right font for the right text. >> One example of how this can be tricky: at 0:17, a caption on the bottom >> wraps and takes two lines, which then pushes the line at 0:19 upward >> (that part's simple enough). If instead the top part had appeared >> first, the renderer would need to figure out in advance to push it >> upwards, to make space for the two-line caption underneith it. >> Otherwise, the captions would be forced to switch places. > > Right, without lookahead I don't know how you'd solve it. With lookahead > things get pretty dicey pretty quickly. The problem is that, at least here, the whole scene is nearly incomprehensible if the top/bottom arrangement isn't maintained. Lacking anything better, I suspect authors would use similar brittle hacks with WebVTT. Anyway, I don't have a simple solution either. >> I think that, no matter what you do, people will insert line breaks in >> cues. I'd follow the HTML model here: convert newlines to spaces and >> have a separate, explicit line break like if needed, so people >> don't manually line-break unless they actually mean to. > > The line-breaks-are-line-breaks feature is one of the features that > originally made SRT seem like a good idea. It still seems like the neatest > way of having a line break. But does this matter? Line breaks within a cue are relatively uncommon in my experience (perhaps it's different for other languages), compared to how many people will insert line breaks in a text editor simply to break lines while authoring. If you do this while testing on a large monitor, it's likely to look reasonable when rendered; the brokenness won't show up until it's played in a smaller window. Anyone using a non-programmer's text editor that doesn't handle long lines cleanly is likely to do this. Wrapping lines manually in SRTs also appears to be common (even standard) practice, perhaps due to inadequate line wrapping in SRT renderers. Making line breaks explicit should help keep people from translating this habit to WebVTT. >> Related to line breaking, should there be an escape? Inserting >> nbsp literally into files is somewhat annoying for authoring, since >> they're indistinguishable from regular spaces. > > How common would be? I guess the main cases I've used nbsp for don't apply so much to captions, eg. © 2011 (likely to come at the start of a caption, so not likely to be wrapp