Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

Mark Watson Tue, 16 Aug 2011 08:03:46 -0700

On Aug 12, 2011, at 10:01 AM, Aaron Colwell wrote:

Hi Mark,


comments inline...

On Thu, Aug 11, 2011 at 9:46 AM, Mark Watson 
<wats...@netflix.com<mailto:wats...@netflix.com>> wrote:
I think it would be good if the API recognized the fact that the media data may 
becoming from several different original files/streams (e.g. different 
bitrates) as the player adapts to network or other conditions.

I agree. I intend to document this when I spec out the format of the byte 
stream that is passed into this API. Initially I'm focusing on WebM which 
requires this type of functionality if the Vorbis initialization data ever 
needs to change during playback. My intuition says that Ogg & MP4 will require 
similar solutions.


The different files may have different initialization information (Info and 
Tracks in WebM, Movie Box in mp4 etc.), which could be provided either in the 
first append call for each stream or with a separate API call. But subsequently 
you need to know which initialization information is relevant for each appended 
block. An integer streamId in the append call would be sufficient - the 
absolute value has no meaning - it would just associate data from the same 
stream across calls.

Since I'm using WebM for the byte stream I don't need to add explicit streamIds 
to the API or data. StreamIDs are already in the byte stream. Ogg bitstream 
serial numbers, and MP4 track numbers should serve the same purpose.

I may have inadvertently overloaded "stream id". And I'm assuming that the 
different bitrates essentially come from different media files. If you use the 
track id in mp4 (or it's equivalent in WebM) then you require that there is a 
level of coordination in the creation of the different bitrate files: they must 
all use distinct track ids. To add a new bitrate you need you need to know what 
track ids were used in the old ones and pick a distinct one. When people get it 
wrong you have a difficult-to-detect failure mode.



The alternatives are:
(a) to require that all streams have the same or compatible initialization 
information or
(b) to pass the initialization information every time you change streams

(a) has the disadvantage of constraining encoding, and making adding new 
streams more dependent on the details of how the existing streams were 
encoded/packaged
(b) is ok, except that it is nice for the player to know "this data is from the 
same stream you were playing a while ago" - it can re-use some previously 
established state - rather than every stream change being 'out of the blue'.

I'm leaning toward (b) right now. Any time a change in stream parameters is 
needed new INFO & TRACKS elements will be appended before the media data from 
the new source. This is similar to how Ogg chaining works. I don't think we 
need unique IDs for marking this state. The media engine can look at the new 
codec config data and see if it matches anything it has seen before. If so then 
it can simply reuse whatever resources it see fit. Another thing to note is 
that just because we append this data every time a stream switch occurs, it 
doesn't mean we have to transfer that data across the network each time. 
JavaScript can cache this data and simply append it when necessary.

That's fine for me. It needs to be clear in the API that this is the expected 
mode of operation. We can word this in a way that is independent of media 
format.



A separate comment is that practically we have found it very useful for the 
media player to know the maximum resolution, frame rate and codec level/profile 
that will be used, which may be different from the resolution and 
codec/level/profile of the first stream.


I agree that this info is useful, but it isn't clear to me that this API needs 
to support that. Existing APIs like 
canPlayType()<http://www.w3.org/TR/html5/video.html#dom-navigator-canplaytype> 
could be used to determine whether specific codec parameters are supported. 
Other DOM APIs could be used to determine max screen size. This could all be 
used to prune the candidate streams sent to the MediaSource API.

True, but I wasn't thinking so much of determining whether playback is 
supported, but of warning the media pipeline of what might be coming so that it 
can dimension various resources appropriately.

This may just be a matter of feeding the header for the highest 
resolution/profile stream first, even if you don't feed any media data for that 
stream. It's possible some players will not support switching resolution to a 
resolution higher than that established at the start of playback (at least we 
have found that to be the case with some embedded media pipelines today).

...Mark



Aaron

Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

Reply via email to