Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-08-16 Thread Mark Watson

On Aug 12, 2011, at 10:01 AM, Aaron Colwell wrote:

Hi Mark,

comments inline...

On Thu, Aug 11, 2011 at 9:46 AM, Mark Watson 
wats...@netflix.commailto:wats...@netflix.com wrote:
I think it would be good if the API recognized the fact that the media data may 
becoming from several different original files/streams (e.g. different 
bitrates) as the player adapts to network or other conditions.

I agree. I intend to document this when I spec out the format of the byte 
stream that is passed into this API. Initially I'm focusing on WebM which 
requires this type of functionality if the Vorbis initialization data ever 
needs to change during playback. My intuition says that Ogg  MP4 will require 
similar solutions.


The different files may have different initialization information (Info and 
Tracks in WebM, Movie Box in mp4 etc.), which could be provided either in the 
first append call for each stream or with a separate API call. But subsequently 
you need to know which initialization information is relevant for each appended 
block. An integer streamId in the append call would be sufficient - the 
absolute value has no meaning - it would just associate data from the same 
stream across calls.

Since I'm using WebM for the byte stream I don't need to add explicit streamIds 
to the API or data. StreamIDs are already in the byte stream. Ogg bitstream 
serial numbers, and MP4 track numbers should serve the same purpose.

I may have inadvertently overloaded stream id. And I'm assuming that the 
different bitrates essentially come from different media files. If you use the 
track id in mp4 (or it's equivalent in WebM) then you require that there is a 
level of coordination in the creation of the different bitrate files: they must 
all use distinct track ids. To add a new bitrate you need you need to know what 
track ids were used in the old ones and pick a distinct one. When people get it 
wrong you have a difficult-to-detect failure mode.



The alternatives are:
(a) to require that all streams have the same or compatible initialization 
information or
(b) to pass the initialization information every time you change streams

(a) has the disadvantage of constraining encoding, and making adding new 
streams more dependent on the details of how the existing streams were 
encoded/packaged
(b) is ok, except that it is nice for the player to know this data is from the 
same stream you were playing a while ago - it can re-use some previously 
established state - rather than every stream change being 'out of the blue'.

I'm leaning toward (b) right now. Any time a change in stream parameters is 
needed new INFO  TRACKS elements will be appended before the media data from 
the new source. This is similar to how Ogg chaining works. I don't think we 
need unique IDs for marking this state. The media engine can look at the new 
codec config data and see if it matches anything it has seen before. If so then 
it can simply reuse whatever resources it see fit. Another thing to note is 
that just because we append this data every time a stream switch occurs, it 
doesn't mean we have to transfer that data across the network each time. 
JavaScript can cache this data and simply append it when necessary.

That's fine for me. It needs to be clear in the API that this is the expected 
mode of operation. We can word this in a way that is independent of media 
format.



A separate comment is that practically we have found it very useful for the 
media player to know the maximum resolution, frame rate and codec level/profile 
that will be used, which may be different from the resolution and 
codec/level/profile of the first stream.


I agree that this info is useful, but it isn't clear to me that this API needs 
to support that. Existing APIs like 
canPlayType()http://www.w3.org/TR/html5/video.html#dom-navigator-canplaytype 
could be used to determine whether specific codec parameters are supported. 
Other DOM APIs could be used to determine max screen size. This could all be 
used to prune the candidate streams sent to the MediaSource API.

True, but I wasn't thinking so much of determining whether playback is 
supported, but of warning the media pipeline of what might be coming so that it 
can dimension various resources appropriately.

This may just be a matter of feeding the header for the highest 
resolution/profile stream first, even if you don't feed any media data for that 
stream. It's possible some players will not support switching resolution to a 
resolution higher than that established at the start of playback (at least we 
have found that to be the case with some embedded media pipelines today).

...Mark



Aaron



Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-08-14 Thread Frank Galligan
Hi All,

comments in line...

On Fri, Aug 12, 2011 at 1:01 PM, Aaron Colwell acolw...@google.com wrote:

 Hi Mark,

 comments inline...

 On Thu, Aug 11, 2011 at 9:46 AM, Mark Watson wats...@netflix.com wrote:

  I think it would be good if the API recognized the fact that the media
 data
  may becoming from several different original files/streams (e.g.
 different
  bitrates) as the player adapts to network or other conditions.
 

 I agree. I intend to document this when I spec out the format of the byte
 stream that is passed into this API. Initially I'm focusing on WebM which
 requires this type of functionality if the Vorbis initialization data ever
 needs to change during playback. My intuition says that Ogg  MP4 will
 require similar solutions.


 
  The different files may have different initialization information (Info
 and
  Tracks in WebM, Movie Box in mp4 etc.), which could be provided either in
  the first append call for each stream or with a separate API call. But
  subsequently you need to know which initialization information is
 relevant
  for each appended block. An integer streamId in the append call would be
  sufficient - the absolute value has no meaning - it would just associate
  data from the same stream across calls.
 

 Since I'm using WebM for the byte stream I don't need to add explicit
 streamIds to the API or data. StreamIDs are already in the byte stream. Ogg
 bitstream serial numbers, and MP4 track numbers should serve the same
 purpose.

 A little background. I have taken what Aaron has written for the MediaChunk
API and I am currently trying to create an adaptive player that will switch
WebM video streams seamlessly. There is only one audio stream. All streams
are in separate files.

Even in the simple case of one video stream and one audio stream, the
problem I'm running into with the current API is that there is no way to
send the header info for the separate streams without re-muxing the separate
headers into a combined header. I can do this in Javascript for WebM files
(provided the track numbers are different or I would need to change all the
track numbers on the blocks in Javascript) but I think it would be easier on
the person writing a player if they didn't have to worry about that.
The easiest solution would be to add a stream id. That way the media engine
doesn't need to force the player or encoder to deal with track id's that are
the same in different streams.

I think the next best solution is probably (b) from below. That way you
could send the header info for a video stream and the header info for
an audio stream to initialize the MediaEngine. Not that it is a big deal
but, you would still have the restriction that different stream types cannot
have the same track number.


 
  The alternatives are:
  (a) to require that all streams have the same or compatible
 initialization
  information or
  (b) to pass the initialization information every time you change streams
 
  (a) has the disadvantage of constraining encoding, and making adding new
  streams more dependent on the details of how the existing streams were
  encoded/packaged
  (b) is ok, except that it is nice for the player to know this data is
 from
  the same stream you were playing a while ago - it can re-use some
  previously established state - rather than every stream change being 'out
 of
  the blue'.
 

 I'm leaning toward (b) right now. Any time a change in stream parameters is
 needed new INFO  TRACKS elements will be appended before the media data
 from the new source. This is similar to how Ogg chaining works. I don't
 think we need unique IDs for marking this state. The media engine can look
 at the new codec config data and see if it matches anything it has seen
 before. If so then it can simply reuse whatever resources it see fit.
 Another thing to note is that just because we append this data every time a
 stream switch occurs, it doesn't mean we have to transfer that data across
 the network each time. JavaScript can cache this data and simply append it
 when necessary.


 
  A separate comment is that practically we have found it very useful for
 the
  media player to know the maximum resolution, frame rate and codec
  level/profile that will be used, which may be different from the
 resolution
  and codec/level/profile of the first stream.
 
 
 I agree that this info is useful, but it isn't clear to me that this API
 needs to support that. Existing APIs like
 canPlayType()
 http://www.w3.org/TR/html5/video.html#dom-navigator-canplaytype
 could
 be used to determine whether specific codec parameters are supported. Other
 DOM APIs could be used to determine max screen size. This could all be used
 to prune the candidate streams sent to the MediaSource API.


 Aaron



Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-08-12 Thread Aaron Colwell
Hi Mark,

comments inline...

On Thu, Aug 11, 2011 at 9:46 AM, Mark Watson wats...@netflix.com wrote:

 I think it would be good if the API recognized the fact that the media data
 may becoming from several different original files/streams (e.g. different
 bitrates) as the player adapts to network or other conditions.


I agree. I intend to document this when I spec out the format of the byte
stream that is passed into this API. Initially I'm focusing on WebM which
requires this type of functionality if the Vorbis initialization data ever
needs to change during playback. My intuition says that Ogg  MP4 will
require similar solutions.



 The different files may have different initialization information (Info and
 Tracks in WebM, Movie Box in mp4 etc.), which could be provided either in
 the first append call for each stream or with a separate API call. But
 subsequently you need to know which initialization information is relevant
 for each appended block. An integer streamId in the append call would be
 sufficient - the absolute value has no meaning - it would just associate
 data from the same stream across calls.


Since I'm using WebM for the byte stream I don't need to add explicit
streamIds to the API or data. StreamIDs are already in the byte stream. Ogg
bitstream serial numbers, and MP4 track numbers should serve the same
purpose.



 The alternatives are:
 (a) to require that all streams have the same or compatible initialization
 information or
 (b) to pass the initialization information every time you change streams

 (a) has the disadvantage of constraining encoding, and making adding new
 streams more dependent on the details of how the existing streams were
 encoded/packaged
 (b) is ok, except that it is nice for the player to know this data is from
 the same stream you were playing a while ago - it can re-use some
 previously established state - rather than every stream change being 'out of
 the blue'.


I'm leaning toward (b) right now. Any time a change in stream parameters is
needed new INFO  TRACKS elements will be appended before the media data
from the new source. This is similar to how Ogg chaining works. I don't
think we need unique IDs for marking this state. The media engine can look
at the new codec config data and see if it matches anything it has seen
before. If so then it can simply reuse whatever resources it see fit.
Another thing to note is that just because we append this data every time a
stream switch occurs, it doesn't mean we have to transfer that data across
the network each time. JavaScript can cache this data and simply append it
when necessary.



 A separate comment is that practically we have found it very useful for the
 media player to know the maximum resolution, frame rate and codec
 level/profile that will be used, which may be different from the resolution
 and codec/level/profile of the first stream.


I agree that this info is useful, but it isn't clear to me that this API
needs to support that. Existing APIs like
canPlayType()http://www.w3.org/TR/html5/video.html#dom-navigator-canplaytype
could
be used to determine whether specific codec parameters are supported. Other
DOM APIs could be used to determine max screen size. This could all be used
to prune the candidate streams sent to the MediaSource API.


Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-08-11 Thread Mark Watson
Hi Aaron,

I think it would be good if the API recognized the fact that the media data may 
becoming from several different original files/streams (e.g. different 
bitrates) as the player adapts to network or other conditions.

The different files may have different initialization information (Info and 
Tracks in WebM, Movie Box in mp4 etc.), which could be provided either in the 
first append call for each stream or with a separate API call. But subsequently 
you need to know which initialization information is relevant for each appended 
block. An integer streamId in the append call would be sufficient - the 
absolute value has no meaning - it would just associate data from the same 
stream across calls.

The alternatives are:
(a) to require that all streams have the same or compatible initialization 
information or
(b) to pass the initialization information every time you change streams

(a) has the disadvantage of constraining encoding, and making adding new 
streams more dependent on the details of how the existing streams were 
encoded/packaged
(b) is ok, except that it is nice for the player to know this data is from the 
same stream you were playing a while ago - it can re-use some previously 
established state - rather than every stream change being 'out of the blue'.

A separate comment is that practically we have found it very useful for the 
media player to know the maximum resolution, frame rate and codec level/profile 
that will be used, which may be different from the resolution and 
codec/level/profile of the first stream.

...Mark

On Jul 11, 2011, at 11:42 AM, Aaron Colwell wrote:

 Hi,
 
 Based on comments in the File API Streaming
 Blobshttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-January/029973.html
 thread and
 my Extending HTML 5 video for adaptive
 streaminghttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/032277.html
 thread,
 I decided on taking a stab at writing a MediaSource API
 spechttp://html5-mediasource-api.googlecode.com/svn/trunk/draft-spec/mediasource-draft-spec.html
 for
 streaming data to a media tag.
 
 Please take a look at the
 spechttp://html5-mediasource-api.googlecode.com/svn/trunk/draft-spec/mediasource-draft-spec.htmland
 provide some feedback.
 
 I've tried to start with the simplest thing that would work and hope to
 expand from there if need be. For now, I'm intentionally not trying to solve
 the generic streaming file case because I believe there might be media
 specific requirements around handling seeking especially if we intend to
 support non-packetized media streams like WAV.
 
 If the feedback is generally positive on this approach, I'll start working
 on patches for WebKit  Chrome so people can experiment with an actual
 implementation.
 
 Thanks,
 Aaron
 



Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-14 Thread Aaron Colwell
On Wed, Jul 13, 2011 at 8:00 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Thu, Jul 14, 2011 at 4:35 AM, Aaron Colwell acolw...@google.comwrote:

 I am open to suggestions. My intent was that the browser would not attempt
 to cache any data passed into append(). It would just demux the buffers that
 are sent in. When a seek is requested, it flushes whatever it has and waits
 for more data from append().  If the web application wants to do caching it
 can use the WebStorage or File APIs. If the browser's media engine needs a
 certain amount of preroll data before it starts playback it can signal
 this explicitly through new attributes or just use HAVE_FUTURE_DATA
  HAVE_ENOUGH_DATA readyStates to signal when it has enough.


 OK, I sorta get the idea. I think you're defining a new interface to the
 media processing pipeline that integrates with the demuxer and codecs at a
 different level to regular media resource loading. (For example, all the
 browser's built-in logic for seeking and buffering would have to be disabled
 and/or bypassed.)


Yes.


 As such, it would have to be carefully specified, potentially in a
 container- or codec-dependent way, unlike APIs like Blobs which work just
 like regular media resource loading and can thus work with any
 container/codec.


My hope is that the data passed to append will basically look like the live
streaming form of containers like Ogg  WebM so this isn't totally foreign
to the existing browser code. We'd probably have to spec the level of
support for Ogg chaining and multiple WebM segments but I don't think that
should be too bad. Seeking is where the trickiness happens and I was just
planning on making it look like a new live stream whose starting timestamp
indicates the actual point seeked to.

I was tempted to create an API that just passed in compressed video/audio
frames and made JavaScript do all of the demuxing, but I thought people
might find that too radical.



 I'm not sure what the best way to do this is, to be honest. It comes down
 to the use-cases. If you want to experiment with different seeking
 strategies, can't you just do that in Chrome itself? If you want scriptable
 adaptive streaming (or even if you don't) then I think we want APIs for
 seamless transitioning along a sequence of media resources, or between
 resources loaded in parallel.


I think the best course of action is for me to get my prototype in a state
where others can play with it and I can demonstrate some of the uses that
I'm trying to enable. I think that will make this a little more concrete.
 I'll keep this list posted on my progress.

Thanks for your help,
Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-13 Thread Aaron Colwell
On Tue, Jul 12, 2011 at 5:05 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Wed, Jul 13, 2011 at 12:00 PM, Aaron Colwell acolw...@google.comwrote:

 On Tue, Jul 12, 2011 at 4:44 PM, Robert O'Callahan 
 rob...@ocallahan.orgwrote:

 I had imagined that this API would let the author feed in the same data
 as you would load from some URI. But that can't be what's happening, since
 in some element implementations (e.g., Gecko's) loaded data is buffered
 internally and seeking might not require any new data to be loaded.


  No. The idea is to allow JavaScript to manage fetching the media data so
 various fetching strategies could be implemented without needing to change
 the browser. My initial motivation is for supporting adaptive streaming with
 this mechanism, but I think various media mashup and delivery scenarios
 could be explored with this.


 I don't think you can do that with this API without making huge assumptions
 about what the browser's demuxer, internal caching, etc are doing.


I am open to suggestions. My intent was that the browser would not attempt
to cache any data passed into append(). It would just demux the buffers that
are sent in. When a seek is requested, it flushes whatever it has and waits
for more data from append().  If the web application wants to do caching it
can use the WebStorage or File APIs. If the browser's media engine needs a
certain amount of preroll data before it starts playback it can signal
this explicitly through new attributes or just use HAVE_FUTURE_DATA
 HAVE_ENOUGH_DATA readyStates to signal when it has enough.

Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-13 Thread Robert O'Callahan
On Thu, Jul 14, 2011 at 4:35 AM, Aaron Colwell acolw...@google.com wrote:

 I am open to suggestions. My intent was that the browser would not attempt
 to cache any data passed into append(). It would just demux the buffers that
 are sent in. When a seek is requested, it flushes whatever it has and waits
 for more data from append().  If the web application wants to do caching it
 can use the WebStorage or File APIs. If the browser's media engine needs a
 certain amount of preroll data before it starts playback it can signal
 this explicitly through new attributes or just use HAVE_FUTURE_DATA
  HAVE_ENOUGH_DATA readyStates to signal when it has enough.


OK, I sorta get the idea. I think you're defining a new interface to the
media processing pipeline that integrates with the demuxer and codecs at a
different level to regular media resource loading. (For example, all the
browser's built-in logic for seeking and buffering would have to be disabled
and/or bypassed.) As such, it would have to be carefully specified,
potentially in a container- or codec-dependent way, unlike APIs like Blobs
which work just like regular media resource loading and can thus work with
any container/codec.

I'm not sure what the best way to do this is, to be honest. It comes down to
the use-cases. If you want to experiment with different seeking strategies,
can't you just do that in Chrome itself? If you want scriptable adaptive
streaming (or even if you don't) then I think we want APIs for seamless
transitioning along a sequence of media resources, or between resources
loaded in parallel.

Rob
-- 
If we claim to be without sin, we deceive ourselves and the truth is not in
us. If we confess our sins, he is faithful and just and will forgive us our
sins and purify us from all unrighteousness. If we claim we have not sinned,
we make him out to be a liar and his word is not in us. [1 John 1:8-10]


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Harald Alvestrand
Not a comment directly on the spec, but you might want to check what 
people are suggesting for interactive media handling in the WEBRTC 
working group.


Streaming is different from interactive media, but it would be a shame 
to have incompatibilities that can be avoided.


On 07/11/11 20:42, Aaron Colwell wrote:

Hi,

Based on comments in the File API Streaming
Blobshttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-January/029973.html
thread and
my Extending HTML 5 video for adaptive
streaminghttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/032277.html
thread,
I decided on taking a stab at writing a MediaSource API
spechttp://html5-mediasource-api.googlecode.com/svn/trunk/draft-spec/mediasource-draft-spec.html
for
streaming data to a media tag.

Please take a look at the
spechttp://html5-mediasource-api.googlecode.com/svn/trunk/draft-spec/mediasource-draft-spec.htmland
provide some feedback.

I've tried to start with the simplest thing that would work and hope to
expand from there if need be. For now, I'm intentionally not trying to solve
the generic streaming file case because I believe there might be media
specific requirements around handling seeking especially if we intend to
support non-packetized media streams like WAV.

If the feedback is generally positive on this approach, I'll start working
on patches for WebKit  Chrome so people can experiment with an actual
implementation.

Thanks,
Aaron





Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Aaron Colwell
On Mon, Jul 11, 2011 at 5:54 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 It seems to me that the spec is written assuming only one media element is
 consuming the MediaSource. But nothing stops multiple elements consuming the
 same URL simultaneously. Maybe instead of going through a URL you should add
 API directly to media elements.


You are right that I don't have anything preventing the MediaSource URL from
being passed to multiple media elements. Only one media element will accept
the URL though because whichever one opens the URL first will transition the
source to the OPEN state. Media elements can only open sources in the CLOSED
state. I'm using a URL for initialization to be consistent with how the
media element is initialized in all other cases. I didn't want to create a
new initialization path.

I thought about adding an attribute to HTMLMediaElement that provided a URL
for signalling MediaSource usage. That mechanism would allow you to create a
URL that only works with that element. When this URL is specified, a
MediaSource attribute would be updated on the media element during loading
and JavaScript could use that to pass data to the tag. I couldn't find a
similar pattern in other APIs so I didn't take that path. If people think
that is a better route then I'm all for it.



 bytesAvailable is for flow control? Instead of doing it this way, I would
 follow WebSockets and use a bufferedAmount attribute to indicate how much
 data is currently buffered up. That makes it easy for authors who don't want
 to care about flow control to just append stuff without encountering errors,
 while still allowing authors who care about flow control to do it.


Yes. The intent was to provide a way for the browser to control how much
data was being pushed into it. It looks like WebSocket will just close the
connection if it doesn't have enough buffer space and the API doesn't appear
to provide a mechanism to predict how much buffered data will trigger a
close. Do we want similar semantics for media? It seems like the browser
should provide some hints to indicate that it is not ok to push hours/days
of data into this interface.

Thanks for your comments.

Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Aaron Colwell
Hi Harald,

Please point me to specific threads that talk about this. I looked through
the public-web...@w3.org archive and didn't see anything about interactive
media handling. I did look through the Mozilla/Cisco proposal
threadhttp://lists.w3.org/Archives/Public/public-webrtc/2011Jul/0010.html
and
didn't see anything in my proposal that is incompatible with what is being
proposed there.

Aaron

On Tue, Jul 12, 2011 at 12:31 AM, Harald Alvestrand har...@alvestrand.nowrote:

 Not a comment directly on the spec, but you might want to check what people
 are suggesting for interactive media handling in the WEBRTC working group.

 Streaming is different from interactive media, but it would be a shame to
 have incompatibilities that can be avoided.





Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Robert O'Callahan
On Wed, Jul 13, 2011 at 8:45 AM, Aaron Colwell acolw...@google.com wrote:

 I thought about adding an attribute to HTMLMediaElement that provided a URL
 for signalling MediaSource usage. That mechanism would allow you to create a
 URL that only works with that element. When this URL is specified, a
 MediaSource attribute would be updated on the media element during loading
 and JavaScript could use that to pass data to the tag. I couldn't find a
 similar pattern in other APIs so I didn't take that path. If people think
 that is a better route then I'm all for it.


I was thinking more of putting the MediaSource functionality
(open/append/close) on the media element itself.

Do you need to support seeking in with this API? That's hard. It would be
simpler if we didn't have to support seeking. Instead of seeking you could
just open a new stream and pour data in for the new offset.

Rob
-- 
If we claim to be without sin, we deceive ourselves and the truth is not in
us. If we confess our sins, he is faithful and just and will forgive us our
sins and purify us from all unrighteousness. If we claim we have not sinned,
we make him out to be a liar and his word is not in us. [1 John 1:8-10]


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Aaron Colwell
On Tue, Jul 12, 2011 at 3:28 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Wed, Jul 13, 2011 at 8:45 AM, Aaron Colwell acolw...@google.comwrote:

 I thought about adding an attribute to HTMLMediaElement that provided a
 URL for signalling MediaSource usage. That mechanism would allow you to
 create a URL that only works with that element. When this URL is specified,
 a MediaSource attribute would be updated on the media element during loading
 and JavaScript could use that to pass data to the tag. I couldn't find a
 similar pattern in other APIs so I didn't take that path. If people think
 that is a better route then I'm all for it.


 I was thinking more of putting the MediaSource functionality
 (open/append/close) on the media element itself.


I'm open to that. In fact that is how my current prototype is implemented
because it was the least painful way to test these ideas in WebKit. My
prototype only implements append() and uses existing media element events as
proxies for the events I've proposed. I only separated this out into a
separate object because I thought people might prefer an object to represent
the source of the media and leave the media element object an endpoint for
controlling media playback.



 Do you need to support seeking in with this API? That's hard. It would be
 simpler if we didn't have to support seeking. Instead of seeking you could
 just open a new stream and pour data in for the new offset.


 I'd like to be able to support seeking so you can use this mechanism for
on-demand playback. In my prototype seeking wasn't too difficult to
implement. I just triggered it off the seeking event. Any append() that
happens after the seeking event fires is associated with the new seek
location. currentTime is updated with the timestamp in the first cluster
passed to append() after the seeking event fires. Once the media engine has
this timestamp and enough preroll data, then it will fire the seeked event
like normal. I haven't tested this with rapid fire seeking yet, but I think
this mechanism should work.

Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Robert O'Callahan
On Wed, Jul 13, 2011 at 11:14 AM, Aaron Colwell acolw...@google.com wrote:


 I'm open to that. In fact that is how my current prototype is implemented
 because it was the least painful way to test these ideas in WebKit. My
 prototype only implements append() and uses existing media element events as
 proxies for the events I've proposed. I only separated this out into a
 separate object because I thought people might prefer an object to represent
 the source of the media and leave the media element object an endpoint for
 controlling media playback.


We're kinda stuck with media elements handling both playback endpoints and
resource loading.



 Do you need to support seeking in with this API? That's hard. It would be
 simpler if we didn't have to support seeking. Instead of seeking you could
 just open a new stream and pour data in for the new offset.


  I'd like to be able to support seeking so you can use this mechanism for
 on-demand playback. In my prototype seeking wasn't too difficult to
 implement. I just triggered it off the seeking event. Any append() that
 happens after the seeking event fires is associated with the new seek
 location. currentTime is updated with the timestamp in the first cluster
 passed to append() after the seeking event fires. Once the media engine has
 this timestamp and enough preroll data, then it will fire the seeked event
 like normal. I haven't tested this with rapid fire seeking yet, but I think
 this mechanism should work.


How do you communicate the data offset that the element wants to read at
over to the script that provides the data? In general you can't know the
strategy the decoder/demuxer uses for seeking, so you don't know what data
it will request.

Rob
-- 
If we claim to be without sin, we deceive ourselves and the truth is not in
us. If we confess our sins, he is faithful and just and will forgive us our
sins and purify us from all unrighteousness. If we claim we have not sinned,
we make him out to be a liar and his word is not in us. [1 John 1:8-10]


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Aaron Colwell
On Tue, Jul 12, 2011 at 4:17 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Wed, Jul 13, 2011 at 11:14 AM, Aaron Colwell acolw...@google.comwrote:


 I'm open to that. In fact that is how my current prototype is implemented
 because it was the least painful way to test these ideas in WebKit. My
 prototype only implements append() and uses existing media element events as
 proxies for the events I've proposed. I only separated this out into a
 separate object because I thought people might prefer an object to represent
 the source of the media and leave the media element object an endpoint for
 controlling media playback.


 We're kinda stuck with media elements handling both playback endpoints and
 resource loading.


Ok.  This makes implementation in WebKit easier for me so I won't push to
hard to keep it separate from the media element. :)





 Do you need to support seeking in with this API? That's hard. It would be
 simpler if we didn't have to support seeking. Instead of seeking you could
 just open a new stream and pour data in for the new offset.


  I'd like to be able to support seeking so you can use this mechanism for
 on-demand playback. In my prototype seeking wasn't too difficult to
 implement. I just triggered it off the seeking event. Any append() that
 happens after the seeking event fires is associated with the new seek
 location. currentTime is updated with the timestamp in the first cluster
 passed to append() after the seeking event fires. Once the media engine has
 this timestamp and enough preroll data, then it will fire the seeked event
 like normal. I haven't tested this with rapid fire seeking yet, but I think
 this mechanism should work.


 How do you communicate the data offset that the element wants to read at
 over to the script that provides the data? In general you can't know the
 strategy the decoder/demuxer uses for seeking, so you don't know what data
 it will request.


I'm doing WebM demuxing and media fetching in JavaScript. When a seek
occurs, I look at currentTime to see where we are seeking to. I then look at
the CUES index data I've fetched to find the file offset for the closest
seek point to the desired time. The appropriate data is fetched and pushed
into the element via append(). The seeked event firing and readyState
transitioning to HAVE_FUTURE_DATA or HAVE_ENOUGH_DATA tells me when I've
sent the element enough data. During playback I just monitor the buffered
attribute to keep a specific duration ahead of the current playback time.

Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Robert O'Callahan
On Wed, Jul 13, 2011 at 11:30 AM, Aaron Colwell acolw...@google.com wrote:

 I'm doing WebM demuxing and media fetching in JavaScript. When a seek
 occurs, I look at currentTime to see where we are seeking to. I then look at
 the CUES index data I've fetched to find the file offset for the closest
 seek point to the desired time. The appropriate data is fetched and pushed
 into the element via append(). The seeked event firing and readyState
 transitioning to HAVE_FUTURE_DATA or HAVE_ENOUGH_DATA tells me when I've
 sent the element enough data. During playback I just monitor the buffered
 attribute to keep a specific duration ahead of the current playback time.


Now I'm rather confused about what you're doing and how you're using this
feature. What format is the data that you're feeding into the element?

I had imagined that this API would let the author feed in the same data as
you would load from some URI. But that can't be what's happening, since in
some element implementations (e.g., Gecko's) loaded data is buffered
internally and seeking might not require any new data to be loaded.

Rob
-- 
If we claim to be without sin, we deceive ourselves and the truth is not in
us. If we confess our sins, he is faithful and just and will forgive us our
sins and purify us from all unrighteousness. If we claim we have not sinned,
we make him out to be a liar and his word is not in us. [1 John 1:8-10]


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Aaron Colwell
On Tue, Jul 12, 2011 at 4:44 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Wed, Jul 13, 2011 at 11:30 AM, Aaron Colwell acolw...@google.comwrote:

 I'm doing WebM demuxing and media fetching in JavaScript. When a seek
 occurs, I look at currentTime to see where we are seeking to. I then look at
 the CUES index data I've fetched to find the file offset for the closest
 seek point to the desired time. The appropriate data is fetched and pushed
 into the element via append(). The seeked event firing and readyState
 transitioning to HAVE_FUTURE_DATA or HAVE_ENOUGH_DATA tells me when I've
 sent the element enough data. During playback I just monitor the buffered
 attribute to keep a specific duration ahead of the current playback time.


 Now I'm rather confused about what you're doing and how you're using this
 feature. What format is the data that you're feeding into the element?


Sorry I wasn't clear about my intent. Currently I'm feeding it WebM. I could
see this expanding to Ogg and perhaps MP4. Theoretically any format that
looks like a packet stream could work.



 I had imagined that this API would let the author feed in the same data as
 you would load from some URI. But that can't be what's happening, since in
 some element implementations (e.g., Gecko's) loaded data is buffered
 internally and seeking might not require any new data to be loaded.


 No. The idea is to allow JavaScript to manage fetching the media data so
various fetching strategies could be implemented without needing to change
the browser. My initial motivation is for supporting adaptive streaming with
this mechanism, but I think various media mashup and delivery scenarios
could be explored with this.

Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-12 Thread Robert O'Callahan
On Wed, Jul 13, 2011 at 12:00 PM, Aaron Colwell acolw...@google.com wrote:

 On Tue, Jul 12, 2011 at 4:44 PM, Robert O'Callahan 
 rob...@ocallahan.orgwrote:

 I had imagined that this API would let the author feed in the same data as
 you would load from some URI. But that can't be what's happening, since in
 some element implementations (e.g., Gecko's) loaded data is buffered
 internally and seeking might not require any new data to be loaded.


  No. The idea is to allow JavaScript to manage fetching the media data so
 various fetching strategies could be implemented without needing to change
 the browser. My initial motivation is for supporting adaptive streaming with
 this mechanism, but I think various media mashup and delivery scenarios
 could be explored with this.


I don't think you can do that with this API without making huge assumptions
about what the browser's demuxer, internal caching, etc are doing.

Rob
-- 
If we claim to be without sin, we deceive ourselves and the truth is not in
us. If we confess our sins, he is faithful and just and will forgive us our
sins and purify us from all unrighteousness. If we claim we have not sinned,
we make him out to be a liar and his word is not in us. [1 John 1:8-10]


[whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-11 Thread Aaron Colwell
Hi,

Based on comments in the File API Streaming
Blobshttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-January/029973.html
thread and
my Extending HTML 5 video for adaptive
streaminghttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/032277.html
thread,
I decided on taking a stab at writing a MediaSource API
spechttp://html5-mediasource-api.googlecode.com/svn/trunk/draft-spec/mediasource-draft-spec.html
for
streaming data to a media tag.

Please take a look at the
spechttp://html5-mediasource-api.googlecode.com/svn/trunk/draft-spec/mediasource-draft-spec.htmland
provide some feedback.

I've tried to start with the simplest thing that would work and hope to
expand from there if need be. For now, I'm intentionally not trying to solve
the generic streaming file case because I believe there might be media
specific requirements around handling seeking especially if we intend to
support non-packetized media streams like WAV.

If the feedback is generally positive on this approach, I'll start working
on patches for WebKit  Chrome so people can experiment with an actual
implementation.

Thanks,
Aaron


Re: [whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

2011-07-11 Thread Robert O'Callahan
It seems to me that the spec is written assuming only one media element is
consuming the MediaSource. But nothing stops multiple elements consuming the
same URL simultaneously. Maybe instead of going through a URL you should add
API directly to media elements.

bytesAvailable is for flow control? Instead of doing it this way, I would
follow WebSockets and use a bufferedAmount attribute to indicate how much
data is currently buffered up. That makes it easy for authors who don't want
to care about flow control to just append stuff without encountering errors,
while still allowing authors who care about flow control to do it.

Rob
-- 
If we claim to be without sin, we deceive ourselves and the truth is not in
us. If we confess our sins, he is faithful and just and will forgive us our
sins and purify us from all unrighteousness. If we claim we have not sinned,
we make him out to be a liar and his word is not in us. [1 John 1:8-10]