Re: [whatwg] Extending HTML 5 video for adaptive streaming

2011-07-04 Thread Robert O'Callahan
On Sat, Jul 2, 2011 at 2:51 AM, Aaron Colwell acolw...@google.com wrote:

 On Thu, Jun 30, 2011 at 4:13 PM, Robert O'Callahan 
 rob...@ocallahan.orgwrote:

 On Fri, Jul 1, 2011 at 4:59 AM, Aaron Colwell acolw...@google.comwrote:

 I've also been looking at the WebRTC MediaStream API and was wondering if
 it
 makes more sense to create an object similar to the LocalMediaStream
 object.
 This has the benefits of unifying how media streams are handled
 independent
 of whether they come from a camera or a JavaScript based streaming
 algorithm. This could also enable sending the media stream through a
 Peer-to-peer connection instead of only allowing a camera as a source.
 Here
 is an example of the type of object I'm talking about.


 I think MediaStreams should not be dealing with compressed data except as
 an optimization when access to decoded data is not required anywhere in the
 stream pipeline. If you want to do processing of decoded stream data (which
 I do --- see
 http://hg.mozilla.org/users/rocallahan_mozilla.com/specs/raw-file/tip/StreamProcessing/StreamProcessing.html),
 then introducing a decoder inside the stream processing graph creates all
 sorts of complications.

 Nice spec. If I understand correctly, your position is that MediaStreams
 should only represent uncompressed media?


Sort of. I want the data format (compressed vs uncompressed, etc) to be
hidden from Web authors unless they use APIs like Worker-based processing
that require access to decoded data. What I don't want to have to deal with
is compressed data being injected at arbitrary points in the graph. Right
now the only the place compressed data is injected is at stream sources ---
media elements and getUserMedia.

Rob
-- 
If we claim to be without sin, we deceive ourselves and the truth is not in
us. If we confess our sins, he is faithful and just and will forgive us our
sins and purify us from all unrighteousness. If we claim we have not sinned,
we make him out to be a liar and his word is not in us. [1 John 1:8-10]


Re: [whatwg] Extending HTML 5 video for adaptive streaming

2011-07-01 Thread Aaron Colwell
Hi Robert,

comments inline.

On Thu, Jun 30, 2011 at 4:13 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Fri, Jul 1, 2011 at 4:59 AM, Aaron Colwell acolw...@google.com wrote:

 I've also been looking at the WebRTC MediaStream API and was wondering if
 it
 makes more sense to create an object similar to the LocalMediaStream
 object.
 This has the benefits of unifying how media streams are handled
 independent
 of whether they come from a camera or a JavaScript based streaming
 algorithm. This could also enable sending the media stream through a
 Peer-to-peer connection instead of only allowing a camera as a source.
 Here
 is an example of the type of object I'm talking about.


 I think MediaStreams should not be dealing with compressed data except as
 an optimization when access to decoded data is not required anywhere in the
 stream pipeline. If you want to do processing of decoded stream data (which
 I do --- see
 http://hg.mozilla.org/users/rocallahan_mozilla.com/specs/raw-file/tip/StreamProcessing/StreamProcessing.html),
 then introducing a decoder inside the stream processing graph creates all
 sorts of complications.

 Nice spec. If I understand correctly, your position is that MediaStreams
should only represent uncompressed media? In the case of camera/mic data
they represent the uncompressed bits before they go to the codec for
transmission over a PeerConnection or before they are rendered by a
audio/video. In the case of standard audio/video playback they would
represent the uncompressed audio before it is sent to the audio card and the
uncompressed video before it is blitted on the screen. From a stream
processing point of view I can see how this makes sense.  I was just
thinking that LocalMediaStream is just a wrapper around a source of media
data and all I was doing was providing a mechanism to provide media data
from JavaScript instead of from hardware.

I think the natural way to support the functionality you're looking for is
 to extend the concept of Blob URLs. Right now you can create a binary Blob,
 mint a URL for it and set that URL as the source for a media element. The
 only extension you need is the ability to append data to the Blob while
 retaining the same URL; you would need to initially mark the Blob as open
 to indicate to URL consumers that the data stream has not ended. That
 extension would be useful for all sorts of things because you can use those
 Blob URLs anywhere. An alternative would be to create a new kind of object
 representing an appendable sequence of Blobs and create an API to mint URLs
 for it.


 I thought about that, but I saw an earlier WHATWG 
 threadhttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/032221.html
  which
lead me down this MediaStream path. Using MediaStreams made more sense to me
because my use case felt similar to the live capture case except that I'm
using compressed media and it comes from JavaScript instead of hardware.
Also MediaStream already had a way to pass stream URLs to audio  video
for camera and remote peer stream data so I figured I could just leverage
that.


 Note that with my API proposal above, you can get a MediaStream from a
 media element that's using any URL and send that through a PeerConnection.

 I see that. Interactions with PeerConnection were not a primary concern for
me. I was only mentioning it as a side benefit of using MediaStream.

Thanks for your comments. I appreciate them.

Aaron


Re: [whatwg] Extending HTML 5 video for adaptive streaming

2011-07-01 Thread Aaron Colwell
Hi Adam,

On Thu, Jun 30, 2011 at 5:20 PM, Adam Malcontenti-Wilson adman.com@
gmail.com wrote:

 @acolwell:
 Is the appendData method one your suggesting or one already
 specified/existing?

 I'm suggesting it. It was a quick and dirty way to try out some ideas I had
while working on a prototype for Chromium. Now that I actually want to take
this out of the prototype stage, I'm trying to get a sense of whether
appendData() or a MediaStream based solution is more desirable.


 @robert:
 Some problems with concept of blobs being appended to, or as I have
 previously described as Streaming Blobs was mentioned at
 http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/032221.html
 I'm not exactly sure what that meant - but I'd expect the ideas
 discussed are similar.

 I saw this thread as well which is why I went down the MediaStream path.
:)

Aaron


Re: [whatwg] Extending HTML 5 video for adaptive streaming

2011-07-01 Thread Bob Lund
Hi Aaron,

Here are some other aspects of script controlled adaptive bit rate that occur 
to me, perhaps you have already considered these.

1) I guess script will be responsible for maintaining its own playback buffer, 
monitoring buffer behavior and selecting the appropriate bit rate for new 
fragments. Are there any other network related events/metrics script might need 
to determine which bit-rate to fetch for the next segment? Is there any other 
information from the user agent about playback performance that script might 
need?

2) If a media resource is a multi-track resource then it would seem script will 
also have to fetch fragments for those tracks which implies that the audio 
element would need the append method. Timed text tracks would also need to be 
processed and Cues appended.

There is a new media pipeline task force in the Web and TV IG 
(http://www.w3.org/2011/webtv/wiki/MPTF) that is also planning to examine this 
topic. You may want to participate.

Regards,
Bob Lund


 -Original Message-
 From: whatwg-boun...@lists.whatwg.org [mailto:whatwg-
 boun...@lists.whatwg.org] On Behalf Of Aaron Colwell
 Sent: Thursday, June 30, 2011 10:59 AM
 To: wha...@whatwg.org
 Subject: [whatwg] Extending HTML 5 video for adaptive streaming
 
 Hi,
 
 I've been working on an adaptive streaming prototype that uses
 JavaScript to fetch chunks of media and feeds them to the video tag for
 decoding. The idea is to let the adaptation algorithm and CDN
 interactions happen in JavaScript so that they can evolve without the
 need for browser changes. I'm looking for some guidance about the
 preferred method for adding this type of functionality. I'm new to this
 process so please bear with me.
 
 My initial implementation is built around WebM, but I believe this could
 work for Ogg  MP4 as well. The basic idea is to initialize the video
 tag with stream initialization data (ie WebM info  tracks elements) via
 the video src attribute and then send media chunks (ie WebM clusters)
 to the tag via a new appendData() method on video. Here is a simple
 example of what I'm talking about.
 
   video id=v autoplay /video
   script
 function needMoreData(e) {
   e.target.appendData(getNextCluster());
 }
 
 function onSeeking(e) {
   var video = e.target;
   video.appendData(findClusterForTime(video.currentTime));
 }
 
 var video = document.getElementById('v');
 
 video.addEventListener('loadstart', needMoreData);
 video.addEventListener('stalled', needMoreData);
 video.addEventListener('seeking', onSeeking);
 
 video.src = URL.createObjectURL(createStreamInitBlob());
   /script
 
 AppendData() expects to recieve a Uint8Array that contains WebM cluster
 elements. The first cluster passed to appendData() initializes the
 starting playback position. Also after a seeking event fires the first
 appendData() updates the current position to the seek point.
 
 I've also been looking at the WebRTC MediaStream API and was wondering
 if it makes more sense to create an object similar to the
 LocalMediaStream object.
 This has the benefits of unifying how media streams are handled
 independent of whether they come from a camera or a JavaScript based
 streaming algorithm. This could also enable sending the media stream
 through a Peer-to-peer connection instead of only allowing a camera as a
 source. Here is an example of the type of object I'm talking about.
 
 interface GeneratedMediaStream : MediaStream {
   void init(in DOMString type, in UInt8Array init_data);
   void appendData(in DOMString trackId, in UInt8Array data);
   void endOfStream();
 
   readonly attribute MultipleTrackList audioTracks;
   readonly attribute ExclusiveTrackList videoTracks; };
 
 type - identifies the type of stream we are generating(ie video/x-webm-
 cluster-stream or video/ogg-page-stream) init_data - Provides
 initialization data that indicates the number of tracks, codec configs,
 etc. (ie WebM info  tracks elements or Ogg header
 
 pages)
 trackId - Indicates what track the data is for. If this is an empty
 string than multiplexed data is being passed in. If not empty trackId
 matches an id of a track in the TrackList objects.
 data - media data chunk (ie WebM cluster or Ogg page). Data is expected
 to have monotonically increasing timestamps, no gaps, etc.
 
 Here are my questions:
 - Is there a preference for appendData() vs new MediaStream object?
 - If the MediaStream object is preferred, should this be constructed
 through Navigator.getUserMedia()? I'm unclear about what the criteria is
 for adding this to Navigator vs allowing direct object construction.
 - Are there existing efforts along these lines? If so, please point me
 to them.
 
 Thanks for your help,
 
 Aaron


Re: [whatwg] Extending HTML 5 video for adaptive streaming

2011-07-01 Thread Aaron Colwell
Hi Bob,

Comments inline

On Fri, Jul 1, 2011 at 8:40 AM, Bob Lund b.l...@cablelabs.com wrote:

 Hi Aaron,

 Here are some other aspects of script controlled adaptive bit rate that
 occur to me, perhaps you have already considered these.

 1) I guess script will be responsible for maintaining its own playback
 buffer, monitoring buffer behavior and selecting the appropriate bit rate
 for new fragments. Are there any other network related events/metrics script
 might need to determine which bit-rate to fetch for the next segment? Is
 there any other information from the user agent about playback performance
 that script might need?


The script would be responsible for managing buffering. It can use the
currentTime  buffered attributes on the video tag to monitor the
consumption of the data passed in via appendData(). I believe the attributes
being proposed in the video metrics
proposalhttp://wiki.whatwg.org/wiki/Video_Metrics#Proposal could
also be helpful. Right now I'm just using XMLHttpRequest to fetch WebM
clusters and measuring how long it takes to fetch them to create a bandwidth
estimate. I haven't spent much time on the BW measurement  adaptation
algorithms yet. I'm just trying to nail down mechanism for passing the media
data to the browser first.


 2) If a media resource is a multi-track resource then it would seem script
 will also have to fetch fragments for those tracks which implies that the
 audio element would need the append method. Timed text tracks would also
 need to be processed and Cues appended.


The idea is that appendData() can receive media for multiple tracks. In the
case of WebM each cluster can have blocks from different tracks multiplexed
together. The initial stream config information contains the the track
mappings necessary to demux the cluster. I was also planning to allow both
multiplexed and demultiplexed clusters. Cluster timecodes must be in
monotonically increasing order, but it would be possible to call
appendData() with an cluster with only audio data followed by a cluster with
only video data. This would allow straight forward support for deployments
where audio  video tracks for a single presentation are in separate WebM
files.


 There is a new media pipeline task force in the Web and TV IG (
 http://www.w3.org/2011/webtv/wiki/MPTF) that is also planning to examine
 this topic. You may want to participate.


I have signed up to the mailing list and will take some time to catch up
with the archives.

Thanks for your comments.

Aaron


[whatwg] Extending HTML 5 video for adaptive streaming

2011-06-30 Thread Aaron Colwell
Hi,

I've been working on an adaptive streaming prototype that uses JavaScript to
fetch chunks of media and feeds them to the video tag for decoding. The idea
is to let the adaptation algorithm and CDN interactions happen in JavaScript
so that they can evolve without the need for browser changes. I'm looking
for some guidance about the preferred method for adding this type of
functionality. I'm new to this process so please bear with me.

My initial implementation is built around WebM, but I believe this could
work for Ogg  MP4 as well. The basic idea is to initialize the video tag
with stream initialization data (ie WebM info  tracks elements) via the
video src attribute and then send media chunks (ie WebM clusters) to the
tag via a new appendData() method on video. Here is a simple example of
what I'm talking about.

  video id=v autoplay /video
  script
function needMoreData(e) {
  e.target.appendData(getNextCluster());
}

function onSeeking(e) {
  var video = e.target;
  video.appendData(findClusterForTime(video.currentTime));
}

var video = document.getElementById('v');

video.addEventListener('loadstart', needMoreData);
video.addEventListener('stalled', needMoreData);
video.addEventListener('seeking', onSeeking);

video.src = URL.createObjectURL(createStreamInitBlob());
  /script

AppendData() expects to recieve a Uint8Array that contains WebM cluster
elements. The first cluster passed to appendData() initializes the starting
playback position. Also after a seeking event fires the first appendData()
updates the current position to the seek point.

I've also been looking at the WebRTC MediaStream API and was wondering if it
makes more sense to create an object similar to the LocalMediaStream object.
This has the benefits of unifying how media streams are handled independent
of whether they come from a camera or a JavaScript based streaming
algorithm. This could also enable sending the media stream through a
Peer-to-peer connection instead of only allowing a camera as a source. Here
is an example of the type of object I'm talking about.

interface GeneratedMediaStream : MediaStream {
  void init(in DOMString type, in UInt8Array init_data);
  void appendData(in DOMString trackId, in UInt8Array data);
  void endOfStream();

  readonly attribute MultipleTrackList audioTracks;
  readonly attribute ExclusiveTrackList videoTracks;
};

type - identifies the type of stream we are generating(ie
video/x-webm-cluster-stream or video/ogg-page-stream)
init_data - Provides initialization data that indicates the number of
tracks, codec configs, etc. (ie WebM info  tracks elements or Ogg header
pages)
trackId - Indicates what track the data is for. If this is an empty string
than multiplexed data is being passed in. If not empty trackId matches an id
of a track in the TrackList objects.
data - media data chunk (ie WebM cluster or Ogg page). Data is expected to
have monotonically increasing timestamps, no gaps, etc.

Here are my questions:
- Is there a preference for appendData() vs new MediaStream object?
- If the MediaStream object is preferred, should this be constructed through
Navigator.getUserMedia()? I'm unclear about what the criteria is for adding
this to Navigator vs allowing direct object construction.
- Are there existing efforts along these lines? If so, please point me to
them.

Thanks for your help,

Aaron


Re: [whatwg] Extending HTML 5 video for adaptive streaming

2011-06-30 Thread Robert O'Callahan
On Fri, Jul 1, 2011 at 4:59 AM, Aaron Colwell acolw...@google.com wrote:

 I've also been looking at the WebRTC MediaStream API and was wondering if
 it
 makes more sense to create an object similar to the LocalMediaStream
 object.
 This has the benefits of unifying how media streams are handled independent
 of whether they come from a camera or a JavaScript based streaming
 algorithm. This could also enable sending the media stream through a
 Peer-to-peer connection instead of only allowing a camera as a source. Here
 is an example of the type of object I'm talking about.


I think MediaStreams should not be dealing with compressed data except as an
optimization when access to decoded data is not required anywhere in the
stream pipeline. If you want to do processing of decoded stream data (which
I do --- see
http://hg.mozilla.org/users/rocallahan_mozilla.com/specs/raw-file/tip/StreamProcessing/StreamProcessing.html),
then introducing a decoder inside the stream processing graph creates all
sorts of complications.

I think the natural way to support the functionality you're looking for is
to extend the concept of Blob URLs. Right now you can create a binary Blob,
mint a URL for it and set that URL as the source for a media element. The
only extension you need is the ability to append data to the Blob while
retaining the same URL; you would need to initially mark the Blob as open
to indicate to URL consumers that the data stream has not ended. That
extension would be useful for all sorts of things because you can use those
Blob URLs anywhere. An alternative would be to create a new kind of object
representing an appendable sequence of Blobs and create an API to mint URLs
for it.

Note that with my API proposal above, you can get a MediaStream from a media
element that's using any URL and send that through a PeerConnection.

Rob
-- 
If we claim to be without sin, we deceive ourselves and the truth is not in
us. If we confess our sins, he is faithful and just and will forgive us our
sins and purify us from all unrighteousness. If we claim we have not sinned,
we make him out to be a liar and his word is not in us. [1 John 1:8-10]


Re: [whatwg] Extending HTML 5 video for adaptive streaming

2011-06-30 Thread Adam Malcontenti-Wilson
@acolwell:
Is the appendData method one your suggesting or one already specified/existing?

@robert:
Some problems with concept of blobs being appended to, or as I have
previously described as Streaming Blobs was mentioned at
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/032221.html
I'm not exactly sure what that meant - but I'd expect the ideas
discussed are similar.

On Fri, Jul 1, 2011 at 9:13 AM, Robert O'Callahan rob...@ocallahan.org wrote:
 On Fri, Jul 1, 2011 at 4:59 AM, Aaron Colwell acolw...@google.com wrote:

 I've also been looking at the WebRTC MediaStream API and was wondering if
 it
 makes more sense to create an object similar to the LocalMediaStream
 object.
 This has the benefits of unifying how media streams are handled independent
 of whether they come from a camera or a JavaScript based streaming
 algorithm. This could also enable sending the media stream through a
 Peer-to-peer connection instead of only allowing a camera as a source. Here
 is an example of the type of object I'm talking about.


 I think MediaStreams should not be dealing with compressed data except as an
 optimization when access to decoded data is not required anywhere in the
 stream pipeline. If you want to do processing of decoded stream data (which
 I do --- see
 http://hg.mozilla.org/users/rocallahan_mozilla.com/specs/raw-file/tip/StreamProcessing/StreamProcessing.html),
 then introducing a decoder inside the stream processing graph creates all
 sorts of complications.

 I think the natural way to support the functionality you're looking for is
 to extend the concept of Blob URLs. Right now you can create a binary Blob,
 mint a URL for it and set that URL as the source for a media element. The
 only extension you need is the ability to append data to the Blob while
 retaining the same URL; you would need to initially mark the Blob as open
 to indicate to URL consumers that the data stream has not ended. That
 extension would be useful for all sorts of things because you can use those
 Blob URLs anywhere. An alternative would be to create a new kind of object
 representing an appendable sequence of Blobs and create an API to mint URLs
 for it.

 Note that with my API proposal above, you can get a MediaStream from a media
 element that's using any URL and send that through a PeerConnection.

 Rob
 --
 If we claim to be without sin, we deceive ourselves and the truth is not in
 us. If we confess our sins, he is faithful and just and will forgive us our
 sins and purify us from all unrighteousness. If we claim we have not sinned,
 we make him out to be a liar and his word is not in us. [1 John 1:8-10]




-- 
Adam Malcontenti-Wilson