Re: [whatwg] VIDEO and pitchAdjustment

2015-09-01 Thread David Singer

> On Sep 1, 2015, at 4:03 , Robert O'Callahan  wrote:
> 
> On Tue, Sep 1, 2015 at 8:02 PM, Kevin Marks  wrote:
> 
>> QuickTime supports full variable speed playback and has done for well over
>> a decade. With bidirectionally predicted frames you need a fair few buffers
>> anyway, so generalising to full variable wait is easier than posters above
>> claim - you need to work a GOP at a time, but memory buffering isn't the
>> big issue these days.
>> 
> 
> "GOP”?

Group of Pictures.  Video-speak for the run between random access points.

> 
> How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25 fps,
> keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2
> bytes = 4.32 GiB. Reading back those frames would kill performance so that
> all has to stay in VRAM. I respectfully deny that in such a case, memory
> buffering "isn't a big issue”.

well, 10s is a pretty long random access interval.

> 
> Now that I think about it, I guess there are more complicated strategies
> available that would reduce memory usage at the expense of repeated
> decoding.

which indeed QuickTime implemented around 10 years ago.

> E.g. in a first pass, decode forward and store every Nth frame.
> Then as you play backwards you need only redecode N-1 intermediate frames
> at time. I don't know whether HW decoder interfaces would actually let you
> implement that though...
> 
> What QuickTime got right was having a ToC approach to video so being able
>> to seek rapidly was possible without thrashing , whereas the stream
>> oriented approaches we are stuck with no wean knowing which bit of the file
>> to read to get the previous GOP is the hard part.
>> 
> 
> I don't understand. Can you explain this in more detail?

The movie file structure (and hence MP4) has a table-of-contents approach to 
file structure; each frame has its timestamps, file location, size, and 
keyframe-nature stored in compact tables in the head of the file.  This makes 
trick modes and so on easier; you’re not reading the actual video to seek for a 
keyframe, and so on.

David Singer
Manager, Software Standards, Apple Inc.



Re: [whatwg] VIDEO and pitchAdjustment

2015-09-01 Thread Kevin Marks
On Tue, Sep 1, 2015 at 10:55 AM, David Singer  wrote:

>
> > On Sep 1, 2015, at 10:47 , Yay295  wrote:
> >
> > On Tue, Sep 1, 2015 at 11:30 AM, David Singer  wrote:
> > > On Sep 1, 2015, at 4:03 , Robert O'Callahan 
> wrote:
> > >> On Tue, Sep 1, 2015 at 8:02 PM, Kevin Marks 
> wrote:
> > >> QuickTime supports full variable speed playback and has done for well
> over
> > >> a decade. With bidirectionally predicted frames you need a fair few
> buffers
> > >> anyway, so generalising to full variable wait is easier than posters
> above
> > >> claim - you need to work a GOP at a time, but memory buffering isn't
> the
> > >> big issue these days.
> > >
> > > "GOP”?
> >
> > Group of Pictures.  Video-speak for the run between random access points.
> >
> > > How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25
> fps,
> > > keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x
> 2
> > > bytes = 4.32 GiB. Reading back those frames would kill performance so
> that
> > > all has to stay in VRAM. I respectfully deny that in such a case,
> memory
> > > buffering "isn't a big issue”.
> >
> > well, 10s is a pretty long random access interval.
> >
> > There's no way to know the distance between keyframes though. The video
> could technically have only one keyframe and still work as a video.
>
> yes, but that is rare. There are indeed videos that don’t play well
> backward, or consume lots of memory and/or CPU, but most are fine.
>
> >
> > >> What QuickTime got right was having a ToC approach to video so being
> able
> > >> to seek rapidly was possible without thrashing , whereas the stream
> > >> oriented approaches we are stuck with no wean knowing which bit of
> the file
> > >> to read to get the previous GOP is the hard part.
> > >
> > > I don't understand. Can you explain this in more detail?
>

I explained the essential difference a while ago here:
http://lists.xiph.org/pipermail/vorbis-dev/2001-October/004846.html

The QuickTime file format defines movies that have tracks made of media;
the tracks are en edit list on the media; the media have the frame layout
information encoded.


> >
> > The movie file structure (and hence MP4) has a table-of-contents
> approach to file structure; each frame has its timestamps, file location,
> size, and keyframe-nature stored in compact tables in the head of the file.
> This makes trick modes and so on easier; you’re not reading the actual
> video to seek for a keyframe, and so on.
> >
> > I suppose the browser could generate this data the first time it reads
> through the video. It would use a lot less memory. Though that sounds like
> a problem for the browsers to solve, not the standard.
>
> There is no *generation* on the browser side; these tables are part of the
> file format.


Well, when it imports stream-oriented media it has to construct these in
memory, but they can be saved out again. I know that in theory this made
its way into the mp4 format, but I'm not sure how much of it is real.


Re: [whatwg] VIDEO and pitchAdjustment

2015-09-01 Thread Kevin Marks
On Tue, Sep 1, 2015 at 11:57 AM, David Singer  wrote:

>
> > On Sep 1, 2015, at 11:36 , Kevin Marks  wrote:
> > > I suppose the browser could generate this data the first time it reads
> through the video. It would use a lot less memory. Though that sounds like
> a problem for the browsers to solve, not the standard.
> >
> > There is no *generation* on the browser side; these tables are part of
> the file format.
> >
> > Well, when it imports stream-oriented media it has to construct these in
> memory, but they can be saved out again. I know that in theory this made
> its way into the mp4 format, but I'm not sure how much of it is real.
>
> Two different questions:
> a) do the QuickTime movie file format and the MP4 format contain these
> tables?  Yes.
> b) if I open another format, what happens?
>
> For case (a), the situation may be more nuanced if Movie Fragments are in
> use (you then get the tables for each fragment of the movie, though they
> are easily coalesced as they arrive).
>
> For case (b), classic QuickTime used to ‘convert to movie’ in memory,
> building the tables.  The situation is more nuanced on more recent engines.
>
> I think the point of the discussion is that one cannot dismiss trick modes
> such as reverse play as being unimplementable.


The other point for me is that given http://aomedia.org/ announcing plans
to create a new video file format to fix everything, that this time we
actually learn from this history and make one that is editable and seekable
again.


Re: [whatwg] VIDEO and pitchAdjustment

2015-09-01 Thread David Singer

> On Sep 1, 2015, at 11:36 , Kevin Marks  wrote:
> > I suppose the browser could generate this data the first time it reads 
> > through the video. It would use a lot less memory. Though that sounds like 
> > a problem for the browsers to solve, not the standard.
> 
> There is no *generation* on the browser side; these tables are part of the 
> file format.
> 
> Well, when it imports stream-oriented media it has to construct these in 
> memory, but they can be saved out again. I know that in theory this made its 
> way into the mp4 format, but I'm not sure how much of it is real.

Two different questions:
a) do the QuickTime movie file format and the MP4 format contain these tables?  
Yes.
b) if I open another format, what happens?

For case (a), the situation may be more nuanced if Movie Fragments are in use 
(you then get the tables for each fragment of the movie, though they are easily 
coalesced as they arrive).

For case (b), classic QuickTime used to ‘convert to movie’ in memory, building 
the tables.  The situation is more nuanced on more recent engines.

I think the point of the discussion is that one cannot dismiss trick modes such 
as reverse play as being unimplementable. 

David Singer
Manager, Software Standards, Apple Inc.



Re: [whatwg] VIDEO and pitchAdjustment

2015-09-01 Thread Yay295
On Tue, Sep 1, 2015 at 11:30 AM, David Singer  wrote:

> > On Sep 1, 2015, at 4:03 , Robert O'Callahan 
> wrote:
> >> On Tue, Sep 1, 2015 at 8:02 PM, Kevin Marks 
> wrote:
> >> QuickTime supports full variable speed playback and has done for well
> over
> >> a decade. With bidirectionally predicted frames you need a fair few
> buffers
> >> anyway, so generalising to full variable wait is easier than posters
> above
> >> claim - you need to work a GOP at a time, but memory buffering isn't the
> >> big issue these days.
> >
> > "GOP”?
>
> Group of Pictures.  Video-speak for the run between random access points.
>
> > How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25
> fps,
> > keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2
> > bytes = 4.32 GiB. Reading back those frames would kill performance so
> that
> > all has to stay in VRAM. I respectfully deny that in such a case, memory
> > buffering "isn't a big issue”.
>
> well, 10s is a pretty long random access interval.
>

There's no way to know the distance between keyframes though. The video
could technically have only one keyframe and still work as a video.


> >> What QuickTime got right was having a ToC approach to video so being
> able
> >> to seek rapidly was possible without thrashing , whereas the stream
> >> oriented approaches we are stuck with no wean knowing which bit of the
> file
> >> to read to get the previous GOP is the hard part.
> >
> > I don't understand. Can you explain this in more detail?
>
> The movie file structure (and hence MP4) has a table-of-contents approach
> to file structure; each frame has its timestamps, file location, size, and
> keyframe-nature stored in compact tables in the head of the file. This
> makes trick modes and so on easier; you’re not reading the actual video to
> seek for a keyframe, and so on.
>

I suppose the browser could generate this data the first time it reads
through the video. It would use a lot less memory. Though that sounds like
a problem for the browsers to solve, not the standard.


Re: [whatwg] VIDEO and pitchAdjustment

2015-09-01 Thread David Singer

> On Sep 1, 2015, at 10:47 , Yay295  wrote:
> 
> On Tue, Sep 1, 2015 at 11:30 AM, David Singer  wrote:
> > On Sep 1, 2015, at 4:03 , Robert O'Callahan  wrote:
> >> On Tue, Sep 1, 2015 at 8:02 PM, Kevin Marks  wrote:
> >> QuickTime supports full variable speed playback and has done for well over
> >> a decade. With bidirectionally predicted frames you need a fair few buffers
> >> anyway, so generalising to full variable wait is easier than posters above
> >> claim - you need to work a GOP at a time, but memory buffering isn't the
> >> big issue these days.
> >
> > "GOP”?
> 
> Group of Pictures.  Video-speak for the run between random access points.
> 
> > How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25 fps,
> > keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2
> > bytes = 4.32 GiB. Reading back those frames would kill performance so that
> > all has to stay in VRAM. I respectfully deny that in such a case, memory
> > buffering "isn't a big issue”.
> 
> well, 10s is a pretty long random access interval.
> 
> There's no way to know the distance between keyframes though. The video could 
> technically have only one keyframe and still work as a video.

yes, but that is rare. There are indeed videos that don’t play well backward, 
or consume lots of memory and/or CPU, but most are fine.

>  
> >> What QuickTime got right was having a ToC approach to video so being able
> >> to seek rapidly was possible without thrashing , whereas the stream
> >> oriented approaches we are stuck with no wean knowing which bit of the file
> >> to read to get the previous GOP is the hard part.
> >
> > I don't understand. Can you explain this in more detail?
> 
> The movie file structure (and hence MP4) has a table-of-contents approach to 
> file structure; each frame has its timestamps, file location, size, and 
> keyframe-nature stored in compact tables in the head of the file. This makes 
> trick modes and so on easier; you’re not reading the actual video to seek for 
> a keyframe, and so on.
> 
> I suppose the browser could generate this data the first time it reads 
> through the video. It would use a lot less memory. Though that sounds like a 
> problem for the browsers to solve, not the standard.

There is no *generation* on the browser side; these tables are part of the file 
format.

David Singer
Manager, Software Standards, Apple Inc.



Re: [whatwg] deprecating

2015-09-01 Thread Ian Hickson
On Tue, 1 Sep 2015, henry.st...@bblfish.net wrote:
>
> As the WhatWG only recenly moved to Github members here may not have 
> noticed that  has been deprecated.
> 
> I opened https://github.com/whatwg/html/issues/67 to give space for the 
> discussion. It is a pitty that this was closed so quickly ( within an 
> hour ) without giving members and the public ( the users of the web ) 
> time to comment nor for their voice to be heard.
> 
> This is a complex issue that involves many different levels of 
> expertise, and it should not be handled so lightly.

The spec just reflects implementations. The majority of implementations of 
 (by usage) have said they want to drop it, and the other major 
implementation has never supported it. The element was originally (and for 
many years) purely a mostly-undocumented proprietary extension; at the 
time it was invented, the HTML spec was edited by the W3C and the W3C did 
not add it (they only ended up speccing it in their most recent HTML spec 
because they forked the WHATWG's spec which did define it -- indeed, even 
then, it was something that W3C HTML working group members argued should 
not have been included). It was only added to the WHATWG spec because one 
of the browser vendors said they could not remove support for it due to 
usage by enterprise customers; that browser vendor is now amongst one of 
the ones wanting to drop it.

As far as I can tell, therefore, things here are working exactly as one 
should expect.

It's worth noting that  is a pretty terrible API. I recommend 
approaching the groups writing new cryptography APIs, explaining your use 
cases, and making sure they are supported in up-and-coming, more widely 
supported, more secure, and more well-thought-out APIs.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] VIDEO and pitchAdjustment

2015-09-01 Thread Robert O'Callahan
On Wed, Sep 2, 2015 at 5:30 AM, David Singer  wrote:

> On Sep 1, 2015, at 4:03 , Robert O'Callahan  wrote:
> > How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25
> fps,
> > keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2
> > bytes = 4.32 GiB. Reading back those frames would kill performance so
> that
> > all has to stay in VRAM. I respectfully deny that in such a case, memory
> > buffering "isn't a big issue”.
>
> well, 10s is a pretty long random access interval.
>

It's easy to find sources on the Internet advising people to use 10s
keyframe intervals.

> Now that I think about it, I guess there are more complicated strategies
> > available that would reduce memory usage at the expense of repeated
> > decoding.
>
> which indeed QuickTime implemented around 10 years ago.
>

It appears that most platform and HW decoder interfaces are incompatible
with this strategy, so in practice implementing this across platforms is
still a big problem.

Nevertheless we can hope for that situation to improve, and negative
playback rates are implementable for some videos, so it makes sense to me
to leave negative playback rates in the spec.

The movie file structure (and hence MP4) has a table-of-contents approach
> to file structure; each frame has its timestamps, file location, size, and
> keyframe-nature stored in compact tables in the head of the file.  This
> makes trick modes and so on easier; you’re not reading the actual video to
> seek for a keyframe, and so on.
>

I think every important video container format has some kind of keyframe
directory.

Rob
-- 
lbir ye,ea yer.tnietoehr  rdn rdsme,anea lurpr  edna e hnysnenh hhe uresyf
toD
selthor  stor  edna  siewaoeodm  or v sstvr  esBa  kbvted,t
rdsme,aoreseoouoto
o l euetiuruewFa  kbn e hnystoivateweh uresyf tulsa rehr  rdm  or rnea
lurpr
.a war hsrer holsa rodvted,t  nenh hneireseoouot.tniesiewaoeivatewt sstvr
esn


Re: [whatwg] VIDEO and pitchAdjustment

2015-09-01 Thread Kevin Marks
QuickTime supports full variable speed playback and has done for well over
a decade. With bidirectionally predicted frames you need a fair few buffers
anyway, so generalising to full variable wait is easier than posters above
claim - you need to work a GOP at a time, but memory buffering isn't the
big issue these days.
What QuickTime got right was having a ToC approach to video so being able
to seek rapidly was possible without thrashing , whereas the stream
oriented approaches we are stuck with no wean knowing which bit of the file
to read to get the previous GOP is the hard part.

On Fri, Aug 28, 2015 at 6:02 PM, Xidorn Quan  wrote:

> On Sat, Aug 29, 2015 at 8:27 AM, Robert O'Callahan 
> wrote:
> > On Sat, Aug 29, 2015 at 8:18 AM, James Ross 
> wrote:
> >
> >> Support is certainly poor; Internet Explorer/Trident and Edge both
> support
> >> negative playback rates on desktop (I haven’t tested mobile) but do so
> by
> >> simply showing the key frames as they are reached in reverse, in my
> testing.
> >
> > That's not so hard to implement, but it's also mostly useless since
> > keyframes are often several seconds apart or more.
>
> It could be useful for a few usecases like fast-backward. Windows
> Media Player does it this way.
>
> FWIW, QuickTime supports per-frame backward playback if you press and
> hold the left arrow. I guess they cannot guarantee the rate, which
> makes them require holding the key instead of providing a playback
> rate setting.
>
> - Xidorn
>


[whatwg] deprecating

2015-09-01 Thread henry.st...@bblfish.net
As the WhatWG only recenly moved to Github members here may not have noticed 
that  has been deprecated. 

I opened https://github.com/whatwg/html/issues/67 to give space for the 
discussion. It is a pitty that this was closed so quickly ( within an hour ) 
without giving members and the public ( the users of the web ) time to comment 
nor for their voice to be heard.

This is a complex issue that involves many different levels of expertise, and 
it should not be handled so lightly.

Henry


Social Web Architect
http://bblfish.net/



Re: [whatwg] VIDEO and pitchAdjustment

2015-09-01 Thread Philip Jägenstedt
On Mon, Aug 31, 2015 at 9:48 PM, Domenic Denicola  wrote:
> From: Eric Carlson [mailto:eric.carl...@apple.com]
>
>>   FWIW, Safari supports negative playback rates on the desktop and on iOS.
>>
>> ...
>>
>>   The crash Garrett noted in Safari 8 is a bug that “only" happens with MSE
>> content.
>
> That's really helpful, thanks. Combined with Edge's keyframes-only support, 
> it sounds like we should probably leave the spec as it is.
>
> Do you have thoughts on a mozPreservesPitch equivalent?

Should we just standardize HTMLMediaElement.preservesPitch, perhaps?
Note that WebKit also has webkitPreservesPitch, but I removed it from
Blink because it didn't actually do anything in Chromium.

In both Gecko and WebKit it defaults to true. Is there anything else
worth knowing before writing the spec for this?

Philip


Re: [whatwg] VIDEO and pitchAdjustment

2015-09-01 Thread Robert O'Callahan
On Tue, Sep 1, 2015 at 8:02 PM, Kevin Marks  wrote:

> QuickTime supports full variable speed playback and has done for well over
> a decade. With bidirectionally predicted frames you need a fair few buffers
> anyway, so generalising to full variable wait is easier than posters above
> claim - you need to work a GOP at a time, but memory buffering isn't the
> big issue these days.
>

"GOP"?

How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25 fps,
keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2
bytes = 4.32 GiB. Reading back those frames would kill performance so that
all has to stay in VRAM. I respectfully deny that in such a case, memory
buffering "isn't a big issue".

Now that I think about it, I guess there are more complicated strategies
available that would reduce memory usage at the expense of repeated
decoding. E.g. in a first pass, decode forward and store every Nth frame.
Then as you play backwards you need only redecode N-1 intermediate frames
at time. I don't know whether HW decoder interfaces would actually let you
implement that though...

What QuickTime got right was having a ToC approach to video so being able
> to seek rapidly was possible without thrashing , whereas the stream
> oriented approaches we are stuck with no wean knowing which bit of the file
> to read to get the previous GOP is the hard part.
>

I don't understand. Can you explain this in more detail?

Rob
-- 
lbir ye,ea yer.tnietoehr  rdn rdsme,anea lurpr  edna e hnysnenh hhe uresyf
toD
selthor  stor  edna  siewaoeodm  or v sstvr  esBa  kbvted,t
rdsme,aoreseoouoto
o l euetiuruewFa  kbn e hnystoivateweh uresyf tulsa rehr  rdm  or rnea
lurpr
.a war hsrer holsa rodvted,t  nenh hneireseoouot.tniesiewaoeivatewt sstvr
esn