Re: [whatwg] VIDEO and pitchAdjustment
> On Sep 1, 2015, at 4:03 , Robert O'Callahanwrote: > > On Tue, Sep 1, 2015 at 8:02 PM, Kevin Marks wrote: > >> QuickTime supports full variable speed playback and has done for well over >> a decade. With bidirectionally predicted frames you need a fair few buffers >> anyway, so generalising to full variable wait is easier than posters above >> claim - you need to work a GOP at a time, but memory buffering isn't the >> big issue these days. >> > > "GOP”? Group of Pictures. Video-speak for the run between random access points. > > How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25 fps, > keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2 > bytes = 4.32 GiB. Reading back those frames would kill performance so that > all has to stay in VRAM. I respectfully deny that in such a case, memory > buffering "isn't a big issue”. well, 10s is a pretty long random access interval. > > Now that I think about it, I guess there are more complicated strategies > available that would reduce memory usage at the expense of repeated > decoding. which indeed QuickTime implemented around 10 years ago. > E.g. in a first pass, decode forward and store every Nth frame. > Then as you play backwards you need only redecode N-1 intermediate frames > at time. I don't know whether HW decoder interfaces would actually let you > implement that though... > > What QuickTime got right was having a ToC approach to video so being able >> to seek rapidly was possible without thrashing , whereas the stream >> oriented approaches we are stuck with no wean knowing which bit of the file >> to read to get the previous GOP is the hard part. >> > > I don't understand. Can you explain this in more detail? The movie file structure (and hence MP4) has a table-of-contents approach to file structure; each frame has its timestamps, file location, size, and keyframe-nature stored in compact tables in the head of the file. This makes trick modes and so on easier; you’re not reading the actual video to seek for a keyframe, and so on. David Singer Manager, Software Standards, Apple Inc.
Re: [whatwg] VIDEO and pitchAdjustment
On Tue, Sep 1, 2015 at 10:55 AM, David Singerwrote: > > > On Sep 1, 2015, at 10:47 , Yay295 wrote: > > > > On Tue, Sep 1, 2015 at 11:30 AM, David Singer wrote: > > > On Sep 1, 2015, at 4:03 , Robert O'Callahan > wrote: > > >> On Tue, Sep 1, 2015 at 8:02 PM, Kevin Marks > wrote: > > >> QuickTime supports full variable speed playback and has done for well > over > > >> a decade. With bidirectionally predicted frames you need a fair few > buffers > > >> anyway, so generalising to full variable wait is easier than posters > above > > >> claim - you need to work a GOP at a time, but memory buffering isn't > the > > >> big issue these days. > > > > > > "GOP”? > > > > Group of Pictures. Video-speak for the run between random access points. > > > > > How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25 > fps, > > > keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x > 2 > > > bytes = 4.32 GiB. Reading back those frames would kill performance so > that > > > all has to stay in VRAM. I respectfully deny that in such a case, > memory > > > buffering "isn't a big issue”. > > > > well, 10s is a pretty long random access interval. > > > > There's no way to know the distance between keyframes though. The video > could technically have only one keyframe and still work as a video. > > yes, but that is rare. There are indeed videos that don’t play well > backward, or consume lots of memory and/or CPU, but most are fine. > > > > > >> What QuickTime got right was having a ToC approach to video so being > able > > >> to seek rapidly was possible without thrashing , whereas the stream > > >> oriented approaches we are stuck with no wean knowing which bit of > the file > > >> to read to get the previous GOP is the hard part. > > > > > > I don't understand. Can you explain this in more detail? > I explained the essential difference a while ago here: http://lists.xiph.org/pipermail/vorbis-dev/2001-October/004846.html The QuickTime file format defines movies that have tracks made of media; the tracks are en edit list on the media; the media have the frame layout information encoded. > > > > The movie file structure (and hence MP4) has a table-of-contents > approach to file structure; each frame has its timestamps, file location, > size, and keyframe-nature stored in compact tables in the head of the file. > This makes trick modes and so on easier; you’re not reading the actual > video to seek for a keyframe, and so on. > > > > I suppose the browser could generate this data the first time it reads > through the video. It would use a lot less memory. Though that sounds like > a problem for the browsers to solve, not the standard. > > There is no *generation* on the browser side; these tables are part of the > file format. Well, when it imports stream-oriented media it has to construct these in memory, but they can be saved out again. I know that in theory this made its way into the mp4 format, but I'm not sure how much of it is real.
Re: [whatwg] VIDEO and pitchAdjustment
On Tue, Sep 1, 2015 at 11:57 AM, David Singerwrote: > > > On Sep 1, 2015, at 11:36 , Kevin Marks wrote: > > > I suppose the browser could generate this data the first time it reads > through the video. It would use a lot less memory. Though that sounds like > a problem for the browsers to solve, not the standard. > > > > There is no *generation* on the browser side; these tables are part of > the file format. > > > > Well, when it imports stream-oriented media it has to construct these in > memory, but they can be saved out again. I know that in theory this made > its way into the mp4 format, but I'm not sure how much of it is real. > > Two different questions: > a) do the QuickTime movie file format and the MP4 format contain these > tables? Yes. > b) if I open another format, what happens? > > For case (a), the situation may be more nuanced if Movie Fragments are in > use (you then get the tables for each fragment of the movie, though they > are easily coalesced as they arrive). > > For case (b), classic QuickTime used to ‘convert to movie’ in memory, > building the tables. The situation is more nuanced on more recent engines. > > I think the point of the discussion is that one cannot dismiss trick modes > such as reverse play as being unimplementable. The other point for me is that given http://aomedia.org/ announcing plans to create a new video file format to fix everything, that this time we actually learn from this history and make one that is editable and seekable again.
Re: [whatwg] VIDEO and pitchAdjustment
> On Sep 1, 2015, at 11:36 , Kevin Markswrote: > > I suppose the browser could generate this data the first time it reads > > through the video. It would use a lot less memory. Though that sounds like > > a problem for the browsers to solve, not the standard. > > There is no *generation* on the browser side; these tables are part of the > file format. > > Well, when it imports stream-oriented media it has to construct these in > memory, but they can be saved out again. I know that in theory this made its > way into the mp4 format, but I'm not sure how much of it is real. Two different questions: a) do the QuickTime movie file format and the MP4 format contain these tables? Yes. b) if I open another format, what happens? For case (a), the situation may be more nuanced if Movie Fragments are in use (you then get the tables for each fragment of the movie, though they are easily coalesced as they arrive). For case (b), classic QuickTime used to ‘convert to movie’ in memory, building the tables. The situation is more nuanced on more recent engines. I think the point of the discussion is that one cannot dismiss trick modes such as reverse play as being unimplementable. David Singer Manager, Software Standards, Apple Inc.
Re: [whatwg] VIDEO and pitchAdjustment
On Tue, Sep 1, 2015 at 11:30 AM, David Singerwrote: > > On Sep 1, 2015, at 4:03 , Robert O'Callahan > wrote: > >> On Tue, Sep 1, 2015 at 8:02 PM, Kevin Marks > wrote: > >> QuickTime supports full variable speed playback and has done for well > over > >> a decade. With bidirectionally predicted frames you need a fair few > buffers > >> anyway, so generalising to full variable wait is easier than posters > above > >> claim - you need to work a GOP at a time, but memory buffering isn't the > >> big issue these days. > > > > "GOP”? > > Group of Pictures. Video-speak for the run between random access points. > > > How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25 > fps, > > keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2 > > bytes = 4.32 GiB. Reading back those frames would kill performance so > that > > all has to stay in VRAM. I respectfully deny that in such a case, memory > > buffering "isn't a big issue”. > > well, 10s is a pretty long random access interval. > There's no way to know the distance between keyframes though. The video could technically have only one keyframe and still work as a video. > >> What QuickTime got right was having a ToC approach to video so being > able > >> to seek rapidly was possible without thrashing , whereas the stream > >> oriented approaches we are stuck with no wean knowing which bit of the > file > >> to read to get the previous GOP is the hard part. > > > > I don't understand. Can you explain this in more detail? > > The movie file structure (and hence MP4) has a table-of-contents approach > to file structure; each frame has its timestamps, file location, size, and > keyframe-nature stored in compact tables in the head of the file. This > makes trick modes and so on easier; you’re not reading the actual video to > seek for a keyframe, and so on. > I suppose the browser could generate this data the first time it reads through the video. It would use a lot less memory. Though that sounds like a problem for the browsers to solve, not the standard.
Re: [whatwg] VIDEO and pitchAdjustment
> On Sep 1, 2015, at 10:47 , Yay295wrote: > > On Tue, Sep 1, 2015 at 11:30 AM, David Singer wrote: > > On Sep 1, 2015, at 4:03 , Robert O'Callahan wrote: > >> On Tue, Sep 1, 2015 at 8:02 PM, Kevin Marks wrote: > >> QuickTime supports full variable speed playback and has done for well over > >> a decade. With bidirectionally predicted frames you need a fair few buffers > >> anyway, so generalising to full variable wait is easier than posters above > >> claim - you need to work a GOP at a time, but memory buffering isn't the > >> big issue these days. > > > > "GOP”? > > Group of Pictures. Video-speak for the run between random access points. > > > How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25 fps, > > keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2 > > bytes = 4.32 GiB. Reading back those frames would kill performance so that > > all has to stay in VRAM. I respectfully deny that in such a case, memory > > buffering "isn't a big issue”. > > well, 10s is a pretty long random access interval. > > There's no way to know the distance between keyframes though. The video could > technically have only one keyframe and still work as a video. yes, but that is rare. There are indeed videos that don’t play well backward, or consume lots of memory and/or CPU, but most are fine. > > >> What QuickTime got right was having a ToC approach to video so being able > >> to seek rapidly was possible without thrashing , whereas the stream > >> oriented approaches we are stuck with no wean knowing which bit of the file > >> to read to get the previous GOP is the hard part. > > > > I don't understand. Can you explain this in more detail? > > The movie file structure (and hence MP4) has a table-of-contents approach to > file structure; each frame has its timestamps, file location, size, and > keyframe-nature stored in compact tables in the head of the file. This makes > trick modes and so on easier; you’re not reading the actual video to seek for > a keyframe, and so on. > > I suppose the browser could generate this data the first time it reads > through the video. It would use a lot less memory. Though that sounds like a > problem for the browsers to solve, not the standard. There is no *generation* on the browser side; these tables are part of the file format. David Singer Manager, Software Standards, Apple Inc.
Re: [whatwg] deprecating
On Tue, 1 Sep 2015, henry.st...@bblfish.net wrote: > > As the WhatWG only recenly moved to Github members here may not have > noticed that has been deprecated. > > I opened https://github.com/whatwg/html/issues/67 to give space for the > discussion. It is a pitty that this was closed so quickly ( within an > hour ) without giving members and the public ( the users of the web ) > time to comment nor for their voice to be heard. > > This is a complex issue that involves many different levels of > expertise, and it should not be handled so lightly. The spec just reflects implementations. The majority of implementations of (by usage) have said they want to drop it, and the other major implementation has never supported it. The element was originally (and for many years) purely a mostly-undocumented proprietary extension; at the time it was invented, the HTML spec was edited by the W3C and the W3C did not add it (they only ended up speccing it in their most recent HTML spec because they forked the WHATWG's spec which did define it -- indeed, even then, it was something that W3C HTML working group members argued should not have been included). It was only added to the WHATWG spec because one of the browser vendors said they could not remove support for it due to usage by enterprise customers; that browser vendor is now amongst one of the ones wanting to drop it. As far as I can tell, therefore, things here are working exactly as one should expect. It's worth noting that is a pretty terrible API. I recommend approaching the groups writing new cryptography APIs, explaining your use cases, and making sure they are supported in up-and-coming, more widely supported, more secure, and more well-thought-out APIs. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] VIDEO and pitchAdjustment
On Wed, Sep 2, 2015 at 5:30 AM, David Singerwrote: > On Sep 1, 2015, at 4:03 , Robert O'Callahan wrote: > > How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25 > fps, > > keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2 > > bytes = 4.32 GiB. Reading back those frames would kill performance so > that > > all has to stay in VRAM. I respectfully deny that in such a case, memory > > buffering "isn't a big issue”. > > well, 10s is a pretty long random access interval. > It's easy to find sources on the Internet advising people to use 10s keyframe intervals. > Now that I think about it, I guess there are more complicated strategies > > available that would reduce memory usage at the expense of repeated > > decoding. > > which indeed QuickTime implemented around 10 years ago. > It appears that most platform and HW decoder interfaces are incompatible with this strategy, so in practice implementing this across platforms is still a big problem. Nevertheless we can hope for that situation to improve, and negative playback rates are implementable for some videos, so it makes sense to me to leave negative playback rates in the spec. The movie file structure (and hence MP4) has a table-of-contents approach > to file structure; each frame has its timestamps, file location, size, and > keyframe-nature stored in compact tables in the head of the file. This > makes trick modes and so on easier; you’re not reading the actual video to > seek for a keyframe, and so on. > I think every important video container format has some kind of keyframe directory. Rob -- lbir ye,ea yer.tnietoehr rdn rdsme,anea lurpr edna e hnysnenh hhe uresyf toD selthor stor edna siewaoeodm or v sstvr esBa kbvted,t rdsme,aoreseoouoto o l euetiuruewFa kbn e hnystoivateweh uresyf tulsa rehr rdm or rnea lurpr .a war hsrer holsa rodvted,t nenh hneireseoouot.tniesiewaoeivatewt sstvr esn
Re: [whatwg] VIDEO and pitchAdjustment
QuickTime supports full variable speed playback and has done for well over a decade. With bidirectionally predicted frames you need a fair few buffers anyway, so generalising to full variable wait is easier than posters above claim - you need to work a GOP at a time, but memory buffering isn't the big issue these days. What QuickTime got right was having a ToC approach to video so being able to seek rapidly was possible without thrashing , whereas the stream oriented approaches we are stuck with no wean knowing which bit of the file to read to get the previous GOP is the hard part. On Fri, Aug 28, 2015 at 6:02 PM, Xidorn Quanwrote: > On Sat, Aug 29, 2015 at 8:27 AM, Robert O'Callahan > wrote: > > On Sat, Aug 29, 2015 at 8:18 AM, James Ross > wrote: > > > >> Support is certainly poor; Internet Explorer/Trident and Edge both > support > >> negative playback rates on desktop (I haven’t tested mobile) but do so > by > >> simply showing the key frames as they are reached in reverse, in my > testing. > > > > That's not so hard to implement, but it's also mostly useless since > > keyframes are often several seconds apart or more. > > It could be useful for a few usecases like fast-backward. Windows > Media Player does it this way. > > FWIW, QuickTime supports per-frame backward playback if you press and > hold the left arrow. I guess they cannot guarantee the rate, which > makes them require holding the key instead of providing a playback > rate setting. > > - Xidorn >
[whatwg] deprecating
As the WhatWG only recenly moved to Github members here may not have noticed that has been deprecated. I opened https://github.com/whatwg/html/issues/67 to give space for the discussion. It is a pitty that this was closed so quickly ( within an hour ) without giving members and the public ( the users of the web ) time to comment nor for their voice to be heard. This is a complex issue that involves many different levels of expertise, and it should not be handled so lightly. Henry Social Web Architect http://bblfish.net/
Re: [whatwg] VIDEO and pitchAdjustment
On Mon, Aug 31, 2015 at 9:48 PM, Domenic Denicolawrote: > From: Eric Carlson [mailto:eric.carl...@apple.com] > >> FWIW, Safari supports negative playback rates on the desktop and on iOS. >> >> ... >> >> The crash Garrett noted in Safari 8 is a bug that “only" happens with MSE >> content. > > That's really helpful, thanks. Combined with Edge's keyframes-only support, > it sounds like we should probably leave the spec as it is. > > Do you have thoughts on a mozPreservesPitch equivalent? Should we just standardize HTMLMediaElement.preservesPitch, perhaps? Note that WebKit also has webkitPreservesPitch, but I removed it from Blink because it didn't actually do anything in Chromium. In both Gecko and WebKit it defaults to true. Is there anything else worth knowing before writing the spec for this? Philip
Re: [whatwg] VIDEO and pitchAdjustment
On Tue, Sep 1, 2015 at 8:02 PM, Kevin Markswrote: > QuickTime supports full variable speed playback and has done for well over > a decade. With bidirectionally predicted frames you need a fair few buffers > anyway, so generalising to full variable wait is easier than posters above > claim - you need to work a GOP at a time, but memory buffering isn't the > big issue these days. > "GOP"? How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25 fps, keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2 bytes = 4.32 GiB. Reading back those frames would kill performance so that all has to stay in VRAM. I respectfully deny that in such a case, memory buffering "isn't a big issue". Now that I think about it, I guess there are more complicated strategies available that would reduce memory usage at the expense of repeated decoding. E.g. in a first pass, decode forward and store every Nth frame. Then as you play backwards you need only redecode N-1 intermediate frames at time. I don't know whether HW decoder interfaces would actually let you implement that though... What QuickTime got right was having a ToC approach to video so being able > to seek rapidly was possible without thrashing , whereas the stream > oriented approaches we are stuck with no wean knowing which bit of the file > to read to get the previous GOP is the hard part. > I don't understand. Can you explain this in more detail? Rob -- lbir ye,ea yer.tnietoehr rdn rdsme,anea lurpr edna e hnysnenh hhe uresyf toD selthor stor edna siewaoeodm or v sstvr esBa kbvted,t rdsme,aoreseoouoto o l euetiuruewFa kbn e hnystoivateweh uresyf tulsa rehr rdm or rnea lurpr .a war hsrer holsa rodvted,t nenh hneireseoouot.tniesiewaoeivatewt sstvr esn