On Tue, Mar 12, 2019 at 05:05:00AM +0000, Andreas Rheinhardt wrote: > Michael Niedermayer: > > On Sun, Mar 10, 2019 at 11:03:00PM +0000, Andreas Rheinhardt wrote: > >> Michael Niedermayer: > >>> On Fri, Mar 08, 2019 at 10:25:59AM +0100, Andreas Rheinhardt wrote: > >>>> When the new incremental parser was introduced, the old parser was > >>>> kept, because the new parser was unable to handle the way SSA packets > >>>> are put into Matroska. But since 2014 (since > >>>> c7d8dbad14ed5fa3c217a4fc1790021d6c0b6416) this is no longer needed, so > >>>> that the old parser can be completely removed. > >>>> > >>>> Signed-off-by: Andreas Rheinhardt <andreas.rheinha...@googlemail.com> > >>>> --- > >>>> libavformat/matroskadec.c | 72 ++++++--------------------------------- > >>>> 1 file changed, 10 insertions(+), 62 deletions(-) > >>> > >>> This affects seeking: > >>> > >>> libavformat/tests/seek > >>> issues/388/Matrix.Reloaded.Trailer-640x346-XviD-1.0beta2-HE_AAC_subtitled.mkv > >>> -duration 400 >oldseek > >>> > >>> The file appears to be available here: > >>> https://samples.ffmpeg.org/Matroska/matrix/Matrix.Reloaded.Trailer-640x346-XviD-1.0beta2-HE_AAC_subtitled.mkv > >>> > >>> --- oldseek 2019-03-08 23:08:21.380042329 +0100 > >>> +++ newseek 2019-03-08 23:08:02.048041745 +0100 > >>> @@ -8,7 +8,7 @@ > >>> ret: 0 st:13 flags:1 dts: 86.750000 pts: 86.750000 pos: -1 > >>> size: 50436 > >>> ret:-1 st: 1 flags:0 ts: 250.577000 > >>> ret: 0 st: 1 flags:1 ts: 13.471000 > >>> -ret: 0 st:13 flags:1 dts: 1.776000 pts: 1.776000 pos: -1 > >>> size: 50436 > >>> +ret: 0 st:13 flags:1 dts: 0.000000 pts: 0.000000 pos: -1 > >>> size: 50436 > >>> ret:-1 st: 2 flags:0 ts: 176.365000 > >>> ret: 0 st: 2 flags:1 ts: 339.259000 > >>> ret: 0 st:13 flags:1 dts: 145.080000 pts: 145.080000 pos: -1 > >>> size: 50436 > >>> > >> This is not unexpected. The reason is that the old parser always > >> parses a whole cluster at once while the new parser parses clusters > >> incrementally, i.e. one block (I use block as a general term for > >> SimpleBlock or BlockGroup) at a time. The goal of incremental parsing > >> was reduced latency (and with my patch #6 one also achieves a > >> reduction of memory used and an increase in performance as well). > >> > >> With the old parser, the very first cluster gets parsed during > >> avformat_find_stream_info() and index entries were created for all the > >> keyframes contained therein (the cues only contain entries for the > >> main video track, so this only affects the other tracks). The 1.776 > >> pts corresponds to the block with the highest timestamp for track #1 > >> (or track #2 in Matroska's counting) in the first cluster (with a > >> timestamp of 1776ms). > >> > >> With the incremental parser, only one block of the audio track gets > >> parsed during avformat_find_stream_info() (it has a timestamp of 0) > >> and so only one index entry gets created for it. > >> > >> So when the seek based upon the audio track is performed, the used > >> index point has a timestamp of either 0ms or 1776ms. Then the > >> current_dts is updated based upon this timestamp. > >> > >> Now this file has an attached picture which gets translated into its > >> own track and upon every seek this picture will be put into the > >> raw_packet_buffer from which it will be read by ff_read_packet as the > >> first packet after each seek. Given that this packet doesn't have > >> timestamps, the timestamp will be set equal to current_dts (in > >> compute_pkt_fields()). Hence the discrepancy. > >> > >> Btw: Because of things like this, a fate test had to be changed during > >> the initial introduction of incremental parsing (in > >> 8336eb6f85e4b94b9c198b16bd0ac4178f4dba86). > > > > Sounds like undesirable behavior if not a bug > > > > The seek request is to 13.471000 and the result should not depend on > > what has been parsed or not prior to the seek request. > > also if a packet can be produced for 1.776000 which adequatly fullfills > > the seek reuest that is better than a more distant one as that would > > increase latency experienced by the user > > > If all one cares about is that the returned packet is near to the > desired timestamp, then matroska_read_seek succeeds in two cases: > a) The currently available index entries for the desired track are so > fine-grained that one can use this index to seek accurately. This > includes the standard case that one seeks according to a video track > for which the file provides cues (for the keyframes). > b) The desired timestamp is so big that it is beyond the last index > entry or the returned index entry is the last index entry. In this > case the file is read from the last index entry onward, so that a very > precise timestamp can be found. > > If one only seeks wrt the same track and if said track either has a > good index or no index, then this works well: It's clear that it works > in case there is a good index and if there is no index at all (because > in this case the index that is being built up in the background is > good for a whole initial portion of the file and said portion won't > have "holes"). > > But if one changes the track wrt to one seeks, problems can arise. > Think of the case where one first seeks via the cues according to the > video track to positions t_0 and t_1 and reads a few blocks at both > times. This includes the creation of index entries for other tracks > (at least for continuous tracks like audio tracks). If one now seeks > wrt to such an audio track to time t' with t_0<t'<t_1, one might end > up in a place pretty far away from t'. This happens for both parsing > methods. > > There is a case where the incremental parser fares better: If the > clusters referenced in the cues all begin with video keyframes > (standard mkvmerge behaviour, so pretty normal) and one only wants to > read a single frame from t_0 and t_1, then these frames will be video > keyframes and therefore no index entries for the other tracks will > have been created, so that if one seeks with respect to a different > track than the video track, the whole file will be read until one > reaches the desired timestamp. I therefore view the fact that > matroska_read_seek's result depends upon what has already been parsed > as a general bug in matroska_read_seek, not as a drawback of > incremental parsing compared to classical parsing.
thanks for the detailed analysis, and yes i agree this is a bug in existing code > > Here are the results for what I have just described. Old parsing: > ret: 0 st: 1 flags:1 dts: 0.000000 pts: 0.000000 pos: 6653 > size: 1792 > ret: 0 st:-1 flags:0 ts:-1.000000 > ret: 0 st: 0 flags:1 dts: NOPTS pts: 0.016000 pos: 20998 > size:134602 > ret: 0 st:-1 flags:1 ts: 161.894167 > ret: 0 st: 0 flags:1 dts: NOPTS pts: 161.696000 > pos:212409707 size:128647 > ret: 0 st: 0 flags:0 ts: 324.788000 > ret: 0 st: 0 flags:1 dts: NOPTS pts: 324.936000 > pos:356642724 size:195297 > ret: 0 st: 0 flags:1 ts: 87.683000 > ret: 0 st: 0 flags:1 dts: NOPTS pts: 87.456000 > pos:122474581 size:251000 > ret: 0 st: 1 flags:0 ts: 250.577000 > ret: 0 st: 1 flags:1 dts: 324.960000 pts: 324.960000 > pos:356873874 size: 1792 > ret: 0 st: 1 flags:1 ts: 13.471000 > ret: 0 st: 1 flags:1 dts: 0.512000 pts: 0.512000 pos: 450228 > size: 1792 > > Incremental parsing: > ret: 0 st: 1 flags:1 dts: 0.000000 pts: 0.000000 pos: 6653 > size: 1792 > ret: 0 st:-1 flags:0 ts:-1.000000 > ret: 0 st: 0 flags:1 dts: NOPTS pts: 0.016000 pos: 20998 > size:134602 > ret: 0 st:-1 flags:1 ts: 161.894167 > ret: 0 st: 0 flags:1 dts: NOPTS pts: 161.696000 > pos:212409707 size:128647 > ret: 0 st: 0 flags:0 ts: 324.788000 > ret: 0 st: 0 flags:1 dts: NOPTS pts: 324.936000 > pos:356642724 size:195297 > ret: 0 st: 0 flags:1 ts: 87.683000 > ret: 0 st: 0 flags:1 dts: NOPTS pts: 87.456000 > pos:122474581 size:251000 > ret: 0 st: 1 flags:0 ts: 250.577000 > ret: 0 st: 1 flags:1 dts: 250.592000 pts: 250.592000 > pos:296449118 size: 1792 > ret: 0 st: 1 flags:1 ts: 13.471000 > ret: 0 st: 2 flags:1 dts: 13.216000 pts: 13.216000 > pos:18014477 size: 20 > > > > (I see two main avenues to improve matroska_read_seek: > 1. Use the index of other tracks if the best index entry for the > desired track is too far away. (Problems: What about files with wrong > interleavement? What is too far away? If the targeted track has gaps > in it, then one might end up parsing a lot unnecessarily.) > 2. If the found index is too far away, one could simply parse the file > a bit more. > > 2. would have the disadvantage that even video tracks with proper cues > (i.e. an entry for every keyframe), but long GOPs would be parsed. > This could be mitigated by restricting it to tracks for which no cue > entries were present (i.e. when cue entries are present for a track, > then it is presumed that there is a cue entry for every keyframe of > said track) and which can be presumed to have no holes in it (i.e. no > subtitle tracks).) yes, this sounds reasonable to me but others may have other oppinions and there may of course be unexpected problems with this or any other approuch Thnaks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Whats the most studid thing your enemy could do ? Blow himself up Whats the most studid thing you could do ? Give up your rights and freedom because your enemy blew himself up.
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel