Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
lör 2024-03-30 klockan 17:02 +0100 skrev Andreas Rheinhardt: > If you force reordering on our users, then this is a breaking change Another thing that bears keeping in mind: sorting isn't consistent even across demuxers that sort. Most sort by timestamp (SUB_SORT_TS_POS), but vobsub sorts by byte position (SUB_SORT_POS_TS). It does this so as to be able to merge split subtitles later, but it does then not sort by timestamp after merging, as far as I can tell. I'm a bit busy with other stuff in the coming days, but hopefully I should be able to knock out a prototype by the end of the week that still provides sorted subs when possible, except when the input is a stream (which doesn't work now anyways). This kind of nonsense is why being strict with input is a good idea. /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
On 30/03/2024 17:02, Andreas Rheinhardt wrote: Tomas Härdin: lör 2024-03-30 klockan 15:49 +0100 skrev Nicolas George: ASS demuxer sorts its packets because there is no guarantee the text are sorted in the file So? I'm making a normative argument. Normative about what? The ASS specification [1] explicitly says: "SSA does not care what order events are entered in. [1]: www.tcax.org/docs/ass-specs.htm It's worth noting that this document, while often referred to as the ASS specification, is not actually an authoritative specification. The ASS format is implementation-defined: https://github.com/libass/libass/wiki/ASS-File-Format-Guide#preface-other-format-guidesspecification But of course this doesn't change the fact that the format allows events to be in any order. (Though this is also not the same as not caring at all what order events are entered in: While any order is valid, the order of events will affect their layering and collision detection.) ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
lör 2024-03-30 klockan 17:02 +0100 skrev Andreas Rheinhardt: > > > ASS demuxer sorts its packets because > > > there is no guarantee the text are sorted in the file > > > > So? I'm making a normative argument. > > > > Normative about what? The ASS specification [1] explicitly says: > > "SSA does not care what order events are entered in. > > They could be entered in complete reverse order, and SSA would > still play everything correctly in the right order ie. you cannot > assume > that each dialogue line is in chronological order in the script > file." This describes what SSA does, not what lavf should do. lavf does not guarantee ordered subtitles in general, as far as I can tell. > If you force reordering on our users, then this is a breaking change Hence why we shouldn't put business logic in lavf. It would have been better to put it in ffmpeg.c. Not putting business logic in lavf is a point I've been making for years. Here we see an excellent reason why. We can maintain the current behavior by putting the sorting logic further up in lavf, assuming API users care. If API users don't care but CLI users do then we could put it in ffmpeg.c. If we are to maintain compatibility with an ill-defined set of API users rather than say specific packages (vlc, mpv, melt etc) then the only sensible solution is to put the sorting logic further up in lavf, say demux.c. /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
Tomas Härdin: > lör 2024-03-30 klockan 15:49 +0100 skrev Nicolas George: >> Tomas Härdin (12024-03-30): >>> Players can implement sorting if they wish. >> >> API break. > > lavf's API provides no guarantees regarding presentation order >> > >>> Finally I will note that sorting does not happen when subtitles are >>> muxed in say mkv or avi, so the behavior is not even consistent >>> across >>> demuxers that support subtitles. >> >> AVI or MKV demuxer do not sort their packets because the packets are >> supposed to be already sorted; > > "Supposed to" is doing a lot of work here. IIRC AVI is fundamentally > incapable of providing out-of-order anything, this is true (B-frames > being notably haram in AVI). It is however capable of providing poorly > muxed files. For example it is perfecectly legal in AVI to mux all > video, then all audio, rather than the typical case where audio and > video are interleaved (the I in AVI). The same goes for many formats. > MOV supports basically any ordering via ctts shenanigans if I'm not > mistaken. > 1. AVI does not have a way to signal pts, but you can simply store stuff with reordering in it (in coding order); you just need something (most likely a decoder) that can properly reorder the frames for presentation. 2. IIRC our AVI demuxer tries to properly interleave the packets returned by AVI even if the input file is non-interleaved; the same goes for mov/mp4 (where the index and not the file position is used). >> ASS demuxer sorts its packets because >> there is no guarantee the text are sorted in the file > > So? I'm making a normative argument. > Normative about what? The ASS specification [1] explicitly says: "SSA does not care what order events are entered in. They could be entered in complete reverse order, and SSA would still play everything correctly in the right order ie. you cannot assume that each dialogue line is in chronological order in the script file." If you force reordering on our users, then this is a breaking change and it would impair the usefulness of libavformat: If users have to implement workaround for issues of certain file formats (for files which are not even broken according to the specification of the format in question!), then what point is there in using libavformat at all? - Andreas [1]: www.tcax.org/docs/ass-specs.htm ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
Tomas Härdin (12024-03-30): > lavf's API provides no guarantees regarding presentation order It used to work, you are about to require new code from applications for it to work. That is an API break, and pretending otherwise like you do here is just a cop out. > "Supposed to" is doing a lot of work here. IIRC AVI is fundamentally > incapable of providing out-of-order anything, this is true (B-frames > being notably haram in AVI). It is however capable of providing poorly > muxed files. And these files are widely considered broken and annoying. > So? I'm making a normative argument. I do not know what you try to mean here and I do not want to know. I hope you realize you have no authority to decide which ASS file is valid and which is not. -- Nicolas George ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
lör 2024-03-30 klockan 15:49 +0100 skrev Nicolas George: > Tomas Härdin (12024-03-30): > > Players can implement sorting if they wish. > > API break. lavf's API provides no guarantees regarding presentation order > > > Finally I will note that sorting does not happen when subtitles are > > muxed in say mkv or avi, so the behavior is not even consistent > > across > > demuxers that support subtitles. > > AVI or MKV demuxer do not sort their packets because the packets are > supposed to be already sorted; "Supposed to" is doing a lot of work here. IIRC AVI is fundamentally incapable of providing out-of-order anything, this is true (B-frames being notably haram in AVI). It is however capable of providing poorly muxed files. For example it is perfecectly legal in AVI to mux all video, then all audio, rather than the typical case where audio and video are interleaved (the I in AVI). The same goes for many formats. MOV supports basically any ordering via ctts shenanigans if I'm not mistaken. > ASS demuxer sorts its packets because > there is no guarantee the text are sorted in the file So? I'm making a normative argument. /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
Tomas Härdin (12024-03-30): > Players can implement sorting if they wish. API break. > One potential solution is to do this style of parsing when the input is > non-seekable. But then we have the silly situation where streamed and > non-streamed behavior differs considerably. Sure, better break both cases than only one. > Finally I will note that sorting does not happen when subtitles are > muxed in say mkv or avi, so the behavior is not even consistent across > demuxers that support subtitles. AVI or MKV demuxer do not sort their packets because the packets are supposed to be already sorted; ASS demuxer sorts its packets because there is no guarantee the text are sorted in the file. [exploding head emoji] -- Nicolas George ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
lör 2024-03-30 klockan 12:44 +0100 skrev Nicolas George: > Tomas Härdin (12024-03-30): > > More interesting is fate-sub-srt-badsyntax. Despite the name it > > doesn't > > really have bad syntax, but its cues are in a random order. I guess > > it > > exists to test the cue sorting logic. But should subtitle demuxers > > really sort subtitles in this way? We don't do that for other > > formats > > that can have non-decreasing timestamps. For comparison, the WebVTT > > spec explicitly disallows decreasing timestamps. > > On the other hand, I remember seeing a lot of ASS files from the > fansub > world where titles, signs and karaokes are added at the end after the > dialogues, relying on sorting by the player. Players can implement sorting if they wish. Why should we misrepresent what the file says? These people could also fix their workflows, put karaoke lyrics in a separate stream etc.. Business logic does not belong in lavf, and certainly not deep within demuxers. > (But I guess in the New and Improved FFmpeg, files originating from > the > fansub world are not worth our time, it is enough to be able to > decode > files for Crunchyroll…) Snark doesn't help your case. One potential solution is to do this style of parsing when the input is non-seekable. But then we have the silly situation where streamed and non-streamed behavior differs considerably. Another way could be to move the sorting further up, into demux.c or so, extending the generic index functionality. Finally I will note that sorting does not happen when subtitles are muxed in say mkv or avi, so the behavior is not even consistent across demuxers that support subtitles. With logic further up, and proper discarding, sorting could be done there as well. /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
Tomas Härdin (12024-03-30): > More interesting is fate-sub-srt-badsyntax. Despite the name it doesn't > really have bad syntax, but its cues are in a random order. I guess it > exists to test the cue sorting logic. But should subtitle demuxers > really sort subtitles in this way? We don't do that for other formats > that can have non-decreasing timestamps. For comparison, the WebVTT > spec explicitly disallows decreasing timestamps. On the other hand, I remember seeing a lot of ASS files from the fansub world where titles, signs and karaokes are added at the end after the dialogues, relying on sorting by the player. (But I guess in the New and Improved FFmpeg, files originating from the fansub world are not worth our time, it is enough to be able to decode files for Crunchyroll…) -- Nicolas George ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
On Sat, Mar 30, 2024 at 9:31 AM Tomas Härdin wrote: > lör 2024-03-30 klockan 00:35 +0100 skrev Michael Niedermayer: > > breaks fate here: > > > > --- ./tests/ref/fate/sub-srt-madness-timeshift 2024-03-29 > > 20:43:34.617419731 +0100 > > +++ tests/data/fate/sub-srt-madness-timeshift 2024-03-30 > > Sorry but this file is utter crap and shouldn't be part of FATE. Look > at this: > > > 53 > > 00:00:01,111 --> 00:00:02,222 > > okay, let's make things easy > > > > 21 lol jk > > 00:00:03,333 --> 00:00:04,444 > > hello > > 5 > > > > > > don't forget me. > > 3 > > > > > > 00:00:05,555 --> 00:00:06,666 > > > > > > no. > > let's add some fun > > 44 yes we can > > 00:00:07,777 --> 00:00:06,666 > > let's do it in reverse bc wtf not > > > > This file is crafted to test srtdec as it is, rather than following > what passes for an SRT spec (that doom9 forum post[1] and the videolan > wiki[2]). We should remove it, or keep it as a sample that should fail > parsing. > > More interesting is fate-sub-srt-badsyntax. Despite the name it doesn't > really have bad syntax, but its cues are in a random order. I guess it > exists to test the cue sorting logic. But should subtitle demuxers > really sort subtitles in this way? We don't do that for other formats > that can have non-decreasing timestamps. For comparison, the WebVTT > spec explicitly disallows decreasing timestamps. I also wonder how this > file was created. My guess is "via a text editor" since it seems to > consist of bits from different SRT files. There's little reason to > support such deliberately broken files, at least having a bunch of > sorting logic just for it. There's no reason we couldn't output packets > in stored order. > > The rest of the subtitle test cases pass. > I agree, current subtitles not being handled in streamed way is bad practice. If some subtitle have wrong order of items, than new generic subtitle filter could be implemented which would fix that by either using queue or seeking around in subtitle stream. > > /Tomas > > [1] https://forum.doom9.org/showthread.php?p=470941#post470941 > [2] https://wiki.videolan.org/SubRip/ > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
lör 2024-03-30 klockan 00:35 +0100 skrev Michael Niedermayer: > breaks fate here: > > --- ./tests/ref/fate/sub-srt-madness-timeshift 2024-03-29 > 20:43:34.617419731 +0100 > +++ tests/data/fate/sub-srt-madness-timeshift 2024-03-30 Sorry but this file is utter crap and shouldn't be part of FATE. Look at this: > 53 > 00:00:01,111 --> 00:00:02,222 > okay, let's make things easy > > 21 lol jk > 00:00:03,333 --> 00:00:04,444 > hello > 5 > > > don't forget me. > 3 > > > 00:00:05,555 --> 00:00:06,666 > > > no. > let's add some fun > 44 yes we can > 00:00:07,777 --> 00:00:06,666 > let's do it in reverse bc wtf not > This file is crafted to test srtdec as it is, rather than following what passes for an SRT spec (that doom9 forum post[1] and the videolan wiki[2]). We should remove it, or keep it as a sample that should fail parsing. More interesting is fate-sub-srt-badsyntax. Despite the name it doesn't really have bad syntax, but its cues are in a random order. I guess it exists to test the cue sorting logic. But should subtitle demuxers really sort subtitles in this way? We don't do that for other formats that can have non-decreasing timestamps. For comparison, the WebVTT spec explicitly disallows decreasing timestamps. I also wonder how this file was created. My guess is "via a text editor" since it seems to consist of bits from different SRT files. There's little reason to support such deliberately broken files, at least having a bunch of sorting logic just for it. There's no reason we couldn't output packets in stored order. The rest of the subtitle test cases pass. /Tomas [1] https://forum.doom9.org/showthread.php?p=470941#post470941 [2] https://wiki.videolan.org/SubRip/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
lör 2024-03-30 klockan 00:35 +0100 skrev Michael Niedermayer: > On Thu, Mar 28, 2024 at 11:57:57PM +0100, Tomas Härdin wrote: > > Here as well > > > libavformat/srtdec.c | 211 -- > > - > > tests/ref/fate/sub-srt-rrn-remux | 4 > > 2 files changed, 116 insertions(+), 99 deletions(-) > > 699d8b957286e190de6d5ca5cd17e67bb59dab7c 0002-lavf-srtdec-Permit- > > streaming-input.patch > > From 6d0684ca6fe02d80fc07a622fb85445a6917c29f Mon Sep 17 00:00:00 > > 2001 > > From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= > > Date: Thu, 28 Mar 2024 22:15:18 +0100 > > Subject: [PATCH 2/3] lavf/srtdec: Permit streaming input > > > > This is largely a rewrite. > > > > Read packets in srt_read_packet() rather than reading the entire > > file in srt_read_header(). > > Rely on AVFMT_GENERIC_INDEX for seeking. > > Allow zero-length packets (same as WebVTT). > > The implementation before this is broken in at least the following > > ways: > > > > Decimals like .99 are silently accepted and converted to > > 999.999 seconds. > > This is because no verification is done on the milliseconds part. > > This patch enforces that the minutes and seconds parts are 00-59, > > and the milliseconds 000-999. > > It's not perfect since FFmpeg doesn't have regex functionality or > > indeed any kind of parsing framework, > > but it's better than before. > > > > Segmenting cues by lines that consist of just a single integer is > > incredibly wrong, > > since the subtitle text itself may well contain lines that are just > > a lone integer. > > This means files written with CR line endings that have text with > > lone integers are > > parsed in a completely broken manner. Neither can we segment by > > lines containing --> > > since this is permissible in SubRip (as far as I can tell). WebVTT > > explicitly forbids it however. > > --- > > libavformat/srtdec.c | 211 --- > > > > tests/ref/fate/sub-srt-rrn-remux | 4 + > > 2 files changed, 116 insertions(+), 99 deletions(-) > > breaks fate here: > > --- ./tests/ref/fate/sub-srt-madness-timeshift 2024-03-29 > 20:43:34.617419731 +0100 > +++ tests/data/fate/sub-srt-madness-timeshift 2024-03-30 > 00:30:08.776949369 +0100 > @@ -3,34 +3,7 @@ > okay, let's make things easy > > 2 > -00:00:05,160 --> 00:00:05,263 > -31 i'm a number but the only payload so please keep me :) > - > -3 > 00:00:06,473 --> 00:00:07,584 > hello > 5 > -don't forget me. > - > -4 > -00:00:08,695 --> 00:00:09,806 > -no. > -let's add some fun > - > -5 > -00:00:10,917 --> 00:00:12,028 > -let's do it in reverse bc wtf not > -45 yes this is a number but i'm actually part of the sub > - > -6 > -00:00:12,028 --> 00:00:13,139 > -1 > -0 > -next is negative, not a chapnum ;) > --1 > - > -7 > -00:00:13,241 --> 00:00:13,263 > -credits > -2015 > > Test sub-srt-madness-timeshift failed. Look at tests/data/fate/sub- > srt-madness-timeshift.err for details. > tests/Makefile:309: recipe for target 'fate-sub-srt-madness- > timeshift' failed > make: *** [fate-sub-srt-madness-timeshift] Error 1 Can confirm. I was busy with fate-sub-srt-rrn-remux which popped up when running fate that I forgot to run a full fate after fixing it. Will submit a new patch series later /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
On Thu, Mar 28, 2024 at 11:57:57PM +0100, Tomas Härdin wrote: > Here as well > libavformat/srtdec.c | 211 > --- > tests/ref/fate/sub-srt-rrn-remux |4 > 2 files changed, 116 insertions(+), 99 deletions(-) > 699d8b957286e190de6d5ca5cd17e67bb59dab7c > 0002-lavf-srtdec-Permit-streaming-input.patch > From 6d0684ca6fe02d80fc07a622fb85445a6917c29f Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= > Date: Thu, 28 Mar 2024 22:15:18 +0100 > Subject: [PATCH 2/3] lavf/srtdec: Permit streaming input > > This is largely a rewrite. > > Read packets in srt_read_packet() rather than reading the entire file in > srt_read_header(). > Rely on AVFMT_GENERIC_INDEX for seeking. > Allow zero-length packets (same as WebVTT). > The implementation before this is broken in at least the following ways: > > Decimals like .99 are silently accepted and converted to 999.999 seconds. > This is because no verification is done on the milliseconds part. > This patch enforces that the minutes and seconds parts are 00-59, and the > milliseconds 000-999. > It's not perfect since FFmpeg doesn't have regex functionality or indeed any > kind of parsing framework, > but it's better than before. > > Segmenting cues by lines that consist of just a single integer is incredibly > wrong, > since the subtitle text itself may well contain lines that are just a lone > integer. > This means files written with CR line endings that have text with lone > integers are > parsed in a completely broken manner. Neither can we segment by lines > containing --> > since this is permissible in SubRip (as far as I can tell). WebVTT explicitly > forbids it however. > --- > libavformat/srtdec.c | 211 --- > tests/ref/fate/sub-srt-rrn-remux | 4 + > 2 files changed, 116 insertions(+), 99 deletions(-) breaks fate here: --- ./tests/ref/fate/sub-srt-madness-timeshift 2024-03-29 20:43:34.617419731 +0100 +++ tests/data/fate/sub-srt-madness-timeshift 2024-03-30 00:30:08.776949369 +0100 @@ -3,34 +3,7 @@ okay, let's make things easy 2 -00:00:05,160 --> 00:00:05,263 -31 i'm a number but the only payload so please keep me :) - -3 00:00:06,473 --> 00:00:07,584 hello 5 -don't forget me. - -4 -00:00:08,695 --> 00:00:09,806 -no. -let's add some fun - -5 -00:00:10,917 --> 00:00:12,028 -let's do it in reverse bc wtf not -45 yes this is a number but i'm actually part of the sub - -6 -00:00:12,028 --> 00:00:13,139 -1 -0 -next is negative, not a chapnum ;) --1 - -7 -00:00:13,241 --> 00:00:13,263 -credits -2015 Test sub-srt-madness-timeshift failed. Look at tests/data/fate/sub-srt-madness-timeshift.err for details. tests/Makefile:309: recipe for target 'fate-sub-srt-madness-timeshift' failed make: *** [fate-sub-srt-madness-timeshift] Error 1 [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB You can kill me, but you cannot change the truth. signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
Here as well From 6d0684ca6fe02d80fc07a622fb85445a6917c29f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= Date: Thu, 28 Mar 2024 22:15:18 +0100 Subject: [PATCH 2/3] lavf/srtdec: Permit streaming input This is largely a rewrite. Read packets in srt_read_packet() rather than reading the entire file in srt_read_header(). Rely on AVFMT_GENERIC_INDEX for seeking. Allow zero-length packets (same as WebVTT). The implementation before this is broken in at least the following ways: Decimals like .99 are silently accepted and converted to 999.999 seconds. This is because no verification is done on the milliseconds part. This patch enforces that the minutes and seconds parts are 00-59, and the milliseconds 000-999. It's not perfect since FFmpeg doesn't have regex functionality or indeed any kind of parsing framework, but it's better than before. Segmenting cues by lines that consist of just a single integer is incredibly wrong, since the subtitle text itself may well contain lines that are just a lone integer. This means files written with CR line endings that have text with lone integers are parsed in a completely broken manner. Neither can we segment by lines containing --> since this is permissible in SubRip (as far as I can tell). WebVTT explicitly forbids it however. --- libavformat/srtdec.c | 211 --- tests/ref/fate/sub-srt-rrn-remux | 4 + 2 files changed, 116 insertions(+), 99 deletions(-) diff --git a/libavformat/srtdec.c b/libavformat/srtdec.c index 6bf73599a7..c74d40b726 100644 --- a/libavformat/srtdec.c +++ b/libavformat/srtdec.c @@ -28,7 +28,8 @@ #include "libavutil/intreadwrite.h" typedef struct { -FFDemuxSubtitlesQueue q; +AVBPrint buf; +FFTextReader tr; } SRTContext; static int srt_probe(const AVProbeData *p) @@ -72,54 +73,66 @@ struct event_info { static int get_event_info(const char *line, struct event_info *ei) { -int hh1, mm1, ss1, ms1; -int hh2, mm2, ss2, ms2; - -ei->x1 = ei->x2 = ei->y1 = ei->y2 = ei->duration = -1; -ei->pts = AV_NOPTS_VALUE; -ei->pos = -1; -if (sscanf(line, "%d:%d:%d%*1[,.]%d --> %d:%d:%d%*1[,.]%d" +unsigned int hh1, mm1, ss1, ms1; +unsigned int hh2, mm2, ss2, ms2; +int n, m = 0; + +// require exactly two digits for mm and ss, three digits for ms +n = sscanf(line, "%*u:%*1u%*1u:%*1u%*1u%*1[,.]%*1u%*1u%*1u --> %*u:%*1u%*1u:%*1u%*1u%*1[,.]%*1u%*1u%*1u%n", ); +if (n < 0 || m <= 0) +return -1; +n = sscanf(line, "%u:%u:%u%*1[,.]%u --> %u:%u:%u%*1[,.]%u" "%*[ ]X1:%"PRId32" X2:%"PRId32" Y1:%"PRId32" Y2:%"PRId32, , , , , , , , , - >x1, >x2, >y1, >y2) >= 8) { + >x1, >x2, >y1, >y2); +// do not accept partial positions (9 <= n <= 11) +if ((n == 8 || n == 12) && +// require timestamps to be well-formed +mm1 < 60 && ss1 < 60 && ms1 < 1000 && +mm2 < 60 && ss2 < 60 && ms2 < 1000) { +// hh is converted to long long before the multiplication +// hence this cannot overflow even if hh == UINT_MAX const int64_t start = (hh1*3600LL + mm1*60LL + ss1) * 1000LL + ms1; const int64_t end = (hh2*3600LL + mm2*60LL + ss2) * 1000LL + ms2; -ei->duration = end - start; -ei->pts = start; -return 0; +// accept start == end (hidden subtitles) since a FATE test requires it +// WebVTT by contrast does not allow this +if (start <= end) { +ei->duration = end - start; +ei->pts = start; +return 0; +} } return -1; } -static int add_event(FFDemuxSubtitlesQueue *q, AVBPrint *buf, char *line_cache, - const struct event_info *ei, int append_cache) +static int output_packet(AVBPrint *buf, + const struct event_info *ei, AVPacket *pkt) { -if (append_cache && line_cache[0]) -av_bprintf(buf, "%s\n", line_cache); -line_cache[0] = 0; +int ret; if (!av_bprint_is_complete(buf)) return AVERROR(ENOMEM); +// len < size throughout this loop, so we don't need to +// call av_bprint_is_complete() twice like the old code did while (buf->len > 0 && buf->str[buf->len - 1] == '\n') buf->str[--buf->len] = 0; -if (buf->len) { -AVPacket *sub = ff_subtitles_queue_insert_bprint(q, buf, 0); -if (!sub) -return AVERROR(ENOMEM); -av_bprint_clear(buf); -sub->pos = ei->pos; -sub->pts = ei->pts; -sub->duration = ei->duration; -if (ei->x1 != -1) { -uint8_t *p = av_packet_new_side_data(sub, AV_PKT_DATA_SUBTITLE_POSITION, 16); -if (p) { -AV_WL32(p, ei->x1); -AV_WL32(p + 4, ei->y1); -AV_WL32(p + 8, ei->x2); -AV_WL32(p + 12, ei->y2); -} +ret = av_new_packet(pkt, buf->len);
[FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input
I am once again asking more people on this list to peruse https://langsec.org/ /Tomas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".