Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-04-01 Thread Tomas Härdin
lör 2024-03-30 klockan 17:02 +0100 skrev Andreas Rheinhardt:
> If you force reordering on our users, then this is a breaking change

Another thing that bears keeping in mind: sorting isn't consistent even
across demuxers that sort. Most sort by timestamp (SUB_SORT_TS_POS),
but vobsub sorts by byte position (SUB_SORT_POS_TS). It does this so as
to be able to merge split subtitles later, but it does then not sort by
timestamp after merging, as far as I can tell.

I'm a bit busy with other stuff in the coming days, but hopefully I
should be able to knock out a prototype by the end of the week that
still provides sorted subs when possible, except when the input is a
stream (which doesn't work now anyways).

This kind of nonsense is why being strict with input is a good idea.

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-04-01 Thread arch1t3cht

On 30/03/2024 17:02, Andreas Rheinhardt wrote:

Tomas Härdin:

lör 2024-03-30 klockan 15:49 +0100 skrev Nicolas George:

ASS demuxer sorts its packets because
there is no guarantee the text are sorted in the file


So? I'm making a normative argument.



Normative about what? The ASS specification [1] explicitly says:

"SSA does not care what order events are entered in.

[1]: www.tcax.org/docs/ass-specs.htm


It's worth noting that this document, while often referred to as the
ASS specification, is not actually an authoritative specification.
The ASS format is implementation-defined:

https://github.com/libass/libass/wiki/ASS-File-Format-Guide#preface-other-format-guidesspecification

But of course this doesn't change the fact that the format allows events
to be in any order. (Though this is also not the same as not caring at
all what order events are entered in: While any order is valid, the 
order of events will affect their layering and collision detection.)


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-30 Thread Tomas Härdin
lör 2024-03-30 klockan 17:02 +0100 skrev Andreas Rheinhardt:
> > > ASS demuxer sorts its packets because
> > > there is no guarantee the text are sorted in the file
> > 
> > So? I'm making a normative argument.
> > 
> 
> Normative about what? The ASS specification [1] explicitly says:
> 
> "SSA does not care what order events are entered in.
> 
>  They could be entered in complete reverse order, and SSA would
> still play everything correctly in the right order ie. you cannot
> assume
> that each dialogue line is in chronological order in the script
> file."

This describes what SSA does, not what lavf should do. lavf does not
guarantee ordered subtitles in general, as far as I can tell.

> If you force reordering on our users, then this is a breaking change

Hence why we shouldn't put business logic in lavf. It would have been
better to put it in ffmpeg.c. Not putting business logic in lavf is a
point I've been making for years. Here we see an excellent reason why.

We can maintain the current behavior by putting the sorting logic
further up in lavf, assuming API users care. If API users don't care
but CLI users do then we could put it in ffmpeg.c.

If we are to maintain compatibility with an ill-defined set of API
users rather than say specific packages (vlc, mpv, melt etc) then the
only sensible solution is to put the sorting logic further up in lavf,
say demux.c.

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-30 Thread Andreas Rheinhardt
Tomas Härdin:
> lör 2024-03-30 klockan 15:49 +0100 skrev Nicolas George:
>> Tomas Härdin (12024-03-30):
>>> Players can implement sorting if they wish.
>>
>> API break.
> 
> lavf's API provides no guarantees regarding presentation order
>>
> 
>>> Finally I will note that sorting does not happen when subtitles are
>>> muxed in say mkv or avi, so the behavior is not even consistent
>>> across
>>> demuxers that support subtitles.
>>
>> AVI or MKV demuxer do not sort their packets because the packets are
>> supposed to be already sorted;
> 
> "Supposed to" is doing a lot of work here. IIRC AVI is fundamentally
> incapable of providing out-of-order anything, this is true (B-frames
> being notably haram in AVI). It is however capable of providing poorly
> muxed files. For example it is perfecectly legal in AVI to mux all
> video, then all audio, rather than the typical case where audio and
> video are interleaved (the I in AVI). The same goes for many formats.
> MOV supports basically any ordering via ctts shenanigans if I'm not
> mistaken.
> 

1. AVI does not have a way to signal pts, but you can simply store stuff
with reordering in it (in coding order); you just need something (most
likely a decoder) that can properly reorder the frames for presentation.
2. IIRC our AVI demuxer tries to properly interleave the packets
returned by AVI even if the input file is non-interleaved; the same goes
for mov/mp4 (where the index and not the file position is used).

>> ASS demuxer sorts its packets because
>> there is no guarantee the text are sorted in the file
> 
> So? I'm making a normative argument.
> 

Normative about what? The ASS specification [1] explicitly says:

"SSA does not care what order events are entered in.

 They could be entered in complete reverse order, and SSA would
still play everything correctly in the right order ie. you cannot assume
that each dialogue line is in chronological order in the script file."

If you force reordering on our users, then this is a breaking change and
it would impair the usefulness of libavformat: If users have to
implement workaround for issues of certain file formats (for files which
are not even broken according to the specification of the format in
question!), then what point is there in using libavformat at all?

- Andreas

[1]: www.tcax.org/docs/ass-specs.htm

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-30 Thread Nicolas George
Tomas Härdin (12024-03-30):
> lavf's API provides no guarantees regarding presentation order

It used to work, you are about to require new code from applications for
it to work. That is an API break, and pretending otherwise like you do
here is just a cop out.

> "Supposed to" is doing a lot of work here. IIRC AVI is fundamentally
> incapable of providing out-of-order anything, this is true (B-frames
> being notably haram in AVI). It is however capable of providing poorly
> muxed files.

And these files are widely considered broken and annoying.

> So? I'm making a normative argument.

I do not know what you try to mean here and I do not want to know. I
hope you realize you have no authority to decide which ASS file is valid
and which is not.

-- 
  Nicolas George
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-30 Thread Tomas Härdin
lör 2024-03-30 klockan 15:49 +0100 skrev Nicolas George:
> Tomas Härdin (12024-03-30):
> > Players can implement sorting if they wish.
> 
> API break.

lavf's API provides no guarantees regarding presentation order
> 

> > Finally I will note that sorting does not happen when subtitles are
> > muxed in say mkv or avi, so the behavior is not even consistent
> > across
> > demuxers that support subtitles.
> 
> AVI or MKV demuxer do not sort their packets because the packets are
> supposed to be already sorted;

"Supposed to" is doing a lot of work here. IIRC AVI is fundamentally
incapable of providing out-of-order anything, this is true (B-frames
being notably haram in AVI). It is however capable of providing poorly
muxed files. For example it is perfecectly legal in AVI to mux all
video, then all audio, rather than the typical case where audio and
video are interleaved (the I in AVI). The same goes for many formats.
MOV supports basically any ordering via ctts shenanigans if I'm not
mistaken.

> ASS demuxer sorts its packets because
> there is no guarantee the text are sorted in the file

So? I'm making a normative argument.

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-30 Thread Nicolas George
Tomas Härdin (12024-03-30):
> Players can implement sorting if they wish.

API break.

> One potential solution is to do this style of parsing when the input is
> non-seekable. But then we have the silly situation where streamed and
> non-streamed behavior differs considerably.

Sure, better break both cases than only one.

> Finally I will note that sorting does not happen when subtitles are
> muxed in say mkv or avi, so the behavior is not even consistent across
> demuxers that support subtitles.

AVI or MKV demuxer do not sort their packets because the packets are
supposed to be already sorted; ASS demuxer sorts its packets because
there is no guarantee the text are sorted in the file. [exploding head
emoji]

-- 
  Nicolas George
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-30 Thread Tomas Härdin
lör 2024-03-30 klockan 12:44 +0100 skrev Nicolas George:
> Tomas Härdin (12024-03-30):
> > More interesting is fate-sub-srt-badsyntax. Despite the name it
> > doesn't
> > really have bad syntax, but its cues are in a random order. I guess
> > it
> > exists to test the cue sorting logic. But should subtitle demuxers
> > really sort subtitles in this way? We don't do that for other
> > formats
> > that can have non-decreasing timestamps. For comparison, the WebVTT
> > spec explicitly disallows decreasing timestamps.
> 
> On the other hand, I remember seeing a lot of ASS files from the
> fansub
> world where titles, signs and karaokes are added at the end after the
> dialogues, relying on sorting by the player.

Players can implement sorting if they wish. Why should we misrepresent
what the file says? These people could also fix their workflows, put
karaoke lyrics in a separate stream etc.. Business logic does not
belong in lavf, and certainly not deep within demuxers.

> (But I guess in the New and Improved FFmpeg, files originating from
> the
> fansub world are not worth our time, it is enough to be able to
> decode
> files for Crunchyroll…)

Snark doesn't help your case.

One potential solution is to do this style of parsing when the input is
non-seekable. But then we have the silly situation where streamed and
non-streamed behavior differs considerably.

Another way could be to move the sorting further up, into demux.c or
so, extending the generic index functionality.

Finally I will note that sorting does not happen when subtitles are
muxed in say mkv or avi, so the behavior is not even consistent across
demuxers that support subtitles. With logic further up, and proper
discarding, sorting could be done there as well.

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-30 Thread Nicolas George
Tomas Härdin (12024-03-30):
> More interesting is fate-sub-srt-badsyntax. Despite the name it doesn't
> really have bad syntax, but its cues are in a random order. I guess it
> exists to test the cue sorting logic. But should subtitle demuxers
> really sort subtitles in this way? We don't do that for other formats
> that can have non-decreasing timestamps. For comparison, the WebVTT
> spec explicitly disallows decreasing timestamps.

On the other hand, I remember seeing a lot of ASS files from the fansub
world where titles, signs and karaokes are added at the end after the
dialogues, relying on sorting by the player.

(But I guess in the New and Improved FFmpeg, files originating from the
fansub world are not worth our time, it is enough to be able to decode
files for Crunchyroll…)

-- 
  Nicolas George
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-30 Thread Paul B Mahol
On Sat, Mar 30, 2024 at 9:31 AM Tomas Härdin  wrote:

> lör 2024-03-30 klockan 00:35 +0100 skrev Michael Niedermayer:
> > breaks fate here:
> >
> > --- ./tests/ref/fate/sub-srt-madness-timeshift  2024-03-29
> > 20:43:34.617419731 +0100
> > +++ tests/data/fate/sub-srt-madness-timeshift   2024-03-30
>
> Sorry but this file is utter crap and shouldn't be part of FATE. Look
> at this:
>
> > 53
> > 00:00:01,111 --> 00:00:02,222
> > okay, let's make things easy
> >
> > 21 lol jk
> > 00:00:03,333 --> 00:00:04,444
> > hello
> > 5
> >
> >
> > don't forget me.
> > 3
> >
> >
> > 00:00:05,555 --> 00:00:06,666
> >
> >
> > no.
> > let's add some fun
> > 44 yes we can
> > 00:00:07,777 --> 00:00:06,666
> > let's do it in reverse bc wtf not
> >
>
> This file is crafted to test srtdec as it is, rather than following
> what passes for an SRT spec (that doom9 forum post[1] and the videolan
> wiki[2]). We should remove it, or keep it as a sample that should fail
> parsing.
>
> More interesting is fate-sub-srt-badsyntax. Despite the name it doesn't
> really have bad syntax, but its cues are in a random order. I guess it
> exists to test the cue sorting logic. But should subtitle demuxers
> really sort subtitles in this way? We don't do that for other formats
> that can have non-decreasing timestamps. For comparison, the WebVTT
> spec explicitly disallows decreasing timestamps. I also wonder how this
> file was created. My guess is "via a text editor" since it seems to
> consist of bits from different SRT files. There's little reason to
> support such deliberately broken files, at least having a bunch of
> sorting logic just for it. There's no reason we couldn't output packets
> in stored order.
>
> The rest of the subtitle test cases pass.
>

I agree, current subtitles not being handled in streamed way is bad
practice.
If some subtitle have wrong order of items, than new generic subtitle
filter could be implemented
which would fix that by either using queue or seeking around in subtitle
stream.


>
> /Tomas
>
> [1] https://forum.doom9.org/showthread.php?p=470941#post470941
> [2] https://wiki.videolan.org/SubRip/
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-30 Thread Tomas Härdin
lör 2024-03-30 klockan 00:35 +0100 skrev Michael Niedermayer:
> breaks fate here:
> 
> --- ./tests/ref/fate/sub-srt-madness-timeshift  2024-03-29
> 20:43:34.617419731 +0100
> +++ tests/data/fate/sub-srt-madness-timeshift   2024-03-30

Sorry but this file is utter crap and shouldn't be part of FATE. Look
at this:

> 53
> 00:00:01,111 --> 00:00:02,222
> okay, let's make things easy
> 
> 21 lol jk
> 00:00:03,333 --> 00:00:04,444
> hello
> 5
> 
> 
> don't forget me.
> 3
> 
> 
> 00:00:05,555 --> 00:00:06,666
> 
> 
> no.
> let's add some fun
> 44 yes we can
> 00:00:07,777 --> 00:00:06,666
> let's do it in reverse bc wtf not
> 

This file is crafted to test srtdec as it is, rather than following
what passes for an SRT spec (that doom9 forum post[1] and the videolan
wiki[2]). We should remove it, or keep it as a sample that should fail
parsing.

More interesting is fate-sub-srt-badsyntax. Despite the name it doesn't
really have bad syntax, but its cues are in a random order. I guess it
exists to test the cue sorting logic. But should subtitle demuxers
really sort subtitles in this way? We don't do that for other formats
that can have non-decreasing timestamps. For comparison, the WebVTT
spec explicitly disallows decreasing timestamps. I also wonder how this
file was created. My guess is "via a text editor" since it seems to
consist of bits from different SRT files. There's little reason to
support such deliberately broken files, at least having a bunch of
sorting logic just for it. There's no reason we couldn't output packets
in stored order.

The rest of the subtitle test cases pass.

/Tomas

[1] https://forum.doom9.org/showthread.php?p=470941#post470941
[2] https://wiki.videolan.org/SubRip/
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-29 Thread Tomas Härdin
lör 2024-03-30 klockan 00:35 +0100 skrev Michael Niedermayer:
> On Thu, Mar 28, 2024 at 11:57:57PM +0100, Tomas Härdin wrote:
> > Here as well
> 
> >  libavformat/srtdec.c |  211 --
> > -
> >  tests/ref/fate/sub-srt-rrn-remux |    4 
> >  2 files changed, 116 insertions(+), 99 deletions(-)
> > 699d8b957286e190de6d5ca5cd17e67bb59dab7c  0002-lavf-srtdec-Permit-
> > streaming-input.patch
> > From 6d0684ca6fe02d80fc07a622fb85445a6917c29f Mon Sep 17 00:00:00
> > 2001
> > From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= 
> > Date: Thu, 28 Mar 2024 22:15:18 +0100
> > Subject: [PATCH 2/3] lavf/srtdec: Permit streaming input
> > 
> > This is largely a rewrite.
> > 
> > Read packets in srt_read_packet() rather than reading the entire
> > file in srt_read_header().
> > Rely on AVFMT_GENERIC_INDEX for seeking.
> > Allow zero-length packets (same as WebVTT).
> > The implementation before this is broken in at least the following
> > ways:
> > 
> > Decimals like .99 are silently accepted and converted to
> > 999.999 seconds.
> > This is because no verification is done on the milliseconds part.
> > This patch enforces that the minutes and seconds parts are 00-59,
> > and the milliseconds 000-999.
> > It's not perfect since FFmpeg doesn't have regex functionality or
> > indeed any kind of parsing framework,
> > but it's better than before.
> > 
> > Segmenting cues by lines that consist of just a single integer is
> > incredibly wrong,
> > since the subtitle text itself may well contain lines that are just
> > a lone integer.
> > This means files written with CR line endings that have text with
> > lone integers are
> > parsed in a completely broken manner. Neither can we segment by
> > lines containing -->
> > since this is permissible in SubRip (as far as I can tell). WebVTT
> > explicitly forbids it however.
> > ---
> >  libavformat/srtdec.c | 211 ---
> > 
> >  tests/ref/fate/sub-srt-rrn-remux |   4 +
> >  2 files changed, 116 insertions(+), 99 deletions(-)
> 
> breaks fate here:
> 
> --- ./tests/ref/fate/sub-srt-madness-timeshift  2024-03-29
> 20:43:34.617419731 +0100
> +++ tests/data/fate/sub-srt-madness-timeshift   2024-03-30
> 00:30:08.776949369 +0100
> @@ -3,34 +3,7 @@
>  okay, let's make things easy
> 
>  2
> -00:00:05,160 --> 00:00:05,263
> -31 i'm a number but the only payload so please keep me :)
> -
> -3
>  00:00:06,473 --> 00:00:07,584
>  hello
>  5
> -don't forget me.
> -
> -4
> -00:00:08,695 --> 00:00:09,806
> -no.
> -let's add some fun
> -
> -5
> -00:00:10,917 --> 00:00:12,028
> -let's do it in reverse bc wtf not
> -45 yes this is a number but i'm actually part of the sub
> -
> -6
> -00:00:12,028 --> 00:00:13,139
> -1
> -0
> -next is negative, not a chapnum ;)
> --1
> -
> -7
> -00:00:13,241 --> 00:00:13,263
> -credits
> -2015
> 
> Test sub-srt-madness-timeshift failed. Look at tests/data/fate/sub-
> srt-madness-timeshift.err for details.
> tests/Makefile:309: recipe for target 'fate-sub-srt-madness-
> timeshift' failed
> make: *** [fate-sub-srt-madness-timeshift] Error 1

Can confirm. I was busy with fate-sub-srt-rrn-remux which popped up
when running fate that I forgot to run a full fate after fixing it.
Will submit a new patch series later

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-29 Thread Michael Niedermayer
On Thu, Mar 28, 2024 at 11:57:57PM +0100, Tomas Härdin wrote:
> Here as well

>  libavformat/srtdec.c |  211 
> ---
>  tests/ref/fate/sub-srt-rrn-remux |4 
>  2 files changed, 116 insertions(+), 99 deletions(-)
> 699d8b957286e190de6d5ca5cd17e67bb59dab7c  
> 0002-lavf-srtdec-Permit-streaming-input.patch
> From 6d0684ca6fe02d80fc07a622fb85445a6917c29f Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= 
> Date: Thu, 28 Mar 2024 22:15:18 +0100
> Subject: [PATCH 2/3] lavf/srtdec: Permit streaming input
> 
> This is largely a rewrite.
> 
> Read packets in srt_read_packet() rather than reading the entire file in 
> srt_read_header().
> Rely on AVFMT_GENERIC_INDEX for seeking.
> Allow zero-length packets (same as WebVTT).
> The implementation before this is broken in at least the following ways:
> 
> Decimals like .99 are silently accepted and converted to 999.999 seconds.
> This is because no verification is done on the milliseconds part.
> This patch enforces that the minutes and seconds parts are 00-59, and the 
> milliseconds 000-999.
> It's not perfect since FFmpeg doesn't have regex functionality or indeed any 
> kind of parsing framework,
> but it's better than before.
> 
> Segmenting cues by lines that consist of just a single integer is incredibly 
> wrong,
> since the subtitle text itself may well contain lines that are just a lone 
> integer.
> This means files written with CR line endings that have text with lone 
> integers are
> parsed in a completely broken manner. Neither can we segment by lines 
> containing -->
> since this is permissible in SubRip (as far as I can tell). WebVTT explicitly 
> forbids it however.
> ---
>  libavformat/srtdec.c | 211 ---
>  tests/ref/fate/sub-srt-rrn-remux |   4 +
>  2 files changed, 116 insertions(+), 99 deletions(-)

breaks fate here:

--- ./tests/ref/fate/sub-srt-madness-timeshift  2024-03-29 20:43:34.617419731 
+0100
+++ tests/data/fate/sub-srt-madness-timeshift   2024-03-30 00:30:08.776949369 
+0100
@@ -3,34 +3,7 @@
 okay, let's make things easy

 2
-00:00:05,160 --> 00:00:05,263
-31 i'm a number but the only payload so please keep me :)
-
-3
 00:00:06,473 --> 00:00:07,584
 hello
 5
-don't forget me.
-
-4
-00:00:08,695 --> 00:00:09,806
-no.
-let's add some fun
-
-5
-00:00:10,917 --> 00:00:12,028
-let's do it in reverse bc wtf not
-45 yes this is a number but i'm actually part of the sub
-
-6
-00:00:12,028 --> 00:00:13,139
-1
-0
-next is negative, not a chapnum ;)
--1
-
-7
-00:00:13,241 --> 00:00:13,263
-credits
-2015

Test sub-srt-madness-timeshift failed. Look at 
tests/data/fate/sub-srt-madness-timeshift.err for details.
tests/Makefile:309: recipe for target 'fate-sub-srt-madness-timeshift' failed
make: *** [fate-sub-srt-madness-timeshift] Error 1


[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

You can kill me, but you cannot change the truth.


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-28 Thread Tomas Härdin
Here as well
From 6d0684ca6fe02d80fc07a622fb85445a6917c29f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= 
Date: Thu, 28 Mar 2024 22:15:18 +0100
Subject: [PATCH 2/3] lavf/srtdec: Permit streaming input

This is largely a rewrite.

Read packets in srt_read_packet() rather than reading the entire file in srt_read_header().
Rely on AVFMT_GENERIC_INDEX for seeking.
Allow zero-length packets (same as WebVTT).
The implementation before this is broken in at least the following ways:

Decimals like .99 are silently accepted and converted to 999.999 seconds.
This is because no verification is done on the milliseconds part.
This patch enforces that the minutes and seconds parts are 00-59, and the milliseconds 000-999.
It's not perfect since FFmpeg doesn't have regex functionality or indeed any kind of parsing framework,
but it's better than before.

Segmenting cues by lines that consist of just a single integer is incredibly wrong,
since the subtitle text itself may well contain lines that are just a lone integer.
This means files written with CR line endings that have text with lone integers are
parsed in a completely broken manner. Neither can we segment by lines containing -->
since this is permissible in SubRip (as far as I can tell). WebVTT explicitly forbids it however.
---
 libavformat/srtdec.c | 211 ---
 tests/ref/fate/sub-srt-rrn-remux |   4 +
 2 files changed, 116 insertions(+), 99 deletions(-)

diff --git a/libavformat/srtdec.c b/libavformat/srtdec.c
index 6bf73599a7..c74d40b726 100644
--- a/libavformat/srtdec.c
+++ b/libavformat/srtdec.c
@@ -28,7 +28,8 @@
 #include "libavutil/intreadwrite.h"
 
 typedef struct {
-FFDemuxSubtitlesQueue q;
+AVBPrint buf;
+FFTextReader tr;
 } SRTContext;
 
 static int srt_probe(const AVProbeData *p)
@@ -72,54 +73,66 @@ struct event_info {
 
 static int get_event_info(const char *line, struct event_info *ei)
 {
-int hh1, mm1, ss1, ms1;
-int hh2, mm2, ss2, ms2;
-
-ei->x1 = ei->x2 = ei->y1 = ei->y2 = ei->duration = -1;
-ei->pts = AV_NOPTS_VALUE;
-ei->pos = -1;
-if (sscanf(line, "%d:%d:%d%*1[,.]%d --> %d:%d:%d%*1[,.]%d"
+unsigned int hh1, mm1, ss1, ms1;
+unsigned int hh2, mm2, ss2, ms2;
+int n, m = 0;
+
+// require exactly two digits for mm and ss, three digits for ms
+n = sscanf(line, "%*u:%*1u%*1u:%*1u%*1u%*1[,.]%*1u%*1u%*1u --> %*u:%*1u%*1u:%*1u%*1u%*1[,.]%*1u%*1u%*1u%n", );
+if (n < 0 || m <= 0)
+return -1;
+n = sscanf(line, "%u:%u:%u%*1[,.]%u --> %u:%u:%u%*1[,.]%u"
"%*[ ]X1:%"PRId32" X2:%"PRId32" Y1:%"PRId32" Y2:%"PRId32,
, , , ,
, , , ,
-   >x1, >x2, >y1, >y2) >= 8) {
+   >x1, >x2, >y1, >y2);
+// do not accept partial positions (9 <= n <= 11)
+if ((n == 8 || n == 12) &&
+// require timestamps to be well-formed
+mm1 < 60 && ss1 < 60 && ms1 < 1000 &&
+mm2 < 60 && ss2 < 60 && ms2 < 1000) {
+// hh is converted to long long before the multiplication
+// hence this cannot overflow even if hh == UINT_MAX
 const int64_t start = (hh1*3600LL + mm1*60LL + ss1) * 1000LL + ms1;
 const int64_t end   = (hh2*3600LL + mm2*60LL + ss2) * 1000LL + ms2;
-ei->duration = end - start;
-ei->pts = start;
-return 0;
+// accept start == end (hidden subtitles) since a FATE test requires it
+// WebVTT by contrast does not allow this
+if (start <= end) {
+ei->duration = end - start;
+ei->pts = start;
+return 0;
+}
 }
 return -1;
 }
 
-static int add_event(FFDemuxSubtitlesQueue *q, AVBPrint *buf, char *line_cache,
- const struct event_info *ei, int append_cache)
+static int output_packet(AVBPrint *buf,
+ const struct event_info *ei, AVPacket *pkt)
 {
-if (append_cache && line_cache[0])
-av_bprintf(buf, "%s\n", line_cache);
-line_cache[0] = 0;
+int ret;
 if (!av_bprint_is_complete(buf))
 return AVERROR(ENOMEM);
 
+// len < size throughout this loop, so we don't need to
+// call av_bprint_is_complete() twice like the old code did
 while (buf->len > 0 && buf->str[buf->len - 1] == '\n')
 buf->str[--buf->len] = 0;
 
-if (buf->len) {
-AVPacket *sub = ff_subtitles_queue_insert_bprint(q, buf, 0);
-if (!sub)
-return AVERROR(ENOMEM);
-av_bprint_clear(buf);
-sub->pos = ei->pos;
-sub->pts = ei->pts;
-sub->duration = ei->duration;
-if (ei->x1 != -1) {
-uint8_t *p = av_packet_new_side_data(sub, AV_PKT_DATA_SUBTITLE_POSITION, 16);
-if (p) {
-AV_WL32(p,  ei->x1);
-AV_WL32(p +  4, ei->y1);
-AV_WL32(p +  8, ei->x2);
-AV_WL32(p + 12, ei->y2);
-}
+ret = av_new_packet(pkt, buf->len);

[FFmpeg-devel] [PATCH 2/3] lavf/srtdec: Permit streaming input

2024-03-28 Thread Tomas Härdin
I am once again asking more people on this list to peruse
https://langsec.org/

/Tomas
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".