> -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of Daniel > Cantarín > Sent: Sunday, December 12, 2021 12:39 AM > To: ffmpeg-devel@ffmpeg.org > Subject: Re: [FFmpeg-devel] [PATCH v20 02/20] avutil/frame: Prepare AVFrame\n > for subtitle handling > > > One of the important points to understand is that - in case of subtitles, > > the AVFrame IS NOT the subtitle event. The subtitle event is actually > > a different and separate entity. (...) > > > Wouldn't it qualify then as a different abstraction? > > I mean: instead of avframe.subtitle_property, perhaps something in the > lines of avframe.some_property_used_for_linked_abstractions, which in > turn lets you access some proper Subtitle abstraction instance. > > That way, devs would not need to defend AVFrame, and Subtitle could > have whatever properties needed. > > I see there's AVSubtitle, as you mention: > https://ffmpeg.org/doxygen/trunk/structAVSubtitle.html > > Isn't it less socially problematic to just link an instance of AVSubtitle, > instead of adding a subtitle timing property to AVFrame? > IIUC, that AVSubtitle instance could live in filter context, and be linked > by the filter doing the heartbeat frames. > > Please note I'm not saying the property is wrong, or even that I understand > the best way to deal with it, but that I recognize some social problem here. > Devs don't like that property, that's a fact. And technical or not, seems to > be a problem. > > > (...) > > The chairs are obviously AVFrames. They need to be numbered monotonically > > increasing - that's the frame.pts. without increasing numbering the > > transport would get stuck. We are filling the chairs with copies > > of the most recent subtitle event, so an AVSubtitle could be repeated > > like for example 5 times. It's always the exact same AVSubtitle event > > sitting in those 5 chairs. The subtitle event has always the same > start time > > (subtitle_pts) but each frame has a different pts. > > I can see AVSubtitle has a "start_display_time" property, as well as a > "pts" property "in AV_TIME_BASE": > > https://ffmpeg.org/doxygen/trunk/structAVSubtitle.html#af7cc390bba4f9d6c32e39 > 1ca59d117a2 > > Is it too much trouble to reuse that while persisting an AVSubtitle instance > in filter context? I guess it could even be used in decoder context. > > I also see a quirky property in AVFrame: "best_effort_timestamp" > https://ffmpeg.org/doxygen/trunk/structAVFrame.html#a0943e85eb624c2191490862e > cecd319d > Perhaps adding there some added "various heuristics" that it claims to > have, > this time related to a linked AVSubtitle, so an extra property is not > needed? > > > > (...) > > Considering the relation between AVFrame and subtitle event as laid out > > above, it should be apparent that there's no guarantee for a certain > > kind of relation between the subtitle_pts and the frame's pts who > > is carrying it. Such relation _can_ exist, but doesn't necessarily. > > It can easily be possible that the frame pts is just increased by 1 > > on subsequent frames. The time_base may change from filter to filter > > and can be oriented on the transport of the subtitle events which > > might have nothing to do with the subtitle display time at all. > > This confuses me. > I understand the difference between filler frame pts and subtitle pts. > That's ok. > But if transport timebase changes, I understand subtitle pts also changes. > > I mean: "transport timebase" means "video timebase", and if subs are synced > to video, then that sync needs to be mantained. If subs are synced, then > their timing is never independant. And if they're not synced, then its > AVFrame > is independant from video frames, and thus does not need any extra prop. > > Here's what I do right now with the filler frames. I'm talking about current > ffmpeg with no subs frames in lavfi, and real-time conversion from dvbsub > to WEBVTT using OCR. Quite dirty stuff what I do: > - Change FPS to a low value, let's say 1. > - Apply OCR to dvb sub, using vf_ocr. > - Read the metadata downstream, and writting vtt to file or pipe output. > > As there's no sub frame capability in lavfi, I can't use vtt encoder > downstream. > Therefore, the output is raw C string and file manipulation. And given > that I > set first the FPS to 1, I have 1 line per second, no matter the > timestamp of > either subs or video or filler frame. The point then is to check for > text diffs > instead of pts for detecting the frame nature. And I can even naively > just put > the frame's pts once per sec with the same text, and with empty lines when > there's no text, without caring about the frame nature (filler or not). > > There's a similar behaviour when dealing with CEA-608: I need to check text > differences instead of any pts, as inner workings of this captions are more > related to video than subs. I assume in my filters that frame PTS is > correct. > > I understand the idea behind PTS, I get that there's also DTS, and so I > can get > that there could be an use case where another timing is needed. But I still > don't see the need for this particular extra timing, as the distance > between > subtitle_pts and filler.pts does not means downstream something like "now > clear the current subtitle line" or something like that. What will > happen if > there's no subtitle_pts, is that the same line will still be active, > which will > only change when there's an actual subtitle difference. So, I believe this > value is more theoretically useful rather than factual. > > I understand that there are subs formats that need precise start and end > timing, but I fail to see the case where that timing avoids the need for > text > differences checking, be it filter or encoder. And if filters or > encoders naively > use PTS, then the filler frames would not break anything: will show > repeatedly > the same text line, at current FPS speed. And if the sparseness problem is > finally solved by your logic somehow, and there's no need for filler > frames, > then there's also no need for subtitle_pts, as pts would be actually fine. > > So, I'm confused, given that you state this property as very important. > Would you please tell us some actual, non-theoretical use case for the prop? > > > > > > Also, subtitle events are sometimes duplicated. When we would convert > > the subtitle_pts to the time_base that is negotiated between two filters, > > then it could happen that multiple copies of a single subtitle event have > > different subtitle_pts values. > > > > If it's repeated, doesn't it have different pts? > I get repeated lines from time to time. But they have slightly different > PTS. > > "Repeated event" != "same event". > If you check for repeated events, then you're doing some extra checking, > as I point with "text difference checks" in previous paragraphs, and so > PTS is not ruling all the logic. Otherwise, worst case scenario you get the > same PTS twice, which will discard some frame. And most likely scenario, > you get two identical frames with different PTS, that actually changes > nothing in viewer's experience. > > > > > Besides that, there are practical considerations: The subtitle_pts > > is almost nowhere needed in any other time_base than AV_TIMEBASE_Q. > > > > All decoders expect it to be like this, all encoders and all filters. > > Conversion would need to happen all over the place. > > Every filter would need to take care of rescaling the subtitle_pts > > value (when time_base is different between in and out). > > > > I'm not well versed enough in ffmpeg/libav to understand that. > But I tell you what. You think is possible for you to do some practical > test? > I mean this: > - Take some short video example with dvbsubs (or whatever graphical). > - Apply graphicsub2text, converting to webvtt, srt, or something. > - Do the same, but taking away subtitle_pts from AVFrame. > > Let's compare both text outputs. > I propose text, because is easier to share. But if you think of any other > practical example like this, it's also welcome. The point is to understand > the relevance of subtitle_pts by looking at the problem of not having it. > > If there's no big deal, then screw it: you take it away, devs get pleased, > and everybody in the world gets the blessing of having subtitle frames in > lavfi. If there's some big deal, then the devs should understand.
I'm afraid, the only reply that I have to this is: - Take my patchset - Remove subtitle_pts - Get everything working (all example command lines in filters.texi) => THEN start talking The same goes out to everybody else who keeps telling it can be removed and that it's an unnecessary duplication. The stage is yours... Kind regards, softworkz _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".