Hi Lance, On Sun, Apr 30, 2023 at 7:01 PM Lance Wang <lance.lmw...@gmail.com> wrote: > This implementation is limited to decklink SDI output only, If possible, > can we implement the function from demuxer layer, and then passthrough > by SEI side data? By this way, we can convert such stream in streaming > to embedded CC to video stream easily also.
I did consider this approach, and it does raise the more fundamental issue about trying to minimize the number of ways we have to process CC data depending on whether it originated in SEI metadata or in separate packets. There are a number of problems with what you are proposing though: 1. There could be multiple CC streams within an MOV file but only a single CC stream can be embedded into AVFrame side data. Hence you would have to specify some sort of argument to the demux to decide which stream to embed. This makes it much more difficult to do things like ingest a stream with multiple CC streams and have separate outputs with different CC streams. Performing the work on the output side allows you to use the standard "-map" mechanism to dictate which CC streams are routed to which outputs, and to deliver the content to different outputs with different CC streams. 2. I have use cases in mind where the captions originate from sources other than MOV files, where the video framerate is not known (or there is no video at all in the source). For example, I want to be able to consume video from a TS source while simultaneously demuxing an SCC or MCC file and sending the result in the output. In such cases the correct rate control for the captions can only be implemented on the output side, since in such cases the SCC/MCC demux doesn't have access to the corresponding video stream (it won't know the video framerate, nor is it able to embed the captions into the AVFrame side data). I can indeed imagine there are use cases where doing it further up the pipeline could be useful. For example, if you were taking in an MOV file and wanting to produce a TS where the captions need to be embedded as SEI metadata (hence you would need the e608 packets converted to AVFrame side data prior to reaching the encoder). However I don't see this as a substitute for being able to do it on the output side when that is the most flexible approach for those other use cases described above. Much of this comes down to the fundamental limitations of the ffmpeg framework related to being able to move data back/forth between data packets and side data. You can't feed data packets into AVFilterGraphs. You can't easily combine data from data packets into AVFrames carrying video (or extract side data from AVFrames to generate data packets), etc. You can't use BSF filters to combine data from multiple inputs such as compressed video streams and data streams after encoding. I've run across all these limitations over the years, and at this point I'm trying to take the least invasive approach possible that doesn't require changes to the fundamental frameworks for handling data packets. It's worth noting that nothing you have suggested is an "either/or" situation. Because caption processing is inexpensive, there isn't any significant overhead in having multiple AvCCFifo instances in the pipeline. In other words, if you added a feature to the MOV demuxer, it wouldn't prevent us from running the packets through an AvCCFifo instance on the output side. The patch proposed doesn't preclude you adding such a feature on the demux side in the future. Devin -- Devin Heitmueller, Senior Software Engineer LTN Global Communications o: +1 (301) 363-1001 w: https://ltnglobal.com e: devin.heitmuel...@ltnglobal.com _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".