On Mon, 2 Nov 2020, Michael Niedermayer wrote:

Please correct me if iam wrong but
in cases where no audio is missing or damaged, this would also ignore how much
audio is in each packet. So you could have lets say a timestamp difference
of excatly 1 second between 2 packets while their is actually not exactly
1 second worth of audio samples between them.

This is true, by using the frame counter (and the video time base) for
audio, we lose some audio packet timestamp precision inherently. However I
don't consider this a problem, audio timestamps do not have to be sample
accurate, for most formats they are not.


Also it is not practical to keep
track of how many samples are there in the packets, for example when you do
seeking, obviously you can't read all the audio data before the seek point
to get a precise sample accurate timestamp.

Its true that with seeking there is not enough information for sample precisse
timestamps. But from packet to packet as long as no seek happened there is.

And that timestamp can turn out to be wrong. If the audio clock is running at little more than 48 kHz, there will be A-V desync because after some time audio and video timestamps for packets coming from the same DV frame will diverge significantly.

My concern was more about something like significant frame to frame
differences in audio sample numbers.
Because if some hw or sw generates this we would produce packets of
identical duration which differ substantially in number of samples and
that would not be handled well in any scenario that accepted the timestamps
and durations as exact.

In general, you can't assume that timestamps or packet durations are exact. Consider you have a format which stores timestamps and durations in miliseconds. Rounding errors will occur. Also, for consumer equipment audio and video is rarely locked together, and audio sample rates are rarely very precise.

Maybe this never occurs and in that case your patch should be a good idea
but if it does happen then some code would be needed to deal with that.
It is detectable when sample counts do not match what is expected.

Yeah, and we have tools to fix that, like -af aresample=async=1.

That said, i would consider a fix for #8762 to produce correct audio in
all cases including wav/pcm/mov/... output and not just when the output
can store "corrupted"/"sparse" audio.

I think ffmpeg.c should be smarter about it, and be aware if unlocked or sparse audio (or audio not starting at the same time as video) is supported by certain muxers or not. And if it is not suppoted, then maybe -af async=1 or similar should be used automagically.

Also to me returning the data from the input file which would represent audio
if it was not corrupt seems to be somehow the "correct" thing to do.
Maybe this never contains any useful data then it doesnt matter in
reality but still it feels a bit odd to fix just the timestamps.

I am not strictly against applying your patch, I can accept that for the users it might be useful to get the data at the demuxer level and not play with async=1, yes, sparse audio requires extra care. I might even be OK with changing the default to pass corrupt packets. But this does not change the fact that the audio timestamps are currently wrong, because they ignore that audio and video from the same DV frame are synced together with at most 1/3 frame duration error.

Regards,
Marton
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to