On Mon, 2 Nov 2020, Michael Niedermayer wrote:
Please correct me if iam wrong but
in cases where no audio is missing or damaged, this would also ignore how much
audio is in each packet. So you could have lets say a timestamp difference
of excatly 1 second between 2 packets while their is actually not exactly
1 second worth of audio samples between them.
This is true, by using the frame counter (and the video time base) for
audio, we lose some audio packet timestamp precision inherently. However I
don't consider this a problem, audio timestamps do not have to be sample
accurate, for most formats they are not.
Also it is not practical to keep
track of how many samples are there in the packets, for example when you do
seeking, obviously you can't read all the audio data before the seek point
to get a precise sample accurate timestamp.
Its true that with seeking there is not enough information for sample precisse
timestamps. But from packet to packet as long as no seek happened there is.
And that timestamp can turn out to be wrong. If the audio clock is running
at little more than 48 kHz, there will be A-V desync because after some
time audio and video timestamps for packets coming from the same DV frame
will diverge significantly.
My concern was more about something like significant frame to frame
differences in audio sample numbers.
Because if some hw or sw generates this we would produce packets of
identical duration which differ substantially in number of samples and
that would not be handled well in any scenario that accepted the timestamps
and durations as exact.
In general, you can't assume that timestamps or packet durations are
exact. Consider you have a format which stores timestamps and durations in
miliseconds. Rounding errors will occur. Also, for consumer equipment
audio and video is rarely locked together, and audio sample rates are
rarely very precise.
Maybe this never occurs and in that case your patch should be a good idea
but if it does happen then some code would be needed to deal with that.
It is detectable when sample counts do not match what is expected.
Yeah, and we have tools to fix that, like -af aresample=async=1.
That said, i would consider a fix for #8762 to produce correct audio in
all cases including wav/pcm/mov/... output and not just when the output
can store "corrupted"/"sparse" audio.
I think ffmpeg.c should be smarter about it, and be aware if unlocked or
sparse audio (or audio not starting at the same time as video) is
supported by certain muxers or not. And if it is not suppoted, then maybe
-af async=1 or similar should be used automagically.
Also to me returning the data from the input file which would represent audio
if it was not corrupt seems to be somehow the "correct" thing to do.
Maybe this never contains any useful data then it doesnt matter in
reality but still it feels a bit odd to fix just the timestamps.
I am not strictly against applying your patch, I can accept that for the
users it might be useful to get the data at the demuxer level and not play
with async=1, yes, sparse audio requires extra care. I might even be OK
with changing the default to pass corrupt packets. But this does not
change the fact that the audio timestamps are currently wrong, because
they ignore that audio and video from the same DV frame are synced
together with at most 1/3 frame duration error.
Regards,
Marton
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".