>Ya I'm having the same problem. I'll repost a question I asked on a forum
>here since I got no answer there.

>For example:
>The Video stream has a start_time of 64194, with a timebase of 1/90000
>The Audio stream has a start_time of 56994, with a timebase of 1/90000.

>The format context has a start_time of 633267.

>So should I start playing the audio 64194 - 56994 = 7200 timebase units
> (0.08 seconds) before the video? Where does the format context's
start_time
>come into play here?

>What do I do if the first audio packet has a pts thats bigger than the
audio
>stream's start_time? Do I just play silence for the first little bit?

>Or should the first decoded audio packet match up with the first decoded
>video packet?

In general, its best to have a "master clock" and sync samples on that.
Store both audio and video in a queue, and read them back in separate
threads. 

>From my experience its best to sync on audio (most players do, either on the
clock from the audio card, or the pts values of the queue).

Slave the "delay" of the other streams to this masterclock. Ege if you are
currently playing audio samples of time x, and check the pts of the other
streams and either drop frames, or insert a duplicate.

This approach allows you to take several scenario's into account (ege badly
encoded video's, or pts values that are off-sync at start).

What is important though is that you sync on the audio _that is playing_ (eg
coming out of the speakers). The audio boards usually have hardware buffers
and may buffer up to a few seconds. I solve this by encoding a timestamp in
my buffers, that is read when the buffer is free'ed right after its played.

Users are less prone to noticing the occasional double frame. They will
however immediately hear a click, or a "NULL" sound.

HTH

Erik

_______________________________________________
libav-user mailing list
[email protected]
https://lists.mplayerhq.hu/mailman/listinfo/libav-user

Reply via email to