The very first frames on each subsession always come back with 'wall clock'
presentationTime (as filled in by code in RTPSource.cpp, lines 309-318).
Then, once a SR packet has arrived, the presentationTime time jumps to the
NTP time advertised by the source in the SR packet.

I don't think this is correct behavior.

Yes it is. The key thing to realize is that the first few presentation times - before RTCP synchronization occurs - are just 'guesses' made by the receiving code (based on the receiver's 'wall clock' and the RTP timestamp). However, once RTCP synchronization occurs, all subsequent presentation times will be accurate, and will be THE SAME PRESENTATION TIMES that the server generated (i.e., they will be times that were computed from the server's clock).

All this means is that a receiver should be prepared for the fact that the first few presentation times (until RTCP synchronization starts) will not be accurate. The code, however, can check this by calling "RTPSource:: hasBeenSynchronizedUsingRTCP()". If this returns False, then the presentation times are not accurate, and should be treated with 'a grain of salt'. However, once the call to returns True, then the presentation times (from then on) will be accurate.


The main reason for this is what kind of SR reports I'm getting from
YouTube:

At time NTP+0.0 seconds, I get an RTP timestamp of 42.0 seconds.
At time NTP+8.0 seconds, I get an RTP timestamp of 54.0 seconds.
At time NTP+16.0 seconds, I get an RTP timestamp of 62.0 seconds.

This shows that in RTP time, 12 seconds worth of data have been transmitted,
however in real time (NTP time), only 8 seconds have elapsed.

All this means is the server is (apparently) streaming 20 seconds worth of data in 16 seconds, apparently to allow the client to pre-buffer the excess data (so it can ensure smooth playout). This means, therefore, that your receiving client needs to buffer this extra data, and play out each frame based on the *presentation time*, *not* at the time at which the frame actually arrives.

Therefore, to use your example, you would:
- play the frame whose presentation time is 42.0 at time 0
- play the frame whose presentation time is 54.0 at time 12
- play the frame whose presentation tme is 62 at time 20
*regardless* of the times at which these frames actually arrived.

I really wish people would stop thinking that they need to do their own implementation of the RTP/RTCP protocol (e.g., look at RTP timestamps or sequence numbers, and/or RTCP reports). You don't - we already implement all of this! All you need to do is use the presentation times that are delivered to you (but be aware that the first few presentation times may not be accurate, as noted above).
--

Ross Finlayson
Live Networks, Inc.
http://www.live555.com/
_______________________________________________
live-devel mailing list
[email protected]
http://lists.live555.com/mailman/listinfo/live-devel

Reply via email to