Hi,

I am trying to do some work using libav* and am having some issues with my Mpeg 
streams, due to the dreaded B-frame reordering issue! Hopefully you guys can 
help me.

My video stream has packets physically ordered as a 15-frame GOP 
"IBBPBBPBBPBBPBB". However, the presentation order is "BBIBBPBBPBBPBBP". The 
first GOP in the stream seems to be closed so the first 2 B frames only depend 
on the I frame, however, the subsequent GOPs are open and their leading B 
frames rely on the previous GOPs' final P frame, which in turn relies on each 
previous P frame in that GOP, which finally relies on the previous GOP's I 
frame.

This in itself isn't causing me a problem, and I can decode the stream happily 
and I have even implemented frame-accurate seeking. However, it far less 
efficient then is (theoretically) possible. I'll give an example to illustrate 
the problem:

For clarity I will refer to frames as the presentation order and packets as the 
physical order. E.g. frame 2 is effectively packet 0 (it's physically contained 
in the first packet in the stream, but it's the 3rd frame presented to the 
viewer). Indexes are all from 0.

If I want to seek to frame 17 (which is the second I frame and packet 15), I 
would expect that I just do an av_seek_frame(ctx, vid, frame_index, flags). The 
only way I could get this to work reliably though was to use the BYTE_OFFSET 
flag and use the byte offset of the desired *packet* in the video stream (e.g. 
the byte offset of packet 15). However, I would then expect to be able to 
decode the packet immediately and get a visible frame (because it's an 
I-frame). However, I actually have to read/decode four(!) packets, presumably 
because the codec is confused by the reordered B frames, and seeking to frame 
17 (and flushing the buffers) causes it to lose the previous I / P frames that 
are needed to display frames 15 and 16 (the B frames). But I am trying only to 
decode frame 17 - I do not care about frames 15 and 16. Because they are after 
frame 17 in the packet stream, I have no need to attempt to decode them nor 
even read them.

Is there a way to convey this to the codec? Essentially I need to be able to do 
some sort of matched seek, where I can say 'skip to this frame and set the 
?codec's time? to the frame's timestamp. I would then expect the next two 
packet reads to be ignored (or fail maybe?) as they are for frames that 
occurred in the past.

The reason for doing this is that I'd like to optimise my frame-accurate search 
- at the moment it goes to the latest I frame it needs to decode the target 
frame, then decodes every frame forwards of that.

So in the worst case, if I seek to the second B frame in a GOP (e.g. frame 16 / 
packet 17), it decodes every frame of the entire previous GOP, then the I frame 
of the new GOP, and the first two B frames of the GOP. This means it has to 
decode 2I 4P and 11B frames. In the perfect implementation, it would decode the 
previous GOPs I frame, the 4 P frames, the new GOPs I frame, then use the last 
P frame and the new I frame to generate the second B frame. At this point the 
codec would be looking at packet 17, and would have all the required reference 
frames needed in the buffers. It could then resume playback from this point 
normally.

However, trying to implement this is difficult - if I do a flush, then 
repeatedly seek/read/decode to the IPPPPIB packets, I can't work out what frame 
corresponds to what packet passed in. If I pass NULL into the codec to try to 
drain the buffered packets out of it, then when I start passing packets into it 
again the images become horribly corrupted.

How can I get this working? It shouldn't be so hard! Help!

Tom

_______________________________________________
libav-user mailing list
[email protected]
https://lists.mplayerhq.hu/mailman/listinfo/libav-user

Reply via email to