Re: [transcode-users] time_handling

Carl Karsten Tue, 07 Oct 2008 13:37:50 -0700

Francesco Romani wrote:
> On Mon, 2008-10-06 at 18:04 -0500, Carl Karsten wrote:
> [...]
>>> The good news is that even smarter algos should fit nicely into the
>>> synchro engine currently being laid.
>> Any chance of that happening this year?
> 
> Very unlinkely at this development speed.
> A lot of code rewrite, a careful planning of some architectural changes
> and the design and the implementation of better synchro algorythm are
> the steps needed. Noone is easy nor quick :)


Can you post a road map, along with what sorts of skills are needed?  I'll try
goto get some help if I know what kind of help is needed.

> 
>>>> I am wondering what it would take to make a test app:  something that would
>>>> generate an input AV stream (file, v4l device... whatever), something like
>>>> transcode would do something to it, and the output would be analyzed to 
>>>> see if
>>>> the AV is still in sync.
>>>>
>>>> I did something similar using a webcam/mic pointed at a laptop that had a 
>>>> clock
>>>> on the screen, and an audio clock ("at the tone, the time will be eight, 
>>>> oh one.
>>>>  beep.")  I let that record for an hour, then played it back and looked to 
>>>> see
>>>> if the tone beeped as the second hand crossed 12.  I would like to think 
>>>> an app
>>>> could do this too.
>>> It is an interesting challenge (and it would be a nice tool to have).
>> How could it be broken up so that something like python could be used for 
>> doing
>> the test?
> 
> By adding tc bindings to python? ;P

careful - if we are testing tc, it may not be good to use tc.  but being able to
 use tc from python might make it easier to test it too, so.. yeah.

> 
>> I am really having a hard time imagining how it would work... but this
>> is what
>> comes to mind:
>> Create input from scratch:
>> Use python PIL module to construct 2 images: one all black, one all white.  
>> use
>> some mystery module to generate a tone every second.  use something to 
>> construct
>> a stream that just alternates between the 2 images, and plays the tone 
>> during one.
> 
> Yep the keypoint is to be able to clearly identify the frames in the
> stream (both audio and video) then we can assert the streams are in sync
> if a given audio frame matches a given video frame (modulo a small fixed
> error).

Is there currently any way to test for sync errors?  my understanding is the
resulting stream is valid, just not what is expected.

> 
>> Represent playback as data:
>> given a stream, I want access to the frames as if it was a list, or at least 
>> an
>> iterator.  video would be rendered as single image blobs.  I am not really 
>> sure
>> how the audio would be represented, but I need some way of detecting the 
>> transition.
> 
> audio is just a different blob.

Can you have a frame of audio?

> 
>> Any idea how much of this has already been done, or where a good starting 
>> point is?
> 
> Well, testframes generators (both audio and video) are quite common, we
> have some examples in TC codebase, in the ffmpeg one and I guess in much
> others.
> We can also just tag the frames using some kind of watermark in order to
> make them recognizable.

I think it will be easy enough to identify video frames.  draw text, ocr the
text, fairly reliable and easy to use.

I have been looking at speech recognition, but I even if that was easy and
worked, I am not sure how it would help.

> 
> Unfortunately, I don't know anything that is usable directly in python,
> I'm afraid anyone interested has to get his hands dirty with C/C++ or
> maybe java too.

I am planning on getting familiar with ctypes, swig and pyrex in the next few
months.

Carl K

Re: [transcode-users] time_handling

Reply via email to