Francesco Romani wrote:
> On Mon, 2008-10-06 at 18:04 -0500, Carl Karsten wrote:
> [...]
>>> The good news is that even smarter algos should fit nicely into the
>>> synchro engine currently being laid.
>> Any chance of that happening this year?
>
> Very unlinkely at this development speed.
> A lot of code rewrite, a careful planning of some architectural changes
> and the design and the implementation of better synchro algorythm are
> the steps needed. Noone is easy nor quick :)
Can you post a road map, along with what sorts of skills are needed? I'll try
goto get some help if I know what kind of help is needed.
>
>>>> I am wondering what it would take to make a test app: something that would
>>>> generate an input AV stream (file, v4l device... whatever), something like
>>>> transcode would do something to it, and the output would be analyzed to
>>>> see if
>>>> the AV is still in sync.
>>>>
>>>> I did something similar using a webcam/mic pointed at a laptop that had a
>>>> clock
>>>> on the screen, and an audio clock ("at the tone, the time will be eight,
>>>> oh one.
>>>> beep.") I let that record for an hour, then played it back and looked to
>>>> see
>>>> if the tone beeped as the second hand crossed 12. I would like to think
>>>> an app
>>>> could do this too.
>>> It is an interesting challenge (and it would be a nice tool to have).
>> How could it be broken up so that something like python could be used for
>> doing
>> the test?
>
> By adding tc bindings to python? ;P
careful - if we are testing tc, it may not be good to use tc. but being able to
use tc from python might make it easier to test it too, so.. yeah.
>
>> I am really having a hard time imagining how it would work... but this
>> is what
>> comes to mind:
>> Create input from scratch:
>> Use python PIL module to construct 2 images: one all black, one all white.
>> use
>> some mystery module to generate a tone every second. use something to
>> construct
>> a stream that just alternates between the 2 images, and plays the tone
>> during one.
>
> Yep the keypoint is to be able to clearly identify the frames in the
> stream (both audio and video) then we can assert the streams are in sync
> if a given audio frame matches a given video frame (modulo a small fixed
> error).
Is there currently any way to test for sync errors? my understanding is the
resulting stream is valid, just not what is expected.
>
>> Represent playback as data:
>> given a stream, I want access to the frames as if it was a list, or at least
>> an
>> iterator. video would be rendered as single image blobs. I am not really
>> sure
>> how the audio would be represented, but I need some way of detecting the
>> transition.
>
> audio is just a different blob.
Can you have a frame of audio?
>
>> Any idea how much of this has already been done, or where a good starting
>> point is?
>
> Well, testframes generators (both audio and video) are quite common, we
> have some examples in TC codebase, in the ffmpeg one and I guess in much
> others.
> We can also just tag the frames using some kind of watermark in order to
> make them recognizable.
I think it will be easy enough to identify video frames. draw text, ocr the
text, fairly reliable and easy to use.
I have been looking at speech recognition, but I even if that was easy and
worked, I am not sure how it would help.
>
> Unfortunately, I don't know anything that is usable directly in python,
> I'm afraid anyone interested has to get his hands dirty with C/C++ or
> maybe java too.
I am planning on getting familiar with ctypes, swig and pyrex in the next few
months.
Carl K