I think you should have three info blocks: video streams, audio streams and
subtitles (if container supports their embedding). Sort naturally or by
vid/aid/sid if present.

You shouldn't multiplex video and audio streams since any video stream can
be combined with any audio stream.

In terms of xml you can have container as root element, which embeds
streams grouped by type.

-- 
Best regards,
Konstantin Gribov.
28.03.2014 1:29 пользователь "Nick Burch" <apa...@gagravarr.org> написал:

> On Thu, 27 Mar 2014, Konstantin Gribov wrote:
>
>> Some containers (like matroska/mkv) tags audio and subtitle streams with
>> language tag and some comment. From mplayer console output:
>>
>>  [lavf] stream 0: video (h264), -vid 0
>>> [lavf] stream 1: audio (aac), -aid 0, -alang rus, Rus BaibaKo.tv
>>> [lavf] stream 2: audio (ac3), -aid 1, -alang eng, Eng
>>>
>>
> Ogg + CMML would give something similar
>
>  I don't know any established semantics for video streams but the first
>> usually is default for playback.
>>
>
> How should a Tika parser handle such a file though? Include the primary
> audio metadata with the video stream as the primary object, and report
> embedded items for the other audio streams? Report all as embedded items?
> Report the primary video stream as the main thing, and give all other video
> + audio as embedded items? Something else?
>
> Nick
>

Reply via email to