Sorry for the delay on this. I've updated tika-ffmpeg with a new file with 2 audio tracks and a subtitle track and added a test. The metadata looks as follows:
pbcore:instantiationDataRate=3511 kb/s pbcore:instantiationDuration=00:00:01.03 pbcore:instantiationEssenceTrack[0]/pbcore:essenceTrackType=Video pbcore:instantiationEssenceTrack[0]/pbcore:essenceTrackFrameSize=480x270 pbcore:instantiationEssenceTrack[0]/pbcore:essenceTrackFrameRate=29.97 fps pbcore:instantiationEssenceTrack[0]/pbcore:essenceTrackDataRate=360 kb/s pbcore:instantiationEssenceTrack[0]/pbcore:essenceTrackEncoding=h264 pbcore:instantiationEssenceTrack[0]/pbcore:essenceTrackLanguage=eng pbcore:instantiationEssenceTrack[1]/pbcore:essenceTrackType=Audio pbcore:instantiationEssenceTrack[1]/pbcore:essenceTrackSamplingRate=48000 Hz pbcore:instantiationEssenceTrack[1]/pbcore:essenceTrackDataRate=1536 kb/s pbcore:instantiationEssenceTrack[1]/pbcore:essenceTrackEncoding=pcm_s16le pbcore:instantiationEssenceTrack[1]/pbcore:essenceTrackLanguage=eng pbcore:instantiationEssenceTrack[2]/pbcore:essenceTrackType=Audio pbcore:instantiationEssenceTrack[2]/pbcore:essenceTrackSamplingRate=48000 Hz pbcore:instantiationEssenceTrack[2]/pbcore:essenceTrackDataRate=1536 kb/s pbcore:instantiationEssenceTrack[2]/pbcore:essenceTrackEncoding=pcm_s16le pbcore:instantiationEssenceTrack[2]/pbcore:essenceTrackLanguage=eng pbcore:instantiationEssenceTrack[3]/pbcore:essenceTrackType=Subtitle pbcore:instantiationEssenceTrack[3]/pbcore:essenceTrackEncoding=eia_608 pbcore:instantiationEssenceTrack[3]/pbcore:essenceTrackLanguage=eng and the alternative representation would look like: pbcore:instantiationDataRate=3511 kb/s pbcore:instantiationDuration=00:00:01.03 stream[0]/pbcore:essenceTrackType=Video stream[0]/pbcore:essenceTrackFrameSize=480x270 stream[0]/pbcore:essenceTrackFrameRate=29.97 fps stream[0]/pbcore:essenceTrackDataRate=360 kb/s stream[0]/pbcore:essenceTrackEncoding=h264 stream[0]/pbcore:essenceTrackLanguage=eng stream[1]/pbcore:essenceTrackType=Audio stream[1]/pbcore:essenceTrackSamplingRate=48000 Hz stream[1]/pbcore:essenceTrackDataRate=1536 kb/s stream[1]/pbcore:essenceTrackEncoding=pcm_s16le stream[1]/pbcore:essenceTrackLanguage=eng stream[2]/pbcore:essenceTrackType=Audio stream[2]/pbcore:essenceTrackSamplingRate=48000 Hz stream[2]/pbcore:essenceTrackDataRate=1536 kb/s stream[2]/pbcore:essenceTrackEncoding=pcm_s16le stream[2]/pbcore:essenceTrackLanguage=eng stream[3]/pbcore:essenceTrackType=Subtitle stream[3]/pbcore:essenceTrackEncoding=eia_608 stream[3]/pbcore:essenceTrackLanguage=eng I really think that if we encounter another 'kind of thing' that might utilize some form of sub-streams, that 'other thing' will need to be namespaced as well or we'll start to lose the value of using specifications as metadata keys in the first place. Another example that could make use of this general concept of structured mapping is our IPTC metadata interface. For instance, that specification uses a structured LocationDetails object for both a single-valued LocationCreated field and for a multi-valued LocationShown field. That LocationDetails object contains fields like City and CountryName, so we currently have that mapped as: Iptc4xmpExt:LocationCreatedCity (internalText) Iptc4xmpExt:LocationCreatedCountryName (internalText) ... Iptc4xmpExt:LocationShownCity (internalTextBag) Iptc4xmpExt:LocationShownCountryName (internalTextBag) ... which strays from the specification a bit to accommodate our metadata structure, i.e. LocationCreatedCity is not a field in the spec, and if one LocationShown entry only contains City and another only contains CountryName we have to rely on empty, 'padding' entries. A much more concise representation would be: Iptc4xmpExt:LocationCreated/Iptc4xmpExt:City Iptc4xmpExt:LocationCreated/Iptc4xmpExt:CountryName ... Iptc4xmpExt:LocationShown[0]/Iptc4xmpExt:City Iptc4xmpExt:LocationShown[0]/Iptc4xmpExt:CountryName ... IMHO, a generic 'streams' prefix would seem out of place next to those fields. Regards, Ray On July 24, 2014 at 9:52:47 AM, Nick Burch (apa...@gagravarr.org) wrote: > On Wed, 23 Jul 2014, Ray Gauss wrote: > > 2) There are are several PBCore instantiation properties that apply to > > the entire file like duration and tracks that we'd want prefixed with > > pbcore so I think it would be odd to see: > > > > pbcore:instantiationDuration=00:00:05.20 > > stream[0]/pbcore:essenceTrackType=Video > > This structure does have the advantage that any tool can easily see that > the second metadata key relates to a sub-stream / sub-track etc, without > having to know anything about PBCore. That will make it easier for tools > to exclude or handle these differently in a general way. > > (I can't think, off the top of my head, of another kind of thing that > might need this structure, but I'm reluctant to nail it down to being only > for PBCore if that'll cause us issues when we try to support something > very similar in future) > > > Any chance you could get / fake a nearly-full set of metadata keys and > value for a media file with (say) 3 streams? We can then generate pbcore > prefixed and general prefixed versions, which should hopefully make it > easier for other community members to compare and offer their input! > > Nick