Re Paul Jakubik at "Mon, 12 Jul 2010 11:26:16 -0500" wrote: PJ> On Mon, Jul 12, 2010 at 10:37 AM, Nick Burch <nick.bu...@alfresco.com>wrote: PJ> I've tried to summarize the various use cases mentioned in your email. PJ> Please let me know if I have correctly captured everything.
PJ> - *Containers that are conceptually a single document.* eg .doc (several PJ> named streams in an OLE2 file), or .xlsx (several named xml files in a zip PJ> file) PJ> - *Containers that are conceptually containers of many separate documents.* eg PJ> a zip file with several text files in it, or a tar file with zip files, doc PJ> files, and text files in it. PJ> - *Containers that are both a single document and separate documents.* eg an PJ> email with multiple parts and/or attachements, or a .doc with embedded PJ> spreadsheets. PJ> - *Single documents with metadata associated with regions of the document. *eg PJ> PDF? What about multimedia containers (ASF, OGG, etc.), that could contain data in different formats? They conceptually look like single file, but with different metadata PJ> From the point of view of reporting metadata for documents, it might be PJ> useful to group these use cases the following way: PJ> - Single documents with multiple sets of metadata PJ> - Containers that are conceptually single documents PJ> - PDF? PJ> - Containers that contain many distinct documents and/or containers PJ> - Containers that are conceptually containers PJ> - Containers that are conceptually documents and containers May be it worth to separate metadata of top-level objects from metadata of embedded objects? And allow to traverse through hierarchy of embedded objects? And provide several implementations, something like: collector of metadata for all embedded objects, or collector only of top-level metadata, etc. This could allow to improve performance in some cases (imho), because in some task people could need only top-level metadata, etc. -- With best wishes, Alex Ott, MBA http://alexott.blogspot.com/ http://alexott.net/ http://alexott-ru.blogspot.com/ Skype: alex.ott