On Wed, Sep 03, 2008 at 05:41:45PM -0700, Jimmy Huang wrote: > I completed a design document draft of Content Manager for Moblin 2.0, > which is part of the Application and Framework infrastructure. It is > based on Meta Tracker (Trackerd) and SQLite. > > Please review the document and provide feedbacks, or comments are > welcome. Thanks.
I've been looking into meta data management for moblin2 myself this week and sum up thoughts related to my findings in this mail. Good meta data APIs will be instrumental in being able to create a good innovative user interface. I am basing my assertions on different frameworks on my own experiments in http://pippin.gimp.org/stuff/. RDF is an extendable abstract technology similar to XML, but it is not XML meta data is also not only about search but can also be storage of information like the change in brightness or the crop desired on a photo. We can use ontologies (a vocabulary for describing different types of data like files, images, videos and applications) specified by Xesam as well as use other ontologies originating with in the semantic web communities. We will probably also invent meta data of our own, like the brightness and contrast as well as cropping and sharpening applied to an image in the photo manager. We need to do more than feed meta data in as well as query what is there, we also need a DOM-like access to traverse the arcs of the graph when creating visualizations and user interfaces. Meta data is not only about search, full text indexing is a separate issue and should be stored in a separate database. We might be able to do without such functionality anyways. Potential RDF storage frameworks ================================ What follows is a review of the candidate libraries I've looked at in greatest depth (I've gone through most C based, actively developed or not RDF libraries I've found with freshmeat and google). This list are the three I have ended up finding most relevant and studying in deepest detail. librdf (redland) ---------------- link: http://librdf.org/ pros: - well tested and documented, uses RDF natively. - DOM like API to navigate the graph - abstraction glue layer - multiple backends, could be extended with a mobile dedicated backend like TT. - supports multiple and pluggable query languages, allows full reuse of existing literature applicable to development using RDF from the semantic web domain. - works with multiple clients using libsql and some other backends, this means no marshalling of data over dbus but direct access from all clients. - written in C cons: - large - verbose API (can be wrapped in macros, or an abstraction layer created.) - The library does not do file locking on berkeley DB files at least. - doesn't support multiple clients concurrent access/syncinc with the berkeleydb backend (can be added with transactions.) But a native quarked string/hashtable based approach similar to tt could be a better long term plan for mobile memory footprint/performance optimization. TT (from stuff) --------------- link: http://pippin.gimp.org/tmp/tt.h.txt http://pippin.gimp.org/tmp/tt.c.txt pros: - developed in-house - we have plans for how to make it efficiently shared between processes using mmap and per processes tiny indexes for efficient queries. - small DOM like API, not as extensive as librdf though. - fast since it works with an in memory index, at a later stage the actual strings could be swapped out in a shared mmaped string storage between client processes. cons: - Experimental small minimal developer base - No developer community. - few features, needs development. - will not work correctly for RDF when there are multiple objects with the same relation (e.g. multiple dc:contributor relations). - very simplistic query model. Tracker: -------- link: http://www.gnome.org/projects/tracker/ pros: - used by others, improved by nokia - responds in real time to filesystem changes. - has many extractors. - could potentially have it's data store replaced with librdf, which could allow clients direct access to the nicer APIs there without going over dbus. cons: - does more than what we need, and doesn't deal with RDF directly, - own high level abstractions for types. - lack of high level DOM API. A plan ====== - Create a separate double bookkeeping librdf based database (using sqlite) that can be manually populated using a commandline spidering tool. - Allow application developers to store custom data and use various front ends to librdf (query languages, higher level of abstraction apis etc.) - Use tracker for monitoring the file system and track additions/deletions changes to files on disk. - Update dobule-booking librdf database periodically or upon changes from tracker by patching tracker. - Make tracker use librdf as it's backend, thus getting rid of double book keeping. - Create a custom footprint optimized backend for librdf (similar in spirit to TT?) for memory constrained devices if neccesary. This development plan makes it possible to parallelize development and avoid having some branches of development depend on the others. /Øyvind K. -- ▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△ 'Truth comes out of error more readily than confusion.' -- Francis Bacon _______________________________________________ dev mailing list [email protected] https://www.moblin.org/mailman/listinfo/dev
