Re: [Moblin Dev] Meta data storage/management

Øyvind Kolås Tue, 09 Sep 2008 04:53:44 -0700

On Mon, Sep 08, 2008 at 12:18:54PM -0700, Jimmy Huang wrote:
> On Fri, 2008-09-05 at 12:23 +0100, Øyvind Kolås wrote:
> > On Wed, Sep 03, 2008 at 05:41:45PM -0700, Jimmy Huang wrote:
> > > I completed a design document draft of Content Manager for Moblin 2.0,
> > > which is part of the Application and Framework infrastructure.  It is
> > > based on Meta Tracker (Trackerd) and SQLite.
> > > 
> I agree, because applications need a way to not only query the extracted
> metadata, but also a way to store metadata that's application-specific.
> This is what tracker doesn't provide a lot, it does let user give a tag
> string to the file, but you can't really give labels to the information
> dynamically.


Internally tracker is technically also RDF, since it stores its data as
triplets. But by not allowing extending the set of meta data stored it
limits some of the potential uses of it as a meta data store.

> The intention of the project is not focus on text indexing at all.  I
> can view what Tracker provides as a good bonus information, and may also
> derive some key usages for the Moblin framework.

> I haven't work with rdf libraries, and I am curious of what's the memory
> footprint and scalability of the API for large datasets.

I do not know the details for different data sets etc, this entirely
depends on the backend bineg used, librdf can use a triplet store stored
in libsql (which is the same thing that tracker does), it can also use
berkeleydb as well as a range of other backends, it could even be
possible to hook up librdf on top of an optimized TT core.

> * snip *

> > A plan
> > ======
> > 
> * snip *
> > 
> > This development plan makes it possible to parallelize development and avoid
> > having some branches of development depend on the others.
> 
> It makes sense to do parallel development because Tracker is actively
> developed and constantly optimized.  I think what's going to need to
> drive the metadata storage design is how are the applications intended
> to use it.  

The plan above is a possible course of action, my main concern about
tracker is that it is not flexible enough and that it might already be
doing quite a few things that we do not need. I am experimenting with
prototype shared core infrastructure for photo/video/music
management/browsing applications and I still do not know the extent of
APIs desired, but I do know that I would prefer to do queries and
travesals of the database in-process rather than proxy that over d-bus.

> Scalability and performance is also a primary concern.  How's a
> RDF-based backend performs when query is going to have to parse the rdf
> on larget datasets on a small device like MID?  The current usage model
> is that application filters through tens of thousands of media content
> and easily crate a tag cloud based on tags provided.  It would be nice
> to see some performance analysis on parsing/querying rdf using librdf. 

RDF does not needs to be parsed, RDF is the abstract data model that can
be embodied in either XML, N3 or other formats. On a device an index
would not be stored in such a serialized format but be kept in data
structures efficient for queries in an sqlite database or in a more
dedicated triplet store.

/Øyvind K.
-- 
▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△▼△
'Truth comes out of error more readily than confusion.' -- Francis Bacon

_______________________________________________
dev mailing list
[email protected]
https://www.moblin.org/mailman/listinfo/dev

Re: [Moblin Dev] Meta data storage/management

Reply via email to