Re: [Tracker] Revisiting indexer/daemon architecture

Jamie McCracken Wed, 11 Mar 2009 08:44:24 -0700

Acutually I have a meeting on friday so tomorrow at same time would be
best for me


jamie

On Wed, 2009-03-11 at 11:13 -0400, Jamie McCracken wrote:
> Hi Carlos,
> 
> there are indeed advantages and disadvantages to doing things
> differently
> 
> Im real busy today and tomorrow but could we have an IRC meet on Friday
> at usual time 14:00 GMT (9am for me) to discuss this
> 
> jamie
> 
> On Tue, 2009-03-10 at 19:56 +0100, Carlos Garnacho wrote:
> > Hi all,
> > 
> > We (As in the contributors from Nokia) have been revisiting the idea of
> > the daemon/indexer architecture, which has proven to be a much better
> > approach, both conceptually and code-wise, however we still feel there
> > are some issues left.
> > 
> > Current situation
> > =================
> > Tracker is now split in daemon (which handles requests, does file
> > monitoring, etc...) and the indexer (which collects metadata from files
> > and saves them in the tracker database). Being as is, the indexer
> > doesn't need many reasons to stay alive after indexing has been
> > performed, and file operations (which aren't that usual in the normal
> > user interaction) would trigger again the indexer to do its job.
> > 
> > What we think could be the future
> > =================================
> > Metadata could come from virtually any source (Emails, RSS feeds, ...),
> > and it could also come at any time, not necessarily requiring user
> > interaction, and the more metadata sources, the busier Tracker would
> > get. This makes the separation between indexer and daemon less useful,
> > since the indexer part would be more and more time active, as it's also
> > responsible of writing data to databases.
> > 
> > Also, as the indexer is told to handle more information, the DBus
> > communication between both processes might become an issue in this
> > scenario.
> > 
> > Solution
> > ========
> > We feel it would be better to have the write functionality of the
> > indexer back to the daemon, and make it just hand the metadata
> > information to the daemon so it's able to store it.
> > 
> > As for the communication medium between the daemon and the indexer, it
> > has to be compact and fast, so sending turtle files through a pipe
> > sounds quite optimal.
> > 
> > This architecture change would also ensure that the part that is
> > responsible of storing metadata and monitor external sources is always
> > kept alive.
> > 
> > So summing up, these would be the roles:
> > 
> > trackerd:
> >      - Database read and writes
> >      - File system monitoring
> >      - Importing metadata to the database using the TTL file format
> >      - handle remote/virtual data
> > 
> > tracker-indexer:
> >      - File system crawling
> >      - Exporting metadata to the TTL file format for the daemon
> > 
> > This solution would imply (again) plenty of changes in the code base,
> > and would take some time to have it ready for inclusion in trunk, but we
> > think it could be beneficial enough to be worth the pain, some pros
> > could be:
> > 
> >     - More efficient communication than DBus.
> >     - Overall memory footprint is smaller. There's no need to duplicate
> > strings on indexer/daemon sides for every file we crawl.
> >     - Faster response times for user requests when inserting and then
> > looking up virtual data (based on various ontologies).
> >     - Much more light weight daemon than we have now.
> >     - Less code duplication (i.e. indexer/daemon crawling).
> > 
> > Opinions? Issues?
> > 
> > Regards,
> >    Carlos
> > 
> > 
> > 
> > _______________________________________________
> > tracker-list mailing list
> > tracker-list@gnome.org
> > http://mail.gnome.org/mailman/listinfo/tracker-list

_______________________________________________
tracker-list mailing list
tracker-list@gnome.org
http://mail.gnome.org/mailman/listinfo/tracker-list

Re: [Tracker] Revisiting indexer/daemon architecture

Reply via email to