Re: [Tracker] Revisiting indexer/daemon architecture

Jamie McCracken Wed, 11 Mar 2009 08:13:21 -0700

Hi Carlos,

there are indeed advantages and disadvantages to doing things
differently


Im real busy today and tomorrow but could we have an IRC meet on Friday
at usual time 14:00 GMT (9am for me) to discuss this

jamie

On Tue, 2009-03-10 at 19:56 +0100, Carlos Garnacho wrote:
> Hi all,
> 
> We (As in the contributors from Nokia) have been revisiting the idea of
> the daemon/indexer architecture, which has proven to be a much better
> approach, both conceptually and code-wise, however we still feel there
> are some issues left.
> 
> Current situation
> =================
> Tracker is now split in daemon (which handles requests, does file
> monitoring, etc...) and the indexer (which collects metadata from files
> and saves them in the tracker database). Being as is, the indexer
> doesn't need many reasons to stay alive after indexing has been
> performed, and file operations (which aren't that usual in the normal
> user interaction) would trigger again the indexer to do its job.
> 
> What we think could be the future
> =================================
> Metadata could come from virtually any source (Emails, RSS feeds, ...),
> and it could also come at any time, not necessarily requiring user
> interaction, and the more metadata sources, the busier Tracker would
> get. This makes the separation between indexer and daemon less useful,
> since the indexer part would be more and more time active, as it's also
> responsible of writing data to databases.
> 
> Also, as the indexer is told to handle more information, the DBus
> communication between both processes might become an issue in this
> scenario.
> 
> Solution
> ========
> We feel it would be better to have the write functionality of the
> indexer back to the daemon, and make it just hand the metadata
> information to the daemon so it's able to store it.
> 
> As for the communication medium between the daemon and the indexer, it
> has to be compact and fast, so sending turtle files through a pipe
> sounds quite optimal.
> 
> This architecture change would also ensure that the part that is
> responsible of storing metadata and monitor external sources is always
> kept alive.
> 
> So summing up, these would be the roles:
> 
> trackerd:
>      - Database read and writes
>      - File system monitoring
>      - Importing metadata to the database using the TTL file format
>      - handle remote/virtual data
> 
> tracker-indexer:
>      - File system crawling
>      - Exporting metadata to the TTL file format for the daemon
> 
> This solution would imply (again) plenty of changes in the code base,
> and would take some time to have it ready for inclusion in trunk, but we
> think it could be beneficial enough to be worth the pain, some pros
> could be:
> 
>     - More efficient communication than DBus.
>     - Overall memory footprint is smaller. There's no need to duplicate
> strings on indexer/daemon sides for every file we crawl.
>     - Faster response times for user requests when inserting and then
> looking up virtual data (based on various ontologies).
>     - Much more light weight daemon than we have now.
>     - Less code duplication (i.e. indexer/daemon crawling).
> 
> Opinions? Issues?
> 
> Regards,
>    Carlos
> 
> 
> 
> _______________________________________________
> tracker-list mailing list
> tracker-list@gnome.org
> http://mail.gnome.org/mailman/listinfo/tracker-list

_______________________________________________
tracker-list mailing list
tracker-list@gnome.org
http://mail.gnome.org/mailman/listinfo/tracker-list

Re: [Tracker] Revisiting indexer/daemon architecture

Reply via email to