Hi Carlos,

there are indeed advantages and disadvantages to doing things
differently

Im real busy today and tomorrow but could we have an IRC meet on Friday
at usual time 14:00 GMT (9am for me) to discuss this

jamie

On Tue, 2009-03-10 at 19:56 +0100, Carlos Garnacho wrote:
> Hi all,
> 
> We (As in the contributors from Nokia) have been revisiting the idea of
> the daemon/indexer architecture, which has proven to be a much better
> approach, both conceptually and code-wise, however we still feel there
> are some issues left.
> 
> Current situation
> =================
> Tracker is now split in daemon (which handles requests, does file
> monitoring, etc...) and the indexer (which collects metadata from files
> and saves them in the tracker database). Being as is, the indexer
> doesn't need many reasons to stay alive after indexing has been
> performed, and file operations (which aren't that usual in the normal
> user interaction) would trigger again the indexer to do its job.
> 
> What we think could be the future
> =================================
> Metadata could come from virtually any source (Emails, RSS feeds, ...),
> and it could also come at any time, not necessarily requiring user
> interaction, and the more metadata sources, the busier Tracker would
> get. This makes the separation between indexer and daemon less useful,
> since the indexer part would be more and more time active, as it's also
> responsible of writing data to databases.
> 
> Also, as the indexer is told to handle more information, the DBus
> communication between both processes might become an issue in this
> scenario.
> 
> Solution
> ========
> We feel it would be better to have the write functionality of the
> indexer back to the daemon, and make it just hand the metadata
> information to the daemon so it's able to store it.
> 
> As for the communication medium between the daemon and the indexer, it
> has to be compact and fast, so sending turtle files through a pipe
> sounds quite optimal.
> 
> This architecture change would also ensure that the part that is
> responsible of storing metadata and monitor external sources is always
> kept alive.
> 
> So summing up, these would be the roles:
> 
> trackerd:
>      - Database read and writes
>      - File system monitoring
>      - Importing metadata to the database using the TTL file format
>      - handle remote/virtual data
> 
> tracker-indexer:
>      - File system crawling
>      - Exporting metadata to the TTL file format for the daemon
> 
> This solution would imply (again) plenty of changes in the code base,
> and would take some time to have it ready for inclusion in trunk, but we
> think it could be beneficial enough to be worth the pain, some pros
> could be:
> 
>     - More efficient communication than DBus.
>     - Overall memory footprint is smaller. There's no need to duplicate
> strings on indexer/daemon sides for every file we crawl.
>     - Faster response times for user requests when inserting and then
> looking up virtual data (based on various ontologies).
>     - Much more light weight daemon than we have now.
>     - Less code duplication (i.e. indexer/daemon crawling).
> 
> Opinions? Issues?
> 
> Regards,
>    Carlos
> 
> 
> 
> _______________________________________________
> tracker-list mailing list
> tracker-list@gnome.org
> http://mail.gnome.org/mailman/listinfo/tracker-list

_______________________________________________
tracker-list mailing list
tracker-list@gnome.org
http://mail.gnome.org/mailman/listinfo/tracker-list

Reply via email to