Hi Carlos, there are indeed advantages and disadvantages to doing things differently
Im real busy today and tomorrow but could we have an IRC meet on Friday at usual time 14:00 GMT (9am for me) to discuss this jamie On Tue, 2009-03-10 at 19:56 +0100, Carlos Garnacho wrote: > Hi all, > > We (As in the contributors from Nokia) have been revisiting the idea of > the daemon/indexer architecture, which has proven to be a much better > approach, both conceptually and code-wise, however we still feel there > are some issues left. > > Current situation > ================= > Tracker is now split in daemon (which handles requests, does file > monitoring, etc...) and the indexer (which collects metadata from files > and saves them in the tracker database). Being as is, the indexer > doesn't need many reasons to stay alive after indexing has been > performed, and file operations (which aren't that usual in the normal > user interaction) would trigger again the indexer to do its job. > > What we think could be the future > ================================= > Metadata could come from virtually any source (Emails, RSS feeds, ...), > and it could also come at any time, not necessarily requiring user > interaction, and the more metadata sources, the busier Tracker would > get. This makes the separation between indexer and daemon less useful, > since the indexer part would be more and more time active, as it's also > responsible of writing data to databases. > > Also, as the indexer is told to handle more information, the DBus > communication between both processes might become an issue in this > scenario. > > Solution > ======== > We feel it would be better to have the write functionality of the > indexer back to the daemon, and make it just hand the metadata > information to the daemon so it's able to store it. > > As for the communication medium between the daemon and the indexer, it > has to be compact and fast, so sending turtle files through a pipe > sounds quite optimal. > > This architecture change would also ensure that the part that is > responsible of storing metadata and monitor external sources is always > kept alive. > > So summing up, these would be the roles: > > trackerd: > - Database read and writes > - File system monitoring > - Importing metadata to the database using the TTL file format > - handle remote/virtual data > > tracker-indexer: > - File system crawling > - Exporting metadata to the TTL file format for the daemon > > This solution would imply (again) plenty of changes in the code base, > and would take some time to have it ready for inclusion in trunk, but we > think it could be beneficial enough to be worth the pain, some pros > could be: > > - More efficient communication than DBus. > - Overall memory footprint is smaller. There's no need to duplicate > strings on indexer/daemon sides for every file we crawl. > - Faster response times for user requests when inserting and then > looking up virtual data (based on various ontologies). > - Much more light weight daemon than we have now. > - Less code duplication (i.e. indexer/daemon crawling). > > Opinions? Issues? > > Regards, > Carlos > > > > _______________________________________________ > tracker-list mailing list > tracker-list@gnome.org > http://mail.gnome.org/mailman/listinfo/tracker-list _______________________________________________ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list