FYI, The diff contains a first look at the tracker-db-sqlite.c file, I added some comments that illustrate how a journal table "Events" will be filled up.
Note that the table will most likely become a sqlite memory table. The reason why I don't think a GHashTable in the C code is as good is because we want to repeat the query in the TrackerXesamLiveSearch on this "Events" table (for example with an INNERT JOIN with Services). If it where a GHashTable, that query would either need a lot of OR clauses (each ServiceID in one OR) or we'd need to do a query for each item in the table to check whether the items affect a live search. /me is the master of pseudo code, here I go again! For each query in live-search-queries do // This one sounds like the best to me. It requires a In-Sqlite // In-Memory table called "Events" SELECT ... FROM Events, Services ... WHERE Events.ServiceID = Services.ID AND the live-search-query AND (ServiceID is in the table) // Pro: short arguments list, easy query // Con: JOIN (although the cartesian product is relatively small) or // This one doesn't need a "Events" table in sqlite but does need a // In-C In-Memory GHashTable holding all the affected ServiceIDs SELECT ... FROM Services ... WHERE the live-search-query AND ( ServiceID = hashtable[0].key OR ServiceID = hashtable[1].key OR ServiceID = hashtable[2].key OR ServiceID = hashtable[n].key ... ) // Pro: no JOIN // Con: long arguments list done On Tue, 2008-04-29 at 17:56 +0200, Philip Van Hoof wrote: > Pre note: > > This is about the Xesam support being done (since this week) in the > indexer-split. > > About: > > Xesam requires notifying live searches about changes that affect them. > We plan to implement this with a "events" table that journals all > creates, deletes and updates that the indexer causes. > > Periodically we will handle and then flush the items in that events > table. > > I made a cracktasty diagram that contains the from-a-high-distance > abstract proposal that we have in mind for this. > > > This is pseudo code that illustrates the periodic handler: > > bool periodic_handler (...) > > { > > lock indexer > update eventstable set beinghandled=1 where 1=1 (all items) > unlock indexer > > foreach query in all livequeries > added, modified, removed = query.execute-on (eventstable) > query.emit_added (added) > query.emit_removed (removed) > query.emit_modified (modified) > done > > lock indexer > delete from eventstable where beinghandled = 1 > unlock indexer > > return (!stopping) > > } > > > Here's a piece of IRC log between me and jamiecc about the proposal: > > pvanhoof ping jamiemcc > pvanhoof same thing > pvanhoof I'll make a pdf > jamiemcc oh ok > pvanhoof Sending > pvanhoof ok > pvanhoof so > pvanhoof it's about the hitsadded, hitsremoved and hitsmodified signals for > xesam > pvanhoof What we have in mind is using a "events" table that is a journal for > all creates, deletes and updates > pvanhoof Periodically we will flush that table, each create (insert), update > and each delete we add a record in that table > pvanhoof We'll make sure the table is queryable in a similar fashion as how > the Xesam query will execute > pvanhoof In the periodical handler we'll for each live search check whether > it got affected by the items in the events table > pvanhoof In pseudo, the handler: > jamiemcc sounds feasible > pvanhoof gboolean periodic_handler (void data) { > pvanhoof lock indexer > pvanhoof update eventstable set beinghandled=1 where 1=1 (all items) > pvanhoof unlock indexer > pvanhoof foreach query in all live queries > pvanhoof added, modified, removed = query.execute-on (eventstable) > pvanhoof query.emit_added (added) > pvanhoof query.emit_removed (removed) > pvanhoof query.emit_modified (modified) > pvanhoof done > pvanhoof lock indexer > pvanhoof delete from eventstable where beinghandled = 1 > pvanhoof unlock indexer > pvanhoof } > pvanhoof I've send you a diagram that you can look at as if it's a > state-activity one, a ERD and a class diagram :) now how cool is that?? :) > pvanhoof it's just three columns, although the ERD is quite simplistic of > course > jamiemcc yeah just go tit > * fritschy ([EMAIL PROTECTED]) has left #tracker > pvanhoof so, the current idea is to adapt those stored procedures into > transactions that will also add this record to the "events" table > * fritschy ([EMAIL PROTECTED]) has joined #tracker > pvanhoof Which might not be sufficient, and we kinda lack the in-depth > know-how of all the db handling of tracker > pvanhoof So that's a first issue we want to discuss with you > pvanhoof The other is stopping the indexing, restarting it (locking it, in > the pseudo code): what you think about that > jamiemcc ok I will need to think about it - I iwll probably reply later > tonight and we can discuss tomorrow > pvanhoof I adapted my initial proposal to have two short critical sections > rather than letting the entire periodic handler be one critical section > pvanhoof that way the lock is smaller > jamiemcc the indexer will be seaparte process so will need to be locked via > dbus signals > pvanhoof by just adding a column to the events table > pvanhoof yes but I guess we want any such locking to be short > jamiemcc well yes > pvanhoof then once the items that are to be handled are identified, we for > each live-search check whether the live-search is affected > pvanhoof and we perform the necessary hitsadded, hitsremoved and hitsmodified > signals if needed > pvanhoof if all is done, we simply purge the handled items from the events > table > jamiemcc the query results will be store din temp tables > pvanhoof which is the second location where we want the indexer to be > locked-out > jamiemcc remember a query may be a cursor so wont include entire result set > pvanhoof No okay, but that's something the check needs to worry about > pvanhoof so ottela is working on a query for the live-search > jamiemcc ok cool > pvanhoof and if we only want to update if the client has the affected item > visible, due to cursor-usage > pvanhoof then i guess we'll somehow need to get that info into trackerd > jamiemcc any reason we dont store whats change din memory rather than sqlite > table? > pvanhoof oh, that's abstract right now > jamiemcc o > jamiemcc ok > pvanhoof "tracker's event table" can also be a hashtable for me .. > jamiemcc yeah fine > pvanhoof implementation detail > pvanhoof since it doesn't need to be persistent ... > pvanhoof difference is that either we use a memory table and still a > transaction for the three stored procedures > pvanhoof or we adapt code > jamiemcc prefer hashtable as amount of data will be small > jamiemcc can even be a list > pvanhoof ok, your comments/ideas on this would of course be very useful btw > jamiemcc yeah I will think about it more tonight and get back to you > pvanhoof sounds great > pvanhoof I'll make a mail about this to the mailing list? or I await your > ideas tomorrow? > pvanhoof I'll just wait for now > jamiemcc you cna mail if you like > jamiemcc I will reply to it > > > _______________________________________________ > tracker-list mailing list > tracker-list@gnome.org > http://mail.gnome.org/mailman/listinfo/tracker-list -- Philip Van Hoof, freelance software developer home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org http://pvanhoof.be/blog http://codeminded.be
Index: src/trackerd/tracker-db-sqlite.c =================================================================== --- src/trackerd/tracker-db-sqlite.c (revision 1330) +++ src/trackerd/tracker-db-sqlite.c (working copy) @@ -3346,6 +3346,9 @@ } id = tracker_db_interface_sqlite_get_last_insert_id (TRACKER_DB_INTERFACE_SQLITE (db_con->db)); + // XESAM TODO + // INSERT INTO Events (ServiceID, ..., EventType) VALUES (sid, ..., 'Create') + if (info->is_hidden) { tracker_db_exec_no_reply (db_con, "Update services set Enabled = 0 where ID = %d", @@ -3549,6 +3552,9 @@ tracker_exec_proc (db_con->common, "DeleteService7", path, name, NULL); tracker_exec_proc (db_con->common, "DeleteService9", path, name, NULL); + // XESAM TODO" + // INSERT INTO Events (ServiceID, ..., EventType) VALUES (str_file_id, ..., 'Delete') + g_free (name); g_free (path); } @@ -3637,8 +3643,16 @@ name = g_path_get_basename (info->uri); path = g_path_get_dirname (info->uri); + // Comment by Philip Van Hoof: + // Please verify that str_service_type_id must be the first argument. + // Reading the file sqlite-stored-procs.sql this doesn't seem to be + // true + tracker_exec_proc (db_con->index, "UpdateFile", str_service_type_id, path, name, info->mime, str_size, str_mtime, str_offset, str_file_id, NULL); - + + // XESAM TODO: + // INSERT INTO Events (ServiceID, ..., EventType) VALUES (str_file_id, ..., 'Update') + g_free (str_service_type_id); g_free (str_size); g_free (str_offset); @@ -4248,6 +4262,9 @@ /* update db so that fileID reflects new uri */ tracker_exec_proc (db_con, "UpdateFileMove", path, name, str_file_id, NULL); + // XESAM TODO: + // INSERT INTO Events (ServiceID, ..., EventType) VALUES (str_file_id, ..., 'Update') + /* update File:Path and File:Filename metadata */ tracker_db_set_single_metadata (db_con, "Files", str_file_id, "File:Path", path, FALSE); tracker_db_set_single_metadata (db_con, "Files", str_file_id, "File:Name", name, FALSE);
_______________________________________________ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list