On Thu, 2010-08-12 at 14:57 -0400, Jamie McCracken wrote: > On Thu, 2010-08-12 at 14:54 -0400, Jamie McCracken wrote: > > your proposal sounds fine - what are you complaining about? > > > > Only one thing stands out - direct access. You will surely need IPC to > > signal changes made by a direct access user as well as to receive them > > I should elaborate > > If i insert a new record via direct access, how will other users be > informed? Note I might not know the ID of the inserted item or will the > system be smart enough to work out what signals and Ids to send from the > sparql and do it automatically behind the scenes?
(again) Direct-access can't insert new records, it's read-only (SELECT only). With FD-passing (on the Steroids D-Bus object) and the traditional D-Bus's Resources object you have the UpdateBlank() APIs which will return you the URLs of anonymous blank nodes. When you pass that to a tracker:id(string subject) in a SPARQL query, you'll get the ID. Not sure if that's what you meant? Cheers, Philip > > > > jamie > > > > On Thu, 2010-08-12 at 20:43 +0200, Philip Van Hoof wrote: > > > Comon guys, > > > > > > I know I'm a natural born pessimist, and I know I shouldn't be. But > > > still, there must be *something* wrong about this proposal?! > > > > > > Nobody is commenting at all? You know, the idea of posting it here is to > > > get some discussion going "before" I implement it ;-) > > > > > > Ping, everybody! > > > > > > Cheers, > > > > > > Philip > > > > > > > > > On Thu, 2010-08-12 at 15:03 +0200, Philip Van Hoof wrote: > > > > A new class signal for Tracker > > > > > > > > Today's situation > > > > > > > > Today we have a simple signal system that causes quite a bit of > > > > overhead which we over time tried to reduce. The overhead comes from: > > > > A. Having to store the URIs of the resources involved in a > > > > changeset in tracker-store's memory; > > > > B. Having to store the predicates involved in a changeset in > > > > tracker-store's memory (although far less severe than #1); > > > > C. Having to UTF-8 validate the strings when we emit them over > > > > D-Bus (D-Bus does this implicitly); > > > > D. D-Bus's own copying and handling of string data; > > > > E. Heavy traffic on D-Bus; > > > > F. Context switching between tracker-store and dbus-daemon; > > > > G. We have to wait with turning on the D-Bus objects until after > > > > we have the latest ontology. So after journal replay. And we > > > > need to reset the situation after a backup restore. Complex! > > > > Besides this overhead there are problems the consumers have too. I'll > > > > make a list in the next section. > > > > > > > > Problems of today's signal > > > > 1. Aforementioned overhead: consumes a lot of D-Bus traffic. This > > > > is caused by sending over URLs for the subjects and the > > > > predicates; > > > > 2. Doesn't make it possible, in case of a delete of <a>, to know > > > > <b> in <a> nfo:isLogicalPartOf <b>, as <a> is removed at the > > > > point of signal emission; > > > > 3. Round trips to know the literals create more D-Bus traffic; > > > > 4. Transactional changes can't be reliably identified with > > > > SubjectsAdded, SubjectsChanged and SubjectsRemoved being > > > > separate signals; > > > > 5. A lot of D-Bus objects, instead of letting clients use D-Bus's > > > > filtering system. > > > > > > > > The drive for a solution > > > > > > > > Jürg Billeter and me brainstormed a bit about all these problems. Last > > > > few months while optimizing tracker-store's INSERT performance and > > > > memory utilization, we brainstormed a lot about how we could reduce > > > > the overhead. I believe we have a good idea of the current situation, > > > > its internal problems and our current solution (hey of course, we > > > > implemented it :p). > > > > > > > > We also gained know how about most of the problems consumers have from > > > > the maintainer of libqttracker, Petteri Iridian Kiiskinen. Thanks > > > > Iridian! > > > > > > > > Today I believe that we must abandon the old ship, redo the signal > > > > system, break the API. Break it all. Get over it, heal our wounds. > > > > Even if that means taking the stress away from all sorts of people > > > > who've been using the old signal system, offering massages, giving out > > > > sauna coupons. You know, the usual stuff that we won't do for real. > > > > Although I'm sure that at a next code-camp in Helsinki we'll have a > > > > good sauna to burn all our own stress away. > > > > > > > > Anyway ... *shrug* > > > > > > > > A proposed solution > > > > > > > > Part one: Direct access > > > > With direct-access we will reduce the round-trip cost of a query from > > > > a consumer who wants a literal object involved in a changeset: it'll > > > > be executed directly on meta.db; you wont use libsqlite's API yourself > > > > but libtracker-sparql. However, libtracker-sparql is for direct-access > > > > a layer on top of aforementioned libsqlite. The so-called "round-trip" > > > > won't even involve IPC: by utilizing the TrackerSparqlCursor API, > > > > you'll end up doing sqlite3_step() in your own process, directly on > > > > meta.db. > > > > > > > > For the consumers of the signal, this removes 3. > > > > > > > > Part two: Sending IDs > > > > A while ago we introduced the SPARQL function tracker:id(). The > > > > tracker:id() function gives you a unique number that Tracker's RDF > > > > store internally. It's not RDF, RDF uses subject URL strings. We just > > > > convert this internally for performance reasons, and with tracker:id() > > > > you can access that. > > > > > > > > Each resource, each class and each predicate (latter two are resources > > > > like any other) have such an unique internal ID. > > > > > > > > Given that Tracker's class signal system isn't RDF anyway, we decided > > > > not to give you subject URL strings in it anymore. Instead, we'll give > > > > you these integer IDs. > > > > > > > > This for us removes A, B, C, D and E. For the consumers of the signal, > > > > this removes 1. Whoohoo! > > > > > > > > Part three: Combine SubjectsAdded and SubjectsChanged, and put > > > > SubjectsRemoved in the same signal > > > > So we give you two arrays: Inserts and Deletes. > > > > > > > > For consumers of the signal, this removes 4. > > > > > > > > Part five: Add the class name to the signal > > > > This allows you to use a string filter on your signal subscription in > > > > D-Bus. > > > > > > > > For us this removes G. For consumers of the signal, this removes 5. > > > > > > > > Part six: Pass the object-id for resource objects > > > > You'll get a third number in the Inserts and Deletes arrays: > > > > object-id. We wont send you object literals, although for integral > > > > objects we're still discussing this. But for resource objects we can > > > > without much extra cost give you the object-id. > > > > > > > > For consumers of the signal, this removes 2. Whoohoo (this was a hard > > > > one)! > > > > > > > > Part seven: SPARQL IN, tracker:id() and tracker:subject() > > > > We recently added support for SPARQL IN, we already have tracker:id() > > > > and we'll implement tracker:subject(). > > > > > > > > This makes things like this possible: > > > > > > > > SELECT ?t { ?r nie:title ?t . > > > > FILTER (tracker:id(?r) IN (800, 801, 802, 807)) } > > > > > > > > Where 800, 801, 802 and 807 will be the IDs that you receive in the > > > > class signal. > > > > > > > > The tracker:subject() SPARQL function will allow you to make a very > > > > fast version of this: > > > > > > > > SELECT ?s { ?s a rdfs:Resource . > > > > FILTER (tracker:id(?s) IN (800)) } > > > > > > > > So it would be something like ... (not sure that you can omit { } in > > > > SPARQL, though): > > > > > > > > SELECT tracker:subject (800) > > > > > > > > For consumers this removes most of the burden introduced by IDs. > > > > Consumers are also advised to keep a local Map<tracker:id(), subject> > > > > to avoid a lot of SPARQL queries. Although with direct-access it might > > > > be just fine. > > > > > > > > Part eight: What is left? > > > > > > > > What is left is context switching between tracker-store and > > > > dbus-daemon, F. But that's our problem. We'll reduce them by grouping > > > > transactions and signals together. It's mostly a problem on ARM > > > > hardware, but yeah that's a major and important target platform for > > > > us. We're on it, we will care about this! > > > > > > > > Let's take a look! > > > > > > > > <node name="/org/freedesktop/Tracker1/Resources"> > > > > <interface name="org.freedesktop.Tracker1.Resources.Class"> > > > > <signal name="class-signal"> > > > > <arg type="s" name="class-name" /> > > > > <arg type="a(iii)" name="inserts" /> > > > > <arg type="a(iii)" name="deletes" /> > > > > </signal> > > > > </interface> > > > > </node> > > > > > > > > Or in short: sa(iii)a(iii). Here's a bit of pseudo code how it'll look > > > > clientside: > > > > > > > > void m_callback (cursor) { > > > > while (cursor.next()) { > > > > // With direct-access are these c.next()s, sqlite_step() calls > > > > print ("title: %s", cursor.get_string ()); > > > > } > > > > } > > > > > > > > void on_signal (class_name, deleted, inserted) { > > > > string in_qry = "", qry; > > > > bool first = true; > > > > > > > > foreach (insert in inserted) { > > > > if (insert.subject_id is_in (my_resources)) { > > > > if (!first) { in_qry += ", "; } > > > > in_qry += insert.subject_id > > > > first = false; > > > > } > > > > } > > > > > > > > qry = string.printf ("SELECT ?titles { ?r nie:title ?titles . > > > > FILTER (tracker:id(?r) IN (%s)) }", in_qry); > > > > > > > > connection.query_async (qry, m_callback); > > > > } > > > > > > > > > > > > Cheers! :-) > > > > > > > > Philip > > > > > > > > > > > > -- > > > > > > > > > > > > Philip Van Hoof > > > > phi...@codeminded.be > > > > freelance software developer > > > > Codeminded BVBA - http://codeminded.be > > > > _______________________________________________ > > > > tracker-list mailing list > > > > tracker-list@gnome.org > > > > http://mail.gnome.org/mailman/listinfo/tracker-list > > > > > > -- > > > > > > > > > Philip Van Hoof > > > freelance software developer > > > Codeminded BVBA - http://codeminded.be > > > > > > _______________________________________________ > > > tracker-list mailing list > > > tracker-list@gnome.org > > > http://mail.gnome.org/mailman/listinfo/tracker-list > > > > > -- Philip Van Hoof freelance software developer Codeminded BVBA - http://codeminded.be _______________________________________________ tracker-list mailing list tracker-list@gnome.org http://mail.gnome.org/mailman/listinfo/tracker-list