On Sun, Jul 26, 2009 at 9:23 PM, Jay A. Kreibich<j...@kreibi.ch> wrote: > On Sat, Jul 25, 2009 at 09:26:16AM -0700, Jim Showalter scratched on the wall: >> You can have the tags in a separate table that has a foreign-key to >> the table with the rows in in you want to tag, > > That's essentially what the OP is doing, except they've built a > many-to-many relationship using a bridge table, rather than a > one-to-many using a simple foreign key. This allows any tag to be > associated with any row without duplicating any of the tags, allowing > for better data normalization.
Indeed. My query is working quite well for now. I do realize that it will not be efficient for very large datasets, but it is quite good for my needs. > >> If you need to search for >> multiple tags at a time, each requires a join: > > Right, and that's the problem. If you've got an application that > allows you to search for tags with an arbitrary number of tags, you > need a different query for each situation (one tag, two tags, etc.). > Also, adding two more JOINs (since you need to join through the > bridge table) for each additional search target means your performance > degrades very quickly when you've got four or five search tags. > > This is why I suggested looking at Relational division. Even if SQL > doesn't support it directly, there are other ways of doing it to make > the query more generic. With division, you can do something like (in the > terms of the OP's post): > > (foo JOIN foo_tag JOIN tag) / (search_tags) = foo with search_tags > > This basically builds a table of all foo:tag combinations and then > divides out our search tags, resulting in a list of foo.f_id values > that have an association with all of the tags found in search_tags. > You can then JOIN the results of this expression back to the tags. I gave relational division a gander. The text was a bit thick so my eyes started glazing after a while, but it piqued my curiosity sufficiently that I am going to go back to it and try and understand it better. > The OP asked to have the search tags removed from the results, which > is also a bit of a trick to do in a generic way. Yes, this is a "view" problem that is fairly easily solved in the application layer. Essentially, one wants to see all the tags for the result returned for a given tag... for example, for a given nationality, "US," show me all the ethnicities, but, since I already know that all the rows in the result are for nationality = "US," there is no point in displaying "US" along with the rows. Much easier to do this in the application logic. > > Anyways... The relational division makes any query search -- from > 1 to 1000 tags -- have the same structure. As long as the division > performance is good, the query performance shouldn't horribly > degrade as you add more search tags. > >> select t >> from tagged as t >> join t.tags as tag1 >> join t.tags as tag2 >> where tag1 = some value >> and tag2 = some value. > > This points in the direction of a one-to-many structure, only the > syntax is not making sense to me. JOINs work on tables, not columns, > and WHERE expressions work on columns, not tables. > > You're also only doing the first half of the problem, since you need > to join this back to the tags to get the result the OP was looking for > (including pulling back out the search targets). It's a trickier > problem than it first looks, especially if you want to do it in one > SQL statement with a minimal number of sub-SELECTs. > > -j > >> ----- Original Message ----- >> From: "Jay A. Kreibich" <j...@kreibi.ch> >> To: <punk...@eidesis.org>; "General Discussion of SQLite Database" >> <sqlite-users@sqlite.org> >> Sent: Friday, July 24, 2009 9:21 PM >> Subject: Re: [sqlite] a system for arbitrarily tagging rows in a table >> >> >> > On Fri, Jul 24, 2009 at 09:20:29PM -0500, P Kishor scratched on the >> > wall: >> >> I am trying to develop a "tagging" system, whereby each row in a >> >> table >> >> can be tagged with arbitrary number of tags. >> > >> > This smells of a Relational division problem. If you're dealing >> > with >> > tags you might want to have a look at that (Celko has a few good >> > articles on it). Since SQL lacks a native Relational division >> > operator, chances are a solution in that direction is going to be >> > more complex -- at least for this problem. But any time I've done >> > tags or attributes, sooner or later I find myself needing to do a >> > division. They come in handy any time you say "my data is vertical >> > but I need it horizontal." You might want to read up on them just >> > to >> > have that knowledge available. >> > >> > >> >> TABLE foo (f_id INTEGER PRIMARY KEY, f_name TEXT); >> >> TABLE tag (t_id INTEGER PRIMARY KEY, t_name TEXT); >> >> TABLE foo_tag (f_id INTEGER, t_id INTEGER); >> > >> > >> >> I have the following solution. Could I do better or differently? >> > >> > I'm not sure about "better", but here's different: >> > >> > sqlite> SELECT foo.f_name, tag.t_name >> > ...> FROM tag AS target >> > ...> NATURAL JOIN foo_tag AS target_ft >> > ...> NATURAL JOIN foo >> > ...> NATURAL JOIN foo_tag >> > ...> NATURAL JOIN tag >> > ...> WHERE target.t_name = 'bar' >> > ...> AND tag.t_id != target.t_id >> > ...> ORDER BY foo.f_name, tag.t_name; >> > >> > This basically folds your IN sub-select back into the main query. >> > We join "foo" to the tag table in two directions... one to find >> > the search target tag id and the other to produce the output. >> > >> > "tag AS target" with the first WHERE clause should return one row. >> > We join that through "foo_tag AS target_ft" to get a list of foo >> > ids >> > that have the search target tag. We then build the normal output >> > list by joining that back through the foo_tag bridge table to the >> > tags, and throw out any rows with an output tag id that matches the >> > search target tag id. >> > >> > Simple! >> > >> > Best of all, the target only appears in the query once, and your >> > name >> > convention means we can use NATURAL JOINs to keep things clean. It >> > also totally falls apart if you need to search on more than one >> > tag. >> > That's where Relational division comes in. >> > >> > -j (who had to draw a picture to get it right) > > -- > Jay A. Kreibich < J A Y @ K R E I B I.C H > > > "Our opponent is an alien starship packed with atomic bombs. We have > a protractor." "I'll go home and see if I can scrounge up a ruler > and a piece of string." --from Anathem by Neal Stephenson > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Puneet Kishor http://www.punkish.org Carbon Model http://carbonmodel.org Charter Member, Open Source Geospatial Foundation http://www.osgeo.org Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor Nelson Institute, UW-Madison http://www.nelson.wisc.edu ----------------------------------------------------------------------- Assertions are politics; backing up assertions with evidence is science ======================================================================= Sent from Madison, WI, United States _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users