On Fri, Aug 22, 2003 at 09:03:55AM -0700, Steven Roussey wrote: > > All the indexes were single indexes, partly because I haven't > > yet made the effort to understand composite index. I guess it's > > time ;-). > > Oh. > > There are better places to start than this list. ;) The manual can be a > great starting place, and several people on this list have written books > about MySQL which are great for getting started. You can look at > Amazon.com, etc.
No, I have them all, I just...I guess haven't gotten around to them yet. I mistakenly figured that individually indexing everything in sight would do the trick. > > mysql> EXPLAIN SELECT cg.cw FROM cg,q,cit,sref > > -> WHERE cg.q_id = q.id AND q.cit_id = cit.id AND cit.sref_id = > sref.id > > -> AND cg.cw LIKE 't%' > > -> AND sref.cd > DATE_SUB(CURDATE(), INTERVAL 6 MONTH) > > -> ORDER BY cg.cw > > -> LIMIT 1000,10; > > You do a range on the fist and last table in the chain. :( > > It is best to normalize your table structure. I should have noticed that > upfront. The 'simple, well-indexed query' through me off. Normalizing is > a great thing to learn, and probably the first thing to understand after > how to do a SELECT and composite indexes. I wonder if anyone else is still reading :-/. While I don't claim to be an expert by any means, and while it may not look like it, this database _is_ highly normalized, or at least I think it is; I put a lot of effort into making it so. Let me just explain what it all does. The database holds lexical data for use in linguistic analysis. For the tables we're looking at now, it works like this: sref contains the bibliographic details for a single physical book or other text, along with some metadata about when the text was keyed, who read it, etc. cit contains the bibliographic details for a quotation taken from a text; there might be only one cit per sref if these details don't change (e.g. a novel), or there might be many cits per sref if they do (e.g. a collection of essays, each one written by a different author at a different date). q contains a single quotation. cg contains information about specific words. I believe this is normalized because each element appears only once and is linked to others as necessary, so that, for example, a single quotation reading "This is a foobarred, conformiferous structiform" is stored once in the database, and "foobarred", "conformiferous", and "structiform" are each stored independently, joined to the q table on the relevant id fields. So if you had the word "structiform" and wanted to see the quotation, you could retrive it by the q_id that's stored in cg. It's the same for the other elements; there's only one sref section per text, so if you're at any other point and want to get the date the text was read, you have to join your way up to sref to retrieve sref.cd. Though I haven't shown them yet, there are also some other tables, holding authors at the sref level, subjects at the sref level, and authors at the cit level; for any of these, there can be any number of elements (i.e. no subjects, one subject, ten subjects etc.), so I normalized them by putting them in to separate tables linked to sref.id or cit.id etc. as appropriate. So for the queries we've been discussing, I'm looking for all the words beginning with "t" that have been keyed in the last six months; the reason I'm getting from the first and last table in the chain should now be obvious. In reality, I wouldn't just be retrieving the word (the cg.cw), I'd be retrieving both the quote and bibliographic info in cit, and the (unshown) author related to cit. Other queries can involve any combination of these factors--searching based on the fulltext content of a q, on an author, on a work title (perhaps not shown), on subjects, and so on and so on. I have created (individual) indexes on all the id fields, and on any value used in a search. In almost, if not every case, a search will involve a particular value used for the search as well as the id fields of relevant tables; what that means for multiple queries, I'm now not sure. I'll stop there and try to answer the suggestions in your other message. Thanks again for all the time you've been spending. Best, Jesse -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]