Re: are my queries bloated?
On Saturday, December 8, 2001, at 12:53 PM, Arjen G. Lentz wrote: Re-order the list of tables in the FROM, and put the subprojectweb.subproject_name = 'shoeshine.com' bit first after the WHERE, that will also make it clearer for you to read. Is this just for my own personal clarity? I was under the impression that the exact order of the JOINs wouldn't matter very much, but I haven't found any evidence of this yet. Join order *does* matter, the table with the search criteria should generally be checked first, most limiting the number of rows. Then the other tables are joined into the result using their foreign keys. You don't want te query to be performed based on the foreign keys, with the search criteria being applied last! I didn't realize that the order of JOIN statements mattered... For a regular join with commas, the optimiser will try to work out an optimal join order. However, you will need to have a look at the output from EXPLAIN, which will show the tables in the order they are joined. If it is non-optimial, you could modify your query by using STRAIGHT_JOIN or other tricks to get an optimal join order. ... but now I feel like I have a better grasp of how it works. I guess the best thing to do is just examine the results of the EXPLAIN. Thanks very much for responding to my questions about this, Arjen. You're quite welcome. This kind of stuff (optimising) is an important subject in MySQL training courses (www.mysql.com/training/). I wish! At some point in the future my organization might grant me that kind of benefit, but for the time being I'm just an office temp who is having a hard enough time trying to justify his project!! Basically, I proposed this project because I wanted to learn more about database-driven web applications using MySQL and PHP and because my organization needed a way to keep track of projects and image files. I really am hoping to take this to a new level eventually, a sort of web-based workspace that will include message boards and other functionality to keep all of the users in touch. I would love to take a class on this subject! -- Erik - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: are my queries bloated?
Hi Erik, - Original Message - From: Erik Price [EMAIL PROTECTED] I see. If I mentally apply this advice to the rest of my database, it seems that I should index (or apply a prefix index) to any column that holds search criteria -- such as files.file_name or subprojectweb.subproject_name.I would assume that I should really only do this with the most-commonly-searched columns, because indexing every column would be unwieldy. As well as useless: the server can only use 1 index per table for any particular query. A good rule would be: find the most limiting factor in your search criteria, a field that is tested for something that will most limit the number of rows returned. Perhaps I should just go with what I have, and once the database (which will only have 30 users or so for now) is cooking, I can go in and analyze the queries and the optimizer and determine where best to place the indexes. In theory that is fine. However, when your user numbers grow, if you find later that you want to redesign some table structure (beyond simply adding indexes) for better results, it will be more tricky since it involves copying the db and putting new scripts into place. So it IS quite important to get the basic design right before your userbase starts to grow. For now, the only indexes I have are the PRIMARY KEYs on the *_id columns (for unique ID numbers) and the UNIQUE INDEX on the middle table (foreign key) between files and projects. Well, as noted, you need to verify that an index is actually being used, by looking at EXPLAIN SELECT In case of multiple indexes on a table, you'll also want check that the best one has been chosen. I read in my book that one way to do this testing is to just reconstruct a table using CREATE TEMPORARY TABLE and go ahead and make changes to this temporary table, and anything that seems to run faster can be later applied to the actual table. Sure. But adding/removing an index on a small data set is easy enough, you can try that kind of stuff on the main tables. Re-order the list of tables in the FROM, and put the subprojectweb.subproject_name = 'shoeshine.com' bit first after the WHERE, that will also make it clearer for you to read. Is this just for my own personal clarity? I was under the impression that the exact order of the JOINs wouldn't matter very much, but I haven't found any evidence of this yet. Join order *does* matter, the table with the search criteria should generally be checked first, most limiting the number of rows. Then the other tables are joined into the result using their foreign keys. You don't want te query to be performed based on the foreign keys, with the search criteria being applied last! For a regular join with commas, the optimiser will try to work out an optimal join order. However, you will need to have a look at the output from EXPLAIN, which will show the tables in the order they are joined. If it is non-optimial, you could modify your query by using STRAIGHT_JOIN or other tricks to get an optimal join order. Thanks very much for responding to my questions about this, Arjen. You're quite welcome. This kind of stuff (optimising) is an important subject in MySQL training courses (www.mysql.com/training/). Regards, Arjen. -- MySQL Training Worldwide, http://www.mysql.com/training/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Mr. Arjen G. Lentz [EMAIL PROTECTED] / /|_/ / // /\ \/ /_/ / /__ MySQL AB, Technical Writer, Trainer /_/ /_/\_, /___/\___\_\___/ Brisbane, QLD Australia ___/ www.mysql.com - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: are my queries bloated?
http://www.tums.com On Friday, December 7, 2001, at 06:58 AM, Erik Price wrote: On Thursday, December 6, 2001, at 08:35 PM, Arjen G. Lentz wrote: There is also the one part that limits your selection (shoeshine.com). So that is very important. I've quoted the table structure of the subprojectweb table above, and the 'subproject_name' field is not indexed. Therefore the server will have to do a table scan on the subprojectweb table, to find the rows that match the specified name. So, add an index on that field, or at least a prefix. You don't need to index all 64 chars of it. I see. If I mentally apply this advice to the rest of my database, it seems that I should index (or apply a prefix index) to any column that holds search criteria -- such as files.file_name or subprojectweb.subproject_name.I would assume that I should really only do this with the most-commonly-searched columns, because indexing every column would be unwieldy. Perhaps I should just go with what I have, and once the database (which will only have 30 users or so for now) is cooking, I can go in and analyze the queries and the optimizer and determine where best to place the indexes. For now, the only indexes I have are the PRIMARY KEYs on the *_id columns (for unique ID numbers) and the UNIQUE INDEX on the middle table (foreign key) between files and projects. I read in my book that one way to do this testing is to just reconstruct a table using CREATE TEMPORARY TABLE and go ahead and make changes to this temporary table, and anything that seems to run faster can be later applied to the actual table. Re-order the list of tables in the FROM, and put the subprojectweb.subproject_name = 'shoeshine.com' bit first after the WHERE, that will also make it clearer for you to read. Is this just for my own personal clarity? I was under the impression that the exact order of the JOINs wouldn't matter very much, but I haven't found any evidence of this yet. Thanks very much for responding to my questions about this, Arjen. Erik - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail mysql-unsubscribe- [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php --- René Fournier [EMAIL PROTECTED] - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: are my queries bloated?
I haven't populated this database yet. There are actually many more columns, this was kind of a rough draft -- I only featured the ones that I needed to see if my JOINs were okay. I just got a little nervous when I drafted a few sample queries and saw that much text !! That makes me feel much relieved, thanks Rodney. I suppose there's nothing wrong after all but, it seemed like a lot of joins (more than I see in other examples). I guess I will get started with the PHP !!! -- Erik On Thursday, December 6, 2001, at 06:05 PM, Rodney Broom wrote: What you've got looks fine to me. Joins like this are not uncommon at all. I wonder, where is your concern with this? Is it in how much text you've typed to create a query, or is the query itself actually running slowly? - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: are my queries bloated?
Hi Erik, - Original Message - From: Erik Price [EMAIL PROTECTED] mysql SHOW COLUMNS FROM subprojectweb; +-+---+--+-+-+ | Field | Type | Null | Key | Default | +-+---+--+-+-+ | subproject_id | mediumint(8) unsigned | | PRI | NULL| | subproject_name | varchar(64) | | | | +-+---+--+-+-+ 2 rows in set (0.01 sec) So if I want to find out which files were used in a web subproject called shoeshine.com, do I really use the following query? SELECT files.file_name FROM files, projfile, projects, subprojectweb WHERE files.file_id = projfile.file_id AND projfile.project_id = projects.project_id AND projects.subprojectweb_id = subprojectweb.subproject_id AND subprojectweb.subproject_name = 'shoeshine.com' ; This is a HUGE query. Well, maybe it's not -- maybe there are lots of queries this big. The point is, I rarely see reference to a query this big on this list or in my book. Is this the norm? Or am I doing something weird, am I not getting this relational database concept? Well, books give examples that need to be clear. A 3 or 4-way join is theoretically the same as a 2-way join, the same 'rules' apply. The above query is actually quite small ;-) There is a prob with the above query though, in that it will turn out to be very slow when your tables grow. When you look at the WHERE clause, most of it just deals with the table relationships. That's fine. There is also the one part that limits your selection (shoeshine.com). So that is very important. I've quoted the table structure of the subprojectweb table above, and the 'subproject_name' field is not indexed. Therefore the server will have to do a table scan on the subprojectweb table, to find the rows that match the specified name. So, add an index on that field, or at least a prefix. You don't need to index all 64 chars of it. Re-order the list of tables in the FROM, and put the subprojectweb.subproject_name = 'shoeshine.com' bit first after the WHERE, that will also make it clearer for you to read. Then, run the query with EXPLAIN in front of the SELECT, and look at the output. With the EXPLAIN command, the server will tell you how the optimiser has organised query execution: in which order will the tables be joined, and which indexes can and will be used. You'll want to set up your query so that the most limiting stuff is done first, and then the other tables are just used based on their primary key, one for each matching row in the first table. For this type of query, that'd be the fastest route. If that's not what happens (see EXPLAIN) you can rephrase your query so that it is (for instance by using STRAIGHT_JOIN). There's quite a bit of background info involved in the above, which I can't possibly post in a short message. So, it's just some things to point you in the right direction By the way, this is the kind of cool stuff you learn as part of a MySQL training course (www.mysql.com/training/), the bit we call Make The Dolphin Fly ;-) Regards, Arjen. -- MySQL Training Worldwide, http://www.mysql.com/training/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Mr. Arjen G. Lentz [EMAIL PROTECTED] / /|_/ / // /\ \/ /_/ / /__ MySQL AB, Technical Writer, Trainer /_/ /_/\_, /___/\___\_\___/ Brisbane, QLD Australia ___/ www.mysql.com - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php