Re: Unique items from all columns, very slow
On 4/21/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Willie Gnarlson [EMAIL PROTECTED] wrote on 04/20/2005 05:46:25 PM: On 4/20/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Willie Gnarlson [EMAIL PROTECTED] wrote on 04/20/2005 01:56:03 PM: On 4/20/05, Dan Nelson [EMAIL PROTECTED] wrote: In the last episode (Apr 20), Willie Gnarlson said: I have been attempting to uncover a fast method for retrieving unique items from every column in a table. The table has approximately 60 thousand rows. Each row has an index associated with it, and running 'explain select distinct class from dlist' shows that it isusing the index. However, since there are 24 columns, running that select query for each column takes about 4 long seconds on a P4 2.4 Ghz. snip - see previous posts on this thread for details Okay, they aren't creating new values using that form. The form I've been talking about is a form to limit results from the database to only specific values for specific fields without the user having to know from memory what the possible values are (plus, they'll change every so often). (...) If I read that last part correctly, you have already identified the need for a mechanism for managing the value lists separately from what the user enters. In this case you should have everything in place (or almost in place) to put each list into its own table (Someone else suggested this design, too). Right, yes. (...) Run your queries once to build your lists then use another form (or even use a separate application) to manage the lists. Because it's a one-time event, getting the first set of unique values can take all night if it needs to. What you need to be fast is the building of the select boxes. By having those lists ready-to-go on their own tables, it will be as fast as a sequential read from the disk (very fast) or a read from the disk cache (faster) or a read from the query cache (fastest). In any case, running the queries will no longer slow you down. It actually seems slower. The separate tables from a previous try look like this: CREATE TABLE `ET` ( `ET` double NOT NULL default '0', PRIMARY KEY (`ET`), ) ENGINE=MyISAM DEFAULT CHARSET=latin1; That table has 4781 records, but some have more (one has ~18 thousand). There are 42 column tables. Querying all of them took 5.58 seconds. -- Willie Gnarlson -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Unique items from all columns, very slow
On 4/21/05, Willie Gnarlson [EMAIL PROTECTED] wrote: (...) It actually seems slower. The separate tables from a previous try look like this: CREATE TABLE `ET` ( `ET` double NOT NULL default '0', PRIMARY KEY (`ET`), ) ENGINE=MyISAM DEFAULT CHARSET=latin1; That table has 4781 records, but some have more (one has ~18 thousand). There are 42 column tables. Querying all of them took 5.58 seconds. I forgot to mention that those tables *do* have unique items, too. -- Willie Gnarlson -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Unique items from all columns, very slow
On 4/21/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Willie Gnarlson [EMAIL PROTECTED] wrote on 04/21/2005 01:39:15 PM: On 4/21/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Willie Gnarlson [EMAIL PROTECTED] wrote on 04/20/2005 05:46:25 PM: On 4/20/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Willie Gnarlson [EMAIL PROTECTED] wrote on 04/20/2005 01:56:03 PM: On 4/20/05, Dan Nelson [EMAIL PROTECTED] wrote: In the last episode (Apr 20), Willie Gnarlson said: I have been attempting to uncover a fast method for retrieving unique items from every column in a table. The table has approximately 60 thousand rows. Each row has an index associated with it, and running 'explain select distinct class from dlist' shows that it isusing the index. However, since there are 24 columns, running that select query for each column takes about 4 long seconds on a P4 2.4 Ghz. snip - see previous posts on this thread for details Okay, they aren't creating new values using that form. The form I've been talking about is a form to limit results from the database to only specific values for specific fields without the user having to know from memory what the possible values are (plus, they'll change every so often). (...) If I read that last part correctly, you have already identified the need for a mechanism for managing the value lists separately from what the user enters. In this case you should have everything in place (or almost in place) to put each list into its own table (Someone else suggested this design, too). Right, yes. (...) Run your queries once to build your lists then use another form (or even use a separate application) to manage the lists. Because it's a one-time event, getting the first set of unique values can take all night if it needs to. What you need to be fast is the building of the select boxes. By having those lists ready-to-go on their own tables, it will be as fast as a sequential read from the disk (very fast) or a read from the disk cache (faster) or a read from the query cache (fastest). In any case, running the queries will no longer slow you down. It actually seems slower. The separate tables from a previous try look like this: CREATE TABLE `ET` ( `ET` double NOT NULL default '0', PRIMARY KEY (`ET`), ) ENGINE=MyISAM DEFAULT CHARSET=latin1; That table has 4781 records, but some have more (one has ~18 thousand). There are 42 column tables. Querying all of them took 5.58 seconds. -- Willie Gnarlson I am SO confused!!! (and that's hard to do) You have a lookup list with eighteen thousand entries in it? Your users must pick 1 of 4781 floating point numbers from a select box? What kind of information are you trying to work with? I cannot imagine an application that uses data like you are describing. Can you help a poor fellow out and let us in on what you are working on? LOL! It sounds a little insane, I'll agree. Actually that table example I provided holds Elapsed Times. Many, many entries can contain the same ET. Here's a question, it may have taken 5.58 seconds for the first pass through all 42 tables but how fast was the second pass? Was that 5.58 measuring just the queries or the time it took to build the select boxes, No, I singled out the code up until the end of the queries, and only that is 5.58 seconds. Yipes, the CPU isn't under any real load now, previously it was. Sorry. It's still too slow I feel. I thought the caching might have helped, but apparently: 1 pass: 3.35 seconds 2 pass: 3.64 seconds (!?) 3 pass: 3.36 seconds too? As far as performance goes, It may be faster to check the user's entries during the form validation/processing phase than it will be to force them to enter the information the correct way the first time. I was trying to set it up a series of select boxes for limiting a search. I may have to re-think this if I can't get the data from the tables fast enough. -- Willie Gnarlson -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Unique items from all columns, very slow
On 4/20/05, Dan Nelson [EMAIL PROTECTED] wrote: In the last episode (Apr 20), Willie Gnarlson said: I have been attempting to uncover a fast method for retrieving unique items from every column in a table. The table has approximately 60 thousand rows. Each row has an index associated with it, and running 'explain select distinct class from dlist' shows that it is using the index. However, since there are 24 columns, running that select query for each column takes about 4 long seconds on a P4 2.4 Ghz. If there are a significant number of identical values (ages, sizes, etc), then setting PACK_KEYS=1 on the table may help by making smaller There are, but PACK_KEYS=1 seemed to shed about a second (just guessing, but it did seem only slightly faster) off the total query time. indexes. You might also be able to convert some columns to ENUMs, which will make their indexes even smaller (and let your SELECT be almost instantaneous if mysql knows how to optimize it right :) One column was already ENUM, but it wouldn't be possible to make any others that type. Another alternative would be to build a table for each column, holding the output of SELECT DISTINCT(column) FROM dlist. Rebuild it as frequently as needed. Okay, I decided to try that option, as it sounded like a good one. I created the tables, and since the main table would be rarely updated I was going to run DELETE FROM on the column tables and run the queries `INSERT INTO table (col) SELECT DISTINCT col FROM dlist' each time a new record was updated or added. It turns out that is almost as slow as the original method. So then I added an index to every column table, and it was still no better. I thought maybe generating a select box for each column's contents was the problem, since unique items in some tables are as much as 4 thousand, but singling out the code which performs the queries, eliminating the select box generation, confirmed that the queries were indeed the slow part. I may have to abandon the retrieving of unique items from every column in a table unless you or anyone else have any other ideas. Thanks for your help, Dan. -- Willie Gnarlson -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Unique items from all columns, very slow
On 4/20/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Willie Gnarlson [EMAIL PROTECTED] wrote on 04/20/2005 01:56:03 PM: On 4/20/05, Dan Nelson [EMAIL PROTECTED] wrote: In the last episode (Apr 20), Willie Gnarlson said: I have been attempting to uncover a fast method for retrieving unique items from every column in a table. The table has approximately 60 thousand rows. Each row has an index associated with it, and running 'explain select distinct class from dlist' shows that it is using the index. However, since there are 24 columns, running that select query for each column takes about 4 long seconds on a P4 2.4 Ghz. If there are a significant number of identical values (ages, sizes, etc), then setting PACK_KEYS=1 on the table may help by making smaller There are, but PACK_KEYS=1 seemed to shed about a second (just guessing, but it did seem only slightly faster) off the total query time. indexes. You might also be able to convert some columns to ENUMs, which will make their indexes even smaller (and let your SELECT be almost instantaneous if mysql knows how to optimize it right :) One column was already ENUM, but it wouldn't be possible to make any others that type. Another alternative would be to build a table for each column, holding the output of SELECT DISTINCT(column) FROM dlist. Rebuild it as frequently as needed. Okay, I decided to try that option, as it sounded like a good one. I created the tables, and since the main table would be rarely updated I was going to run DELETE FROM on the column tables and run the queries `INSERT INTO table (col) SELECT DISTINCT col FROM dlist' each time a new record was updated or added. It turns out that is almost as slow as the original method. So then I added an index to every column table, and it was still no better. I thought maybe generating a select box for each column's contents was the problem, since unique items in some tables are as much as 4 thousand, but singling out the code which performs the queries, eliminating the select box generation, confirmed that the queries were indeed the slow part. I may have to abandon the retrieving of unique items from every column in a table unless you or anyone else have any other ideas. (...) If you are using select boxes for every column, how can you possibly worry about new or changed values as the users must pick from a fixed list of values, don't they? I am building (or attempting tobuild) the select boxes on the fly based on unique items from columns in the table. You don't rebuild _all_ of your umpteen pick-list tables from scratch every time, detect just the new or changed values and add/update them (which goes Yes, that would be the smarter thing to do, you're right. But alas that can be set aside for my testing so far. At this point I'm simply trying to get the values from the columns in a reasonable amount of time. back to my original question of how are the users creating new values if you are really using a select box and not something else like a combo box (part list, part text field)) Okay, they aren't creating new values using that form. The form I've been talking about is a form to limit results from the database to only specific values for specific fields without the user having to know from memory what the possible values are (plus, they'll change every so often). Maybe if I understood your front-end application a little bit better, I could help you work around the issue. The way you described your user interface design made me wince; something just doesn't feel right about it. No, it's not as bad as it sounds. :-) (I hope) -- Willie Gnarlson -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Unique items from all columns, very slow
Hello, I have been attempting to uncover a fast method for retrieving unique items from every column in a table. The table has approximately 60 thousand rows. Each row has an index associated with it, and running 'explain select distinct class from dlist' shows that it is using the index. However, since there are 24 columns, running that select query for each column takes about 4 long seconds on a P4 2.4 Ghz. I've been unable to find any suitable leads through google or the list archives. Any cluebats will be gleefully appreciated. Thanks for reading! -- Willie Gnarlson -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]