Re: Unique items from all columns, very slow

2005-04-21 Thread Willie Gnarlson
On 4/21/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Willie Gnarlson [EMAIL PROTECTED] wrote on 04/20/2005 05:46:25  PM:
   On 4/20/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
Willie Gnarlson [EMAIL PROTECTED] wrote on 04/20/2005
 01:56:03 PM:
  On 4/20/05, Dan Nelson [EMAIL PROTECTED] wrote:
   In the last episode (Apr 20), Willie Gnarlson said:
I have been attempting to uncover a fast method for retrieving
 unique
items from every column in a table. The table has approximately
 60
thousand rows. Each row has an index associated with it, and
 running
'explain select distinct class from dlist' shows that it isusing
 the
index.
   
However, since there are 24 columns, running that select query
 for
each column takes about 4 long seconds on a P4 2.4 Ghz.
   
  snip - see previous posts on this thread for details
   Okay, they aren't creating new values using that form. The form I've
   been talking about is a form to limit results from the database to
   only specific values for  specific fields without the user having to
   know from memory what the possible values are (plus, they'll change
   every so often).
 

(...)

  
 If I read that last part correctly, you have already identified the need for
 a mechanism for managing the value lists separately from what the user
 enters. In this case you should have everything in place (or almost in
 place) to put each list into its own table (Someone else suggested this
 design, too). 

Right, yes.
  
(...)
 Run your queries once to build your lists then use another form (or even use
 a separate application) to manage the lists. Because it's a one-time event,
 getting the first set of unique values can take all night if it needs to.
 What you need to be fast is the building of the select boxes. By having
 those lists ready-to-go on their own tables, it will be as fast as a
 sequential read from the disk (very fast) or a read from the disk cache
 (faster) or a read from the query cache (fastest). In any case, running the
 queries will no longer slow you down. 

It actually seems slower. The separate tables from a previous try look
like this:

CREATE TABLE `ET` (
  `ET` double NOT NULL default '0',
  PRIMARY KEY  (`ET`),
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

That table has 4781 records, but some have more (one has ~18
thousand). There are 42 column tables.

Querying all of them took 5.58 seconds.

-- 
Willie Gnarlson

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Unique items from all columns, very slow

2005-04-21 Thread Willie Gnarlson
On 4/21/05, Willie Gnarlson [EMAIL PROTECTED] wrote:

(...)

 It actually seems slower. The separate tables from a previous try look
 like this:
 
 CREATE TABLE `ET` (
   `ET` double NOT NULL default '0',
   PRIMARY KEY  (`ET`),
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
 
 That table has 4781 records, but some have more (one has ~18
 thousand). There are 42 column tables.
 
 Querying all of them took 5.58 seconds.

I forgot to mention that those tables *do* have unique items, too. 

-- 
Willie Gnarlson

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Unique items from all columns, very slow

2005-04-21 Thread Willie Gnarlson
On 4/21/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Willie Gnarlson [EMAIL PROTECTED] wrote on 04/21/2005 01:39:15
 PM:
  On 4/21/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
   Willie Gnarlson [EMAIL PROTECTED] wrote on 04/20/2005
  05:46:25  PM:
 On 4/20/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
  Willie Gnarlson [EMAIL PROTECTED] wrote on 04/20/2005
   01:56:03 PM:
On 4/20/05, Dan Nelson [EMAIL PROTECTED] wrote:
 In the last episode (Apr 20), Willie Gnarlson said:
  I have been attempting to uncover a fast method for
 retrieving
   unique
  items from every column in a table. The table has
 approximately
   60
  thousand rows. Each row has an index associated with it,
 and
   running
  'explain select distinct class from dlist' shows that it
 isusing
   the
  index.
 
  However, since there are 24 columns, running that select
 query
   for
  each column takes about 4 long seconds on a P4 2.4 Ghz.

snip - see previous posts on this thread for details
 Okay, they aren't creating new values using that form. The form
 I've
 been talking about is a form to limit results from the database to
 only specific values for  specific fields without the user having
 to
 know from memory what the possible values are (plus, they'll change
 every so often).

 
  (...)
 
  
   If I read that last part correctly, you have already identified the
 need for
   a mechanism for managing the value lists separately from what the user
   enters. In this case you should have everything in place (or almost in
   place) to put each list into its own table (Someone else suggested
 this
   design, too).
 
  Right, yes.
 
  (...)
   Run your queries once to build your lists then use another form (or
 even use
   a separate application) to manage the lists. Because it's a one-time
 event,
   getting the first set of unique values can take all night if it needs
 to.
   What you need to be fast is the building of the select boxes. By
 having
   those lists ready-to-go on their own tables, it will be as fast as a
   sequential read from the disk (very fast) or a read from the disk
 cache
   (faster) or a read from the query cache (fastest). In any case,
 running the
   queries will no longer slow you down.
 
  It actually seems slower. The separate tables from a previous try look
  like this:
 
  CREATE TABLE `ET` (
`ET` double NOT NULL default '0',
PRIMARY KEY  (`ET`),
  ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
 
  That table has 4781 records, but some have more (one has ~18
  thousand). There are 42 column tables.
 
  Querying all of them took 5.58 seconds.
 
  --
  Willie Gnarlson
 
 I am SO confused!!! (and that's hard to do) You have a lookup list with
 eighteen thousand entries in it? Your users must pick 1 of 4781 floating
 point numbers from a select box? What kind of information are you trying
 to work with? I cannot imagine an application that uses data like you are
 describing. Can you help a poor fellow out and let us in on what you are
 working on?

LOL! 

It sounds a little insane, I'll agree. Actually that table example I
provided holds Elapsed Times. Many, many entries can contain the same
ET.

 Here's a question, it may have taken 5.58 seconds for the first pass
 through all 42 tables but how fast was the second pass? Was that 5.58
 measuring just the queries or the time it took to build the select boxes,

No, I singled out the code up until the end of the queries, and only
that is 5.58 seconds. Yipes, the CPU isn't under any real load now,
previously it was. Sorry. It's still too slow I feel.

I thought the caching might have helped, but apparently:

1 pass:  3.35 seconds
2 pass:  3.64 seconds (!?)
3 pass:  3.36 seconds

 too? As far as performance goes, It may be faster to check the user's
 entries during the form validation/processing phase than it will be to
 force them to enter the information the correct way the first time.

I was trying to set it up a series of select  boxes for limiting a
search. I may have to re-think this if I can't get the data from the
tables fast enough.

-- 
Willie Gnarlson

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Unique items from all columns, very slow

2005-04-20 Thread Willie Gnarlson
On 4/20/05, Dan Nelson [EMAIL PROTECTED] wrote:
 In the last episode (Apr 20), Willie Gnarlson said:
  I have been attempting to uncover a fast method for retrieving unique
  items from every column in a table. The table has approximately 60
  thousand rows. Each row has an index associated with it, and running
  'explain select distinct class from dlist' shows that it is using the
  index.
 
  However, since there are 24 columns, running that select query for
  each column takes about 4 long seconds on a P4 2.4 Ghz.
 
 If there are a significant number of identical values (ages, sizes,
 etc), then setting PACK_KEYS=1 on the table may help by making smaller

There are, but PACK_KEYS=1 seemed to shed about a second (just
guessing, but it did seem only slightly faster) off the total query
time.

 indexes.  You might also be able to convert some columns to ENUMs,
 which will make their indexes even smaller (and let your SELECT be
 almost instantaneous if mysql knows how to optimize it right :)

One column was already ENUM, but it wouldn't be possible to make any
others that type.
 
 Another alternative would be to build a table for each column, holding
 the output of SELECT DISTINCT(column) FROM dlist.  Rebuild it as
 frequently as needed.

Okay, I decided to try that option, as it sounded like a good one. I
created the tables, and since the main table would be rarely updated I
was going to run DELETE FROM on the column tables and run the queries
`INSERT INTO table (col) SELECT DISTINCT col FROM dlist' each time a
new record was updated or added.

It turns out that is almost as slow as the original method. So then I
added an index to every column table, and it was still no better.

I thought maybe generating a select box for each column's contents was
the problem, since unique items in some tables are as much as 4
thousand, but singling out the code which performs the queries,
eliminating the select box generation, confirmed that the queries were
indeed the slow part.

I may have to abandon the retrieving of unique items from every column
in a table unless you or anyone else have any other ideas.
 
Thanks for your help, Dan.
 
-- 
Willie Gnarlson

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Unique items from all columns, very slow

2005-04-20 Thread Willie Gnarlson
On 4/20/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Willie Gnarlson [EMAIL PROTECTED] wrote on 04/20/2005 01:56:03 PM:
   On 4/20/05, Dan Nelson [EMAIL PROTECTED] wrote:
In the last episode (Apr 20), Willie Gnarlson said:
 I have been attempting to uncover a fast method for retrieving unique
 items from every column in a table. The table has approximately 60
 thousand rows. Each row has an index associated with it, and running
 'explain select distinct class from dlist' shows that it is using the
 index.

 However, since there are 24 columns, running that select query for
 each column takes about 4 long seconds on a P4 2.4 Ghz.

If there are a significant number of identical values (ages, sizes,
etc), then setting PACK_KEYS=1 on the table may help by making smaller
   
   There are, but PACK_KEYS=1 seemed to shed about a second (just
   guessing, but it did seem only slightly faster) off the total query
   time.
   
indexes.  You might also be able to convert some columns to ENUMs,
which will make their indexes even smaller (and let your SELECT be
almost instantaneous if mysql knows how to optimize it right :)
   
   One column was already ENUM, but it wouldn't be possible to make any
   others that type.

Another alternative would be to build a table for each column, holding
the output of SELECT DISTINCT(column) FROM dlist.  Rebuild it as
frequently as needed.
   
   Okay, I decided to try that option, as it sounded like a good one. I
   created the tables, and since the main table would be rarely updated I
   was going to run DELETE FROM on the column tables and run the queries
   `INSERT INTO table (col) SELECT DISTINCT col FROM dlist' each time a
   new record was updated or added.
   
   It turns out that is almost as slow as the original method. So then I
   added an index to every column table, and it was still no better.
   
   I thought maybe generating a select box for each column's contents was
   the problem, since unique items in some tables are as much as 4
   thousand, but singling out the code which performs the queries,
   eliminating the select box generation, confirmed that the queries were
   indeed the slow part.
   
   I may have to abandon the retrieving of unique items from every column
   in a table unless you or anyone else have any other ideas.

(...)

 If you are using select boxes for every column, how can you possibly worry
 about new or changed values as the users must pick from a fixed list of
 values, don't they? 

I am building (or attempting tobuild) the select boxes on the fly
based on unique items from columns in the table.
  
 You don't rebuild _all_ of your umpteen pick-list tables from scratch every
 time, detect just the new or changed values and add/update them (which goes

Yes, that would be the smarter thing to do, you're right. But alas
that can be set aside for my testing so far. At this point I'm simply
trying to get the values from the columns in a reasonable amount of
time.

 back to my original question of how are the users creating new values if you
 are really using a select box and not something else like a combo box (part
 list, part text field)) 

Okay, they aren't creating new values using that form. The form I've
been talking about is a form to limit results from the database to
only specific values for  specific fields without the user having to
know from memory what the possible values are (plus, they'll change
every so often).
  
 Maybe if I understood your front-end application a little bit better, I
 could help you work around the issue. The way you described your user
 interface design made me wince; something just doesn't feel right about it. 

No, it's not as bad as it sounds. :-) (I hope)

-- 
Willie Gnarlson

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Unique items from all columns, very slow

2005-04-19 Thread Willie Gnarlson
Hello,

I have been attempting to uncover a fast method for retrieving unique
items from every column in a table. The table has approximately 60
thousand rows. Each row has an index associated with it, and running
'explain select distinct class from dlist' shows that it is using the
index.

However, since there are 24 columns, running that select query for
each column takes about 4 long seconds on a P4 2.4 Ghz.

I've been unable to find any suitable leads through google or the list
archives.

Any cluebats will be gleefully appreciated. 

Thanks for reading!

-- 
Willie Gnarlson

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]