RE: UPDATE statement optimization

Jeffrey Santos Mon, 10 Oct 2005 11:04:40 -0700

Shawn,


Thank you for the advice, I had a feeling you'd say something like that.
Will reorganize my databases.  Makes my code a little easier to follow to
boot :-P

 

Thanks again,

 

- Jeff

 

  _____  

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 10, 2005 1:55 PM
To: Jeffrey Santos
Cc: mysql@lists.mysql.com
Subject: RE: UPDATE statement optimization

 


Unless you are on a severely underpowered machine, MySQL will handle 3
million rows without any problems. If you are on such an underpowered
machine, then your current process must absolutely CRAWL!!! Don't worry
numbers in the millionsl when you are working with enterprise-quality
database engines, like MySQL. As long as you have the disk space, I wouldn't
think twice about it. :-) As an example, those three million rows may use up
to 15-18MB including indexes. You have WAY more free space than that on your
disks, don't you. 

As it is, you have to decompose 1000 different lists of numbers, manually
scan your decompositions for certain values, delete those certain values (if
they exist), reconstitute each list, and update the original record. None of
those activities (except for the final UPDATE) can possibly  use an index
and you always have to do it to EVERY row!  It's much, MUCH more practical
to use the separate table method (not to mention much, much faster.) 

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine 



"Jeffrey Santos" <[EMAIL PROTECTED]> wrote on 10/10/2005 01:37:58 PM:

> I understand and it's actually the original way I was designing the 
> databases.  Only thing that stopped me from doing it that way was 
> the fact that the sets a user belongs to (vsets) is also dynamic, 
> the user can change which sets he/she belongs to.  Meaning that if 
> there are 50 sets and 50 users then we have the possibility of 2500 
> rows in the table for a relatively small population.  Since each 
> user will have most likely created their own sets and the ability to
> create multiple ones, the dynamic is probably more along the lines 
> of a 3:1 set to user ratio, if not higher. So. 
>   
> 1000 users, 3000 sets 3000000 possible rows.  While this is an 
> extreme case, the possibility remains.  Now having said that, the 
> way I'm doing it isn't efficient due to processing time it would 
> take to delete from the CSL on each 1000 users every time a set gets
> deleted.  But, is it not equally inefficient to have to record and 
> store 3 million rows?  I'm not very familiar with the backend 
> structure of how mysql actually stores information so I recognize 
> the comparison might not even be valid. 
>   
> - Jeff 
>   
> 
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
> Sent: Monday, October 10, 2005 1:26 PM
> To: Jeffrey Santos
> Cc: mysql@lists.mysql.com
> Subject: Re: UPDATE statement optimization 
>   
> 
> 
> "Jeffrey Santos" <[EMAIL PROTECTED]> wrote on 10/10/2005 01:07:03
PM:
> 
> > Hey all!
> > 
> >  
> > 
> > I'm a very novice MYSQL user of the mentality of "get it working, then
get
> > it working the right way," feeling the best way to learn is to just do
it.
> > I got things working but now I'm looking back and trying to get better
> > efficiency out of my SQL statements.
> > 
> >  
> > 
> > Database is setup like this:
> > 
> >  
> > 
> > Have one table using primary keys called "fcsets"
> > 
> > Have a second table called "fcusers" with a field "vsets" which is a
comma
> > separated list of primary keys referring to the first table
> > 
> >  
> > 
> > Users can dynamically delete entries from the first table, so I want to
be
> > able to remove that entry from all users who have it in their csl vsets
in
> > fcusers.  This is a PHP statement but I think you'll get the general
idea of
> > how I'm doing it.  Am looking for some suggestions on how to do this
more
> > efficiently:
> > 
> >  
> > 
> > UPDATE fcusers SET vsets=TRIM(BOTH ',' FROM CONCAT_WS(',',
> > SUBSTRING_INDEX(`vsets`, ',', FIND_IN_SET('{$setkey}', `vsets`) - 1),
> > SUBSTRING_INDEX(`vsets`, ',', FIND_IN_SET(REVERSE('{$setkey}'),
> > REVERSE(`vsets`)) * -1 + 1))) WHERE vsets REGEXP '^.*,?{$setkey},?.*$'
> > 
> >  
> > 
> > Where $setkey obviously refers to the key of the entry in fcsets being
> > deleted.
> > 
> >  
> > 
> > Thanks!
> > 
> >  
> > 
> > - Jeff
> > 
> 
> You are having a hard time managing your set to user relationships 
> because of your schema (database design). 
> 
> You need to normalize your data. That means you need to get rid of 
> the comma separated list on fcusers. You need to make one entry in 
> some table for each association with a set each user has. 
> 
> I assume there is more information in the fcusers table other than 
> just what set a user belongs to and you don't want multiple copies 
> of that information floating about. That means you need to make a 
> third table, called something like  `users_sets`, just to contain 
> each pairs of PK values of your users  and the PK values of the sets
> they belong to. It would look something like: 
> 
> CREATE TABLE users_sets ( 
>         users_id int, 
>         sets_id, 
>         PRIMARY KEY (users_id, set_id) 
> ) 
> 
> Does that make sense? You create one row in users_sets for each time
> a user and a set become associated. When that association ends, 
> delete the appropriate row. 
> 
> You might be thinking that working with multiple rows of data seems 
> like more work, it won't be. Adding and subtracting sets to users 
> (or users to sets) becomes simple INSERT and DELETE statements. 
> Everything will move much faster. 
> 
> Can you see the pattern or would you like more details? 
> 
> 
> Shawn Green
> Database Administrator
> Unimin Corporation - Spruce Pine

RE: UPDATE statement optimization

Reply via email to