RE: Managing Very Large Tables

2004-03-30 Thread Henrik Schröder
I did some tests with 20GB tables and several millions of rows a few months
back, and what helped improve the performance the most was to separate the
(small) columns used for searching from the (large) columns containing data.
My test table was from a messaging system, and I redesigned it so that the
larger table hold only a message-id and some TEXT-fields, and the smaller
table held all usernames, timestamps, and other columns used for selecting
and sorting.

Using the older table design, I tried to set the primary key as optimal as
possible so that the most frequently run query could use it directly.
However, since I used InnoDB, the primary key *is* the table, doing index
searches is rather slow simply because of the amount of disk you have to
traverse to get to the index. Adding secondary indexes and forcing the
queries to use them did not help.

When I changed the structure, the info table shrunk to a few hundred MB, and
searches in that smaller table and index was considerably faster. Getting
data from the data table was also very fast, since all access to it was
reduced to primary key lookups instead of index scans. All of this combined
made my queries go ten(!) times faster.

I don't know if you can do the same, if you have large data-columns you can
split off, but if you do, it won't hurt that much to try. :-)

I also don't know how MyISAM compares to InnoDb in this specific case, maybe
the result is smaller for MyISAM because of the difference in how the
primary key is created and used.


/Henrik 

-Original Message-
From: Chad Attermann [mailto:[EMAIL PROTECTED] 
Sent: den 30 mars 2004 19:42
To: [EMAIL PROTECTED]
Subject: Managing Very Large Tables


Hello,

I am trying to determine the best way to manage very large (MyISAM) tables,
ensuring that they can be queried in reasonable amounts of time.  One table
in particular has over 18 million records (8GB data) and is growing by more
than 150K records per day, and that rate is increasing.  Besides the obvious
things like better hardware and load-balancing across multiple replicating
databases, I am trying to determine how to keep these data sets optimized
for fastest queries.  In my particular situation, the most recent data is
queried most often, and data over 30-45 days old is not queried much at all
but still must remain accessible.  Each record has an integer time column
that is indexed for querying over periods of time.  Currently I run a script
regularly that moves records older than 45 days from tables in the main
database into identical tables in another (archive) database running in the
same server process.  This seems to speed up the tables in the main
database, but I realize that deleting records leaves holes in the tables,
and that this slows inserts as well as makes it impossible to read and write
concurrently from these tables.  My question is, is it better to keep all of
the data in the original tables to avoid holes, or is 'archiving' records to
another database a wise approach?  How much does the size of a table really
affect performance when querying the more recent data?  If archiving is
reasonable, is there a way to optimize the tables to get rid of the holes
without siginificantly impacting ongoing activity on these tables?

Thanks for your time!

Chad Attermann
[EMAIL PROTECTED]

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: INSERT ... SELECT question

2004-03-29 Thread Henrik Schröder
No, no, no, no need to make it that complicated! :-)

If your table has the columns col1, col2, col3, col4, etc, and col1 is the
autoincrement column, and you want to duplicate the entire table, something
like this should work:

INSERT INTO t1(col2, col3, col4, ...)
SELECT col2, col3, col4, ... FROM t1

The trick is that you select rows from your table that contain all columns
except the auto_increment column, and then re-insert them without specifying
a value for the auto_increment column, thereby assigning them a new value.

If you don't want to duplicate the entire table, just add an appropriate
where-clause to the above statement.


/Henrik 

-Original Message-
From: Eric J. Janus [mailto:[EMAIL PROTECTED] 
Sent: den 29 mars 2004 21:28
To: Matt Chatterley; 'MySQL'
Subject: RE: INSERT ... SELECT question


Views would be nice. :)

That idea sounds like it would work in a single-user envrinoment, which I'm
not.  I agree, messy.

I'll just write a function to build a field list from a specified table
leaving out the AUTO_INCREMENT field...I can then do something like this:

SQL = INSERT INTO tbl_name( + generate_field_list(tbl_name) + ); SQL +=
SELECT  + generate_field_list(tbl_name)   FROM tbl_name WHERE
where_clause

Just a little more work that I was hoping for though!

Thanks for the help.

Eric

 -Original Message-
 From: Matt Chatterley [mailto:[EMAIL PROTECTED]
 Sent: Monday, March 29, 2004 2:25 PM
 To: 'Eric J. Janus'; 'MySQL'
 Subject: RE: INSERT ... SELECT question
 Importance: Low


 Oh, if only there were views!! That would make this easy. Maybe soon 
 (*please*). :)

 Another approach (more cumbersome) might be to insert the rows you 
 need to duplicate into a temporary table, update the id_col adding
 max(id_col) from
 the original table to each, and then to insert from the temporary 
 table back into the original.

 I don't like that at all, though. It seems messy. Hopefully someone 
 here can come up with a better idea!

 Cheers,


 Matt

 -Original Message-
 From: Eric J. Janus [mailto:[EMAIL PROTECTED]
 Sent: 29 March 2004 20:12
 To: Matt Chatterley; 'MySQL'
 Subject: RE: INSERT ... SELECT question

 There is only 1 table.  I want to replicate a record in a table except 
 the AUTO_INCREMENT column.

 Your solution would work, but I'd prefer to not have to maintain a 
 list of columns in the application.  Worst case I'll have the 
 application generate the query based on the table definition, but I 
 was hoping that MySQL had something built in to make this easier.

 Thanks,

 Eric

  -Original Message-
  From: Matt Chatterley [mailto:[EMAIL PROTECTED]
  Sent: Monday, March 29, 2004 2:11 PM
  To: 'Eric J. Janus'; 'MySQL'
  Subject: RE: INSERT ... SELECT question
  Importance: Low
 
 
  Can you roughly outline the schema of the two tables?
 
  If the key value doesn't have to match from the input to the
 final result,
  you can always insert into a list of fields, skipping the 
  auto_increment/key column, and they will continue to be generated..
 
  Assuming you have two tables with id_col, col2, col3:
 
  INSERT INTO table (col2, col3)
  SELECT col2, col3 FROM table2 WHERE id_col=1;
 
 
  Regards,
 
  Matt
 
  -Original Message-
  From: Eric J. Janus [mailto:[EMAIL PROTECTED]
  Sent: 29 March 2004 19:37
  To: MySQL
  Subject: INSERT ... SELECT question
 
  I have a table with just about 100 columns, and I would like to 
  duplicate a row exactly, except for one column, which is the 
  AUTO_INCREMENT column.
 
  Using 'INSERT INTO tbl_name SELECT * FROM tbl_name WHERE id_col = 1' 
  doesn't work, because it produces the following error: 'ERROR 1062:
  Duplicate entry
  '1' for key 1'.
 
  Because I'd like the application using this database to be simpler 
  to maintain, I'd prefer to not have to change the code each time a 
  field is added...so is there a way to duplicate a row, but still 
  have it automatically assigned an value for the AUTO_INCREMENT 
  column?
 
  Thanks,
 
  Eric
 
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
 
 
 





-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



How to optimize ugly order by?

2004-03-24 Thread Henrik Schröder
Hi all,

I have a table of users which contains a lot of information, and also a
column called membershiptype which can either be 0, 1, or 2. This table is
used a lot in various searches, and all results should be ordered by
membership type, but the order should not be 0-1-2, but instead 1-2-0.
Currently, this is achieved like this:

SELECT ... ORDER BY (membershiptype  1) ASC, (membershiptype  2) ASC,
login ASC

...which is rather ugly, and forces MySQL to create a temp table with the
calculated expressions and then re-sort the result using these. Since this
query is used a lot, it would be nice if I could get rid of this. I'm
completely stumped. Any ideas?

No, I did not choose the sort order. No, I can not change the values used.
Yes, it has to be this sort order. :-)


/Henrik Schröder

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: How to optimize ugly order by?

2004-03-24 Thread Henrik Schröder
I don't think it does right now, but that can always be taken care of later.

Assuming that the query uses one index that contains all the where-columns
and the orderby-columns, having an ORDER BY that sorts on expressions is
still the biggest problem, right?


/Henrik 

-Original Message-
From: Victor Pendleton [mailto:[EMAIL PROTECTED] 
Sent: den 24 mars 2004 16:22
To: [EMAIL PROTECTED]; Henrik Schröder; '[EMAIL PROTECTED]'
Subject: Re: How to optimize ugly order by?


Let me rephrase my statement. Does the index being used include the order by
columns? If not, the optimizer will not be able to perform both row
selection and row ordering in the same pass.

 On Wed, 24 Mar 2004, Victor Pendleton wrote:
 Does the where clause contain the order by columns? If not, then you 
 will have a filesort no matter what.
 
  On Wed, 24 Mar 2004, Henrik Schröder wrote:
  Hi all,
  
  I have a table of users which contains a lot of information, and 
  also a column called membershiptype which can either be 0, 1, or 2. 
  This table is used a lot in various searches, and all results should 
  be ordered by membership type, but the order should not be 0-1-2, 
  but instead 1-2-0. Currently, this is achieved like this:
  
  SELECT ... ORDER BY (membershiptype  1) ASC, (membershiptype  2) 
  ASC, login ASC
  
  ...which is rather ugly, and forces MySQL to create a temp table 
  with tthe calculated expressions and then re-sort the result using 
  these. Since this query is used a lot, it would be nice if I could 
  get rid of this. I'm completely stumped. Any ideas?
  
  No, I did not choose the sort order. No, I can not change the values 
  used. Yes, it has to be this sort order. :-)
  
  
  /Henrik Schröder
  
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: How to optimize ugly order by?

2004-03-24 Thread Henrik Schröder
Thanks, I'll have to try that to see if it's faster.

Adding another column as someone else suggested is too complex in comparison
because it forces a bigger code change, and I don't really want that.
Changing one SELECT-statement is enough...


/Henrik 

-Original Message-
From: Jigal van Hemert [mailto:[EMAIL PROTECTED] 
Sent: den 24 mars 2004 16:12
To: [EMAIL PROTECTED]
Subject: Re: How to optimize ugly order by?


From: Henrik Schröder [EMAIL PROTECTED]
 used a lot in various searches, and all results should be ordered by 
 membership type, but the order should not be 0-1-2, but instead 1-2-0.

One thing you could do (not very neat IMHO) is to JOIN this query with a
table that contains membershiptype and a sortorder. Then you can ORDER BY
the sortorder of that table...

Regards, Jigal.



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]