Re: Mysql growing pains, 4 days to create index on one table!

2004-07-15 Thread Tim Brody
I know it's naff but I've found it quicker to use myisamchk with row-sort
than it is to get the MySQL daemon to regenerate keys (and if you know
you're not changing the data file you can tell myisamchk not to copy the
data), unless I've missed something in the MySQL config ...

The only way I know to do this for PRIMARY KEYs is a process described by
Kyle J. Munn (but watch out for free space :-):
http://lists.mysql.com/mysql/158737

This boils down to moving the MYD file, truncating the table to create an
empty MYD, modify your key(s), move the MYD back, and repair using myisamchk
to re-create the keys.

Otherwise you can use ALTER TABLE table DISABLE KEYS/ENABLE KEYS to
disable keys while you make your inserts, then enable keys to update
non-Primary keys in one go.

I've never tried PACK_KEYS so can't suggest anything on that, other than it
sounds like it adds more data to your index which is probably a bad thing
with such short values.

You may find that the 'dic' KEY isn't necessary, as it's the first part of
your PRIMARY KEY.

Depending on your data you may be better off storing integer IDs and the
char values in a look-up table.

I've found better performance for multi-column keys by putting the columns
in order of least variance first, e.g. for a list of dates:
1979-04-23
1979-07-15
1980-02-04
1980-06-04
You want a key on (YEAR-MONTH-DAY)

If you can you could put the index/data on different disks - not sure how
you would do that in Windows (INDEX DIRECTORY option for CREATE TABLE?).

You should definitely put the binary log file on another disk, but again not
something I've used.

I've found MySQL to be a royal pain working with multi-GB tables (my biggest
is 12GB+13GB index). I've learnt that MySQL is a bit like a confused puppy -
it doesn't know why it wet the floor, but it expects you to know that pained
expression means you need to move it somewhere else ...

All the best,
Tim.

- Original Message - 
From: matt ryan [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, June 30, 2004 7:50 PM
Subject: Mysql growing pains, 4 days to create index on one table!


 Rebuilding index takes 3 1/2 days!!!  Growing pains with mysql..

 I've got 2 dedicated servers, each with a slave, all run 32gig 15k rpm
 raid 5 on u320 perc raid cards, dell 2600/4600's with single channel
 backplanes (new ones will have dual channel)

 All have 2 gig of ram, but I've never seen mysql use more than 600mb
 of ram.

 The servers handle huge loads, each day there are 30 1-2 gig files
 loaded into large tables, total mysql data size is 96 gig, the large
 tables are 2-6 gig.

 Inserts are done on dupe key ignore, this takes hours on the large
 files, it barely keeps up with input files.

 At the bottom of this post I've got the mysql.ini config lines, any
 suggestions are welcome, I'm already beyond the mysql huge sample
 they used to include in the program.

 Sample table that I load is as follows.

 each day I get 40 % new records on the text file, the input file is
 normally 20mb, once a week I get one that's 1-2 gig, these take all
 day to load.

 I need more multiple column indexes, as some querys return millions of
 rows that must be scanned, but the index size already exceeds the
 table size, and the combinations I need would result in an myi that's
 5x larger than the data itself.

 Here's an example of the speed problem, the index was corrupt so I
 dropped all and recreated, rather than a myisam repair. I think 3 days
 is a little excessive for a table that's only 3.428 gig, index is
 2.729 gig.  I cant remove the primary key, as it keeps duplicates out
 of the system, the input files are from old database's, we use mysql
 to store the data for the web frontend, mostly done in ASP, most
 queries take less than a second, unforuntatly we have big queries that
 take way more than the IIS timeout setting all the time, but no way
 around it, I cant add more indexes without making it even slower :(



 I cant tell if it's mysql that's the problem, or the hardware, Here's a
 screenshot of the disk IO, if I copy a file while mysql is doing the
 build index, the io shoots way up, which tells me, mysql is NOT maxing
 out the drives, and it's also not maxing out the memory.

 Unless it's doing lots and lots of seeks on the drive, which is harder
 to test using perfmon, are there any mysql test setups that would help
 identify where the bottleneck is?


 screenshot of disk io usage

 http://www.geekopolis.com/pics/diskio.jpg

 I'm all out of ideas, other than switching to another db, and the table
  indexes split across drives, maybe a 2 channel setup, 4 drives per
 channel, each 4 is a separate raid 5 setup, one holds data one holds
 indexes, cant do this with mysql though

 mysql alter table hood_stat add primary key
 (dic,niin,fr_ric,don,suf,dte_txn,sta) , add index `don` (`don`), add
 index `niin` (`niin`), add index `stor` (`stor`), add index `dic`
 (`dic`), add index `ctasc` (`ctasc`);

 Query OK, 45449534 rows 

Re: Mysql growing pains, 4 days to create index on one table!

2004-07-15 Thread Tim Brody
- Original Message - 
From: matt ryan [EMAIL PROTECTED]

 I need the DIC in the key to keep the record unique, I have thousands
 with everything identical except the DIC.

In your schema you had DIC in the PRIMARY KEY and an additional
(unnecessary?) KEY on DIC.

 I was confused on the multi key index issue, I thought it would seek
 faster if I put the most unique field up front, which I do on most
 tables, I did not on this one though.   I have one large raid array now,
 so I cant split the data, or put the binary log on another disk.

Having performed a couple of tests on a multi-key index I can't find any
significant difference between different orders on the columns (logical I
guess), although there may be certain uses that might work better in either
situation ...

Perhaps a better point to make would be to say change the order of your
PRIMARY KEY to reflect the queries that you're performing. e.g. as I said
above MySQL will efficiently use the first part of any key. Certainly, if
you're performing a WHERE or ORDER BY on any of the columns in the primary
key put those at the front.

 I found mysql was great up to about 3 gig, then everything hit the
 brakes and got really really really slow

 I'm scared of joines, every time I do a join in mysql on indexed fields
 in mysql, the performance is horrible, because the where clause is not a
 field that's in the join, performance is poopy

Hmm, how about (sorry if I'm pointing out the obvious) e.g.:
_left (col1 int, col2 int, col3 int, unique(col3,col1,col2));
_right (col1 int, col2 int, col3 int, unique(col1,col2,col3));

_left inner join _right using(col1,col2) where _left.col3=42 and
_right.col3=35;

The analyzer (explain) shows a const at the end of the ref columns for
_right and const for _left.

All the best,
Tim.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: ENUM vs TINYINT

2004-06-25 Thread Tim Brody
If you want to use as little space as possible use char(0) and null/not-null
for your boolean values:

CREATE TEMPORARY TABLE test_bool (i char(5), bool char(0));
INSERT INTO test_bool VALUES
('true',''),('false',null),('true',''),('true',''),('false',null);
SELECT * FROM test_bool WHERE bool is not Null;
SELECT * FROM test_bool WHERE bool is Null;

In a multi-column table that saves you 1 byte per row.

Regards,
Tim.

- Original Message - 
From: Brian Mansell [EMAIL PROTECTED]
To: Cemal Dalar [EMAIL PROTECTED]
Cc: Group MySQL List [EMAIL PROTECTED]
Sent: Friday, June 25, 2004 8:01 AM
Subject: Re: ENUM vs TINYINT


 Cemal,

 I recall hearing a similar question mentioned in a previous email
 thread. In fact, here it is:

http://groups.google.com/groups?hl=enlr=ie=UTF-8threadm=c6h60e%2419dd%241%40FreeBSD.csie.NCTU.edu.twrnum=1prev=/groups%3Fq%3D%2522enum%2Bor%2Btinyint%2522%26hl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3Dc6h60e%252419dd%25241%2540FreeBSD.csie.NCTU.edu.tw%26rnum%3D1

 (thread is titled enum or tinyint)

 I hope that helps!

 On Thu, 24 Jun 2004 21:17:09 +0300, Cemal Dalar [EMAIL PROTECTED] wrote:
 
  Hi all,
 
  I need a boolean column and at to this time I always used
ENUM('Y','N')
  for this. I'am wondering that will there be a performance difference
between
  using ENUM('Y','N') and TINYINT(1) or BOOLEAN?. And put 0 or 1 to
TINYINT
  column.
 
  Best Regards,
  Cemal Dalar a.k.a Jimmy
  System Administrator  Web Developer
  http://www.dalar.net
 
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]
 
 

 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Index Building on Large Tables Stalling

2004-06-16 Thread Tim Brody
binary MySQL 4.0.20-standard on Redhat 7.2/Linux
Dual-proc, 4Gb ram, raid

I'm trying to change an index on a 12Gb table (270 million rows). Within an
hour or so the entire table is copied, and the index reaches 3.7Gb of data.
Then the database appears to do nothing more, except for touching the Index
file (the timestamp changes, with no change to file size). I've left it for
4 days with no further increase in the index size (which should eventually
be a similar size to the data). The MySQL process is at most 4-5% CPU util,
having been at 60-90% during the initial hour.

I've left it performing Repair with keycache, tried set
myisam_max_sort_file = 17179869184 forcing MySQL to Repair by sorting
without success. With Repair by sorting a TMD file is created with a
duplicate of the data, but the index reaches the same point and stops.

I originally created the table through multiple-inserts.

The table is a mix of varchars, ints and a year. It has a primary key over
two ints, and a key over varchars/year.

I know this isn't much to go on, but what's special about 3.7Gb? (A rogue
unsigned int somewhere in Repair with keycache/Repair by sorting?)

All the best,
Tim.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Index Building on Large Tables Stalling

2004-06-16 Thread Tim Brody
Nope.
As far as I'm aware the only disk space being used is in the database's 
directory, and that file system has 200Gb spare.
(/tmp has 19Gb free anyway)

Regards,
Tim.
gerald_clark wrote:
Are you running out of temp space?
Tim Brody wrote:
binary MySQL 4.0.20-standard on Redhat 7.2/Linux
Dual-proc, 4Gb ram, raid
I'm trying to change an index on a 12Gb table (270 million rows). 
Within an
hour or so the entire table is copied, and the index reaches 3.7Gb of 
data.
Then the database appears to do nothing more, except for touching the 
Index
file (the timestamp changes, with no change to file size). I've left 
it for
4 days with no further increase in the index size (which should 
eventually
be a similar size to the data). The MySQL process is at most 4-5% CPU 
util,
having been at 60-90% during the initial hour.

I've left it performing Repair with keycache, tried set
myisam_max_sort_file = 17179869184 forcing MySQL to Repair by sorting
without success. With Repair by sorting a TMD file is created with a
duplicate of the data, but the index reaches the same point and stops.
I originally created the table through multiple-inserts.
The table is a mix of varchars, ints and a year. It has a primary key 
over
two ints, and a key over varchars/year.

I know this isn't much to go on, but what's special about 3.7Gb? (A rogue
unsigned int somewhere in Repair with keycache/Repair by sorting?)
All the best,
Tim.
 




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]