Re: Possible tricks to ALTER on huge tables?

2010-08-06 Thread Jigal van Hemert

Daevid Vincent wrote:

We currently have some tables that are approaching 1 BILLION rows (real
Billion, with nine zeros, not that silly six zero version). Trying to do an
ALTER on them to add a column can sometimes take hours.


A few years ago I have tested possible table structures for an 
application which had to store at least a million profiles of persons. 
Because we expected that properties would be added (and/or removed) from 
the database quite often I also tested a structure where the properties 
of a single profile were stored in tables based on the data type.


So we had tables with integers, strings, dates, etc. and used a record 
for each property; columns were like: id, property name, value, and a 
few other relevant things to handle and display the data.


Most select queries were about as fast as they would be with a single 
table. Database size was approximately the same because not all profiles 
used all properties, so we only needed to store the properties a certain 
 profile would use.
The only limitation at that time was 31 joins, but I don't think we've 
ever hit that limit.


Adding properties was easy, just adding them to the configuration of the 
application was enough.


It really depends on the situation of your application which table 
structure is the most suitable. Test the performance of all kinds of 
operations you need to do with realistic data and various amounts of 
data to see how it scales.


--
Jigal van Hemert

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Possible tricks to ALTER on huge tables?

2010-08-06 Thread Andy Wallace

I had to do this trick with a few million rows in the table, and what
I did was to create a new table with the required structure, then did
insert into select from, starting with the newest data first, cause
that made sense for my application. Then, renamed the old table and the
new.

YMMV
andu

Jigal van Hemert wrote:

Daevid Vincent wrote:

We currently have some tables that are approaching 1 BILLION rows (real
Billion, with nine zeros, not that silly six zero version). Trying to 
do an

ALTER on them to add a column can sometimes take hours.




--
Andy Wallace
iHOUSEweb, Inc.
awall...@ihouseweb.com
(866) 645-7700 ext 219
--
There are two ways to build software:
Make it so simple that there are obviously no bugs,
or make it so complex that there are no obvious bugs.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



RE: Possible tricks to ALTER on huge tables?

2010-08-06 Thread Daevid Vincent
 

 -Original Message-
 From: Rob Wultsch [mailto:wult...@gmail.com] 
 Sent: Thursday, August 05, 2010 6:05 PM
 To: Daevid Vincent
 Cc: MySQL List
 Subject: Re: Possible tricks to ALTER on huge tables?
 
 Having significant amount of overhead for unused columns will without
 doubt harm performance significantly for certain operations.

I don't see this as significant overhead. A few null columns that are
dormant are not going to impact performance in any measurable way I don't
think.

 Altering .frm files should is always be tried on a non-prod box before
 even considering using it on prod. 

Thanks Capt'n Obvious. ;-p

Isn't that the case for ANY and ALL DB operations?

 There are some well known use cases
 (adding enums values, enlarging varchar columns) where altering a .frm
 is useful, but it should always be considered very dangerous.

Of course.

 You could of course consider using PostgreSQL which would only need a
 very brief exclusive lock for adding a default null column...

...an interesting thought.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Possible tricks to ALTER on huge tables?

2010-08-05 Thread Daevid Vincent
Driving to work today, I had an epiphany thought, but wanted to see if
anyone could prove my theory or not.

We currently have some tables that are approaching 1 BILLION rows (real
Billion, with nine zeros, not that silly six zero version). Trying to do an
ALTER on them to add a column can sometimes take hours.

I'm wondering if we had the foresight to create the tables, and then tack
on extra dormant columns of various common types, such as:

future_uint int(10) unsigned null,
future_int int(10) signed null,
future_var varchar(255) null,
Etc.

So basically they'd be unused, then when we wanted a new column of that
type, we would just rename the dormant one. 

I'm not sure if mySQL is smart enough to realize that if the schema
definition for a column is identical, then it's just a simple rename, or if
it treats any change the same and will still take hours to complete (if so,
perhaps there's an optimization for you mysql developer team)

Another option I considered, was is it possible to just go in with a hex
editor and rename the field in the .frm file? Is there some kind of .frm
editor available anywhere?

r...@mypse:/var/lib/mysql/agis_core# hexdump -C country.frm
1000  01 01 00 00 0a 00 00 00  02 00 01 00 00 00 01 80
||
1010  02 00 00 12 00 02 00 ff  50 52 49 4d 41 52 59 ff
|PRIMARY.|
1020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
||
*
13f0  00 00 00 00 00 00 00 00  00 00 00 00 06 00 49 6e
|..In|
1400  6e 6f 44 42 00 00 00 00  00 00 00 00 00 00 00 00
|noDB|
1410  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
||
*
2150  04 00 0b 69 64 5f 63 6f  75 6e 74 72 79 00 05 00
|...id_country...|
2160  05 69 73 6f 32 00 06 00  05 69 73 6f 33 00 07 00
|.iso2iso3...|
2170  0d 63 6f 75 6e 74 72 79  5f 6e 61 6d 65 00 04 0b
|.country_name...|
2180  05 05 00 02 00 00 12 00  0f 00 00 02 c0 00 00 05
||
2190  05 06 06 00 04 00 00 00  80 00 00 00 fe c0 00 00
||
21a0  06 05 09 09 00 0a 00 00  00 80 00 00 00 fe c0 00
||
21b0  00 07 0d 42 fd 02 13 00  00 00 00 00 00 00 0f c0
|...B|
21c0  00 00 ff 69 64 5f 63 6f  75 6e 74 72 79 ff 69 73
|...future_var...|
21d0  6f 32 ff 69 73 6f 33 ff  63 6f 75 6e 74 72 79 5f
|...future_int...|


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Possible tricks to ALTER on huge tables?

2010-08-05 Thread Buford Tannen

Daevid Vincent wrote:

I'm wondering if we had the foresight to create the tables, and then tack
on extra dormant columns of various common types, such as:


Nothing beats empirical evidence. Why don't you try it and find 
out (and report back)!


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Possible tricks to ALTER on huge tables?

2010-08-05 Thread Rob Wultsch
On Thu, Aug 5, 2010 at 5:07 PM, Daevid Vincent dae...@daevid.com wrote:
 Driving to work today, I had an epiphany thought, but wanted to see if
 anyone could prove my theory or not.

 We currently have some tables that are approaching 1 BILLION rows (real
 Billion, with nine zeros, not that silly six zero version). Trying to do an
 ALTER on them to add a column can sometimes take hours.

 I'm wondering if we had the foresight to create the tables, and then tack
 on extra dormant columns of various common types, such as:

 future_uint int(10) unsigned null,
 future_int int(10) signed null,
 future_var varchar(255) null,
 Etc.

 So basically they'd be unused, then when we wanted a new column of that
 type, we would just rename the dormant one.

 I'm not sure if mySQL is smart enough to realize that if the schema
 definition for a column is identical, then it's just a simple rename, or if
 it treats any change the same and will still take hours to complete (if so,
 perhaps there's an optimization for you mysql developer team)

 Another option I considered, was is it possible to just go in with a hex
 editor and rename the field in the .frm file? Is there some kind of .frm
 editor available anywhere?

 r...@mypse:/var/lib/mysql/agis_core# hexdump -C country.frm
 1000  01 01 00 00 0a 00 00 00  02 00 01 00 00 00 01 80
 ||
 1010  02 00 00 12 00 02 00 ff  50 52 49 4d 41 52 59 ff
 |PRIMARY.|
 1020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 ||
 *
 13f0  00 00 00 00 00 00 00 00  00 00 00 00 06 00 49 6e
 |..In|
 1400  6e 6f 44 42 00 00 00 00  00 00 00 00 00 00 00 00
 |noDB|
 1410  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 ||
 *
 2150  04 00 0b 69 64 5f 63 6f  75 6e 74 72 79 00 05 00
 |...id_country...|
 2160  05 69 73 6f 32 00 06 00  05 69 73 6f 33 00 07 00
 |.iso2iso3...|
 2170  0d 63 6f 75 6e 74 72 79  5f 6e 61 6d 65 00 04 0b
 |.country_name...|
 2180  05 05 00 02 00 00 12 00  0f 00 00 02 c0 00 00 05
 ||
 2190  05 06 06 00 04 00 00 00  80 00 00 00 fe c0 00 00
 ||
 21a0  06 05 09 09 00 0a 00 00  00 80 00 00 00 fe c0 00
 ||
 21b0  00 07 0d 42 fd 02 13 00  00 00 00 00 00 00 0f c0
 |...B|
 21c0  00 00 ff 69 64 5f 63 6f  75 6e 74 72 79 ff 69 73
 |...future_var...|
 21d0  6f 32 ff 69 73 6f 33 ff  63 6f 75 6e 74 72 79 5f
 |...future_int...|

Having significant amount of overhead for unused columns will without
doubt harm performance significantly for certain operations.

Altering .frm files should is always be tried on a non-prod box before
even considering using it on prod. There are some well known use cases
(adding enums values, enlarging varchar columns) where altering a .frm
is useful, but it should always be considered very dangerous.

You could of course consider using PostgreSQL which would only need a
very brief exclusive lock for adding a default null column...

-- 
Rob Wultsch
wult...@gmail.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org