Hi,
A few questions for you.
On Friday, August 6, 2004, at 06:17 AM, Jan Kirchhoff wrote:
I was just wondering if anybody has been using very large HEAP-tables and if there are ways to have mysql use the memory more efficient:
(I have no experience with all heap-tables but using them as temporary tables...)
I just started testing with 2 heap-tables on a development-system (p4 3.2ghz, 2GB RAM) to get an idea of what's possible and what's not:
I think mysql uses way too much memory (overhead) to store my data.
How much overhead do you think it is using? Each row is 61 bytes in geldbrief, which is *exactly* the amount needed for the datatypes you have.
char(12) = 12 bytes
int(10) = 4 bytes
datetime = 8 bytes
int(10) = 4 bytes
double(16,4) = 8 bytes
double(16,4) = 8 bytes
double(16,4) = 8 bytes
double(16,4) = 8 bytes
NULL values = 1 byte
---------------------------
Total 61 bytes
Now if you take 61 * 2449755 (number of rows) = 149435055 bytes used
157468096 Real amount 149435055 Data size --------- 8033041 Overhead 5.1% Total overhead of data
I don't see how you could get it to be much smaller than that. Even with zero overhead it would only be 5% smaller, which still would require *a lot* of memory to store it all. The primary key itself is only using about 8 bytes of memory per row (because it is a hashed index, btree would be much larger), which is also very compact.
With your own in-memory database, do you some sort of compression algorithm? That is the only way that I could see it taking up much less space. MySQL is pretty close to as efficient as you can get without compression.
I've fed the database with realtime-data for 1 hour now and I need at least 30 times as much in the tables. (200 times would be perfect - of course on some other machine with more memory)
Right now "top" tells me that mysql is using around 10% of the memory.
I already increased max_heap_table_size and I will have to increase it much more, but right now it seems that I would need 6GB of RAM to get my minimum amount of data in those tables. Which means I'd need a 64bit-system. But can mysql deal with 6GB-HEAP-tables???
So has anybody tried something like this, yet? We are currently using our own, self-written databases for this, but we are thinking about either using a sql-database in order to be able to make joins to other tables or adding features to our own software (64bit-support, sql-like interface etc.). If it works with mysql, we'd probably prefer that since its much less work for us and easier to handle. It doesn't matter if mysql uses a little more memory, but right now it seems like mysql is wasting way too much memory :(
thanks for any help!
Jan
| max_heap_table_size | 499999744 |
mysql> show table status\G *************************** 1. row *************************** Name: geldbrief Engine: HEAP Version: 9 Row_format: Fixed Rows: 2449755 Avg_row_length: 61 Data_length: 157468096 Max_data_length: 609999634 Index_length: 19690688 Data_free: 0 Auto_increment: NULL Create_time: NULL Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: max_rows=20000000 Comment: *************************** 2. row *************************** Name: umsaetze Engine: HEAP Version: 9 Row_format: Fixed Rows: 236425 Avg_row_length: 45 Data_length: 11402880 Max_data_length: 535713975 Index_length: 1942648 Data_free: 0 Auto_increment: NULL Create_time: NULL Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: max_rows=20000000 Comment: 2 rows in set (0.00 sec)
CREATE TABLE `geldbrief` ( `symbol` char(12) NOT NULL default '', `quelle` int(10) unsigned NOT NULL default '0', `kurszeit` datetime NOT NULL default '0000-00-00 00:00:00', `ticknumber` int(10) unsigned NOT NULL default '0', `bid` double(16,4) default NULL, `bidsize` double(16,4) default NULL, `ask` double(16,4) default NULL, `asksize` double(16,4) default NULL, PRIMARY KEY (`symbol`,`quelle`,`kurszeit`,`ticknumber`) ) ENGINE=HEAP DEFAULT CHARSET=latin1 MAX_ROWS=20000000
CREATE TABLE `umsaetze` ( `symbol` char(12) NOT NULL default '', `quelle` int(10) unsigned NOT NULL default '0', `kurszeit` datetime NOT NULL default '0000-00-00 00:00:00', `ticknumber` int(10) unsigned NOT NULL default '0', `kurs` double(16,4) default NULL, `umsatz` double(16,4) default NULL, PRIMARY KEY (`symbol`,`quelle`,`kurszeit`,`ticknumber`) ) ENGINE=HEAP DEFAULT CHARSET=latin1 MAX_ROWS=20000000
With all of that being said, I would just go with InnoDB, which can buffer the data in memory as well. In a later email you mention that you need to delete a lot of rows per hour. HEAP wouldn't work all that well for that since it uses table level locks. If it took 5 seconds to delete a large portion of rows, then the table would be locked for the duration of that. InnoDB, with its row level locking, would be much better for that purge process. If you turn off the innodb_flush_log_at_trx_commit (which would most likely be OK if your case) then most of the inserting would be done in memory anyways, and only written to disk in batches.
The other option is as you said before using myisam merge tables. That will make dropping old records easier if you partition the data into the separate days. The only thing I would be worried about in that scenario is if inserts are coming from multiple threads. In that case you could still possibly run into locking issues with the inserts locking each other.
Regards,
Harrison
-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]