Re: Tracking changes in large datasets over time
On Fri, Jul 18, 2008 at 3:46 PM, Rob Wultsch [EMAIL PROTECTED] wrote: I'm sure there is a way to write this in a single query, but I bet it it is ugly as heck, probably with a bunch of derived tables. INSERT INTO balances (userid, points) SELECT users.userid, users.points FROM users LEFT JOIN ( SELECT userid, MAX(timestamp_dump) AS 'timestamp_dump' FROM balances GROUP BY userid ) AS b1 USING(userid) LEFT JOIN balances b2 USING(userid,timestamp_dump) WHERE users.points != b2.points OR b2.points IS NULL ; This query is a dog, That stored procedure is pretty horrible. I'm going to play with trying to figure out how to optimize this. Fun stuff. -- Rob Wultsch -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Tracking changes in large datasets over time
On Sat, Jul 19, 2008 at 4:18 AM, Rob Wultsch [EMAIL PROTECTED] wrote: On Fri, Jul 18, 2008 at 3:46 PM, Rob Wultsch [EMAIL PROTECTED] wrote: I'm sure there is a way to write this in a single query, but I bet it it is ugly as heck, probably with a bunch of derived tables. INSERT INTO balances (userid, points) SELECT users.userid, users.points FROM users LEFT JOIN ( SELECT userid, MAX(timestamp_dump) AS 'timestamp_dump' FROM balances GROUP BY userid ) AS b1 USING(userid) LEFT JOIN balances b2 USING(userid,timestamp_dump) WHERE users.points != b2.points OR b2.points IS NULL ; This query is a dog, That stored procedure is pretty horrible. I'm going to play with trying to figure out how to optimize this. Fun stuff. -- Rob Wultsch Unless I am missing something Postgres is 100X faster for the query shown above. Awesome... postgres=# DROP TABLE IF EXISTS users, balances; DROP TABLE postgres=# CREATE TABLE users( postgres(# userid SERIAL, postgres(# points INT); NOTICE: CREATE TABLE will create implicit sequence users_userid_seq for seria l column users.userid CREATE TABLE postgres=# CREATE TABLE balances( postgres(# userid INT, postgres(# points INT, postgres(# timestamp_dump TIMESTAMP DEFAULT NOW(), postgres(# PRIMARY KEY(userid,timestamp_dump) ); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index balances_pkey f or table balances CREATE TABLE postgres=# INSERT INTO users(points) postgres-# SELECT random()*100 FROM generate_series(0,1) ; INSERT 0 10001 postgres=# INSERT INTO balances (userid, points) postgres-# SELECT users.userid, users.points postgres-# FROM users postgres-# LEFT JOIN ( postgres(# SELECT userid, MAX(timestamp_dump) AS timestamp_dump postgres(# FROM balances postgres(# GROUP BY userid postgres(# ) AS b1 USING(userid) postgres-# LEFT JOIN balances b2 USING(userid,timestamp_dump) postgres-# WHERE users.points != b2.points postgres-# OR b2.points IS NULL ; INSERT 0 10001 postgres=# UPDATE users postgres-# SET points = random()*100 postgres-# WHERE random() .3; UPDATE 2976 postgres=# select now(); now 2008-07-19 06:26:31.359-07 (1 row) postgres=# INSERT INTO balances (userid, points) postgres-# SELECT users.userid, users.points postgres-# FROM users postgres-# LEFT JOIN ( postgres(# SELECT userid, MAX(timestamp_dump) AS timestamp_dump postgres(# FROM balances postgres(# GROUP BY userid postgres(# ) AS b1 USING(userid) postgres-# LEFT JOIN balances b2 USING(userid,timestamp_dump) postgres-# WHERE users.points != b2.points postgres-# OR b2.points IS NULL ; INSERT 0 2946 postgres=# select now(); now 2008-07-19 06:26:31.593-07 (1 row) mysql DROP TABLE IF EXISTS users, balances, integers; Query OK, 0 rows affected (0.41 sec) mysql CREATE TABLE users( - userid INT PRIMARY KEY AUTO_INCREMENT, - points INT); Query OK, 0 rows affected (0.13 sec) mysql CREATE TABLE balances( - userid INT, - points INT, - timestamp_dump TIMESTAMP, - PRIMARY KEY(userid,timestamp_dump) ); Query OK, 0 rows affected (0.13 sec) mysql CREATE TABLE integers(i tinyint); Query OK, 0 rows affected (0.11 sec) mysql INSERT INTO integers(i) - VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9); Query OK, 10 rows affected (0.06 sec) Records: 10 Duplicates: 0 Warnings: 0 mysql INSERT INTO users(points) - SELECT rand()*100 - FROM integers AS i1, - integers AS i2, - integers AS i3, - integers AS i4 - WHERE i1.i + i2.i *10 + i3.i * 100 + i4.i *1000 1 ; Query OK, 1 rows affected (0.31 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql INSERT INTO balances (userid, points) - SELECT users.userid, users.points - FROM users - LEFT JOIN ( - SELECT userid, MAX(timestamp_dump) AS 'timestamp_dump' - FROM balances - GROUP BY userid - ) AS b1 USING(userid) - LEFT JOIN balances b2 USING(userid,timestamp_dump) - WHERE users.points != b2.points - OR b2.points IS NULL ; Query OK, 1 rows affected (0.38 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql UPDATE users - SET points = rand()*100 - WHERE rand() .3; Query OK, 2970 rows affected (0.16 sec) Rows matched: 2990 Changed: 2970 Warnings: 0 mysql INSERT INTO balances (userid, points) - SELECT users.userid, users.points - FROM users - LEFT JOIN ( - SELECT userid, MAX(timestamp_dump) AS 'timestamp_dump' - FROM balances - GROUP BY userid - ) AS b1 USING(userid) - LEFT JOIN balances b2 USING(userid,timestamp_dump) - WHERE users.points != b2.points - OR b2.points IS NULL ; Query OK, 2970 rows affected (20.05 sec) Records: 2970 Duplicates: 0 Warnings: 0 -- Rob Wultsch -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Tracking changes in large datasets over time
On 7/18/08, Jason Yergeau [EMAIL PROTECTED] wrote: Hi Rob --- MySQL 5. On Fri, Jul 18, 2008 at 3:01 PM, Rob Wultsch [EMAIL PROTECTED] wrote: On Fri, Jul 18, 2008 at 2:00 PM, Jason Yergeau [EMAIL PROTECTED] wrote: I'm having trouble working through a data problem. Any tips or keywords that might clue me into a known pattern would be incredibly appreciated! I have about 500,000 users in my system. Each user has a points balance, updated by a system over which I have no control. I'd like to track changes to each user's point balance over time by taking timestamped snapshots of their balance, and saving it into a new table. It's easy to take the snapshot: insert into balances (userid, points) select userid, points from users; This quickly takes the points field from my users table, and saves it into a balances table, which saves the data along with a timestamp. I can run that query on a regular basis without overly taxing my system. The first time its run, I get 500,000 rows of data. That's fine. But the next time I run a query, I only want to save the differences in balance. Anyone have any tips? Best, Jason What version of mysql do you need this to work with? I may be wrong, but I think you could accomplish this through the use of triggers. Triggers are designed to monitor data change activity. -jp -- Broken promises don't upset me. I just think, why did they believe me? deepthoughtsbyjackhandy.com
Re: counting by countrycode
If your countrycodes are going to be embedded in the callednumber, you really need a delimiter like a dash -. Then you can use some string functions to do the count. Ideally, as others suggested, the country code should be in its on field/column. Below is the general idea for the SELECT, of course the problem in this data set is the country code 001: mysql SELECT LEFT(callednumber, 4),COUNT(LEFT(callednumber, 4)) FROM call_log GROUP BY LEFT(callednumber, 4); +---+--+ | LEFT(callednumber, 4) | COUNT(LEFT(callednumber, 4)) | +---+--+ | 001X |1 | | 0060 |1 | | 0061 |6 | | 0063 |2 | +---+--+ The bottom line is that you really need to use cleaner more discrete data. CheersCassj Jerry Schwartz wrote: How are you storing phone numbers? I don't think there are leading zeroes in country codes, and country codes are not all the same length. Are you padding them with leading zeroes to five characters? If you have padded them so that the first five characters are always the country code, then you can simply do SELECT -Original Message- From: Ron [mailto:[EMAIL PROTECTED] Sent: Friday, July 18, 2008 2:27 AM To: mysql@lists.mysql.com Subject: counting by countrycode - | customercode | customer name | customerphonenumber | callednumber | calldate | callstart | callend | callduration | callcost | - | 1116 | Company Name | Customer NUmber | 0061X | 2008-07-16 | 15:25:19 | 15:26:00 |1 | 0.229375 | | 1116 | Company Name | Customer NUmber | 0063X | 2008-07-16 | 15:25:21 | 15:26:44 |2 | 0.07759 | | 1116 | Company Name | Customer NUmber | 001XX | 2008-07-16 | 15:25:24 | 15:25:34 |1 | 0.229375 | | 1116 | Company Name | Customer NUmber | 0060X | 2008-07-16 | 15:25:25 | 15:29:33 |5 | 1.14688 | | 1116 | Company Name | Customer NUmber | 0061X | 2008-07-16 | 15:25:29 | 15:25:58 |1 | 0.038795 | | 1116 | Company Name | Customer NUmber | 0061X | 2008-07-16 | 15:25:29 | 15:25:48 |1 | 0.038795 | | 1116 | Company Name | Customer NUmber | 0061X | 2008-07-16 | 15:25:32 | 15:25:54 |1 | 0.038795 | | 1116 | Company Name | Customer NUmber | 0063X | 2008-07-16 | 15:25:33 | 15:25:45 |1 | 0.038795 | | 1116 | Company Name | Customer NUmber | 0061X | 2008-07-16 | 15:25:35 | 15:26:29 |1 | 0.229375 | | 1116 | Company Name | Customer NUmber | 0061X | 2008-07-16 | 15:25:38 | 15:25:53 |1 | 0.038795 | -- Hi, How can i count by countrycode with that table? I kind of need a summary to count how many calls went to certain country. Thank You. I hope my question is clear. [JS] How are you storing phone numbers? I don't think there are leading zeroes in country codes, and country codes are not all the same length. Are you padding them with leading zeroes to five characters? If you have padded them so that the first five characters are always the country code, then you can simply do SELECT SUBSTRING(callednumber,1,5) AS countrycode FROM tablename GROUP BY countrycode; Regards Ron -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] infoshop.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Suggestion Setting For Highload Server
Hi, I have situation where a MySQL server processes about 10-20 thousands requests per minute. I need suggestions from you for tuning up this server to get optimized setting. TIA Willy -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Suggestion Setting For Highload Server
At 12:11 PM 7/19/2008, sangprabv wrote: Hi, I have situation where a MySQL server processes about 10-20 thousands requests per minute. I need suggestions from you for tuning up this server to get optimized setting. TIA Willy Willy, You will need post more information: 1) What type of hardware are you running this on? (OS, Drives, Memory etc.) 2) What table engine are you using (MyISAM or InnoDb?) 3) What do the slow queries look like? Updates? Table joins? Inserts? Simple selects? 4) How large are the tables? 5) What does your My.Cnf file look like? Mike -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Tracking changes in large datasets over time
On Sat, Jul 19, 2008 at 6:54 AM, Jake Peavy [EMAIL PROTECTED] wrote: I may be wrong, but I think you could accomplish this through the use of triggers. Triggers are designed to monitor data change activity. -jp Good thought. Using triggers to keep track of changes in an intermediary tables allows for good performance using the approach shown below. I think that should work well, or at least be a good start... mysql DROP TABLE IF EXISTS users, balances, balances_temp, integers; Query OK, 0 rows affected (0.41 sec) mysql CREATE TABLE users( - userid INT PRIMARY KEY AUTO_INCREMENT, - points INT); Query OK, 0 rows affected (0.13 sec) mysql CREATE TABLE balances( - userid INT, - points INT, - timestamp_dump TIMESTAMP, - PRIMARY KEY(userid,timestamp_dump) ); Query OK, 0 rows affected (0.11 sec) mysql CREATE TABLE balances_temp( - userid INT, - points INT, - points_original INT, - PRIMARY KEY(userid) ); Query OK, 0 rows affected (0.16 sec) mysql CREATE TABLE integers(i tinyint); Query OK, 0 rows affected (0.11 sec) mysql INSERT INTO integers(i) - VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9); Query OK, 10 rows affected (0.05 sec) Records: 10 Duplicates: 0 Warnings: 0 mysql CREATE TRIGGER users_insert - AFTER INSERT ON users - FOR EACH ROW - INSERT INTO balances_temp(userid,points) VALUES(NEW.userid,NEW.points); Query OK, 0 rows affected (0.00 sec) mysql CREATE TRIGGER users_update - BEFORE UPDATE ON users - FOR EACH ROW - INSERT INTO balances_temp(userid,points,points_original) VALUES(NEW.useri d,NEW.points,OLD.points) - ON DUPLICATE KEY UPDATE points = VALUES(points); Query OK, 0 rows affected (0.01 sec) mysql INSERT INTO users(points) - SELECT rand()*100 - FROM integers AS i1, - integers AS i2, - integers AS i3, - integers AS i4 - WHERE i1.i + i2.i *10 + i3.i * 100 + i4.i *1000 1 ; Query OK, 1 rows affected (0.69 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql INSERT INTO balances(userid, points) - SELECT userid, points - FROM balances_temp - WHERE points != points_original OR points_original IS NULL ; Query OK, 1 rows affected (0.30 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql DELETE FROM balances_temp; Query OK, 1 rows affected (0.28 sec) mysql SELECT SLEEP(60); +---+ | SLEEP(60) | +---+ | 0 | +---+ 1 row in set (1 min 0.00 sec) mysql UPDATE users - SET points = rand()*100 - WHERE rand() .3; Query OK, 3014 rows affected (0.53 sec) Rows matched: 3044 Changed: 3014 Warnings: 0 mysql INSERT INTO balances(userid, points) - SELECT userid, points - FROM balances_temp - WHERE points != points_original OR points_original IS NULL ; Query OK, 3014 rows affected (0.19 sec) Records: 3014 Duplicates: 0 Warnings: 0 mysql DELETE FROM balances_temp; Query OK, 3044 rows affected (0.13 sec) mysql SELECT * FROM balances WHERE userid 10 ORDER BY timestamp_dump, userid; +++-+ | userid | points | timestamp_dump | +++-+ | 1 | 42 | 2008-07-19 13:57:54 | | 2 | 76 | 2008-07-19 13:57:54 | | 3 | 50 | 2008-07-19 13:57:54 | | 4 | 24 | 2008-07-19 13:57:54 | | 5 | 70 | 2008-07-19 13:57:54 | | 6 | 76 | 2008-07-19 13:57:54 | | 7 | 72 | 2008-07-19 13:57:54 | | 8 | 33 | 2008-07-19 13:57:54 | | 9 | 48 | 2008-07-19 13:57:54 | | 1 | 7 | 2008-07-19 13:58:56 | | 2 | 33 | 2008-07-19 13:58:56 | | 8 | 44 | 2008-07-19 13:58:56 | | 9 | 19 | 2008-07-19 13:58:56 | +++-+ 13 rows in set (0.22 sec) -- Rob Wultsch -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Suggestion Setting For Highload Server
Hi Mike, Thanks for the reply. 1. Currently the hardware is P4 2.8 on Slackware 12 with 1GB of DDR Memory (we plan to upgrade it) 2. The table type is MyISAM 3. There is no slow query, because all of the queries are a simple type 4. The table's size is increasing dynamically with at least 10 thousands new records / minute 5. Here is the current my.cnf key_buffer = 16M max_allowed_packet = 16M thread_stack= 128K thread_cache_size = 8 max_connections = 1000 query_cache_limit = 1M query_cache_size= 16M tmp_table_size = 64M max_heap_table_size = 32M log_slow_queries= /var/log/mysql/mysql-slow.log long_query_time = 2 log-queries-not-using-indexes log_bin = /var/log/mysql/mysql-bin.log expire_logs_days= 10 max_binlog_size = 100M skip-bdb [mysqldump] quick quote-names max_allowed_packet = 16M [isamchk] key_buffer = 16M TIA Regards, Willy On Sat, 2008-07-19 at 15:45 -0500, mos wrote: At 12:11 PM 7/19/2008, sangprabv wrote: Hi, I have situation where a MySQL server processes about 10-20 thousands requests per minute. I need suggestions from you for tuning up this server to get optimized setting. TIA Willy Willy, You will need post more information: 1) What type of hardware are you running this on? (OS, Drives, Memory etc.) 2) What table engine are you using (MyISAM or InnoDb?) 3) What do the slow queries look like? Updates? Table joins? Inserts? Simple selects? 4) How large are the tables? 5) What does your My.Cnf file look like? Mike -- --- SMS SERVER, SMS BULK, SMS PREMIUM, WEB DESIGN, NETWORKING Contact [EMAIL PROTECTED], 62 811 923 464, 62 21 944 8282 4 http://www.binbit.co.id -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: Suggestion Setting For Highload Server
At 04:42 PM 7/19/2008, sangprabv wrote: Hi Mike, Thanks for the reply. 1. Currently the hardware is P4 2.8 on Slackware 12 with 1GB of DDR Memory (we plan to upgrade it) 2. The table type is MyISAM 3. There is no slow query, because all of the queries are a simple type 4. The table's size is increasing dynamically with at least 10 thousands new records / minute 5. Here is the current my.cnf key_buffer = 16M max_allowed_packet = 16M thread_stack= 128K thread_cache_size = 8 max_connections = 1000 query_cache_limit = 1M query_cache_size= 16M tmp_table_size = 64M max_heap_table_size = 32M log_slow_queries= /var/log/mysql/mysql-slow.log long_query_time = 2 log-queries-not-using-indexes log_bin = /var/log/mysql/mysql-bin.log expire_logs_days= 10 max_binlog_size = 100M skip-bdb [mysqldump] quick quote-names max_allowed_packet = 16M [isamchk] key_buffer = 16M TIA Regards, Willy Willy, You definitely need more RAM. Use as much RAM as you can cram into the machine (8-16gb). You didn't mention a problem with table locking? As the rows are being inserted, is anyone reading from the table? If so the table will be locked as rows are inserted and people are prevented from executing a Select on the table until the inserts are completed. You can get around this locking problem by optimizing the table which removes holes in the table (deleted rows) and then the inserts will not require table locks. If you want to delete a row then set a row column to 0 to indicated 'deleted' so it can't be seen by your Select statements. Once a day delete these deleted rows and re-optimize the table. The slowest part about rebuilding a large table is building the index. You will need to add 500M (or more) to key_buffer_size in order to try and build the index in memory (100x faster than building the index on disk). Do you really need 1000 connections?? Are you not using connection pooling? Otherwise it appears your settings are extremely conservative. There is a book called High Performance MySQL: Optimization, Backups, Replication, and More published 2008. It is available from Amazon. (There was a 2004 version of the book so make sure you order the newer one.) You can also post your explanation to the MySQL list to get more help. Mike On Sat, 2008-07-19 at 15:45 -0500, mos wrote: At 12:11 PM 7/19/2008, sangprabv wrote: Hi, I have situation where a MySQL server processes about 10-20 thousands requests per minute. I need suggestions from you for tuning up this server to get optimized setting. TIA Willy Willy, You will need post more information: 1) What type of hardware are you running this on? (OS, Drives, Memory etc.) 2) What table engine are you using (MyISAM or InnoDb?) 3) What do the slow queries look like? Updates? Table joins? Inserts? Simple selects? 4) How large are the tables? 5) What does your My.Cnf file look like? Mike -- --- SMS SERVER, SMS BULK, SMS PREMIUM, WEB DESIGN, NETWORKING Contact [EMAIL PROTECTED], 62 811 923 464, 62 21 944 8282 4 http://www.binbit.co.id -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Insert ... select ... On Duplicate Update Question
Is there a way to get Insert ... select ... On Duplicate Update to update the row with the duplicate key? Otherwise I'll have to use Replace which is inefficient because it deletes the old duplicated row and then inserts the new row with the same key. I'd much rather have it update the existing row with the same existing keys, but use the new values in the Select statement. Why can't it do this? It would be much faster. Mike -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]