Help with purging old logs for each customer ID
I have a customer log table that is starting to rapidly fill up (we have hundreds of thousands of users, but many are transient, and use the service for a few months, or use the free trial and quit, etc.) CREATE TABLE `customers_log` ( `customer_log_id` bigint(20) unsigned NOT NULL auto_increment, `customer_id` int(10) unsigned default '0', `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `type` enum('View','Action','Admin','Search','Login','Logout','Access','General','A PI'), `source` enum('web','mobile','system'), `body` text, PRIMARY KEY (`customer_log_id`), KEY `created_on` (`created_on`), KEY `customers_id` (`customer_id`) ) ENGINE=InnoDB What I'd like to do now is make a 'rolling log' in that I want to DELETE any entries older than 90 days for EACH `customer_id`. I'm not sure how to do that in a query? I'd rather not iterate over each customer_id if I can help it. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: Help with purging old logs for each customer ID
Off hand, I would iterate over the PRIMARY KEY, looking at a thousand rows at a time, DELETEing any that need to be purged. I would use a Perl or PHP loop, or write a stored procedure. More discussion of huge deletes (which this _could_ be): http://mysql.rjweb.org/doc.php/deletebig (PARTITIONing does not apply in your case, as I understand it.) I like the loop: SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1; DELETE FROM customers_log WHERE customer_log_id = @left_off AND customer_log_id @z AND created_on NOW() - INTERVAL 90 DAY; sleep a few seconds (to be a nice guy) Plus code to take care of iterating and terminating. That loop could be done continually. It seems that customer_id is irrelevant?? -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 11:46 AM To: mysql@lists.mysql.com Subject: Help with purging old logs for each customer ID I have a customer log table that is starting to rapidly fill up (we have hundreds of thousands of users, but many are transient, and use the service for a few months, or use the free trial and quit, etc.) CREATE TABLE `customers_log` ( `customer_log_id` bigint(20) unsigned NOT NULL auto_increment, `customer_id` int(10) unsigned default '0', `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `type` enum('View','Action','Admin','Search','Login','Logout','Access','Genera l','A PI'), `source` enum('web','mobile','system'), `body` text, PRIMARY KEY (`customer_log_id`), KEY `created_on` (`created_on`), KEY `customers_id` (`customer_id`) ) ENGINE=InnoDB What I'd like to do now is make a 'rolling log' in that I want to DELETE any entries older than 90 days for EACH `customer_id`. I'm not sure how to do that in a query? I'd rather not iterate over each customer_id if I can help it. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: Help with purging old logs for each customer ID
Well, the customer_id is relevant in that I want the last 90 days relative to each customer. customer_id = 123 might have logs from jan - mar customer_id = 444 might have logs from feb - may So it's a rolling log of THEIR last 90 days from their last log (most recent) back 90 days from there. Does that make more sense? I guess I was trying to avoid looping over every customer ID and computing if I could help it. I thought by using a GROUP BY or something it could group all the logs for a given customer and then trim them that way. But maybe brute force is the way to go? -Original Message- From: Rick James [mailto:rja...@yahoo-inc.com] Sent: Thursday, October 25, 2012 1:09 PM To: Daevid Vincent; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID Off hand, I would iterate over the PRIMARY KEY, looking at a thousand rows at a time, DELETEing any that need to be purged. I would use a Perl or PHP loop, or write a stored procedure. More discussion of huge deletes (which this _could_ be): http://mysql.rjweb.org/doc.php/deletebig (PARTITIONing does not apply in your case, as I understand it.) I like the loop: SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1; DELETE FROM customers_log WHERE customer_log_id = @left_off AND customer_log_id @z AND created_on NOW() - INTERVAL 90 DAY; sleep a few seconds (to be a nice guy) Plus code to take care of iterating and terminating. That loop could be done continually. It seems that customer_id is irrelevant?? -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 11:46 AM To: mysql@lists.mysql.com Subject: Help with purging old logs for each customer ID I have a customer log table that is starting to rapidly fill up (we have hundreds of thousands of users, but many are transient, and use the service for a few months, or use the free trial and quit, etc.) CREATE TABLE `customers_log` ( `customer_log_id` bigint(20) unsigned NOT NULL auto_increment, `customer_id` int(10) unsigned default '0', `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `type` enum('View','Action','Admin','Search','Login','Logout','Access','Genera l','A PI'), `source` enum('web','mobile','system'), `body` text, PRIMARY KEY (`customer_log_id`), KEY `created_on` (`created_on`), KEY `customers_id` (`customer_id`) ) ENGINE=InnoDB What I'd like to do now is make a 'rolling log' in that I want to DELETE any entries older than 90 days for EACH `customer_id`. I'm not sure how to do that in a query? I'd rather not iterate over each customer_id if I can help it. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql
RE: Help with purging old logs for each customer ID
If the 90 days is back from MAX(created_on) for a given customer... INDEX(customer_id, created_on) will probably be needed. And that should replace KEY `customers_id` (`customer_id`). Maybe... DELETE FROM customers_log AS a WHERE a.customer_log_id = @left_off AND a.customer_log_id @z AND a.created_on ( SELECT MAX(created_on) FROM customers_log WHERE customer_id = a.customer_id ) - INTERVAL 90 DAY ); (Since this has the subquery, I would do only 100 at a time, not 1000) Or... CREATE TEMPORARY TABLE tmp SELECT customer_id, MAX(created_on) - INTERVAL 90 DAY AS cutoff FROM customers_log GROUP BY customer_id; DELETE FROM customers_log AS a JOIN tmp ON a.customer_id = tmp.customer_id WHERE a.customer_log_id = @left_off AND a.customer_log_id @z AND a.created_on tmp.cutoff; If you have millions of rows, a delete without some kind of loop is asking for trouble. Or... Turning things around to base it on customers... Loop through customer_ids (yeah, you did not want to do this) SELECT @id := 0; StartLoop: SELECT @id := customer_id WHERE customer_id @id ORDER BY customer_id LIMIT 1; if @id is NULL, exit DELETE FROM customers_log AS a WHERE a.customer_id = @id AND a.created_on ( SELECT MAX(created_on) FROM customers_log WHERE customer_id = @id ) - INTERVAL 90 DAY ); EndLoop. Since there is no rush for the purging, there is little need to optimize it other than to keep it from interfering with other queries. To that end, the compound index I propose is important. -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 1:33 PM To: Rick James; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID Well, the customer_id is relevant in that I want the last 90 days relative to each customer. customer_id = 123 might have logs from jan - mar customer_id = 444 might have logs from feb - may So it's a rolling log of THEIR last 90 days from their last log (most recent) back 90 days from there. Does that make more sense? I guess I was trying to avoid looping over every customer ID and computing if I could help it. I thought by using a GROUP BY or something it could group all the logs for a given customer and then trim them that way. But maybe brute force is the way to go? -Original Message- From: Rick James [mailto:rja...@yahoo-inc.com] Sent: Thursday, October 25, 2012 1:09 PM To: Daevid Vincent; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID Off hand, I would iterate over the PRIMARY KEY, looking at a thousand rows at a time, DELETEing any that need to be purged. I would use a Perl or PHP loop, or write a stored procedure. More discussion of huge deletes (which this _could_ be): http://mysql.rjweb.org/doc.php/deletebig (PARTITIONing does not apply in your case, as I understand it.) I like the loop: SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1; DELETE FROM customers_log WHERE customer_log_id = @left_off AND customer_log_id @z AND created_on NOW() - INTERVAL 90 DAY; sleep a few seconds (to be a nice guy) Plus code to take care of iterating and terminating. That loop could be done continually. It seems that customer_id is irrelevant?? -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 11:46 AM To: mysql@lists.mysql.com Subject: Help with purging old logs for each customer ID I have a customer log table that is starting to rapidly fill up (we have hundreds of thousands of users, but many are transient, and use the service for a few months, or use the free trial and quit, etc.) CREATE TABLE `customers_log` ( `customer_log_id` bigint(20) unsigned NOT NULL auto_increment, `customer_id` int(10) unsigned default '0', `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `type` enum('View','Action','Admin','Search','Login','Logout','Access','Gen era l','A PI'), `source` enum('web','mobile','system'), `body` text, PRIMARY KEY (`customer_log_id`), KEY `created_on` (`created_on`), KEY `customers_id` (`customer_id`) ) ENGINE=InnoDB What I'd like to do now is make a 'rolling log' in that I want to DELETE any entries older than 90 days for EACH `customer_id`. I'm not sure how to do that in a query? I'd rather not iterate over each customer_id if I can help it. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql -- MySQL General Mailing List For list
RE: Help with purging old logs for each customer ID
Thank you Rick! You're a super freak! ;-p This gets me in the vicinity of where I'm trying to go and I learned a few new tricks with the StartLoop: stuff too! Neat! d. -Original Message- From: Rick James [mailto:rja...@yahoo-inc.com] Sent: Thursday, October 25, 2012 1:54 PM To: Daevid Vincent; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID If the 90 days is back from MAX(created_on) for a given customer... INDEX(customer_id, created_on) will probably be needed. And that should replace KEY `customers_id` (`customer_id`). Maybe... DELETE FROM customers_log AS a WHERE a.customer_log_id = @left_off AND a.customer_log_id @z AND a.created_on ( SELECT MAX(created_on) FROM customers_log WHERE customer_id = a.customer_id ) - INTERVAL 90 DAY ); (Since this has the subquery, I would do only 100 at a time, not 1000) Or... CREATE TEMPORARY TABLE tmp SELECT customer_id, MAX(created_on) - INTERVAL 90 DAY AS cutoff FROM customers_log GROUP BY customer_id; DELETE FROM customers_log AS a JOIN tmp ON a.customer_id = tmp.customer_id WHERE a.customer_log_id = @left_off AND a.customer_log_id @z AND a.created_on tmp.cutoff; If you have millions of rows, a delete without some kind of loop is asking for trouble. Or... Turning things around to base it on customers... Loop through customer_ids (yeah, you did not want to do this) SELECT @id := 0; StartLoop: SELECT @id := customer_id WHERE customer_id @id ORDER BY customer_id LIMIT 1; if @id is NULL, exit DELETE FROM customers_log AS a WHERE a.customer_id = @id AND a.created_on ( SELECT MAX(created_on) FROM customers_log WHERE customer_id = @id ) - INTERVAL 90 DAY ); EndLoop. Since there is no rush for the purging, there is little need to optimize it other than to keep it from interfering with other queries. To that end, the compound index I propose is important. -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 1:33 PM To: Rick James; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID Well, the customer_id is relevant in that I want the last 90 days relative to each customer. customer_id = 123 might have logs from jan - mar customer_id = 444 might have logs from feb - may So it's a rolling log of THEIR last 90 days from their last log (most recent) back 90 days from there. Does that make more sense? I guess I was trying to avoid looping over every customer ID and computing if I could help it. I thought by using a GROUP BY or something it could group all the logs for a given customer and then trim them that way. But maybe brute force is the way to go? -Original Message- From: Rick James [mailto:rja...@yahoo-inc.com] Sent: Thursday, October 25, 2012 1:09 PM To: Daevid Vincent; mysql@lists.mysql.com Subject: RE: Help with purging old logs for each customer ID Off hand, I would iterate over the PRIMARY KEY, looking at a thousand rows at a time, DELETEing any that need to be purged. I would use a Perl or PHP loop, or write a stored procedure. More discussion of huge deletes (which this _could_ be): http://mysql.rjweb.org/doc.php/deletebig (PARTITIONing does not apply in your case, as I understand it.) I like the loop: SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1; DELETE FROM customers_log WHERE customer_log_id = @left_off AND customer_log_id @z AND created_on NOW() - INTERVAL 90 DAY; sleep a few seconds (to be a nice guy) Plus code to take care of iterating and terminating. That loop could be done continually. It seems that customer_id is irrelevant?? -Original Message- From: Daevid Vincent [mailto:dae...@daevid.com] Sent: Thursday, October 25, 2012 11:46 AM To: mysql@lists.mysql.com Subject: Help with purging old logs for each customer ID I have a customer log table that is starting to rapidly fill up (we have hundreds of thousands of users, but many are transient, and use the service for a few months, or use the free trial and quit, etc.) CREATE TABLE `customers_log` ( `customer_log_id` bigint(20) unsigned NOT NULL auto_increment, `customer_id` int(10) unsigned default '0', `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `type` enum('View','Action','Admin','Search','Login','Logout','Access','Gen era l','A PI'), `source` enum('web','mobile','system'), `body` text, PRIMARY KEY (`customer_log_id`), KEY `created_on` (`created_on