Help with purging old logs for each customer ID

2012-10-25 Thread Daevid Vincent
I have a customer log table that is starting to rapidly fill up (we have
hundreds of thousands of users, but many are transient, and use the service
for a few months, or use the free trial and quit, etc.)

CREATE TABLE `customers_log` (
  `customer_log_id` bigint(20) unsigned NOT NULL auto_increment,
  `customer_id` int(10) unsigned default '0',
  `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update
CURRENT_TIMESTAMP,
  `type`
enum('View','Action','Admin','Search','Login','Logout','Access','General','A
PI'),
  `source` enum('web','mobile','system'),
  `body` text,
  PRIMARY KEY  (`customer_log_id`),
  KEY `created_on` (`created_on`),
  KEY `customers_id` (`customer_id`)
) ENGINE=InnoDB

What I'd like to do now is make a 'rolling log' in that I want to DELETE any
entries older than 90 days for EACH `customer_id`.

I'm not sure how to do that in a query? I'd rather not iterate over each
customer_id if I can help it.




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: Help with purging old logs for each customer ID

2012-10-25 Thread Rick James
Off hand, I would iterate over the PRIMARY KEY, looking at a thousand rows at a 
time, DELETEing any that need to be purged.  I would use a Perl or PHP loop, or 
write a stored procedure.  More discussion of huge deletes (which this 
_could_ be):
http://mysql.rjweb.org/doc.php/deletebig
(PARTITIONing does not apply in your case, as I understand it.)

I like the loop:

SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1;
DELETE FROM customers_log
WHERE customer_log_id = @left_off AND customer_log_id  @z
  AND created_on  NOW() - INTERVAL 90 DAY;
sleep a few seconds (to be a nice guy)

Plus code to take care of iterating and terminating.

That loop could be done continually.

It seems that customer_id is irrelevant??  

 -Original Message-
 From: Daevid Vincent [mailto:dae...@daevid.com]
 Sent: Thursday, October 25, 2012 11:46 AM
 To: mysql@lists.mysql.com
 Subject: Help with purging old logs for each customer ID
 
 I have a customer log table that is starting to rapidly fill up (we
 have hundreds of thousands of users, but many are transient, and use
 the service for a few months, or use the free trial and quit, etc.)
 
 CREATE TABLE `customers_log` (
   `customer_log_id` bigint(20) unsigned NOT NULL auto_increment,
   `customer_id` int(10) unsigned default '0',
   `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update
 CURRENT_TIMESTAMP,
   `type`
 enum('View','Action','Admin','Search','Login','Logout','Access','Genera
 l','A
 PI'),
   `source` enum('web','mobile','system'),
   `body` text,
   PRIMARY KEY  (`customer_log_id`),
   KEY `created_on` (`created_on`),
   KEY `customers_id` (`customer_id`)
 ) ENGINE=InnoDB
 
 What I'd like to do now is make a 'rolling log' in that I want to
 DELETE any entries older than 90 days for EACH `customer_id`.
 
 I'm not sure how to do that in a query? I'd rather not iterate over
 each customer_id if I can help it.
 
 
 
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: Help with purging old logs for each customer ID

2012-10-25 Thread Daevid Vincent
Well, the customer_id is relevant in that I want the last 90 days relative
to each customer.

customer_id = 123 might have logs from jan - mar
customer_id = 444 might have logs from feb - may

So it's a rolling log of THEIR last 90 days from their last log (most
recent) back 90 days from there.

Does that make more sense?

I guess I was trying to avoid looping over every customer ID and computing
if I could help it. I thought by using a GROUP BY or something it could
group all the logs for a given customer and then trim them that way.

But maybe brute force is the way to go?

 -Original Message-
 From: Rick James [mailto:rja...@yahoo-inc.com]
 Sent: Thursday, October 25, 2012 1:09 PM
 To: Daevid Vincent; mysql@lists.mysql.com
 Subject: RE: Help with purging old logs for each customer ID
 
 Off hand, I would iterate over the PRIMARY KEY, looking at a thousand rows
 at a time, DELETEing any that need to be purged.  I would use a Perl or
PHP
 loop, or write a stored procedure.  More discussion of huge deletes
(which
 this _could_ be):
 http://mysql.rjweb.org/doc.php/deletebig
 (PARTITIONing does not apply in your case, as I understand it.)
 
 I like the loop:
 
 SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1;
 DELETE FROM customers_log
 WHERE customer_log_id = @left_off AND customer_log_id  @z
   AND created_on  NOW() - INTERVAL 90 DAY;
 sleep a few seconds (to be a nice guy)
 
 Plus code to take care of iterating and terminating.
 
 That loop could be done continually.
 
 It seems that customer_id is irrelevant??
 
  -Original Message-
  From: Daevid Vincent [mailto:dae...@daevid.com]
  Sent: Thursday, October 25, 2012 11:46 AM
  To: mysql@lists.mysql.com
  Subject: Help with purging old logs for each customer ID
 
  I have a customer log table that is starting to rapidly fill up (we
  have hundreds of thousands of users, but many are transient, and use
  the service for a few months, or use the free trial and quit, etc.)
 
  CREATE TABLE `customers_log` (
`customer_log_id` bigint(20) unsigned NOT NULL auto_increment,
`customer_id` int(10) unsigned default '0',
`created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on update
  CURRENT_TIMESTAMP,
`type`
  enum('View','Action','Admin','Search','Login','Logout','Access','Genera
  l','A
  PI'),
`source` enum('web','mobile','system'),
`body` text,
PRIMARY KEY  (`customer_log_id`),
KEY `created_on` (`created_on`),
KEY `customers_id` (`customer_id`)
  ) ENGINE=InnoDB
 
  What I'd like to do now is make a 'rolling log' in that I want to
  DELETE any entries older than 90 days for EACH `customer_id`.
 
  I'm not sure how to do that in a query? I'd rather not iterate over
  each customer_id if I can help it.
 
 
 
 
  --
  MySQL General Mailing List
  For list archives: http://lists.mysql.com/mysql
  To unsubscribe:http://lists.mysql.com/mysql
 
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: Help with purging old logs for each customer ID

2012-10-25 Thread Rick James
If the 90 days is back from MAX(created_on) for a given customer...
   INDEX(customer_id, created_on)
will probably be needed.  And that should replace KEY `customers_id` 
(`customer_id`).

Maybe...

DELETE FROM customers_log AS a
WHERE a.customer_log_id = @left_off AND a.customer_log_id  @z
  AND a.created_on  
 ( SELECT MAX(created_on) FROM customers_log
 WHERE customer_id = a.customer_id
 )
  - INTERVAL 90 DAY );
(Since this has the subquery, I would do only 100 at a time, not 1000)

Or...
CREATE TEMPORARY TABLE tmp
SELECT customer_id, MAX(created_on) - INTERVAL 90 DAY  AS cutoff
FROM customers_log
GROUP BY customer_id;
DELETE FROM customers_log AS a
   JOIN tmp  ON a.customer_id = tmp.customer_id
WHERE a.customer_log_id = @left_off AND a.customer_log_id  @z
  AND a.created_on  tmp.cutoff;

If you have millions of rows, a delete without some kind of loop is asking for 
trouble.

Or...
Turning things around to base it on customers...
Loop through customer_ids (yeah, you did not want to do this)

SELECT @id := 0;
StartLoop:
SELECT @id := customer_id WHERE customer_id  @id ORDER BY customer_id 
LIMIT 1;
if @id is NULL, exit
DELETE FROM customers_log AS a
WHERE a.customer_id = @id
  AND a.created_on  
 ( SELECT MAX(created_on) FROM customers_log
 WHERE customer_id = @id
 )
  - INTERVAL 90 DAY ); 
EndLoop.

Since there is no rush for the purging, there is little need to optimize it 
other than to keep it from interfering with other queries.  To that end, the 
compound index I propose is important.

 -Original Message-
 From: Daevid Vincent [mailto:dae...@daevid.com]
 Sent: Thursday, October 25, 2012 1:33 PM
 To: Rick James; mysql@lists.mysql.com
 Subject: RE: Help with purging old logs for each customer ID
 
 Well, the customer_id is relevant in that I want the last 90 days
 relative to each customer.
 
 customer_id = 123 might have logs from jan - mar customer_id = 444
 might have logs from feb - may
 
 So it's a rolling log of THEIR last 90 days from their last log (most
 recent) back 90 days from there.
 
 Does that make more sense?
 
 I guess I was trying to avoid looping over every customer ID and
 computing if I could help it. I thought by using a GROUP BY or
 something it could group all the logs for a given customer and then
 trim them that way.
 
 But maybe brute force is the way to go?
 
  -Original Message-
  From: Rick James [mailto:rja...@yahoo-inc.com]
  Sent: Thursday, October 25, 2012 1:09 PM
  To: Daevid Vincent; mysql@lists.mysql.com
  Subject: RE: Help with purging old logs for each customer ID
 
  Off hand, I would iterate over the PRIMARY KEY, looking at a thousand
  rows at a time, DELETEing any that need to be purged.  I would use a
  Perl or
 PHP
  loop, or write a stored procedure.  More discussion of huge deletes
 (which
  this _could_ be):
  http://mysql.rjweb.org/doc.php/deletebig
  (PARTITIONing does not apply in your case, as I understand it.)
 
  I like the loop:
 
  SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1;
  DELETE FROM customers_log
  WHERE customer_log_id = @left_off AND customer_log_id  @z
AND created_on  NOW() - INTERVAL 90 DAY;
  sleep a few seconds (to be a nice guy)
 
  Plus code to take care of iterating and terminating.
 
  That loop could be done continually.
 
  It seems that customer_id is irrelevant??
 
   -Original Message-
   From: Daevid Vincent [mailto:dae...@daevid.com]
   Sent: Thursday, October 25, 2012 11:46 AM
   To: mysql@lists.mysql.com
   Subject: Help with purging old logs for each customer ID
  
   I have a customer log table that is starting to rapidly fill up (we
   have hundreds of thousands of users, but many are transient, and
 use
   the service for a few months, or use the free trial and quit, etc.)
  
   CREATE TABLE `customers_log` (
 `customer_log_id` bigint(20) unsigned NOT NULL auto_increment,
 `customer_id` int(10) unsigned default '0',
 `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on
   update CURRENT_TIMESTAMP,
 `type`
  
 enum('View','Action','Admin','Search','Login','Logout','Access','Gen
   era
   l','A
   PI'),
 `source` enum('web','mobile','system'),
 `body` text,
 PRIMARY KEY  (`customer_log_id`),
 KEY `created_on` (`created_on`),
 KEY `customers_id` (`customer_id`)
   ) ENGINE=InnoDB
  
   What I'd like to do now is make a 'rolling log' in that I want to
   DELETE any entries older than 90 days for EACH `customer_id`.
  
   I'm not sure how to do that in a query? I'd rather not iterate over
   each customer_id if I can help it.
  
  
  
  
   --
   MySQL General Mailing List
   For list archives: http://lists.mysql.com/mysql
   To unsubscribe:http://lists.mysql.com/mysql
 
 
  --
  MySQL General Mailing List
  For list

RE: Help with purging old logs for each customer ID

2012-10-25 Thread Daevid Vincent
Thank you Rick! You're a super freak! ;-p

This gets me in the vicinity of where I'm trying to go and I learned a few
new tricks with the StartLoop: stuff too! Neat!

d.

 -Original Message-
 From: Rick James [mailto:rja...@yahoo-inc.com]
 Sent: Thursday, October 25, 2012 1:54 PM
 To: Daevid Vincent; mysql@lists.mysql.com
 Subject: RE: Help with purging old logs for each customer ID
 
 If the 90 days is back from MAX(created_on) for a given customer...
INDEX(customer_id, created_on)
 will probably be needed.  And that should replace KEY `customers_id`
 (`customer_id`).
 
 Maybe...
 
 DELETE FROM customers_log AS a
 WHERE a.customer_log_id = @left_off AND a.customer_log_id  @z
   AND a.created_on 
  ( SELECT MAX(created_on) FROM customers_log
  WHERE customer_id = a.customer_id
  )
   - INTERVAL 90 DAY );
 (Since this has the subquery, I would do only 100 at a time, not 1000)
 
 Or...
 CREATE TEMPORARY TABLE tmp
 SELECT customer_id, MAX(created_on) - INTERVAL 90 DAY  AS cutoff
 FROM customers_log
 GROUP BY customer_id;
 DELETE FROM customers_log AS a
JOIN tmp  ON a.customer_id = tmp.customer_id
 WHERE a.customer_log_id = @left_off AND a.customer_log_id  @z
   AND a.created_on  tmp.cutoff;
 
 If you have millions of rows, a delete without some kind of loop is asking
 for trouble.
 
 Or...
 Turning things around to base it on customers...
 Loop through customer_ids (yeah, you did not want to do this)
 
 SELECT @id := 0;
 StartLoop:
 SELECT @id := customer_id WHERE customer_id  @id ORDER BY customer_id
 LIMIT 1;
 if @id is NULL, exit
 DELETE FROM customers_log AS a
 WHERE a.customer_id = @id
   AND a.created_on 
  ( SELECT MAX(created_on) FROM customers_log
  WHERE customer_id = @id
  )
   - INTERVAL 90 DAY );
 EndLoop.
 
 Since there is no rush for the purging, there is little need to optimize
it
 other than to keep it from interfering with other queries.  To that end,
the
 compound index I propose is important.
 
  -Original Message-
  From: Daevid Vincent [mailto:dae...@daevid.com]
  Sent: Thursday, October 25, 2012 1:33 PM
  To: Rick James; mysql@lists.mysql.com
  Subject: RE: Help with purging old logs for each customer ID
 
  Well, the customer_id is relevant in that I want the last 90 days
  relative to each customer.
 
  customer_id = 123 might have logs from jan - mar customer_id = 444
  might have logs from feb - may
 
  So it's a rolling log of THEIR last 90 days from their last log (most
  recent) back 90 days from there.
 
  Does that make more sense?
 
  I guess I was trying to avoid looping over every customer ID and
  computing if I could help it. I thought by using a GROUP BY or
  something it could group all the logs for a given customer and then
  trim them that way.
 
  But maybe brute force is the way to go?
 
   -Original Message-
   From: Rick James [mailto:rja...@yahoo-inc.com]
   Sent: Thursday, October 25, 2012 1:09 PM
   To: Daevid Vincent; mysql@lists.mysql.com
   Subject: RE: Help with purging old logs for each customer ID
  
   Off hand, I would iterate over the PRIMARY KEY, looking at a thousand
   rows at a time, DELETEing any that need to be purged.  I would use a
   Perl or
  PHP
   loop, or write a stored procedure.  More discussion of huge deletes
  (which
   this _could_ be):
   http://mysql.rjweb.org/doc.php/deletebig
   (PARTITIONing does not apply in your case, as I understand it.)
  
   I like the loop:
  
   SELECT @z := customer_log_id FROM customers_log LIMIT 1000, 1;
   DELETE FROM customers_log
   WHERE customer_log_id = @left_off AND customer_log_id  @z
 AND created_on  NOW() - INTERVAL 90 DAY;
   sleep a few seconds (to be a nice guy)
  
   Plus code to take care of iterating and terminating.
  
   That loop could be done continually.
  
   It seems that customer_id is irrelevant??
  
-Original Message-
From: Daevid Vincent [mailto:dae...@daevid.com]
Sent: Thursday, October 25, 2012 11:46 AM
To: mysql@lists.mysql.com
Subject: Help with purging old logs for each customer ID
   
I have a customer log table that is starting to rapidly fill up (we
have hundreds of thousands of users, but many are transient, and
  use
the service for a few months, or use the free trial and quit, etc.)
   
CREATE TABLE `customers_log` (
  `customer_log_id` bigint(20) unsigned NOT NULL auto_increment,
  `customer_id` int(10) unsigned default '0',
  `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on
update CURRENT_TIMESTAMP,
  `type`
   
  enum('View','Action','Admin','Search','Login','Logout','Access','Gen
era
l','A
PI'),
  `source` enum('web','mobile','system'),
  `body` text,
  PRIMARY KEY  (`customer_log_id`),
  KEY `created_on` (`created_on