* Bengt Lindholm
> I have a table where I need to group the content on a timestamp. Any
> record that is less than say 5 minutes from any other record needs to
> be grouped with that other record.
>
> ID   timestamp
>   1   2004-02-02 12:00:00
>   2   2004-02-02 12:00:05
>   3   2004-02-02 12:05:20
>   4   2004-02-02 12:15:00
>   5   2004-02-02 12:17:15
>   6   2004-02-02 12:21:20
>
> With this recordset I would get:
> 1 & 2 should be grouped since there's less than 5 minutes between the
> records
> 3 will be in a group of its own
> 4, 5 & 6 is a third group since any record in the group has less than 5
> minutes to another record in the group
>
> Is this possible to do in MySQL, and how is it accomplished?

You could use mysql user variables:

mysql> set @a:=NULL,@b:=0;
Query OK, 0 rows affected (0.00 sec)

mysql> select if(clicktime-interval 5 minute>[EMAIL PROTECTED],@b:[EMAIL 
PROTECTED],@b) x,
    -> @a:=clicktime from stats where clicktime between
    -> "2003-11-24 02:00:00" and "2003-11-24 02:59:59"
    -> order by clicktime;
+------+---------------------+
| x    | @a:=clicktime       |
+------+---------------------+
|    0 | 2003-11-24 02:09:57 |
|    0 | 2003-11-24 02:10:05 |
|    0 | 2003-11-24 02:10:09 |
|    0 | 2003-11-24 02:10:22 |
|    0 | 2003-11-24 02:10:24 |
  :
|    0 | 2003-11-24 02:41:42 |
|    0 | 2003-11-24 02:41:43 |
|    0 | 2003-11-24 02:43:12 |
|    1 | 2003-11-24 02:49:19 |
|    1 | 2003-11-24 02:50:46 |
  :
|    1 | 2003-11-24 02:56:28 |
|    1 | 2003-11-24 02:56:41 |
|    1 | 2003-11-24 02:56:50 |
+------+---------------------+
136 rows in set (0.01 sec)

As you can see, I only get two groups with my test data, and the result
seems to be correct, but when I apply a GROUP BY and a COUNT() I get a
different result. I get groups of all rows within five minutes of the
_first_ row of each group:

mysql> set @a:=NULL,@b:=0;
Query OK, 0 rows affected (0.00 sec)

mysql> select if(clicktime-interval 5 minute>[EMAIL PROTECTED],@b:[EMAIL 
PROTECTED],@b) x,
    -> @a:=clicktime,count(*) from stats where clicktime between
    -> "2003-11-24 02:00:00" and "2003-11-24 02:59:59"
    -> group by x order by clicktime;
+------+---------------------+----------+
| x    | @a:=clicktime       | count(*) |
+------+---------------------+----------+
|    0 | 2003-11-24 02:09:57 |       22 |
|    2 | 2003-11-24 02:15:17 |       29 |
|    4 | 2003-11-24 02:20:55 |       17 |
|    6 | 2003-11-24 02:25:55 |       22 |
|    8 | 2003-11-24 02:31:12 |        4 |
|   10 | 2003-11-24 02:36:19 |       21 |
|   12 | 2003-11-24 02:41:42 |        3 |
|   14 | 2003-11-24 02:49:19 |       13 |
|   16 | 2003-11-24 02:56:12 |        5 |
+------+---------------------+----------+
9 rows in set (0.00 sec)

This seems to have something to do with how mysql handles GROUP BY queries,
the @a variable is not re-assigned for each row. Note that this is a
non-standard GROUP BY statement, as a column (clicktime) is used in the
field list but not in the GROUP BY clause. Using a group function (MIN() in
this case) on the assignment expression makes this a more standard compliant
GROUP BY, and it seems to give the result we want:

mysql> set @a:=NULL,@b:=0;
Query OK, 0 rows affected (0.00 sec)

mysql> select if(clicktime-interval 5 minute>[EMAIL PROTECTED],@b:[EMAIL 
PROTECTED],@b) x,
    -> min(@a:=clicktime),count(*) from stats where clicktime between
    -> "2003-11-24 02:00:00" and "2003-11-24 02:59:59"
    -> group by x order by clicktime;
+------+---------------------+----------+
| x    | min(@a:=clicktime)  | count(*) |
+------+---------------------+----------+
|    0 | 2003-11-24 02:09:57 |      118 |
|    1 | 2003-11-24 02:49:19 |       18 |
+------+---------------------+----------+
2 rows in set (0.01 sec)

--
Roger


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to