Re: [SQL] selecting records X minutes apart

2011-06-05 Thread Masaru Sugawara
On Fri, 3 Jun 2011 15:52:53 -0400
lists-pg...@useunix.net wrote:


I also think you might want to use WITH RECURSIVE clause. 
This SQL searches the case of an interval of 5 minutes or more, 
and sets a relationship between a parent to its child.


CREATE TABLE tbl(id integer, ts time) ;
INSERT INTO tbl VALUES
(0, '20:00'),
(0, '20:05'),
(0, '20:08'),
(0, '20:10'),
(0, '20:11'),
(1, '20:03'),
(1, '20:04'),
(1, '20:05'),
(1, '20:09'),
(1, '20:16');

SELECT * FROM tbl;


--
WITH RECURSIVE rec(id , ts_p, ts_c) AS (
SELECT a1.id, min(a1.ts), min(b1.ts)
   FROM tbl AS a1, tbl AS b1 
   WHERE a1.id=b1.id AND a1.ts + interval'5 minute' <= b1.ts 
   GROUP BY a1.id
UNION ALL
SELECT t2.id, t2.ts_p, t2.ts_c
   FROM rec AS t1 INNER JOIN 
   (SELECT a2.id, a2.ts as ts_p, min(b2.ts) AS ts_c
   FROM tbl AS a2, tbl AS b2 
   WHERE a2.id = b2.id AND a2.ts + interval'5 minute' <= b2.ts 
   GROUP BY a2.id, a2.ts
UNION ALL
SELECT a3.id, a3.ts, null
   FROM tbl AS a3  
) AS t2 ON t1.id = t2.id AND t1.ts_c=t2.ts_p
)
SELECT DISTINCT id, ts_p AS ts FROM rec
ORDER BY 1,2;






> I have a table that, at a minimum, has ID and timestamp columns.  Records
> are inserted into with random IDs and timestamps.  Duplicate IDs are allowed. 
> 
> I want to select records grouped by ID, ordered by timestamp that are X 
> minutes
> apart. In this case X is 5.
> 
> Note, the intervals are not X minute wall clock intervals, they are X minute
> intervals from the last accepted record, per-id.
> 
> For instance here is some sample input data:
> 
> IDTS (HH:MM)
> ---
> 0 20:00
> 1 20:03
> 1 20:04
> 0 20:05
> 1 20:05
> 0 20:08
> 1 20:09
> 0 20:10
> 
> I'd want the select to return:
> 
> IDTS (HH:MM)
> ---
> 0 20:00
> 0 20:05
> 0 20:10
> 1 20:03
> 1 20:09
> 
> 
> Does my question make sense?
> 
> Thanks in advance,
> Wayne
> 
> -- 
> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql




-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] selecting records X minutes apart

2011-06-05 Thread Masaru Sugawara
On Fri, 3 Jun 2011 15:52:53 -0400
lists-pg...@useunix.net wrote:


I also think you might want to use WITH RECURSIVE clause. 
This SQL searches the case of an interval of 5 minutes or more, 
and sets a relationship between a parent to its child.


CREATE TABLE tbl(id integer, ts time) ;
INSERT INTO tbl VALUES
(0, '20:00'),
(0, '20:05'),
(0, '20:08'),
(0, '20:10'),
(0, '20:11'),   -- added as a sample. 
(1, '20:03'),
(1, '20:04'),
(1, '20:05'),
(1, '20:09'),
(1, '20:16');   -- added as a sample. 

SELECT * FROM tbl;


--
WITH RECURSIVE rec(id , ts_p, ts_c) AS (
SELECT a1.id, min(a1.ts), min(b1.ts)
   FROM tbl AS a1, tbl AS b1 
   WHERE a1.id=b1.id AND a1.ts + interval'5 minute' <= b1.ts 
   GROUP BY a1.id
UNION ALL
SELECT t2.id, t2.ts_p, t2.ts_c
   FROM rec AS t1 INNER JOIN 
   (SELECT a2.id, a2.ts as ts_p, min(b2.ts) AS ts_c
   FROM tbl AS a2, tbl AS b2 
   WHERE a2.id = b2.id AND a2.ts + interval'5 minute' <= b2.ts 
   GROUP BY a2.id, a2.ts
UNION ALL
SELECT a3.id, a3.ts, null
   FROM tbl AS a3  
) AS t2 ON t1.id = t2.id AND t1.ts_c=t2.ts_p
)
SELECT DISTINCT id, ts_p AS ts FROM rec
ORDER BY 1,2;




Regards,
Masaru Sugawara


> I have a table that, at a minimum, has ID and timestamp columns.  Records
> are inserted into with random IDs and timestamps.  Duplicate IDs are allowed. 
> 
> I want to select records grouped by ID, ordered by timestamp that are X 
> minutes
> apart. In this case X is 5.
> 
> Note, the intervals are not X minute wall clock intervals, they are X minute
> intervals from the last accepted record, per-id.
> 
> For instance here is some sample input data:
> 
> IDTS (HH:MM)
> ---
> 0 20:00
> 1 20:03
> 1 20:04
> 0 20:05
> 1 20:05
> 0 20:08
> 1 20:09
> 0 20:10
> 
> I'd want the select to return:
> 
> IDTS (HH:MM)
> ---
> 0 20:00
> 0 20:05
> 0 20:10
> 1 20:03
> 1 20:09
> 
> 
> Does my question make sense?
> 
> Thanks in advance,
> Wayne
> 
> -- 
> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql




-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] selecting records X minutes apart

2011-06-05 Thread lists-pgsql
Thank you all who replied!!  It looks like Sugawara's recursive solution does
the trick.  Unfortunately performance is quite poor for the sample dataset I'm
working with which is a table of about 5 records.  Indeed, there are
indexes applied to the table.  I believe the recursive select is being
executed a great number of times causing the first part of the query to take a
long time.

The fastest solution I've come up with is a plpgsql procedure the loops over a
select where the result is ordered by (id,tstamp) and examines the tstamp
values and only returns rows that meet the interval criteria. This technique
takes roughly 2 seconds to filter out records over my 5 record sample
set which is acceptable but not nearly as elegant as a single SQL
statement.

Again, thank you for all the replies.

Wayne

On Sun, Jun 05, 2011 at 08:52:30PM +0900, Masaru Sugawara wrote:
> On Fri, 3 Jun 2011 15:52:53 -0400
> lists-pg...@useunix.net wrote:
> 
> 
> I also think you might want to use WITH RECURSIVE clause. 
> This SQL searches the case of an interval of 5 minutes or more, 
> and sets a relationship between a parent to its child.
> 
> 
> CREATE TABLE tbl(id integer, ts time) ;
> INSERT INTO tbl VALUES
> (0, '20:00'),
> (0, '20:05'),
> (0, '20:08'),
> (0, '20:10'),
> (0, '20:11'),
> (1, '20:03'),
> (1, '20:04'),
> (1, '20:05'),
> (1, '20:09'),
> (1, '20:16');
> 
> SELECT * FROM tbl;
> 
> 
> --
> WITH RECURSIVE rec(id , ts_p, ts_c) AS (
> SELECT a1.id, min(a1.ts), min(b1.ts)
>FROM tbl AS a1, tbl AS b1 
>WHERE a1.id=b1.id AND a1.ts + interval'5 minute' <= b1.ts 
>GROUP BY a1.id
> UNION ALL
> SELECT t2.id, t2.ts_p, t2.ts_c
>FROM rec AS t1 INNER JOIN 
>(SELECT a2.id, a2.ts as ts_p, min(b2.ts) AS ts_c
>FROM tbl AS a2, tbl AS b2 
>WHERE a2.id = b2.id AND a2.ts + interval'5 minute' <= b2.ts 
>GROUP BY a2.id, a2.ts
> UNION ALL
> SELECT a3.id, a3.ts, null
>FROM tbl AS a3  
> ) AS t2 ON t1.id = t2.id AND t1.ts_c=t2.ts_p
> )
> SELECT DISTINCT id, ts_p AS ts FROM rec
> ORDER BY 1,2;
> 
> 
> 
> 
> 
> 
> > I have a table that, at a minimum, has ID and timestamp columns.  Records
> > are inserted into with random IDs and timestamps.  Duplicate IDs are 
> > allowed. 
> > 
> > I want to select records grouped by ID, ordered by timestamp that are X 
> > minutes
> > apart. In this case X is 5.
> > 
> > Note, the intervals are not X minute wall clock intervals, they are X minute
> > intervals from the last accepted record, per-id.
> > 
> > For instance here is some sample input data:
> > 
> > ID  TS (HH:MM)
> > ---
> > 0   20:00
> > 1   20:03
> > 1   20:04
> > 0   20:05
> > 1   20:05
> > 0   20:08
> > 1   20:09
> > 0   20:10
> > 
> > I'd want the select to return:
> > 
> > ID  TS (HH:MM)
> > ---
> > 0   20:00
> > 0   20:05
> > 0   20:10
> > 1   20:03
> > 1   20:09
> > 
> > 
> > Does my question make sense?
> > 
> > Thanks in advance,
> > Wayne
> > 
> > -- 
> > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-sql
> 
> 
> 
> 
> -- 
> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql