Re: [GENERAL] Time intersect query

Alban Hertroys Mon, 23 Mar 2009 03:02:13 -0700

On Mar 23, 2009, at 5:44 AM, Brad Murray wrote:

My current procedure...
1) Create temporary table with each possible data point. Thisexample usesrecursive functions from pgsql 8.4 but was originally implemented byusinglarge numbers of queries from php. My knowledge of the recursivefunctions
is pretty week, but I was able to put this together without too much
trouble.
create temp table timerange as with recursive f as (
   select '2009-03-21 18:20:00'::timestamp as a
   union all
   select a+'30 seconds'::interval as a from f where a < '2009-03-21
20:20:00'::timestamp
) select a from f;

I think you can do that easier with the generate_series function, noneed to use recursion that way. It's probably also convenient to havethe end of the interval available. It would be something like:

SELECT TIMESTAMP WITH TIME ZONE 'epoch' + f.a * INTERVAL '1 second' ASstart,TIMESTAMP WITH TIME ZONE 'epoch' + f.a * INTERVAL '1 second' +INTERVAL '30 seconds' AS end

FROM generate_series(
        EXTRACT(EPOCH FROM '2009-03-21 18:20:00'::timestamp)::bigint,
        EXTRACT(EPOCH FROM '2009-03-21 20:20:00'::timestamp)::bigint,
        30
) AS f(a)

I get the impression you don't use this just once, so it may be betterto keep the results (maybe with some added columns with derived valuesthat can be used to join on easily) instead of creating a temp table.You could also add your mycount column here with some initialisationvalue (likely 0).

I used something similar to generate a table that contained start andend dates of weeks based on week numbers and years. We had anothertable that contained periodic information, left joining the two tablesit was easy to split the period table into a record per week witheither the periodic information or NULL values (which meant no datafor that week). I realise weeks per year aren't much data, but neitherare your periods I think (although more). Having a scheduled scriptthat would delete everything older than say a month would keep thistable quite manageable (~90k records).

2) Update table with record counts
alter table timerange add column mycount integer;
explain analyze update timerange set mycount = (select count(*) from
streamlogfoo where client = 3041 and a between startts(ts,viewtime)and ts);

With the above you could join streamlogfoo and group bytimerange.start, like so:


SELECT timerange.start, COUNT(*)
FROM timerange

LEFT JOIN streamlog ON (streamlog.ts BETWEEN timerange.start ANDtimerange.end)

GROUP BY timerange.start

-----------------

This seems to work reasonably well, with the following exceptions...
1) The number reported is the number at the set time period, not thehighestvalue between each data point. With a 30 second interval, thisisn't a bigproblem, but with larger intervals gives results that do not matchwhat I'm
looking for (maximum users).
2) This does not scale well for large numbers of points, asinternally eachdata point is a complete scan through the data, even though most ofthe data
points will be common for the entire range.
I'm thinking this would be a good use for the new window functions,but I'm
not sure where to begin.  Any ideas?

Well, you'd need something to partition over, and since you don't knowwhere your intervals start and end I don't see how you could do thatwithout at least first generating your intervals. After that theredoesn't seem to be much use for the windowing functions, as a simplegroup by seems to do what you want.


Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,49c75d4b129742009819935!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Time intersect query

Reply via email to