Re: [GENERAL] aggregate and order by

2007-12-07 Thread Sam Mason
On Thu, Dec 06, 2007 at 02:12:48PM -0600, Matthew Dennis wrote:
> I want to create an aggregate that will give the average velocity (sum of
> distance traveled / sum of elapsed time) from position and timestamps.

How do you want to handle noisy data?  If you want to handle it in any
reasonable way you'd need to some sort of regression; i.e. you'd need
to consider all the relavant data and then try and minimise the total
error somehow.  I've always relied on external tools to do this sort of
thing, but if you want to do it in the database you may be able to get
somewhere with pl/r.


  Sam

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] aggregate and order by

2007-12-06 Thread Gregory Stark
"Matthew Dennis" <[EMAIL PROTECTED]> writes:

> So, my question is if I can have PostgreSQL honor order by clauses such as:
>
> select trip_id, avg_vel(position, pos_time) 
>   from (select position, pos_time, trip_id from data order by pos_time) 
> sorted_data
>
> Would this in fact guarantee that the rows are passed into the aggregate in
> the order specified?

Yes. AFAIK this isn't covered by the spec but it works in Postgres and we know
there are people depending on it so we wouldn't break it without a big notice
and presumably some replacement.

> Other suggestions/discussions/questions/etc are welcome.

Good luck, this looks pretty painful to get right. Keep in mind you have to
keep all your state in the state data. If you keep a temporary variable
outside that data then your function won't work if it's called twice in the
same query like "select avg_vel(position, pos_time), avg_vel(position2,
pos_time2) from ..."

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] aggregate and order by

2007-12-06 Thread Matthew Dennis
I want to create an aggregate that will give the average velocity (sum of
distance traveled / sum of elapsed time) from position and timestamps.

example:

create table data(position integer, pos_time timestamp, trip_id integer);

insert into data values(1, "time x", 1);
insert into data values(2, "time x plus 1 second", 1);
insert into data values(4, "time x plus 2 second", 1);
insert into data values(1, "time y", 2);
insert into data values(4, "time y plus 1 second", 2);
insert into data values(16, "time y plus 3 second", 2);
etc, etc, etc...

select trip_id, avg_vel(position, pos_time) from data group by trip_id;

Row one to row two has an elapsed time of ("time x plus 1 second" - "time
x") and a difference in position of (2 - 1) and from row two to row three
there is a difference in position of (4 -2) and a elapsed time of ("time x
plus 1 second" - "time x plus 2 seconds") so for trip_id we get ((2-1) +
(4-2)) / (1 + 1).

Row 4 to row 5 has a difference in position of (4-1) and a elapsed time of
("time y plus 1 second" - "time y") and from row 5 to row 6 there is a
position difference of (16-4) and time difference of ("time y plus 3
seconds" - "time y plus 1 second") so for trip_id 2 we get ((4-1) + (16-4))
/ (1 + 2).

Keep in mind that I can't just take the difference between the start and end
of the trip because I might move from 1 to 10 to 1.  If I just took the end
points (1-1) the velocity would be zero because it looks like I didn't move.

So I could write an aggregate that remembers the last row and on each new
row, does the diff and keeps the running sums and then when it's done, the
final function does the division and returns the average velocity.  However,
this only works if the rows come into the aggregate function in the correct
order (otherwise I might count the total distance and/or elapsed time wrong
because both are calculated from the difference of the previous row).  So,
my question is if I can have PostgreSQL honor order by clauses such as:

select trip_id, avg_vel(position, pos_time) from (select position, pos_time,
trip_id from data order by pos_time) sorted_data

Would this in fact guarantee that the rows are passed into the aggregate in
the order specified?

Other suggestions/discussions/questions/etc are welcome.