Re: [GENERAL] how do you write aggregate function

2008-03-11 Thread Martijn van Oosterhout
On Mon, Mar 10, 2008 at 10:01:47AM -0500, Justin wrote:
 i wish that could work  but need to keep track of the individual weights 
 as its a percentage of the total amount of the weight. 

Unless you have a different meaning of weighted average to me, I don't
think you do. AFAIK this would produce exactly the same result as what
your code does:

state_function(state,weight,value)
  state[0] = state[0]+weight
  state[1] = state[1]+weight*value

final_function(state)
  return state[1]/state[0]

The state[0] is the same as your _sumedWeight. Because it's a constant in
the second loop you can do the division after. state[1] is your
_sumedWxV times _sumedWeight.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [GENERAL] how do you write aggregate function

2008-03-11 Thread Justin

thanks

it now takes 806 ms

Martijn van Oosterhout wrote:

On Mon, Mar 10, 2008 at 10:01:47AM -0500, Justin wrote:
  
i wish that could work  but need to keep track of the individual weights 
as its a percentage of the total amount of the weight. 



Unless you have a different meaning of weighted average to me, I don't
think you do. AFAIK this would produce exactly the same result as what
your code does:

state_function(state,weight,value)
  state[0] = state[0]+weight
  state[1] = state[1]+weight*value

final_function(state)
  return state[1]/state[0]

The state[0] is the same as your _sumedWeight. Because it's a constant in
the second loop you can do the division after. state[1] is your
_sumedWxV times _sumedWeight.

Have a nice day,
  


Re: [GENERAL] how do you write aggregate function

2008-03-10 Thread Martijn van Oosterhout
On Sun, Mar 09, 2008 at 11:40:47AM -0500, Justin wrote:
 I got the aggregate function for weighted average done.   I finely left 
 alone for more than 10 minutes  to actual get it written.  It takes 2 
 value input  Weight and the Value.  it will sum the weighted entries to 
 create the constant then does the normal formula, but does not 
 percentage number but averaged number.  A simple change on the return 
 line it can do percentages.

If you're looking for performance, ISTM the best option would be to
simply accumulate the weights and value*weight as you go and do a
division at the end. That seems likely to beat any implementation
involving array_append.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [GENERAL] how do you write aggregate function

2008-03-10 Thread Justin
i wish that could work  but need to keep track of the individual weights 
as its a percentage of the total amount of the weight. 

I would never have thought the performance of the Array would suck this 
bad. 


Martijn van Oosterhout wrote:

On Sun, Mar 09, 2008 at 11:40:47AM -0500, Justin wrote:
  
I got the aggregate function for weighted average done.   I finely left 
alone for more than 10 minutes  to actual get it written.  It takes 2 
value input  Weight and the Value.  it will sum the weighted entries to 
create the constant then does the normal formula, but does not 
percentage number but averaged number.  A simple change on the return 
line it can do percentages.



If you're looking for performance, ISTM the best option would be to
simply accumulate the weights and value*weight as you go and do a
division at the end. That seems likely to beat any implementation
involving array_append.

Have a nice day,
  


Re: [GENERAL] how do you write aggregate function

2008-03-10 Thread Justin

A couple of email pointers:

*  Please don't use  language like the Array would suck this bad in
lists.  I swear like a sailor in person, but it reads *very* badly in
email.  There are a lot of people on this list who might take offense,
since it is their hard work that means that arrays work at all for
you...


That comment was not meant to be an insult or disparaging in any way 
what so ever.  If it was taken as such then i'm sorry.


It seems the biggest performance hit is copying of the array content 
from one memory variable to another which is happening allot.


I'm not really against using a temp tables to hold  onto values.  I used 
to do that in Foxpro when i hit the hard limit on its array but other 
problems start popping up.  If we use a temp table keeping track what 
going with other users can make life fun.


I really want to figure this out how to speed it up.  I have to write 
allot more aggregate functions to analyze RD data which will happen 
latter this year.   right now this function will be used in calculating 
manufacturing cost. 


Webb Sprague wrote:

A couple of email pointers:

*  Please don't use  language like the Array would suck this bad in
lists.  I swear like a sailor in person, but it reads *very* badly in
email.  There are a lot of people on this list who might take offense,
since it is their hard work that means that arrays work at all for
you...

* Please don't topquote.

On Mon, Mar 10, 2008 at 8:01 AM, Justin [EMAIL PROTECTED] wrote:
  

 i wish that could work  but need to keep track of the individual weights as
its a percentage of the total amount of the weight.



The definition of weighted mean doesn't require that the weights be
normalized.  You probably need to calculate this separately, either
using a temp table or a column in the original table.

  

 I would never have thought the performance of the Array would suck this
bad.



You should be using a temp table or a new column as above to store the
normalized weights, and then calling the original aggregate on those.
(a) calculate sum of weights (aggregate), (b) calculate normalized
weight for each row (weight / total), (c) find weighted mean with
aggregate that retains only the totals between function calls.  If you
let your arrays grow to the size of tables, performance will suffer
(as they are not meant to do that anyway).  I don't think you can
avoid two passes, whether inside a function or outside.

  

 Martijn van Oosterhout wrote:
 On Sun, Mar 09, 2008 at 11:40:47AM -0500, Justin wrote:


 I got the aggregate function for weighted average done. I finely left
alone for more than 10 minutes to actual get it written. It takes 2
value input Weight and the Value. it will sum the weighted entries to
create the constant then does the normal formula, but does not
percentage number but averaged number. A simple change on the return
line it can do percentages.

 If you're looking for performance, ISTM the best option would be to
simply accumulate the weights and value*weight as you go and do a
division at the end. That seems likely to beat any implementation
involving array_append.

Have a nice day,





Re: [GENERAL] how do you write aggregate function

2008-03-10 Thread Webb Sprague
On Mon, Mar 10, 2008 at 11:00 AM, Justin [EMAIL PROTECTED] wrote:

  That comment was not meant to be an insult or disparaging in any way what
 so ever.  If it was taken as such then i'm sorry.

I am sure it would have been fine in person, I just think over email
it sounded abrasive.

But could you stop topquoting please?

  It seems the biggest performance hit is copying of the array content from
 one memory variable to another which is happening allot.

Yeah, I think arrays just can't handle a whole lot of data, that is
all.  They are tricky, and shouldn't be used for heavy lifting (more
than 1k of elements feels like you are asking for trouble).

  I'm not really against using a temp tables to hold  onto values.  I used to
 do that in Foxpro when i hit the hard limit on its array but other problems
 start popping up.  If we use a temp table keeping track what going with
 other users can make life fun.

I think temp tables have scope, though you should test this, so that
you can use them with impunity in functions and not worry with
multiple users.

  I really want to figure this out how to speed it up.  I have to write allot
 more aggregate functions to analyze RD data which will happen latter this
 year.   right now this function will be used in calculating manufacturing
 cost.

I think a combination of aggregate functions along with some more
design would be best.  For example:  can you have a trigger calculate
the normalized weight of a row on insert?  Have triggers keep another
table with summary information updated as you modify the data?  Etc.
There is a lot to PG that would help for this kind of thing.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do you write aggregate function

2008-03-09 Thread Justin
I got the aggregate function for weighted average done.   I finely left 
alone for more than 10 minutes  to actual get it written.  It takes 2 
value input  Weight and the Value.  it will sum the weighted entries to 
create the constant then does the normal formula, but does not 
percentage number but averaged number.  A simple change on the return 
line it can do percentages.


I did a couple of  things a little odd .  instead of doing a multi 
dimensional array i did a one dimensional array where the 1st row is 
Weight and the  2nd row is Value.   This made the loop through the array 
look stupid. 

I tested it across 50,000 records with a group by it took 3.3 seconds to 
run.


without the group by clause performance is terrible taking several 
minutes just to do the sfunc part. 371,563ms


The Array seems to have performance hit any advice?It could be the 
way i'm appending to the Array which has a performance hit as the array 
gets bigger and bigger ?



[EMAIL PROTECTED] wrote:

Richard Broersma wrote:

On Tue, Mar 4, 2008 at 8:44 AM, Justin [EMAIL PROTECTED] wrote:


I searched the archive of the mail list and did not find anything


Search the documentation. There are a couple great examples posted at

http://www.postgresql.org/docs/8.0/interactive/sql-createaggregate.html



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

create or replace function wcost_average_sf (numeric[],  numeric, numeric)
returns numeric[] as 
$Body$
declare 
_state numeric[];
begin 
_state := $1;
_state := array_append(_state, $2);
_state := array_append(_state, $3);
return _state;
end;
$Body$
LANGUAGE 'plpgsql' VOLATILE;

create or replace function wcost_average_ff (numeric[] )
returns numeric as 
$Body$
declare 
iState   alias for $1 ;
_sumedWeight numeric ;
_sumedWxVnumeric ;
_elmentCount integer ;
_icounterinteger ;
begin 
_elmentCount := array_upper(iState,1) ;
_sumedWeight := 0 ;
_sumedWxV:= 0 ;
_icounter:= 0 ;

loop
_sumedWeight := _sumedWeight + iState[_icounter + 1] ;
_icounter := _icounter + 2 ;

if ( _icounter = _elmentCount ) then
exit; 
end if ;
end loop ; 

_icounter := 0;
loop 
_sumedWxV := _sumedWxV + ( (iState[_icounter + 
1]/_sumedWeight) * iState[_icounter+2]) ;
_icounter := _icounter + 2 ;

if ( _icounter = _elmentCount ) then
exit; 
end if ;
end loop ;
return _sumedWxV;
end;
$Body$
LANGUAGE 'plpgsql' VOLATILE;


create aggregate wcost_average (numeric, numeric)(
sfunc = wcost_average_sf, 
stype = numeric[],
initcond = '{0,0}',
finalfunc = wcost_average_ff
);
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do you write aggregate function

2008-03-09 Thread dmp
Array appends are usually a performance hit, as you said. I'm not sure 
though with
PostgreSQL. Why not try it with two arrays and see what happens. At 
least you would

reducing the single array and the eliminating the append.

danap.

I got the aggregate function for weighted average done.   I finely 
left alone for more than 10 minutes  to actual get it written.  It 
takes 2 value input  Weight and the Value.  it will sum the weighted 
entries to create the constant then does the normal formula, but does 
not percentage number but averaged number.  A simple change on the 
return line it can do percentages.


I did a couple of  things a little odd .  instead of doing a multi 
dimensional array i did a one dimensional array where the 1st row is 
Weight and the  2nd row is Value.   This made the loop through the 
array look stupid.
I tested it across 50,000 records with a group by it took 3.3 seconds 
to run.


without the group by clause performance is terrible taking several 
minutes just to do the sfunc part. 371,563ms


The Array seems to have performance hit any advice?It could be the 
way i'm appending to the Array which has a performance hit as the 
array gets bigger and bigger ?



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do you write aggregate function

2008-03-09 Thread Justin
If i use two arrays how do i append new entries into these arrays.  
array_append only allows for 1 dimension.


what other methods are there to add new entries to the array. 


dmp wrote:
Array appends are usually a performance hit, as you said. I'm not sure 
though with
PostgreSQL. Why not try it with two arrays and see what happens. At 
least you would

reducing the single array and the eliminating the append.

danap.

I got the aggregate function for weighted average done.   I finely 
left alone for more than 10 minutes  to actual get it written.  It 
takes 2 value input  Weight and the Value.  it will sum the weighted 
entries to create the constant then does the normal formula, but does 
not percentage number but averaged number.  A simple change on the 
return line it can do percentages.


I did a couple of  things a little odd .  instead of doing a multi 
dimensional array i did a one dimensional array where the 1st row is 
Weight and the  2nd row is Value.   This made the loop through the 
array look stupid.
I tested it across 50,000 records with a group by it took 3.3 seconds 
to run.


without the group by clause performance is terrible taking several 
minutes just to do the sfunc part. 371,563ms


The Array seems to have performance hit any advice?It could be 
the way i'm appending to the Array which has a performance hit as the 
array gets bigger and bigger ?





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Fwd: Re: [GENERAL] how do you write aggregate function]

2008-03-09 Thread Justin


DMP   you did give me an idea on how to call the append array 


sfunc looks like this

create or replace function wcost_average_sf (numeric[],  numeric, numeric)
returns numeric[] as
$Body$
  begin
  return array_append(array_append($1, $2), $3);
  end;
$Body$
LANGUAGE 'plpgsql' VOLATILE;

this yanked out 140,000 ms aka 2.3 minutes  on the run time,  a big improvement but no 
where, where i want it to be.


are there speed improvements in the other languages TCL

dmp wrote:
Array appends are usually a performance hit, as you said. I'm not sure 
though with
PostgreSQL. Why not try it with two arrays and see what happens. At 
least you would

reducing the single array and the eliminating the append.

danap.

I got the aggregate function for weighted average done.   I finely 
left alone for more than 10 minutes  to actual get it written.  It 
takes 2 value input  Weight and the Value.  it will sum the weighted 
entries to create the constant then does the normal formula, but does 
not percentage number but averaged number.  A simple change on the 
return line it can do percentages.


I did a couple of  things a little odd .  instead of doing a multi 
dimensional array i did a one dimensional array where the 1st row is 
Weight and the  2nd row is Value.   This made the loop through the 
array look stupid.
I tested it across 50,000 records with a group by it took 3.3 seconds 
to run.


without the group by clause performance is terrible taking several 
minutes just to do the sfunc part. 371,563ms


The Array seems to have performance hit any advice?It could be 
the way i'm appending to the Array which has a performance hit as the 
array gets bigger and bigger ?







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do you write aggregate function

2008-03-09 Thread dmp

Justin,
I'm really not a expert in this area, on how to code this, or functions in
PostgreSQL. All I could offer is some ideas which you might try. Sometimes
this is all it takes. Perhaps someone else will respond that might be more
helpful. With that said I have read in the documentation the use of other
languages and if you are looking for a higher performance, that is the
way I would said its going to come about perhaps.

danap.


DMP   you did give me an idea on changing how to call the append array
sfunc looks like this

create or replace function wcost_average_sf (numeric[],  numeric, 
numeric)

returns numeric[] as
$Body$
   begin
   return array_append(array_append($1, $2), $3);
   end;
$Body$
LANGUAGE 'plpgsql' VOLATILE;

this yanked out 140,000 ms   on the run time,  a big improvement but 
no where i want it to be.


are there speed improvements in the other languages TCL

dmp wrote:

Array appends are usually a performance hit, as you said. I'm not 
sure though with
PostgreSQL. Why not try it with two arrays and see what happens. At 
least you would

reducing the single array and the eliminating the append.

danap.

I got the aggregate function for weighted average done.   I finely 
left alone for more than 10 minutes  to actual get it written.  It 
takes 2 value input  Weight and the Value.  it will sum the weighted 
entries to create the constant then does the normal formula, but 
does not percentage number but averaged number.  A simple change on 
the return line it can do percentages.


I did a couple of  things a little odd .  instead of doing a multi 
dimensional array i did a one dimensional array where the 1st row is 
Weight and the  2nd row is Value.   This made the loop through the 
array look stupid.
I tested it across 50,000 records with a group by it took 3.3 
seconds to run.


without the group by clause performance is terrible taking several 
minutes just to do the sfunc part. 371,563ms


The Array seems to have performance hit any advice?It could be 
the way i'm appending to the Array which has a performance hit as 
the array gets bigger and bigger ?






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how do you write aggregate function

2008-03-05 Thread btober

Richard Broersma wrote:

On Tue, Mar 4, 2008 at 8:44 AM, Justin [EMAIL PROTECTED] wrote:


I searched the archive of the mail list and did not find anything


Search the documentation. There are a couple great examples posted at

http://www.postgresql.org/docs/8.0/interactive/sql-createaggregate.html



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] how do you write aggregate function

2008-03-05 Thread Justin
Thank you i was looking for examples and never thought to look at the 
user comments.  I searched the help files several times but never 
thought to look at the user comment help online. 



the help coming that comes with PGAdmin III does not have the user 
comments.  once i get the aggregate function working i will add it to 
the comment section or post it back here.  Sense it will be a more 
complicated aggregate function that what is listed here with sfunc and 
ffunc calls


[EMAIL PROTECTED] wrote:

Richard Broersma wrote:

On Tue, Mar 4, 2008 at 8:44 AM, Justin [EMAIL PROTECTED] wrote:


I searched the archive of the mail list and did not find anything


Search the documentation. There are a couple great examples posted at

http://www.postgresql.org/docs/8.0/interactive/sql-createaggregate.html




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[GENERAL] how do you write aggregate function

2008-03-04 Thread Justin
The help is not real clear nor can i find examples how to write an 
aggregate function. 


I searched the archive of the mail list and did not find anything

I need to write Weighted Average function so the need to keep track of 
multiple variables between function calls is must?


I see how the /|sfunc|/( internal-state, next-data-values )   keeps 
track of the passed values with the internal state.  Can the internal 
state be an array type where i  just adding new entries to a 
multi-dimensional array then on ffunc do all the math in.


One person pointed out the pl/r plugin but what i'm working on gets tied 
into the PostBooks/OpenMfg project and they are not real big fans on 
adding another language and plugin they have to make sure their 
customers have deployed (which i'm one of those customers).


PostBooks/OpenMfg will want all the functions in pl/pgsql  does this 
present any major problems??


Re: [GENERAL] how do you write aggregate function

2008-03-04 Thread Colin Wetherbee

Justin wrote:
The help is not real clear nor can i find examples how to write an 
aggregate function. 


Examples:

http://www.postgresql.org/docs/8.2/static/xaggr.html


I searched the archive of the mail list and did not find anything


The online documentation is excellent for these sorts of things.

I need to write Weighted Average function so the need to keep track of 
multiple variables between function calls is must?


You don't necessarily need an aggregate function to do this.

A weighted average takes several independent variables, weights them 
based on some constant (usually the difference between a static time and 
the time at which the data were recorded), and returns a value [0]. 
Maintaining state between calls is probably going to be more trouble 
than it's worth, especially if you're recomputing the weights all the 
time... which, in most cases, is what happens.


I perform exponential moving average analysis of stock market and 
trading data, for which I have a table that contains columns like the 
following (these data are not intended to resemble the performance of 
any particular security).


id | time  | price
-
 1 | 09:30 | 89.54
 2 | 09:31 | 89.58
 3 | 09:32 | 89.53
 4 | 09:33 | 89.5
 5 | 09:34 | 89.51
 6 | 09:35 | 89.5
 7 | 09:36 | 89.42
 8 | 09:37 | 89.44

When I compute the exponential average of these data, I'm always looking 
at the most recent X prices, as I loop over all the rows in which I'm 
interested.  Which means I need to recompute the weighted values for 
every minute of data (in the case of this sample table, anyway). 
Maintaining state for that sort of calculation wouldn't be worth the 
overhead.


I suggest writing a function (in PL/pgSQL or whatever your favorite 
flavor is) that performs a query to retrieve all the rows you need and 
outputs a SETOF data that contains the weighted averages.  If you only 
need one average at a time, just return a single value instead of a 
SETOF values.


I hope this helps, but in case it doesn't, you should probably give us a 
little more detail about what you're actually trying to do.


Colin

[0] Using a formula like this: 
http://en.wikipedia.org/wiki/Moving_average#Weighted_moving_average


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] how do you write aggregate function

2008-03-04 Thread Richard Broersma
On Tue, Mar 4, 2008 at 8:44 AM, Justin [EMAIL PROTECTED] wrote:


 I searched the archive of the mail list and did not find anything


I don't know if you've already seen this, but this is the question that I
asked a while back:
http://archives.postgresql.org/pgsql-general/2007-12/msg00681.php

Regards,
Richard Broersma Jr.