Re: [GENERAL] how do you write aggregate function
On Mon, Mar 10, 2008 at 10:01:47AM -0500, Justin wrote: i wish that could work but need to keep track of the individual weights as its a percentage of the total amount of the weight. Unless you have a different meaning of weighted average to me, I don't think you do. AFAIK this would produce exactly the same result as what your code does: state_function(state,weight,value) state[0] = state[0]+weight state[1] = state[1]+weight*value final_function(state) return state[1]/state[0] The state[0] is the same as your _sumedWeight. Because it's a constant in the second loop you can do the division after. state[1] is your _sumedWxV times _sumedWeight. Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Please line up in a tree and maintain the heap invariant while boarding. Thank you for flying nlogn airlines. signature.asc Description: Digital signature
Re: [GENERAL] how do you write aggregate function
thanks it now takes 806 ms Martijn van Oosterhout wrote: On Mon, Mar 10, 2008 at 10:01:47AM -0500, Justin wrote: i wish that could work but need to keep track of the individual weights as its a percentage of the total amount of the weight. Unless you have a different meaning of weighted average to me, I don't think you do. AFAIK this would produce exactly the same result as what your code does: state_function(state,weight,value) state[0] = state[0]+weight state[1] = state[1]+weight*value final_function(state) return state[1]/state[0] The state[0] is the same as your _sumedWeight. Because it's a constant in the second loop you can do the division after. state[1] is your _sumedWxV times _sumedWeight. Have a nice day,
Re: [GENERAL] how do you write aggregate function
On Sun, Mar 09, 2008 at 11:40:47AM -0500, Justin wrote: I got the aggregate function for weighted average done. I finely left alone for more than 10 minutes to actual get it written. It takes 2 value input Weight and the Value. it will sum the weighted entries to create the constant then does the normal formula, but does not percentage number but averaged number. A simple change on the return line it can do percentages. If you're looking for performance, ISTM the best option would be to simply accumulate the weights and value*weight as you go and do a division at the end. That seems likely to beat any implementation involving array_append. Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Please line up in a tree and maintain the heap invariant while boarding. Thank you for flying nlogn airlines. signature.asc Description: Digital signature
Re: [GENERAL] how do you write aggregate function
i wish that could work but need to keep track of the individual weights as its a percentage of the total amount of the weight. I would never have thought the performance of the Array would suck this bad. Martijn van Oosterhout wrote: On Sun, Mar 09, 2008 at 11:40:47AM -0500, Justin wrote: I got the aggregate function for weighted average done. I finely left alone for more than 10 minutes to actual get it written. It takes 2 value input Weight and the Value. it will sum the weighted entries to create the constant then does the normal formula, but does not percentage number but averaged number. A simple change on the return line it can do percentages. If you're looking for performance, ISTM the best option would be to simply accumulate the weights and value*weight as you go and do a division at the end. That seems likely to beat any implementation involving array_append. Have a nice day,
Re: [GENERAL] how do you write aggregate function
A couple of email pointers: * Please don't use language like the Array would suck this bad in lists. I swear like a sailor in person, but it reads *very* badly in email. There are a lot of people on this list who might take offense, since it is their hard work that means that arrays work at all for you... That comment was not meant to be an insult or disparaging in any way what so ever. If it was taken as such then i'm sorry. It seems the biggest performance hit is copying of the array content from one memory variable to another which is happening allot. I'm not really against using a temp tables to hold onto values. I used to do that in Foxpro when i hit the hard limit on its array but other problems start popping up. If we use a temp table keeping track what going with other users can make life fun. I really want to figure this out how to speed it up. I have to write allot more aggregate functions to analyze RD data which will happen latter this year. right now this function will be used in calculating manufacturing cost. Webb Sprague wrote: A couple of email pointers: * Please don't use language like the Array would suck this bad in lists. I swear like a sailor in person, but it reads *very* badly in email. There are a lot of people on this list who might take offense, since it is their hard work that means that arrays work at all for you... * Please don't topquote. On Mon, Mar 10, 2008 at 8:01 AM, Justin [EMAIL PROTECTED] wrote: i wish that could work but need to keep track of the individual weights as its a percentage of the total amount of the weight. The definition of weighted mean doesn't require that the weights be normalized. You probably need to calculate this separately, either using a temp table or a column in the original table. I would never have thought the performance of the Array would suck this bad. You should be using a temp table or a new column as above to store the normalized weights, and then calling the original aggregate on those. (a) calculate sum of weights (aggregate), (b) calculate normalized weight for each row (weight / total), (c) find weighted mean with aggregate that retains only the totals between function calls. If you let your arrays grow to the size of tables, performance will suffer (as they are not meant to do that anyway). I don't think you can avoid two passes, whether inside a function or outside. Martijn van Oosterhout wrote: On Sun, Mar 09, 2008 at 11:40:47AM -0500, Justin wrote: I got the aggregate function for weighted average done. I finely left alone for more than 10 minutes to actual get it written. It takes 2 value input Weight and the Value. it will sum the weighted entries to create the constant then does the normal formula, but does not percentage number but averaged number. A simple change on the return line it can do percentages. If you're looking for performance, ISTM the best option would be to simply accumulate the weights and value*weight as you go and do a division at the end. That seems likely to beat any implementation involving array_append. Have a nice day,
Re: [GENERAL] how do you write aggregate function
On Mon, Mar 10, 2008 at 11:00 AM, Justin [EMAIL PROTECTED] wrote: That comment was not meant to be an insult or disparaging in any way what so ever. If it was taken as such then i'm sorry. I am sure it would have been fine in person, I just think over email it sounded abrasive. But could you stop topquoting please? It seems the biggest performance hit is copying of the array content from one memory variable to another which is happening allot. Yeah, I think arrays just can't handle a whole lot of data, that is all. They are tricky, and shouldn't be used for heavy lifting (more than 1k of elements feels like you are asking for trouble). I'm not really against using a temp tables to hold onto values. I used to do that in Foxpro when i hit the hard limit on its array but other problems start popping up. If we use a temp table keeping track what going with other users can make life fun. I think temp tables have scope, though you should test this, so that you can use them with impunity in functions and not worry with multiple users. I really want to figure this out how to speed it up. I have to write allot more aggregate functions to analyze RD data which will happen latter this year. right now this function will be used in calculating manufacturing cost. I think a combination of aggregate functions along with some more design would be best. For example: can you have a trigger calculate the normalized weight of a row on insert? Have triggers keep another table with summary information updated as you modify the data? Etc. There is a lot to PG that would help for this kind of thing. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how do you write aggregate function
I got the aggregate function for weighted average done. I finely left alone for more than 10 minutes to actual get it written. It takes 2 value input Weight and the Value. it will sum the weighted entries to create the constant then does the normal formula, but does not percentage number but averaged number. A simple change on the return line it can do percentages. I did a couple of things a little odd . instead of doing a multi dimensional array i did a one dimensional array where the 1st row is Weight and the 2nd row is Value. This made the loop through the array look stupid. I tested it across 50,000 records with a group by it took 3.3 seconds to run. without the group by clause performance is terrible taking several minutes just to do the sfunc part. 371,563ms The Array seems to have performance hit any advice?It could be the way i'm appending to the Array which has a performance hit as the array gets bigger and bigger ? [EMAIL PROTECTED] wrote: Richard Broersma wrote: On Tue, Mar 4, 2008 at 8:44 AM, Justin [EMAIL PROTECTED] wrote: I searched the archive of the mail list and did not find anything Search the documentation. There are a couple great examples posted at http://www.postgresql.org/docs/8.0/interactive/sql-createaggregate.html ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster create or replace function wcost_average_sf (numeric[], numeric, numeric) returns numeric[] as $Body$ declare _state numeric[]; begin _state := $1; _state := array_append(_state, $2); _state := array_append(_state, $3); return _state; end; $Body$ LANGUAGE 'plpgsql' VOLATILE; create or replace function wcost_average_ff (numeric[] ) returns numeric as $Body$ declare iState alias for $1 ; _sumedWeight numeric ; _sumedWxVnumeric ; _elmentCount integer ; _icounterinteger ; begin _elmentCount := array_upper(iState,1) ; _sumedWeight := 0 ; _sumedWxV:= 0 ; _icounter:= 0 ; loop _sumedWeight := _sumedWeight + iState[_icounter + 1] ; _icounter := _icounter + 2 ; if ( _icounter = _elmentCount ) then exit; end if ; end loop ; _icounter := 0; loop _sumedWxV := _sumedWxV + ( (iState[_icounter + 1]/_sumedWeight) * iState[_icounter+2]) ; _icounter := _icounter + 2 ; if ( _icounter = _elmentCount ) then exit; end if ; end loop ; return _sumedWxV; end; $Body$ LANGUAGE 'plpgsql' VOLATILE; create aggregate wcost_average (numeric, numeric)( sfunc = wcost_average_sf, stype = numeric[], initcond = '{0,0}', finalfunc = wcost_average_ff ); -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how do you write aggregate function
Array appends are usually a performance hit, as you said. I'm not sure though with PostgreSQL. Why not try it with two arrays and see what happens. At least you would reducing the single array and the eliminating the append. danap. I got the aggregate function for weighted average done. I finely left alone for more than 10 minutes to actual get it written. It takes 2 value input Weight and the Value. it will sum the weighted entries to create the constant then does the normal formula, but does not percentage number but averaged number. A simple change on the return line it can do percentages. I did a couple of things a little odd . instead of doing a multi dimensional array i did a one dimensional array where the 1st row is Weight and the 2nd row is Value. This made the loop through the array look stupid. I tested it across 50,000 records with a group by it took 3.3 seconds to run. without the group by clause performance is terrible taking several minutes just to do the sfunc part. 371,563ms The Array seems to have performance hit any advice?It could be the way i'm appending to the Array which has a performance hit as the array gets bigger and bigger ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how do you write aggregate function
If i use two arrays how do i append new entries into these arrays. array_append only allows for 1 dimension. what other methods are there to add new entries to the array. dmp wrote: Array appends are usually a performance hit, as you said. I'm not sure though with PostgreSQL. Why not try it with two arrays and see what happens. At least you would reducing the single array and the eliminating the append. danap. I got the aggregate function for weighted average done. I finely left alone for more than 10 minutes to actual get it written. It takes 2 value input Weight and the Value. it will sum the weighted entries to create the constant then does the normal formula, but does not percentage number but averaged number. A simple change on the return line it can do percentages. I did a couple of things a little odd . instead of doing a multi dimensional array i did a one dimensional array where the 1st row is Weight and the 2nd row is Value. This made the loop through the array look stupid. I tested it across 50,000 records with a group by it took 3.3 seconds to run. without the group by clause performance is terrible taking several minutes just to do the sfunc part. 371,563ms The Array seems to have performance hit any advice?It could be the way i'm appending to the Array which has a performance hit as the array gets bigger and bigger ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[Fwd: Re: [GENERAL] how do you write aggregate function]
DMP you did give me an idea on how to call the append array sfunc looks like this create or replace function wcost_average_sf (numeric[], numeric, numeric) returns numeric[] as $Body$ begin return array_append(array_append($1, $2), $3); end; $Body$ LANGUAGE 'plpgsql' VOLATILE; this yanked out 140,000 ms aka 2.3 minutes on the run time, a big improvement but no where, where i want it to be. are there speed improvements in the other languages TCL dmp wrote: Array appends are usually a performance hit, as you said. I'm not sure though with PostgreSQL. Why not try it with two arrays and see what happens. At least you would reducing the single array and the eliminating the append. danap. I got the aggregate function for weighted average done. I finely left alone for more than 10 minutes to actual get it written. It takes 2 value input Weight and the Value. it will sum the weighted entries to create the constant then does the normal formula, but does not percentage number but averaged number. A simple change on the return line it can do percentages. I did a couple of things a little odd . instead of doing a multi dimensional array i did a one dimensional array where the 1st row is Weight and the 2nd row is Value. This made the loop through the array look stupid. I tested it across 50,000 records with a group by it took 3.3 seconds to run. without the group by clause performance is terrible taking several minutes just to do the sfunc part. 371,563ms The Array seems to have performance hit any advice?It could be the way i'm appending to the Array which has a performance hit as the array gets bigger and bigger ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how do you write aggregate function
Justin, I'm really not a expert in this area, on how to code this, or functions in PostgreSQL. All I could offer is some ideas which you might try. Sometimes this is all it takes. Perhaps someone else will respond that might be more helpful. With that said I have read in the documentation the use of other languages and if you are looking for a higher performance, that is the way I would said its going to come about perhaps. danap. DMP you did give me an idea on changing how to call the append array sfunc looks like this create or replace function wcost_average_sf (numeric[], numeric, numeric) returns numeric[] as $Body$ begin return array_append(array_append($1, $2), $3); end; $Body$ LANGUAGE 'plpgsql' VOLATILE; this yanked out 140,000 ms on the run time, a big improvement but no where i want it to be. are there speed improvements in the other languages TCL dmp wrote: Array appends are usually a performance hit, as you said. I'm not sure though with PostgreSQL. Why not try it with two arrays and see what happens. At least you would reducing the single array and the eliminating the append. danap. I got the aggregate function for weighted average done. I finely left alone for more than 10 minutes to actual get it written. It takes 2 value input Weight and the Value. it will sum the weighted entries to create the constant then does the normal formula, but does not percentage number but averaged number. A simple change on the return line it can do percentages. I did a couple of things a little odd . instead of doing a multi dimensional array i did a one dimensional array where the 1st row is Weight and the 2nd row is Value. This made the loop through the array look stupid. I tested it across 50,000 records with a group by it took 3.3 seconds to run. without the group by clause performance is terrible taking several minutes just to do the sfunc part. 371,563ms The Array seems to have performance hit any advice?It could be the way i'm appending to the Array which has a performance hit as the array gets bigger and bigger ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how do you write aggregate function
Richard Broersma wrote: On Tue, Mar 4, 2008 at 8:44 AM, Justin [EMAIL PROTECTED] wrote: I searched the archive of the mail list and did not find anything Search the documentation. There are a couple great examples posted at http://www.postgresql.org/docs/8.0/interactive/sql-createaggregate.html ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] how do you write aggregate function
Thank you i was looking for examples and never thought to look at the user comments. I searched the help files several times but never thought to look at the user comment help online. the help coming that comes with PGAdmin III does not have the user comments. once i get the aggregate function working i will add it to the comment section or post it back here. Sense it will be a more complicated aggregate function that what is listed here with sfunc and ffunc calls [EMAIL PROTECTED] wrote: Richard Broersma wrote: On Tue, Mar 4, 2008 at 8:44 AM, Justin [EMAIL PROTECTED] wrote: I searched the archive of the mail list and did not find anything Search the documentation. There are a couple great examples posted at http://www.postgresql.org/docs/8.0/interactive/sql-createaggregate.html ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] how do you write aggregate function
The help is not real clear nor can i find examples how to write an aggregate function. I searched the archive of the mail list and did not find anything I need to write Weighted Average function so the need to keep track of multiple variables between function calls is must? I see how the /|sfunc|/( internal-state, next-data-values ) keeps track of the passed values with the internal state. Can the internal state be an array type where i just adding new entries to a multi-dimensional array then on ffunc do all the math in. One person pointed out the pl/r plugin but what i'm working on gets tied into the PostBooks/OpenMfg project and they are not real big fans on adding another language and plugin they have to make sure their customers have deployed (which i'm one of those customers). PostBooks/OpenMfg will want all the functions in pl/pgsql does this present any major problems??
Re: [GENERAL] how do you write aggregate function
Justin wrote: The help is not real clear nor can i find examples how to write an aggregate function. Examples: http://www.postgresql.org/docs/8.2/static/xaggr.html I searched the archive of the mail list and did not find anything The online documentation is excellent for these sorts of things. I need to write Weighted Average function so the need to keep track of multiple variables between function calls is must? You don't necessarily need an aggregate function to do this. A weighted average takes several independent variables, weights them based on some constant (usually the difference between a static time and the time at which the data were recorded), and returns a value [0]. Maintaining state between calls is probably going to be more trouble than it's worth, especially if you're recomputing the weights all the time... which, in most cases, is what happens. I perform exponential moving average analysis of stock market and trading data, for which I have a table that contains columns like the following (these data are not intended to resemble the performance of any particular security). id | time | price - 1 | 09:30 | 89.54 2 | 09:31 | 89.58 3 | 09:32 | 89.53 4 | 09:33 | 89.5 5 | 09:34 | 89.51 6 | 09:35 | 89.5 7 | 09:36 | 89.42 8 | 09:37 | 89.44 When I compute the exponential average of these data, I'm always looking at the most recent X prices, as I loop over all the rows in which I'm interested. Which means I need to recompute the weighted values for every minute of data (in the case of this sample table, anyway). Maintaining state for that sort of calculation wouldn't be worth the overhead. I suggest writing a function (in PL/pgSQL or whatever your favorite flavor is) that performs a query to retrieve all the rows you need and outputs a SETOF data that contains the weighted averages. If you only need one average at a time, just return a single value instead of a SETOF values. I hope this helps, but in case it doesn't, you should probably give us a little more detail about what you're actually trying to do. Colin [0] Using a formula like this: http://en.wikipedia.org/wiki/Moving_average#Weighted_moving_average ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] how do you write aggregate function
On Tue, Mar 4, 2008 at 8:44 AM, Justin [EMAIL PROTECTED] wrote: I searched the archive of the mail list and did not find anything I don't know if you've already seen this, but this is the question that I asked a while back: http://archives.postgresql.org/pgsql-general/2007-12/msg00681.php Regards, Richard Broersma Jr.