Thanks for your comment. I see two possible solution directions:
1. Is it possible to create C-function, which could accept something like ROWSET(ARRAY[]) in its input parameters? E.g. to call it as SELECT array_rowset_glue((SELECT arrayfield FROM arraytable)); or something like this? 2. Is it possible to implement in C something like this? array_buffer_init(); SELECT array_buffer_push(arrayfield) FROM arraytable; ids := array_buffer_get(); array_buffer_free(); where array_buffer_push() is an aggregate function which returns void, but, as its side-effect, appends arrayfield to the global array buffer for later acces with array_buffer_get(). On 10/10/07, Filip Rembiałkowski <[EMAIL PROTECTED]> wrote: > > 2007/10/10, Dmitry Koterov <[EMAIL PROTECTED]>: > > Hello. > > > > I created an aggregate: > > > > CREATE AGGREGATE intarray_aggregate_push (_int4) > > ( > > STYPE = _int4, > > SFUNC = intarray_push_array, > > INITCOND = '{}' > > ); > > > > (or - I may use _int_union instead of intarray_push_array, its speed is > > practically the same in my case). > > This aggregate merges together a list of integer[] arrays resulting one > big > > array with all elements. > > > > Then I want to use this aggregate: > > > > SELECT intarray_aggregate_push(arrayfield) > > FROM arraytable > > > > The table arraytable contains a lot of rows (about 5000), each row has > > array with length of 5-10 elements, so - the resulting array should > contain > > about 50000 elements. > > > > The query is okay, but its speed is too bad: about 1 second. > > > > The main problem is the speed of intarray_aggregate_push function - it > is > > quite slow, because intarray_push_array reallocates the memory each time > I > > merge two arrays. I am pretty sure that the reallocaton and copying is > the > > bottleneck, because if I use another dummy aggreate: > > > > CREATE AGGREGATE intarray_aggregate_dummy (_int4) > > ( > > STYPE = _int4, > > SFUNC = dummy, > > INITCOND = '{}' > > ); > > > > CREATE OR REPLACE FUNCTION "public"."dummy" (a integer [], b integer []) > > RETURNS integer [] AS > > $body$ BEGIN RETURN a; END; $body$ > > LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER; > > > > where dummy() is the function which returns its first argument without > any > > modification, the speed grows dramatically - about 25 ms (instead of > 1000 > > ms!). > > > > The question is: how could I optimize this, and is it possible at all in > > Postgres? I just want to get one large array glued from a lot of smaller > > arrays... > > > 1. no wonder copying is the bottleneck - this is what the aggregate > does, mostly. > > 2. you can use plain array_cat for this, in my test it is few percent > faster > > 3. in this case I guess intarrray contrib is not an option, AFAIK it > was created only for speeding up searches, that is int4[] lookups > > 4. to have this kind of optimization you talk about, we would need an > aggregate operating (in this case appending) directly on > internalstate. i'm not sure if this is possible in postgres > > 5. my results: > your method (using intarray_push_array): 940 ms > using array_cat: 860 ms > same in PL/PgSQL: (LOOP, append) 800 ms > same thing in Perl, no database (push array of arrays into one and > print ): 18 ms > > > cheers, Filip > > > -- > Filip Rembiałkowski > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ >