Josh Berkus wrote:
intagg: what does this module do which is not already available through the built-in array functions and operators? Maybe I don't understand what it does. Unnatributed in the README. Move to pgfoundry?
Short summary: Is there an equivalent of "int_array_enum()" built in? I use it for substantial (9X) performance improvements for doing joins similar to those described in its README. I think it can be used to do somewhat similar things with integer arrays that the SQL2003 UNNEST operator does on MULTISETs (but yeah, they're quite different too). Long and boring, but with examples: I find that it can speed up certain kinds of joins (like those described in it's readme) drastically. I have a pretty big application that has a lot of joins that use int_array_enum() to expand an array stored in one column into something that looks like a table instead of having a third join table connecting two tables. Note that this is often much faster than the array IN/ANY/SOME/NOT IN comparisons because when planning the join it can all the various join plans like hash joins; while the array operators seem to just do linear searches of the arrays. This trick is especially useful in conjunction with an aggregate based on the "_int_union" function from the intarray/ contrib module (similar to the FUSION operator for MULTISETS) when you only want distinct values for that type of join. Sample queries from an actual application showing a factor-of-9 performance improvement(7 seconds to 800ms) are shown below. -- similar to the standard FUSION operator for MULTISETS. create aggregate intarray_union_agg ( sfunc = _int_union, basetype = int[], stype = int[], initcond = '{-1}' ); explain analyze select fac_nam from userfeatures.point_features join entity_facets using (entity_id) where featureid=115 group by fac_nam; -- Total runtime: 7125.322 ms explain analyze select fac_nam from (select distinct int_array_enum(fac_ids) as fac_id from (select distinct fac_ids from entity_facids natural join point_features where featureid=115) as a) as a join facet_lookup using (fac_id); -- Total runtime: 1297.558 ms explain analyze select fac_nam from (select distinct int_array_enum(fac_ids) as fac_id from (select intarray_union_agg(fac_ids) as fac_ids from entity_facids natural join point_features where featureid=115) as a) as a join facet_lookup using (fac_id); -- Total runtime: 803.187 ms I don't have access to the system right now, so I don't have the full table definitions - but the basic problem is that there are many "facets" for each row in the "point_features" table and there are many "features" with featureid=115. The queries are trying to find the names of each facet available from that set of point_features.
intarray: data_types/
Well, the array of int's data type is built in, so I think this module is more about the functions, operators, and indexes that it provides that operate on arrays of ints. Would that make it fit better under functions/ in your new directory tree? If I had a vote, I'd think it nice if the intagg module got merged with the intarray module (wherever it ends up) because they really are quite complementary in providing useful tools for manipulating arrays of ints. ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend