On Sat, Jul 18, 2009 at 6:35 PM, Alan Li<a...@truviso.com> wrote: > Yeah, are you running into the same issue as well? I tried to figure out a > way around the O(n^2) behavior, but there were no existing direct > relationship between the child subpath and its associated AppendRelInfo. I > think an AppenRelInfo dynahash would work and just have it hash by the > child_relid.
I don't see any place in my patch where I had to do anything like this. I do have to loop through all the appendrel elements and skip over the ones which don't belong to this appendrel which seems weird to me but it isn't n^2. Generating a normal Append node you're generating a brand new path for each child and attaching them to the append path. It looks like you're allowing the append rel to generate paths and then digging around looking at those paths. I wonder if that's the right approach. >> The other thing is it would be nice if we could avoid making separate >> subplans for each child and instead make one for the whole structure >> including the aggregate. It would at the very least make the explain >> output prettier, but I think it would avoid repeated execution of the >> aggregate in some cases as well. > > How would this plan look? I think the repeated execution of the aggregate > comes from having to process the output of each child. So it's O(n) > executions where n is the number of children, assuming each child has the > appropriate index for this optimization. No I just mean instead of generating InitPlan 1 Limit Index Scan InitPlan 2 Limit Index Scan Aggregate Append Result Result I think it would be better to generate this: InitPlan 1 Aggregate Append Limit 1 IndexScan Limit 1 IndexScan Result The reason I think this is better is because if the subquery is executed multiple times it will just return the one precalculated result. In your plan it will have all the child minimums precalculated but will still have to re-execute the aggregate and append node. That's not terribly expensive but we might as well generate the simpler more efficient plan. > The other optimization that I thought was that the optimizer might use the > check constraints (in the range partitioning) on the column that I want to > do a MIN/MAX on. Then it'll only have to execute the subplan in one child > partition and the parent partition. But it was more of a wild idea on my > part, not sure if that's possible. Yes, I think this is the long-term goal. That's the whole "better representation of partitions" plotline. To do this efficiently the planner should know what the partition key is and what the partitioning structure is. In an ideal world we would be able to collapse the whole structure and eliminate the append relation entirely. To do that we need some way to indicate that the parent relation is provably empty. I had in mind to mark it as read-only and the statistics as authoritative since that seems more useful than just being able to mark it empty. I think there are a lot of other interesting things we could do with statistics if we knew they were authoritative for a read-only table. (The read-only property we would need here is very strong. There would have to be a vacuum freeze and moreover we would have to know that all tuples were successfully frozen.) -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers