I have forgotten to mention that I have enable_partitionwise_aggregate=on
in the global settings since the beginning. According to the docs:
Enables or disables the query planner's use of partitionwise grouping or
aggregation, which allows grouping or aggregation on partitioned tables
to be perf
On Tue, 14 May 2024, David Rowley wrote:
If you were to put the n_distinct_inherited estimate back to 200 and
disable sort, you should see the costs are higher for the index plan.
If that's not the case then there might be a bug. It seems more
likely that due to the n_distinct estimate being so
On Tue, 14 May 2024, Dimitrios Apostolou wrote:
It took long but if finished:
ANALYZE
Time: 19177398.025 ms (05:19:37.398)
I see now that default_statistics_target is globally set to 500, so this
is probably the reason it took so long. I guess with the default of 100,
it would take approximat
On Fri, 10 May 2024, Tom Lane wrote:
Dimitrios Apostolou writes:
Further digging into this simple query, if I force the non-parallel plan
by setting max_parallel_workers_per_gather TO 0, I see that the query
planner comes up with a cost much higher:
Limit (cost=363.84..1134528847.47 r
On Tue, 14 May 2024 at 02:07, Dimitrios Apostolou wrote:
>
> On Tue, 14 May 2024, David Rowley wrote:
> > Parallel Append can also run in a way that the Append child nodes will
> > only get 1 worker each.
>
> How can I tell which case it is, from the EXPLAIN output (for example
> the output at [1]
On Tue, 14 May 2024 at 01:52, Dimitrios Apostolou wrote:
>
> On Tue, 14 May 2024, David Rowley wrote:
> > The query does contain an ORDER BY, so if the index is not chosen to
> > provide pre-sorted input, then something has to put the results in the
> > correct order before the LIMIT is applied.
>
On Tue, 14 May 2024, David Rowley wrote:
That assumes the Append won't ever use > 1 worker per subnode, but
that's not the case for your plan as the subnodes are "Parallel".
That means all the workers could be working on the same subnode which
could result in one group being split between 2 o
On Tue, 14 May 2024, David Rowley wrote:
On Tue, 14 May 2024 at 00:41, Dimitrios Apostolou wrote:
On Sat, 11 May 2024, David Rowley wrote:
It will. It's just that Sorting requires fetching everything from its subnode.
Isn't it plain wrong to have a sort step in the plan than? The different
On Tue, 14 May 2024 at 00:46, Dimitrios Apostolou wrote:
>
> On Mon, 13 May 2024, Dimitrios Apostolou wrote:
>
> > On Sat, 11 May 2024, David Rowley wrote:
> >> If you look at [1], it says "Tuples changed in partitions and
> >> inheritance children do not trigger analyze on the parent table."
>
On Tue, 14 May 2024 at 00:41, Dimitrios Apostolou wrote:
>
> On Sat, 11 May 2024, David Rowley wrote:
> > It will. It's just that Sorting requires fetching everything from its
> > subnode.
>
> Isn't it plain wrong to have a sort step in the plan than? The different
> partitions contain different
On Tue, 14 May 2024 at 00:28, Dimitrios Apostolou wrote:
>
> On Sat, 11 May 2024, David Rowley wrote:
>
> > On Sat, 11 May 2024 at 13:33, Tom Lane wrote:
> >> I do kind of wonder why it's producing both a hashagg and a Unique
> >> step --- seems like it should do one or the other.
> >
> > It stil
On Mon, 13 May 2024, Dimitrios Apostolou wrote:
On Sat, 11 May 2024, David Rowley wrote:
On Sat, 11 May 2024 at 13:11, Dimitrios Apostolou wrote:
Indeed that's an awful estimate, the table has more than 1M of unique
values in that column. Looking into pg_stat_user_tables, I can't see the
On Sat, 11 May 2024, David Rowley wrote:
On Sat, 11 May 2024 at 13:11, Dimitrios Apostolou wrote:
Indeed that's an awful estimate, the table has more than 1M of unique
values in that column. Looking into pg_stat_user_tables, I can't see the
partitions having been vacuum'd or analyzed at all. I
On Sat, 11 May 2024, David Rowley wrote:
On Sat, 11 May 2024 at 13:33, Tom Lane wrote:
I do kind of wonder why it's producing both a hashagg and a Unique
step --- seems like it should do one or the other.
It still needs to make the duplicate groups from parallel workers unique.
Range parti
On Sat, 11 May 2024 at 13:33, Tom Lane wrote:
> I do kind of wonder why it's producing both a hashagg and a Unique
> step --- seems like it should do one or the other.
It still needs to make the duplicate groups from parallel workers unique.
David
On Sat, 11 May 2024 at 13:11, Dimitrios Apostolou wrote:
> Indeed that's an awful estimate, the table has more than 1M of unique
> values in that column. Looking into pg_stat_user_tables, I can't see the
> partitions having been vacuum'd or analyzed at all. I think they should
> have been auto-ana
Dimitrios Apostolou writes:
> On Fri, 10 May 2024, Tom Lane wrote:
>> I'd say the blame lies with that (probably-default) estimate of
>> just 200 distinct rows. That means the planner expects to have
>> to read about 5% (10/200) of the tables to get the result, and
>> that's making fast-start pla
On Fri, 10 May 2024, Tom Lane wrote:
Dimitrios Apostolou writes:
Further digging into this simple query, if I force the non-parallel plan
by setting max_parallel_workers_per_gather TO 0, I see that the query
planner comes up with a cost much higher:
Limit (cost=363.84..1134528847.47 rows
Dimitrios Apostolou writes:
> Further digging into this simple query, if I force the non-parallel plan
> by setting max_parallel_workers_per_gather TO 0, I see that the query
> planner comes up with a cost much higher:
> Limit (cost=363.84..1134528847.47 rows=10 width=4)
> -> Unique (cos
On Fri, 10 May 2024, Dimitrios Apostolou wrote:
On Fri, 10 May 2024, Dimitrios Apostolou wrote:
Update: even the simplest SELECT DISTINCT query shows similar behaviour:
Further digging into this simple query, if I force the non-parallel plan
by setting max_parallel_workers_per_gather TO 0,
On Fri, 10 May 2024, Dimitrios Apostolou wrote:
I noticed that the following query is very very slow (too long to wait for it
to finish):
SELECT DISTINCT workitem_n FROM test_runs_raw ORDER BY workitem_n DESC LIMIT 10;
Update: even the simplest SELECT DISTINCT query shows similar behaviour:
Hello list,
INTRO
I have a huge (multi-billion rows) table partitioned into 1000 partitions.
Around half of the partitions are full and the rest are empty, created in
advance ready to receive future incoming data. Postgres is 16.2.
Here are the relevant parts of the schema:
\d test_runs_raw
22 matches
Mail list logo