Re: Merging statistics from children instead of re-sampling everything

Andrey V. Lepikhov Mon, 14 Feb 2022 02:22:50 -0800

On 2/11/22 20:12, Tomas Vondra wrote:

On 2/11/22 05:29, Andrey V. Lepikhov wrote:
On 2/11/22 03:37, Tomas Vondra wrote:
That being said, this thread was not really about foreign partitions,
but about re-analyzing inheritance trees in general. And sampling
foreign partitions doesn't really solve that - we'll still do the
sampling over and over.
IMO, to solve the problem we should do two things:
1. Avoid repeatable partition scans in the case inheritance tree.
2. Avoid to re-analyze everything in the case of active changes insmall subset of partitions.
For (1) i can imagine a solution like multiplexing: on the stage ofdefining which relations to scan, group them and prepare parameters ofscanning to make multiple samples in one shot.
I'm not sure I understand what you mean by multiplexing. The termusually means "sending multiple signals at once" but I'm not sure howthat applies to this issue. Can you elaborate?

I suppose to make a set of samples in one scan: one sample for planetable, another - for a parent and so on, according to the inheritancetree. And cache these samples in memory. We can calculate all parametersof reservoir method to do it.

sample might be used for estimation of clauses directly.

You mean, to use them in difficult cases, such of estimation of groupingover APPEND ?

But it requires storing the sample somewhere, and I haven't found a goodand simple way to do that. We could serialize that into bytea, or wecould create a new fork, or something, but what should that do withoversized attributes (how would TOAST work for a fork) and/or largesamples (which might not fit into 1GB bytea)?

This feature looks like meta-info over a database. It can be stored inseparate relation. It is not obvious that we need to use it for eachrelation, for example, with large samples. I think, it can be controlledby a table parameter.


--
regards,
Andrey Lepikhov
Postgres Professional

Re: Merging statistics from children instead of re-sampling everything

Reply via email to