On Tue, Mar 26, 2019 at 01:37:33PM +0000, Dean Rasheed wrote:
On Tue, 26 Mar 2019 at 11:59, Dean Rasheed <dean.a.rash...@gmail.com> wrote:

On Mon, 25 Mar 2019 at 23:36, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
>
> Attached is an updated patch...

I just looked through the latest set of changes and I have a couple of
additional review comments:


I just spotted another issue while reading the code:

It's possible to build an MCV list with more than
STATS_MCVLIST_MAX_ITEMS = 8192 items, which then causes an error when
the code tries to read it back in:

create temp table foo(a int, b int);
insert into foo select x,x from generate_series(1,10000) g(x);
insert into foo select x,x from generate_series(1,10000) g(x);
alter table foo alter column a set statistics 10000;
alter table foo alter column b set statistics 10000;
create statistics s (mcv) on a,b from foo;
analyse foo;
select * from foo where a=1 and b=1;

ERROR:  invalid length (10000) item array in MCVList

So this needs to be checked when building the MCV list.

In fact, the stats targets for table columns can be as large as 10000
(a hard-coded constant in tablecmds.c, which is pretty ugly, but
that's a different matter), so I think STATS_MCVLIST_MAX_ITEMS
probably ought to match that.

There are also a couple of comments that refer to the 8k limit, which
would need updating, if you change it.


I think we can simply ditch this separate limit, and rely on the
statistics target. The one issue with it is that if we ever allows the
statistics target to we may end up overflowing uint16 (which is used in
the serialized representation). But I think it's OK to just check that in
an assert, or so.

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply via email to