Re: stawidth inconsistency with all NULL columns
Joe Conway writes: > On 5/21/19 3:55 PM, Tom Lane wrote: >> Are you suggesting that we should set stawidth to zero even for a >> fixed-width datatype? That seems pretty silly. We know exactly what >> the value should be, and would be if we'd chanced to find even one >> non-null entry. > Well you could argue in similar fashion for variable width values -- if > we find even one of those, it will be at least 4 bytes. So why set those > to zero? Um, really the minimum width is 1 byte, given short headers. But as the code notes, zero means we don't know what a sane estimate would be, which is certainly not the case for fixed-width types. regards, tom lane
Re: stawidth inconsistency with all NULL columns
On 5/21/19 3:55 PM, Tom Lane wrote: > Joe Conway writes: >> else if (null_cnt > 0) >> { >> /* We found only nulls; assume the column is entirely null */ >> stats->stats_valid = true; >> stats->stanullfrac = 1.0; >> if (is_varwidth) >> stats->stawidth = 0;/* "unknown" */ >> else >> stats->stawidth = stats->attrtype->typlen; >> stats->stadistinct = 0.0; /* "unknown" */ >> } >> 8<- > >> So apparently intentional, but seems gratuitously inconsistent. Could >> this cause any actual inconsistent behaviors? In any case that first >> comment does not reflect the code. > > Are you suggesting that we should set stawidth to zero even for a > fixed-width datatype? That seems pretty silly. We know exactly what > the value should be, and would be if we'd chanced to find even one > non-null entry. Well you could argue in similar fashion for variable width values -- if we find even one of those, it will be at least 4 bytes. So why set those to zero? Not a big deal, but it struck me as odd when I was looking at the current state of affairs. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: stawidth inconsistency with all NULL columns
Joe Conway writes: > else if (null_cnt > 0) > { > /* We found only nulls; assume the column is entirely null */ > stats->stats_valid = true; > stats->stanullfrac = 1.0; > if (is_varwidth) > stats->stawidth = 0;/* "unknown" */ > else > stats->stawidth = stats->attrtype->typlen; > stats->stadistinct = 0.0; /* "unknown" */ > } > 8<- > So apparently intentional, but seems gratuitously inconsistent. Could > this cause any actual inconsistent behaviors? In any case that first > comment does not reflect the code. Are you suggesting that we should set stawidth to zero even for a fixed-width datatype? That seems pretty silly. We know exactly what the value should be, and would be if we'd chanced to find even one non-null entry. regards, tom lane
stawidth inconsistency with all NULL columns
Consider: CREATE TABLE testwid ( txtnotnull text, txtnull text, int8notnull int8, int8null int8 ); INSERT INTO testwid SELECT 'a' || g.i, NULL, g.i, NULL FROM generate_series(1,1) AS g(i); ANALYZE testwid; SELECT attname, avg_width FROM pg_stats WHERE tablename = 'testwid'; attname | avg_width -+--- txtnotnull | 5 txtnull | 0 int8notnull | 8 int8null| 8 (4 rows) I see in analyze.c 8<- /* We can only compute average width if we found some non-null values.*/ if (nonnull_cnt > 0) [snip] else if (null_cnt > 0) { /* We found only nulls; assume the column is entirely null */ stats->stats_valid = true; stats->stanullfrac = 1.0; if (is_varwidth) stats->stawidth = 0;/* "unknown" */ else stats->stawidth = stats->attrtype->typlen; stats->stadistinct = 0.0; /* "unknown" */ } 8<- So apparently intentional, but seems gratuitously inconsistent. Could this cause any actual inconsistent behaviors? In any case that first comment does not reflect the code. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature