On Fri, Jan 04, 2019 at 08:58:57AM +0000, Abadie Lana wrote: > SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, > attname, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, > array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='...' AND > tablename='...' ORDER BY 1 DESC; > > Hmm. Is it normal that the couple (tablename,attname ) is not unique? I'm > surprised to see sample_{ctrl,util,buil} quoted twice
One of the rows is for "inherited stats" (including child tables) stats and one is "noninherited stats". The unique index on the table behind that view is: "pg_statistic_relid_att_inh_index" UNIQUE, btree (starelid, staattnum, stainherit) On the wiki, I added inherited and correlation columns. Would you rerun that query ? https://wiki.postgresql.org/wiki/Slow_Query_Questions#Statistics:_n_distinct.2C_MCV.2C_histogram I'm also interested to see \d and channel_id statistics for the channel table. > explain (analyze, buffers) select > 'BUIL-B36-VA-RT-RT1:CL0001-2-ABW',c.smpl_time,c.nanosecs,c.float_val,c.num_val,c.str_val,c.datatype,c.array_val > from sample c WHERE c.channel_id = (SELECT channel_id FROM channel WHERE > name='BUIL-B36-VA-RT-RT1:CL0001-2-ABW') order by c.smpl_time desc limit 5; You originally wrote this as a implicit comma join. Does the original query still have an issue ? The =(subselect query) doesn't allow the planner to optimize for the given channel, which seems to be a fundamental problem. On Fri, Jan 04, 2019 at 08:58:57AM +0000, Abadie Lana wrote: > Based on your feedback...i rerun analyse directly on the two table > sample_ctrl_year and sample_buil_year > [...] Now when running the query again, only for sample_buil_year table the > wrong index is picked up... It looks like statistics on your tables were completely wrong; not just sample_ctrl_year and sample_buil_year. Right ? Autoanalyze would normally handle this on nonempty tables (children or otherwise) and you should manually run ANALZYE on the parents (both levels of them) whenever statistics change, like after running a big DELETE or DROP or after a significant interval of time has passed relative to the range of time in the table's timestamp columns. Do you know why autoanalze didn't handle the nonempty tables on its own ? > Now, the channel name I gave has no entries in sample_buil_year...(and when I > run the query directly against sample_buil_year the right index is picked > up).... So maybe something related with the partitioning? > -> Index Scan Backward using smpl_time_bx2_idx on > sample_buil_year c_5 (cost=0.56..2023054.76 rows=665761 width=75) (actual > time=13216.589..13216.589 rows=0 loops=1) > Filter: (channel_id = $0) > Rows Removed by Filter: 50597834 > Buffers: shared hit=26626368 So it scanned the entire index expecting to find 5 matching channel IDs "pretty soon", based on the generic distribution of channel IDs, without the benefit of knowing that this channel ID doesn't exist at all (due to =(subquery)). 26e6 buffers is 200GB, apparently accessing some pages many times (even if cached). table_name | index_name | table_size | index_size sample_buil_year | smpl_time_bx2_idx | 4492 MB | 1084 MB General comments: On Wed, Jan 02, 2019 at 04:28:41PM +0000, Abadie Lana wrote: > "sample_time_bm_idx" btree (channel_id, smpl_time) > "sample_time_mb1_idx" btree (smpl_time, channel_id) > "smpl_time_bx1_idx" btree (smpl_time) The smpl_time index is loosely redundant with index on (smpl_time,channel_id). You might consider dropping it, or otherwise dropping the smpl_time,channel_id index and making two separate indices on smpl_time and channel. That would allow bitmap ANDing them together. Or possibly (depending on detail of your data loading) leaving the composite index and changing smpl_time to a BRIN index - it's nice to be able to CLUSTER on the btree index to maximize the efficiency of the brin index. >Check constraints: > "sample_buil_month_smpl_time_check" CHECK (smpl_time >= (now() - '32 > days'::interval)::timestamp without time zone AND smpl_time <= now()) I'm surprised that works, and not really sure what it's doing..but in any case it's maybe not doing what you wanted(??). I'm guessing you never get constraint exclusion (which is irrelevant for this query but still). Justin