On Thu, Jan 11, 2024 at 3:31 PM Jeremiah Bauer <jba...@agristats.com> wrote:
> Hello all, > > We are having an issue with a materialized view refresh never finishing, > any help is appreciated. It will run at 100% CPU and no IO traffic > indefinitely after about 15 minutes of parallel workers and the parent > worker consuming CPU and IO. > > PostgreSQL Version: 15.5 > > Due to some design decisions, we have a large table that we need a > distinct list of associated id's out of for an application interface. This > table has approximately 1,650,000,000 rows with a table size of 175GB and > 250GB of indexes. > > I anonymized the column names in the table structure, but this is the > table structure of our large table. > > create table public.large_table( > large_table_id bigint NOT NULL DEFAULT > nextval(public.large_table_id_seq'::regclass), > nmrc numeric(30,5), > id1 bigint NOT NULL, > id2 bigint NOT NULL, > id3 bigint NOT NULL, > id4 bigint NOT NULL, > id5 bigint NOT NULL, > last_updt_ts timestamp without time zone, > last_db_updt_ts timestamp without time zone, > charval character varying(30) > > ) > > The materialized view is defined as: > > create materialized view public.vw_distinct_list_of_ids as ( > > select distinct id1, id2, id3 from public.large_table > > ) with no data; > > When refreshed the materialized view will contain approximately 59,000,000 > rows. > > It takes approximately 12 minutes to run the select statement from the > view definition. When you refresh the materialized view for the first > time, it will spawn several parallel workers that run at 100% CPU and > produce some I/O for about 12 or 13 minutes, then they finish their work > and terminate. The parent worker will continue to run at 100% CPU until I > terminate the process. I've let it run for over 24 hours. > > What I find interesting is that if I do this: > > create table public.table_of_distinct_list_of_ids as ( > > select distinct id1, id2, id3 from public.large_table > > ); > > It will complete in 12 or 13 minutes, the same as the select statement. I > have three questions. > My question is: what indexes are on public.large_table? Hopefully there's a compound b-tree index on id1, id2, id3. > > 1. Why is the materialized view refresh stalling and never completing? > 2. Is there a setting I am missing that affects materialized view > refreshes? > 3. Should I just go with a table that is dropped and recreated > instead? I was hoping to avoid this. > > A materialized view isn't too different from an unlogged table.