On Thu, Jan 11, 2024 at 3:31 PM Jeremiah Bauer <jba...@agristats.com> wrote:

> Hello all,
>
> We are having an issue with a materialized view refresh never finishing,
> any help is appreciated.  It will run at 100% CPU and no IO traffic
> indefinitely after about 15 minutes of parallel workers and the parent
> worker consuming CPU and IO.
>
> PostgreSQL Version: 15.5
>
> Due to some design decisions, we have a large table that we need a
> distinct list of associated id's out of for an application interface.  This
> table has approximately 1,650,000,000 rows with a table size of 175GB and
> 250GB of indexes.
>
> I anonymized the column names in the table structure, but this is the
> table structure of our large table.
>
> create table public.large_table(
>    large_table_id bigint NOT NULL DEFAULT
> nextval(public.large_table_id_seq'::regclass),
>     nmrc numeric(30,5),
>     id1 bigint NOT NULL,
>     id2 bigint NOT NULL,
>     id3 bigint NOT NULL,
>     id4 bigint NOT NULL,
>     id5 bigint NOT NULL,
>     last_updt_ts timestamp without time zone,
>     last_db_updt_ts timestamp without time zone,
>     charval character varying(30)
>
> )
>
> The materialized view is defined as:
>
> create materialized view public.vw_distinct_list_of_ids as (
>
> select distinct id1, id2, id3 from public.large_table
>
> ) with no data;
>
> When refreshed the materialized view will contain approximately 59,000,000
> rows.
>
> It takes approximately 12 minutes to run the select statement from the
> view definition.  When you refresh the materialized view for the first
> time, it will spawn several parallel workers that run at 100% CPU and
> produce some I/O for about 12 or 13 minutes, then they finish their work
> and terminate.  The parent worker will continue to run at 100% CPU until I
> terminate the process.  I've let it run for over 24 hours.
>
> What I find interesting is that if I do this:
>
> create table public.table_of_distinct_list_of_ids as (
>
> select distinct id1, id2, id3 from public.large_table
>
> );
>
> It will complete in 12 or 13 minutes, the same as the select statement.  I
> have three questions.
>

My question is: what indexes are on public.large_table?  Hopefully there's
a compound b-tree index on id1, id2, id3.

>
>    1. Why is the materialized view refresh stalling and never completing?
>    2. Is there a setting I am missing that affects materialized view
>    refreshes?
>    3. Should I just go with a table that is dropped and recreated
>    instead?  I was hoping to avoid this.
>
> A materialized view isn't too different from an unlogged table.

Reply via email to