Daniel,
  You have a commit [1] that MIGHT fix this.
I have a script that recreates the problem, using random data in pg_temp.
And a nested cursor.

  It took me a few days to reduce this from actual code that was
experiencing this.  If I turn off JIT, the problem goes away.  (if I don't
FETCH the first row, the memory loss does not happen.  Maybe because
opening a cursor is more decoration/prepare)

  I don't have an easy way to test this script right now against the commit.
I am hopeful that your fix fixes this.

  This was my first OOM issue in PG in 3yrs of working with it.

  The problem goes away if the TABLE is analyzed, or JIT is disabled.

  The current script, if run, will consume about 25% of my system memory
(10GB).
Just call the function below until it dies if that's what you need.  The
only way to get the memory back down is to close the connection.

SELECT pg_temp.fx(497);

Surprisingly, to me, the report from pg_get_backend_memory_contexts()
doesn't really show "missing memory", which  I thought it would.  (FWIW, we
caught this with multiple rounds of testing our code, slowing down, then
crashing...  Is there ANY way to interrogate that we are above X% of system
memory so we know to let this backend go?)

It takes about 18 minutes to run on my 4 CPU VM.

For now, we are going to add some ANALYZE statements to our code.
We will consider disabling JIT.

Thanks,
Kirk
[1] = 2cf50585e54a7b0c6bc62a087c69043ae57e4252
<https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=2cf50585e54a7b0c6bc62a087c69043ae57e4252>
CREATE TABLE pg_temp.parts
(
    seid             bigint,
    r_field_name_1   smallint,
    fr_field_name    smallint           NOT NULL,
    p1_field_name    varchar(4),
    qty_field_name   integer,
    p5_field_name    varchar(30),
    partnum          varchar(30),
    st_field_name    smallint DEFAULT 0 NOT NULL
); -- drop table pg_temp.parts;

INSERT INTO pg_temp.parts (seid, partnum, qty_field_name, fr_field_name, 
st_field_name)
SELECT (RANDOM() * 3821 + 1)::bigint                                   AS seid,
       (RANDOM() * 123456789)::text                                    AS 
partnum,
       CASE
           WHEN q.rnd BETWEEN 0 AND 0.45 THEN FLOOR(RANDOM() * 900) + 100 -- 
Random number in the range [100, 999]
           WHEN q.rnd BETWEEN 0.46 AND 0.96 THEN LEAST(TRUNC(FLOOR(RANDOM() * 
999999) + 1000)::int, 999999::int) -- Random number in the range [1000, 9999]
           ELSE FLOOR(RANDOM() * 9000000) + 1000000 -- Random number in the 
range [100000, 999999]
           END                                                         AS 
qty_field_name,
       CASE WHEN RANDOM() < 0.72 THEN 0::smallint ELSE 1::smallint END AS 
fr_field_name,
       CASE WHEN RANDOM() < 0.46 THEN 1::smallint ELSE 2::smallint END AS 
st_field_name
  FROM (SELECT RANDOM() AS rnd, x FROM GENERATE_SERIES(1, 90_000_000) x) q;

CREATE INDEX idx_parts_supid ON pg_temp.parts USING btree (seid, p1_field_name, 
partnum, st_field_name, r_field_name_1, qty_field_name);
CREATE INDEX idx_parts_p5 ON pg_temp.parts USING btree (p5_field_name, seid, 
st_field_name, r_field_name_1, p1_field_name);
CREATE INDEX idx_parts_partnum ON pg_temp.parts USING btree (partnum, seid, 
st_field_name, r_field_name_1, p1_field_name);

CREATE OR REPLACE FUNCTION pg_temp.fx(asupplier bigint = 497 )
    RETURNS void
    LANGUAGE plpgsql
AS
$function$
DECLARE
    supplier_parts       CURSOR (sid bigint) FOR  -- Again, selecting with 
COUNT() would reduce 1 query per row!
        SELECT
            partnum, qty_field_name, st_field_name, sum(qty_field_name) as qty
            FROM pg_temp.parts
            WHERE seid = sid AND (st_field_name = 1)
            GROUP BY partnum, qty_field_name, st_field_name
            ORDER BY partnum, qty_field_name, st_field_name;
    supplier_part_qty_matches CURSOR (sid bigint, pnum varchar(30), pqty 
bigint) FOR
        SELECT DISTINCT
            seid, fr_field_name, partnum, st_field_name
            FROM pg_temp.parts
            WHERE seid <> sid AND partnum = pnum AND qty_field_name = pqty
            ORDER BY seid, partnum;

    a_partnum     varchar(30);
    a_qty         integer;
    a_st          smallint;
    a_cnt         integer = 0;
    b_partnum     varchar(30);
    b_fr          smallint;
    b_seid        bigint;
    b_st          smallint;
    b_cnt         bigint = 0;
BEGIN
    RAISE NOTICE '%', (SELECT (PG_SIZE_PRETTY(SUM(used_bytes)), 
PG_SIZE_PRETTY(SUM(total_bytes)), PG_SIZE_PRETTY(SUM(free_bytes))) FROM 
pg_get_backend_memory_contexts());
    OPEN supplier_parts (asupplier);
    LOOP
        FETCH supplier_parts INTO a_partnum, a_qty, a_st, a_qty;
        EXIT WHEN NOT FOUND;
        a_cnt := a_cnt + 1;
        OPEN supplier_part_qty_matches (sid := asupplier, pnum := a_partnum, 
pqty := a_qty);
        LOOP
            FETCH supplier_part_qty_matches INTO b_seid, b_fr, b_partnum, b_st;
            b_cnt := b_cnt + 1;
            EXIT WHEN TRUE;  -- no Need to loop here  One FETCH per query 
triggers the losses.
        END LOOP;
        CLOSE supplier_part_qty_matches;
    END LOOP;
    CLOSE supplier_parts;
    RAISE NOTICE '-----------after close, Count a: %, count b: %', a_cnt, b_cnt;
    RAISE NOTICE '%', (SELECT (PG_SIZE_PRETTY(SUM(used_bytes)), 
PG_SIZE_PRETTY(SUM(total_bytes)), PG_SIZE_PRETTY(SUM(free_bytes))) FROM 
pg_get_backend_memory_contexts());
    --perform meminfo();
END;
$function$;

-- This will use JIT until the table is analyzed, which causes the problem
explain SELECT DISTINCT seid, fr_field_name, st_field_name
          FROM pg_temp.parts
         WHERE seid <> 497 AND partnum >= '1'
         ORDER BY seid;

-- But using JIT results in loss of some bytes.
SELECT pg_temp.fx(497);


Reply via email to