On Fri, Nov 13, 2020 at 01:39:31PM -0300, Alvaro Herrera wrote:
> On 2020-Nov-13, Justin Pryzby wrote:
>
> > I saw a bunch of these in my logs:
> >
> > log_time | 2020-10-25 22:59:45.619-07
> > database |
> > left | could not open relation with OID 292103095
> > left | processing work entry for relation
> > "ts.child.alarms_202010_alarm_clear_time_idx"
> >
> > Those happen following a REINDEX job on that index.
> >
> > I think that should be more like an INFO message, since that's what vacuum
> > does
> > (vacuum_open_relation), and a queued work item is even more likely to hit a
> > dropped relation.
>
> Ah, interesting. Yeah, I agree this is a bug. I think it can be fixed
> by using try_relation_open() on the index; if that returns NULL, discard
> the work item.
>
> Does this patch solve the problem?
Your patch didn't actually say "try_relation_open", so didn't work.
But it does works if I do that, and close the table.
I tested like:
pryzbyj=# ALTER SYSTEM SET
backtrace_functions='try_relation_open,relation_open';
pryzbyj=# ALTER SYSTEM SET autovacuum_naptime=3; SELECT pg_reload_conf();
pryzbyj=# CREATE TABLE tt AS SELECT generate_series(1,9999)i;
pryzbyj=# CREATE INDEX ON tt USING brin(i)
WITH(autosummarize,pages_per_range=1);
pryzbyj=# \! while :; do psql -h /tmp -qc 'SET client_min_messages=info' -c
'REINDEX INDEX CONCURRENTLY tt_i_idx'; done&
-- run this 5-10 times and hit the "...was not recorded" message, which for
-- whatever reason causes the race condition involving work queue
pryzbyj=# UPDATE tt SET i=1+i;
2020-11-13 11:50:46.093 CST [30687] ERROR: could not open relation with OID
1110882
2020-11-13 11:50:46.093 CST [30687] CONTEXT: processing work entry for
relation "pryzbyj.public.tt_i_idx"
2020-11-13 11:50:46.093 CST [30687] BACKTRACE:
postgres: autovacuum worker pryzbyj(+0xb9ce8) [0x55acf2af0ce8]
postgres: autovacuum worker pryzbyj(index_open+0xb) [0x55acf2bab59b]
postgres: autovacuum worker pryzbyj(brin_summarize_range+0x8f)
[0x55acf2b5b5bf]
postgres: autovacuum worker pryzbyj(DirectFunctionCall2Coll+0x62)
[0x55acf2f40372]
...
--
Justin
>From e08c6d3e2b10964633904ff247e70330077d31b4 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <[email protected]>
Date: Fri, 13 Nov 2020 13:39:31 -0300
Subject: [PATCH v2] error_severity of brin work item
---
src/backend/access/brin/brin.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 1f72562c60..8278a5209c 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -887,8 +887,10 @@ brin_summarize_range(PG_FUNCTION_ARGS)
/*
* We must lock table before index to avoid deadlocks. However, if the
* passed indexoid isn't an index then IndexGetRelation() will fail.
- * Rather than emitting a not-very-helpful error message, postpone
- * complaining, expecting that the is-it-an-index test below will fail.
+ * Rather than emitting a not-very-helpful error message, prepare to
+ * return without doing anything. This allows autovacuum work-items to be
+ * silently discarded rather than uselessly accumulating error messages in
+ * the server log.
*/
heapoid = IndexGetRelation(indexoid, true);
if (OidIsValid(heapoid))
@@ -896,7 +898,14 @@ brin_summarize_range(PG_FUNCTION_ARGS)
else
heapRel = NULL;
- indexRel = index_open(indexoid, ShareUpdateExclusiveLock);
+ indexRel = try_relation_open(indexoid, ShareUpdateExclusiveLock);
+ if (heapRel == NULL || indexRel == NULL)
+ {
+ if (heapRel != NULL)
+ table_close(heapRel, ShareUpdateExclusiveLock);
+
+ PG_RETURN_INT32(0);
+ }
/* Must be a BRIN index */
if (indexRel->rd_rel->relkind != RELKIND_INDEX ||
--
2.17.0