Am 20.02.20 21:41 schrieb(en) Adrian Klaver:
It would be nice to know what:
[snip]
represented in:

Dropping and re-creating the function is actually the last operation in the script.  The 
function is /very/ simple (just a wrapper to hide all internals from "agent" 
clients):

---8<--------------------------------------------------------------------
DROP FUNCTION public.get_result2(mytaskid bigint, OUT data bytea, OUT metadata 
jsonb, OUT errortext text, OUT vanished boolean);
CREATE FUNCTION public.get_result2(mytaskid bigint, OUT data bytea, OUT 
metadata jsonb, OUT errortext text, OUT vanished boolean) RETURNS record
    LANGUAGE plpgsql STABLE SECURITY DEFINER
    SET search_path TO 'public', 'pg_temp'
    AS $$
BEGIN
    SELECT r.data, r.metadata, r.errortext FROM results r INNER JOIN tasks 
USING(resultid) WHERE taskid = mytaskid LIMIT 1 INTO data, metadata, errortext;
    SELECT COUNT(*) = 0 FROM tasks WHERE taskid = mytaskid INTO vanished;
END;
$$;
ALTER FUNCTION public.get_result2(mytaskid bigint, OUT data bytea, OUT metadata 
jsonb, OUT errortext text, OUT vanished boolean) OWNER TO manager;
REVOKE ALL ON FUNCTION public.get_result2(mytaskid bigint, OUT data bytea, OUT 
metadata jsonb, OUT errortext text, OUT vanished boolean) FROM PUBLIC;
GRANT ALL ON FUNCTION public.get_result2(mytaskid bigint, OUT data bytea, OUT metadata 
jsonb, OUT errortext text, OUT vanished boolean) TO "agent";
COMMIT;
---8<--------------------------------------------------------------------

The Postgres logs during and after restart might provide some info.

Also the errors thrown when accessing the other function.

I attach the (slightly stripped down; I don't want to post ~100k…) log, 
starting with the very first error at 13:39:59.302 UTC.  Prior to that line are 
*no* errors.  I added a few [comments].

At 13:39:59.484 the error message changes, referring to an ancient function 
“retrieve_single_result()” which (according to the person who wrote the “agent” 
client) is *not* called.  The clients try periodically poll “get_result2()”.

At 13:42:00 the “systemctl restart” has been initiated.  At 13:42:02 the 
database has been stopped and is restarted immediately, revealing one 
misconfigured client (should be harmless), but the cache lookup error persists.

Thanks in advance for your help,
Albrecht.
[*no* error prior to this line]
2020-02-20 13:39:59.302 UTC [27971] agent@taskdb ERROR:  cache lookup failed for function 1821571
2020-02-20 13:39:59.302 UTC [27971] agent@taskdb CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
2020-02-20 13:39:59.302 UTC [27971] agent@taskdb STATEMENT:  SELECT data, metadata, errortext, vanished FROM get_result2(26671107)
2020-02-20 13:39:59.302 UTC [28048] agent@taskdb ERROR:  cache lookup failed for function 1821571
2020-02-20 13:39:59.302 UTC [28048] agent@taskdb CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
2020-02-20 13:39:59.302 UTC [28048] agent@taskdb STATEMENT:  SELECT data, metadata, errortext, vanished FROM get_result2(26670465)
2020-02-20 13:39:59.302 UTC [28021] agent@taskdb ERROR:  cache lookup failed for function 1821571
2020-02-20 13:39:59.302 UTC [28021] agent@taskdb CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
2020-02-20 13:39:59.302 UTC [28021] agent@taskdb STATEMENT:  SELECT data, metadata, errortext, vanished FROM get_result2(26670465)
[… repeating again and again, with different arguments for the DB function]
2020-02-20 13:39:59.333 UTC [28017] agent@taskdb ERROR:  cache lookup failed for function 1821571
2020-02-20 13:39:59.333 UTC [28017] agent@taskdb CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
2020-02-20 13:39:59.333 UTC [28017] agent@taskdb STATEMENT:  SELECT data, metadata, errortext, vanished FROM get_result2(26671517)
2020-02-20 13:39:59.335 UTC [27999] agent@taskdb ERROR:  cache lookup failed for function 1821571
2020-02-20 13:39:59.335 UTC [27999] agent@taskdb CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
2020-02-20 13:39:59.335 UTC [27999] agent@taskdb STATEMENT:  SELECT data, metadata, errortext, vanished FROM get_result2(26671570)
2020-02-20 13:39:59.484 UTC [2504] agent@taskdb ERROR:  error result "Error: cache lookup failed for function 1821571
	CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
	"
2020-02-20 13:39:59.484 UTC [2504] agent@taskdb CONTEXT:  PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE
2020-02-20 13:39:59.484 UTC [2504] agent@taskdb STATEMENT:  SELECT data, metadata FROM retrieve_single_result(26619789)
2020-02-20 13:39:59.504 UTC [2504] agent@taskdb ERROR:  error result "Error: cache lookup failed for function 1821571
	CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
	"
2020-02-20 13:39:59.504 UTC [2504] agent@taskdb CONTEXT:  PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE
2020-02-20 13:39:59.504 UTC [2504] agent@taskdb STATEMENT:  SELECT data, metadata FROM retrieve_single_result(26618953)
2020-02-20 13:39:59.514 UTC [2504] agent@taskdb ERROR:  error result "Error: cache lookup failed for function 1821571
	CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
	"
2020-02-20 13:39:59.514 UTC [2504] agent@taskdb CONTEXT:  PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE
2020-02-20 13:39:59.514 UTC [2504] agent@taskdb STATEMENT:  SELECT data, metadata FROM retrieve_single_result(26483130)
[…repeating again and again, with different arguments for the DB function]
2020-02-20 13:41:45.469 UTC [2504] agent@taskdb ERROR:  error result "Error: cache lookup failed for function 1821571
	CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
	"
2020-02-20 13:41:45.469 UTC [2504] agent@taskdb CONTEXT:  PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE
2020-02-20 13:41:45.469 UTC [2504] agent@taskdb STATEMENT:  SELECT data, metadata FROM retrieve_single_result(26619490)
2020-02-20 13:42:00.000 UTC [2036] LOG:  cron job 1 starting: SELECT FROM task_watchdog()
2020-02-20 13:42:00.010 UTC [2036] LOG:  cron job 1 completed: 1 row
2020-02-20 13:42:00.404 UTC [2027] LOG:  received fast shutdown request
2020-02-20 13:42:00.404 UTC [2027] LOG:  aborting any active transactions
2020-02-20 13:42:00.405 UTC [28051] agent@taskdb FATAL:  terminating connection due to administrator command
2020-02-20 13:42:00.405 UTC [28055] agent@taskdb FATAL:  terminating connection due to administrator command
[…many more clients being disconnected…]
2020-02-20 13:42:00.407 UTC [28036] agent@taskdb FATAL:  terminating connection due to administrator command
2020-02-20 13:42:00.407 UTC [7065] agent@taskdb FATAL:  terminating connection due to administrator command
2020-02-20 13:42:00.407 UTC [2036] LOG:  pg_cron scheduler shutting down
2020-02-20 13:42:00.407 UTC [28042] agent@taskdb FATAL:  terminating connection due to administrator command
2020-02-20 13:42:00.408 UTC [27987] agent@taskdb FATAL:  terminating connection due to administrator command
[…still disconnecting clients…]
2020-02-20 13:42:00.433 UTC [20299] agent@taskdb FATAL:  terminating connection due to administrator command
2020-02-20 13:42:00.434 UTC [20296] agent@taskdb FATAL:  terminating connection due to administrator command
[…]
2020-02-20 13:42:00.713 UTC [1895] agent@taskdb FATAL:  the database system is shutting down
2020-02-20 13:42:00.717 UTC [1896] agent@taskdb FATAL:  the database system is shutting down
2020-02-20 13:42:00.725 UTC [1897] agent@taskdb FATAL:  the database system is shutting down
[…and many more of these…]
2020-02-20 13:42:00.729 UTC [1923] agent@taskdb FATAL:  the database system is shutting down
2020-02-20 13:42:00.738 UTC [1928] agent@taskdb FATAL:  the database system is shutting down
2020-02-20 13:42:00.742 UTC [2027] LOG:  background worker "logical replication launcher" (PID 2037) exited with exit code 1
2020-02-20 13:42:00.745 UTC [1902] agent@taskdb FATAL:  the database system is shutting down
2020-02-20 13:42:00.747 UTC [1900] agent@taskdb FATAL:  the database system is shutting down
[…]
2020-02-20 13:42:01.489 UTC [1958] agent@taskdb FATAL:  the database system is shutting down
2020-02-20 13:42:01.491 UTC [1956] agent@taskdb FATAL:  the database system is shutting down
2020-02-20 13:42:01.494 UTC [2031] LOG:  shutting down
2020-02-20 13:42:01.497 UTC [1957] agent@taskdb FATAL:  the database system is shutting down
2020-02-20 13:42:01.498 UTC [1955] agent@taskdb FATAL:  the database system is shutting down
[…]
2020-02-20 13:42:01.766 UTC [1988] agent@taskdb FATAL:  the database system is shutting down
2020-02-20 13:42:01.940 UTC [1989] agent@taskdb FATAL:  the database system is shutting down
2020-02-20 13:42:01.986 UTC [2027] LOG:  database system is shut down
[as I initiated s 'systemctl restart', the system is started immediately]
2020-02-20 13:42:02.214 UTC [1995] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2020-02-20 13:42:02.214 UTC [1995] LOG:  listening on IPv6 address "::", port 5432
2020-02-20 13:42:02.215 UTC [1995] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-02-20 13:42:02.243 UTC [1996] LOG:  database system was shut down at 2020-02-20 13:42:01 UTC
2020-02-20 13:42:02.265 UTC [1995] LOG:  database system is ready to accept connections
2020-02-20 13:42:02.284 UTC [1997] agent@taskdb FATAL:  the database system is starting up
2020-02-20 13:42:02.287 UTC [2005] agent@taskdb FATAL:  no pg_hba.conf entry for host "172.16.98.8", user "agent", database "taskdb", SSL off
[misconfigured client, the message is ok]
2020-02-20 13:42:02.292 UTC [2003] LOG:  pg_cron scheduler started
2020-02-20 13:42:02.782 UTC [2012] [unknown]@[unknown] LOG:  incomplete startup packet
2020-02-20 13:42:20.350 UTC [2133] agent@taskdb ERROR:  error result "Error: cache lookup failed for function 1821571
	CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
	"
2020-02-20 13:42:20.350 UTC [2133] agent@taskdb CONTEXT:  PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE
2020-02-20 13:42:20.350 UTC [2133] agent@taskdb STATEMENT:  SELECT data, metadata FROM retrieve_single_result(26619789)
2020-02-20 13:42:20.374 UTC [2133] agent@taskdb ERROR:  error result "Error: cache lookup failed for function 1821571
	CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
	"
2020-02-20 13:42:20.374 UTC [2133] agent@taskdb CONTEXT:  PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE
2020-02-20 13:42:20.374 UTC [2133] agent@taskdb STATEMENT:  SELECT data, metadata FROM retrieve_single_result(26618953)
2020-02-20 13:42:20.388 UTC [2133] agent@taskdb ERROR:  error result "Error: cache lookup failed for function 1821571
	CONTEXT:  PL/pgSQL function get_result2(bigint) while casting return value to function's return type
	"
2020-02-20 13:42:20.388 UTC [2133] agent@taskdb CONTEXT:  PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE
2020-02-20 13:42:20.388 UTC [2133] agent@taskdb STATEMENT:  SELECT data, metadata FROM retrieve_single_result(26483130)
[…continues with the same error again and again]

Attachment: pgpW6cu1QwYvL.pgp
Description: PGP signature

Reply via email to