On Wed, Aug 03, 2011 at 11:18:20AM -0400, Tom Lane wrote:
> Evidently not, if it's not logging anything, but now the question is
> why. One possibility is that for some reason RelationGetNumberOfBlocks
> is persistently lying about the file size. (We've seen kernel bugs
> before that resulted in transiently wrong values, so this isn't totally
> beyond the realm of possibility.) Please try the attached patch, which
> extends the previous one to add a summary line including the number of
> blocks physically scanned by the seqscan.
Ok, I have results from the latest patch and have attached a redacted
server log with the select relfilenode output added inline. This is the
shorter of the logs and shows the sequence pretty clearly. I have additional
logs if wanted.
Summary: the failing process reads 0 rows from 0 blocks from the OLD
relfilenode.
-dg
--
David Gould [email protected] 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.
SS.msec pid db user statement
------ ----- --- -------- -----------------------------------------------
03.804 29706 c27 postgres connection authorized: user=postgres database=c27
...
03.824 29706 c27 postgres set statement_timeout=600000;
03.824 29706 c27 postgres 0.364 ms
03.825 29706 c27 postgres select current_database() as db, relname, oid,
pg_relation_filenode(oid) as filenode, clock_timestamp()::time as ts from
pg_class where relname like 'pg_class%';
03.829 29706 c27 postgres 4.173 ms
db | relname | oid | filenode | ts
-----+----------------------------+------+------------+-----------------
c27 | pg_class | 1259 | 1245833951 | 21:31:03.828293
c27 | pg_class_oid_index | 2662 | 1245833955 | 21:31:03.828791
c27 | pg_class_relname_nsp_index | 2663 | 1259100530 | 21:31:03.828807
03.829 29706 c27 postgres vacuum full pg_catalog.pg_class;
03.829 29706 c27 postgres LOCATION: exec_simple_query, postgres.c:900
...
03.845 29707 c27 postgres disconnection: session time: 0:00:00.041
user=postgres database=c27 host=bk-0
...
08.856 29706 c27 postgres process 29706 still waiting for RowExclusiveLock
on relation 1214 of database 0 after 5000.483 ms
08.856 29706 c27 postgres LOCATION: ProcSleep, proc.c:1059
08.856 29706 c27 postgres STATEMENT: vacuum full pg_catalog.pg_class;
09.383 29711 LOG: 00000: process 29711 still waiting for AccessShareLock on
relation 1259 of database 16408 after 5000.331 ms
09.383 29711 LOCATION: ProcSleep, proc.c:1059
...
11.559 28857 c27 apps SELECT
...
16.560 28857 c27 apps process 28857 still waiting for AccessShareLock
on relation 1259 of database 16408 after 5001.209 ms
16.560 28857 c27 apps LOCATION: ProcSleep, proc.c:1059
16.560 28857 c27 apps STATEMENT: SELECT
...
19.763 29706 c27 postgres process 29706 acquired RowExclusiveLock on
relation 1214 of database 0 after 15907.284 ms
19.763 29706 c27 postgres LOCATION: ProcSleep, proc.c:1063
19.763 29706 c27 postgres STATEMENT: vacuum full pg_catalog.pg_class;
...
25.735 29711 LOG: 00000: process 29711 acquired AccessShareLock on relation
1259 of database 16408 after 21352.393 ms
25.735 29711 LOCATION: ProcSleep, proc.c:1063
25.735 28857 c27 apps process 28857 acquired AccessShareLock on
relation 1259 of database 16408 after 14176.040 ms
25.735 28857 c27 apps LOCATION: ProcSleep, proc.c:1063
25.735 28857 c27 apps STATEMENT: SELECT
25.736 28857 c27 apps ScanPgRelationDetailed: found 0 tuples with OID
2662 in 0 blocks of filenode 1245833951
25.736 28857 c27 apps LOCATION: ScanPgRelationDetailed, relcache.c:372
25.736 28857 c27 apps STATEMENT: SELECT
25.736 28857 c27 apps ERROR: XX000: could not find pg_class tuple for
index 2662
25.736 28857 c27 apps LOCATION: RelationReloadIndexInfo, relcache.c:1816
25.736 28857 c27 apps STATEMENT: SELECT
25.736 29706 c27 postgres 21906.865 ms
25.737 29706 c27 postgres select current_database() as db, relname, oid,
pg_relation_filenode(oid) as filenode, clock_timestamp()::time as ts from
pg_class where relname like 'pg_class%';
25.767 29706 c27 postgres 30.902 ms
db | relname | oid | filenode | ts
-----+----------------------------+------+------------+-----------------
c27 | pg_class | 1259 | 1279787837 | 21:31:25.76726
c27 | pg_class_oid_index | 2662 | 1279788022 | 21:31:25.767764
c27 | pg_class_relname_nsp_index | 2663 | 1279788023 | 21:31:25.767782
25.768 29706 c27 postgres vacuum full pg_catalog.pg_attribute;
25.775 28857 c27 apps disconnection: session time: 0:07:07.758
user=apps database=c27 host=op-01
25.775 28857 c27 apps LOCATION: log_disconnections, postgres.c:4339
...
30.914 29711 LOG: 00000: process 29711 still waiting for AccessShareLock on
relation 1249 of database 16408 after 5000.857 ms
30.914 29711 LOCATION: ProcSleep, proc.c:1059
35.770 29840 c27 apps connection authorized: user=apps database=c27
35.770 29840 c27 apps LOCATION: PerformAuthentication, postinit.c:242
40.773 29840 c27 apps process 29840 still waiting for AccessShareLock
on relation 1249 of database 16408 after 5000.530 ms
40.773 29840 c27 apps LOCATION: ProcSleep, proc.c:1059
...
44.545 29711 LOG: 00000: process 29711 acquired AccessShareLock on relation
1249 of database 16408 after 18631.071 ms
44.545 29711 LOCATION: ProcSleep, proc.c:1063
44.545 29840 c27 apps process 29840 acquired AccessShareLock on
relation 1249 of database 16408 after 8772.382 ms
44.545 29840 c27 apps LOCATION: ProcSleep, proc.c:1063
44.546 29706 c27 postgres 18777.831 ms
44.546 29706 c27 postgres vacuum full pg_catalog.pg_shdepend;
44.550 29711 LOG: 00000: automatic vacuum of table "c27.pg_catalog.pg_class":
index scans: 1
pages: 0 removed, 67 remain
44.550 29711 LOCATION: lazy_vacuum_rel, vacuumlazy.c:244
44.552 29840 c27 apps SHOW default_transaction_isolation
44.552 29840 c27 apps 0.326 ms
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers