> > ERROR:  index "pg_class_oid_index" is not a btree
> 
> That means you got bogus data while reading the metapage.
> I'm beginning to wonder about the hardware on this server ...

This happened again, and this time I went back through 
the logs and found that it is always the exact same query causing 
the issue. I also found it occuring on different servers, 
which rules out RAM anyway (still shared disk, so those are suspect).
This query also sometimes gives errors like this:

ERROR:  could not read block 3 of relation 1663/1554846571/3925298284: 
  read only 0 of 8192 bytes

However, the final number changes: these are invariably temporary relations. 
The query itself is a GROUP BY over a large view and the explain plan is 
107 rows, with nothing esoteric about it. Most of the tables used are 
fairly common ones. I'm trying to duplicate on a non-production box, without 
success so far, and I'm loath to run it on production as it sometimes 
causes multiple backends to freeze up and requires a forceful restart.

Any ideas on how to carefully debug this? There are a couple of quicksorts 
when I explain analyze on a non-prod system, which I am guessing where 
the temp tables come from (work_mem is 24MB). I'm not sure I understand 
what could be causing both the 'read 0' and btree errors for the 
same query - bad blocks on disk for one of the underlying tables?
I'll work next on checking each of the tables the view is using.

-- 
Greg Sabino Mullane g...@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

Attachment: pgpCBcRgxlYYF.pgp
Description: PGP signature

Reply via email to