Re: [GENERAL] Rows missing from table despite FK constraint
Guys, just following up on the issue... For no apparent reason the row became visible in seq scans. After another REINDEX everything appears to be correct. I don't know what the issue was and I'm rather uncomfortable about how it appeared and went away, but anyway it seems to have been resolved. Thank you all for help. -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Alban Hertroys dal...@solfertje.student.utwente.nl: Did you turn off seqscans in the postgres.conf? Seq scan is enabled. Could you try a REINDEX TABLE attachment again in case you somehow reindexed the wrong index or table? How about this test? On a dump from before the rows were gone: # select count(*) from attachment where when_uploaded '2010-01-01'; count -- 22523642 (1 row) On production database: # explain select count(*) from attachment where when_uploaded '2010-01-01'; QUERY PLAN -- Aggregate (cost=1794931.20..1794931.21 rows=1 width=0) - Seq Scan on attachment (cost=0.00..1738076.24 rows=22741985 width=0) Filter: (when_uploaded '2010-01-01 00:00:00'::timestamp without time zone) (3 rows) # select count(*) from attachment where when_uploaded '2010-01-01'; count -- 22523639 (1 row) -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
On 7 Jan 2010, at 11:12, Konrad Garus wrote: Hello, We use PG 8.3. We use pg_dump and pg_restore overnight to create ... You seem to have lost the actual data, not the index entries pointing to it, or a sequential scan (eg. pg_dump) would still have found your rows. Do you have any ideas on how it could possibly happen? What research could help find the root cause and fix the database? What kind of file-system is the affected table on? - and while we're at it, what OS/Distribution and version? Is your data on some kind of RAID array? If so, what type (hardware/software, RAID type)? I get the impression the data you lost and the data around it hasn't been written to in a long time; it wouldn't surprise me if your problem would have been caused by a bad sector on a disk, but that depends on how reliable your storage is set up to be. Bad memory is another typical cause of corruption, but not likely in your case. And of course there could be a bug in PG; are you up to date on the minor versions? Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4b47130010732637119309! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Alban Hertroys dal...@solfertje.student.utwente.nl: You seem to have lost the actual data, not the index entries pointing to it, or a sequential scan (eg. pg_dump) would still have found your rows. I agree. What kind of file-system is the affected table on? - and while we're at it, what OS/Distribution and version? Is your data on some kind of RAID array? If so, what type (hardware/software, RAID type)? It's ext3 on a hardware RAID1. The array is in perfect condition, according to its diag tool. The OS is Ubuntu 8.04. The exact PG version is 8.3.8. I get the impression the data you lost and the data around it hasn't been written to in a long time; it wouldn't surprise me if your problem would have been caused by a bad sector on a disk, but that depends on how reliable your storage is set up to be. You are correct about the first point. It's a write-only table with thousands of inserts daily, and the lost rows were written 7 months ago. -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Konrad Garus konrad.ga...@gmail.com writes: 2010/1/8 Alban Hertroys dal...@solfertje.student.utwente.nl: I get the impression the data you lost and the data around it hasn't been written to in a long time; it wouldn't surprise me if your problem would have been caused by a bad sector on a disk, but that depends on how reliable your storage is set up to be. You are correct about the first point. It's a write-only table with thousands of inserts daily, and the lost rows were written 7 months ago. Do you know that the rows disappeared recently? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Tom Lane t...@sss.pgh.pa.us: Do you know that the rows disappeared recently? Yes. They are present in dump from 9 PM and missing from dump from 1 AM. It must've happened within this 4-hour window. -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Konrad Garus konrad.ga...@gmail.com writes: 2010/1/8 Tom Lane t...@sss.pgh.pa.us: Do you know that the rows disappeared recently? Yes. They are present in dump from 9 PM and missing from dump from 1 AM. It must've happened within this 4-hour window. Hm. It would be interesting to see if you can find the place where the rows had been and dump it with pg_filedump http://sources.redhat.com/rhdb/ You can look at the ctid column of the rows that are adjacent to the missing ones according to your older dump, and then dump out those blocks (I recommend -i -f style). regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
How shall I do it? Is this correct: 1. Run: select ctid, * from attachment where ... on the table with such a WHERE clause that includes rows around the missing ones. ctid around missing rows seems to be (603712,78) and (603714,1). Note that 603713 is missing. 2. Run: select relfilenode from pg_class where relname = 'attachment'; (returns 922494) 3. Run: pg_filedump -i -f -R 603712 603714 /var/lib/postgresql/8.3/main/base/922438/922494 myfile I'm unsure about it, because the resulting file does not seem to have the rows I saw listed for block 603712 or 603714. I checked by text in VARCHAR columns. -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Konrad Garus konrad.ga...@gmail.com: I'm unsure about it, because the resulting file does not seem to have the rows I saw listed for block 603712 or 603714. I checked by text in VARCHAR columns. I must've done something wrong. I found that row listed at: Block 603712 (header etc.) Item 15 -- Length: 184 Offset: 5496 (0x1578) Flags: NORMAL XMIN: 8124 XMAX: 0 CID|XVAC: 0 Block Id: 79424 linp Index: 15 Attributes: 19 Size: 32 infomask: 0x0903 (HASNULL|HASVARWIDTH|XMIN_COMMITTED|XMAX_INVALID) t_bits: [0]: 0xff [1]: 0x79 [2]: 0x07 Has ctid equal (79424,15). How do I run pg_filedump for rows with ctid between (603712,78) and (603714,1)? -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
OK, I got it. Attached is the dump of the missing block. -- Konrad Garus missing_block Description: Binary data -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Konrad Garus konrad.ga...@gmail.com writes: OK, I got it. Attached is the dump of the missing block. So which of these rows are invisible? According to the flags items 1, 3 and 5 should be visible while 2 and 4 are dead versions (of 3 and 5 respectively). regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Tom Lane t...@sss.pgh.pa.us: So which of these rows are invisible? According to the flags items 1, 3 and 5 should be visible while 2 and 4 are dead versions (of 3 and 5 respectively). All 3 are invisible, and at the same time they are the only 3 rows missing from the table. -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Konrad Garus konrad.ga...@gmail.com writes: 2010/1/8 Tom Lane t...@sss.pgh.pa.us: So which of these rows are invisible? According to the flags items 1, 3 and 5 should be visible while 2 and 4 are dead versions (of 3 and 5 respectively). All 3 are invisible, and at the same time they are the only 3 rows missing from the table. Huh. Nothing obviously wrong with the data ... maybe an xid wraparound issue? What's your current XID counter? (pg_controldata is the easiest way to answer that) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
I wrote: Huh. Nothing obviously wrong with the data ... maybe an xid wraparound issue? What's your current XID counter? (pg_controldata is the easiest way to answer that) Also, what are the XMINs of the non-missing tuples in the adjacent blocks? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Tom Lane t...@sss.pgh.pa.us: Also, what are the XMINs of the non-missing tuples in the adjacent blocks? # /usr/lib/postgresql/8.3/bin/pg_controldata /var/lib/postgresql/8.3/main/ pg_control version number:833 Catalog version number: 200711281 Database system identifier: 5246886698902745063 Database cluster state: in production pg_control last modified: Fri 08 Jan 2010 10:20:56 AM CST Latest checkpoint location: 1D6/186B6BA0 Prior checkpoint location:1D6/165DAF60 Latest checkpoint's REDO location:1D6/174C8FB8 Latest checkpoint's TimeLineID: 1 Latest checkpoint's NextXID: 0/83037806 Latest checkpoint's NextOID: 142180690 Latest checkpoint's NextMultiXactId: 2250472 Latest checkpoint's NextMultiOffset: 5954794 Time of latest checkpoint:Fri 08 Jan 2010 10:18:33 AM CST Minimum recovery ending location: 0/0 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment:16777216 Maximum length of identifiers:64 Maximum columns in an index: 32 Maximum size of a TOAST chunk:1996 Date/time type storage: 64-bit integers Maximum length of locale name:128 LC_COLLATE: en_US.UTF-8 LC_CTYPE: en_US.UTF-8 2 mins around the missing rows: ctid | xmin | attachment_id -+--+--- (603712,67) | 17140362 | 15460680 (603712,69) | 17140363 | 15460871 (603712,71) | 17140364 | 15460681 (603712,73) | 17140368 | 15460872 (603712,75) | 17140369 | 15460682 (603712,78) | 17140373 | 15460873 (603714,1) | 17140379 | 15460685 (603714,3) | 17140380 | 15460473 (603714,5) | 17140381 | 15460875 (603714,7) | 17140382 | 15460686 (603714,9) | 17140383 | 15460474 -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Just a reminder - these rows are over 6 months old and were lost at night when the system was lightly used. -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Konrad Garus konrad.ga...@gmail.com writes: Latest checkpoint's NextXID: 0/83037806 So, no wraparound problem ... odder and odder. Could we see the whole -i -f printout for that block? You trimmed some of it before, particularly the block header. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Tom Lane t...@sss.pgh.pa.us: So, no wraparound problem ... odder and odder. Could we see the whole -i -f printout for that block? You trimmed some of it before, particularly the block header. Attached. Since data on disk looks correct, is it possible to diagnose it on a higher level? Could the damage be done by vacuum? -- Konrad Garus missing_block Description: Binary data -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Konrad Garus escribió: 2010/1/8 Tom Lane t...@sss.pgh.pa.us: So, no wraparound problem ... odder and odder. Could we see the whole -i -f printout for that block? You trimmed some of it before, particularly the block header. Attached. Since data on disk looks correct, is it possible to diagnose it on a higher level? Could the damage be done by vacuum? I'm a bit surprised by the block numbers in the block header vs. t_self ... I would have guessed that they come from a different segment (and the numbers seem to match, as 603713 % 131072 = 79425), but Konrad doesn't seem to be using the foo.4 file. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Alvaro Herrera alvhe...@commandprompt.com: I'm a bit surprised by the block numbers in the block header vs. t_self ... I would have guessed that they come from a different segment (and the numbers seem to match, as 603713 % 131072 = 79425), but Konrad doesn't seem to be using the foo.4 file. I am not sure I understand what you say, but the 922494 file has 12 parts (922494, 922494.1 through 922494.11). The file I gave you is, indeed, dumped from 922494.4. -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Konrad Garus konrad.ga...@gmail.com writes: 2010/1/8 Alvaro Herrera alvhe...@commandprompt.com: I'm a bit surprised by the block numbers in the block header vs. t_self ... I would have guessed that they come from a different segment (and the numbers seem to match, as 603713 % 131072 = 79425), but Konrad doesn't seem to be using the foo.4 file. I am not sure I understand what you say, but the 922494 file has 12 parts (922494, 922494.1 through 922494.11). The file I gave you is, indeed, dumped from 922494.4. Yeah, it sounds like you did it right, and anyway the block ids prove this is the right block --- the forward links in HOT-updated tuples have to point to the same block. I'm just completely baffled at this point. The data appears perfectly okay according to pg_filedump, and if pg_filedump can read the page then the backend should be able to as well. Just to confirm, if you try to select any of these rows by ctid, ie select * from tablename where ctid = '(603713,1)'; you get nothing? What *should* happen is that you get the row if you mention offset 1, 3, or 5, but nothing if you say 2 or 4. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Tom Lane t...@sss.pgh.pa.us: Just to confirm, if you try to select any of these rows by ctid, ie select * from tablename where ctid = '(603713,1)'; you get nothing? What *should* happen is that you get the row if you mention offset 1, 3, or 5, but nothing if you say 2 or 4. How about this? # select attachment_id from attachment where ctid = '(603713,1)'; attachment_id --- 15460683 (1 row) # select attachment_id from attachment where attachment_id = 15460683; attachment_id --- (0 rows) -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Konrad Garus konrad.ga...@gmail.com writes: 2010/1/8 Tom Lane t...@sss.pgh.pa.us: Just to confirm, if you try to select any of these rows by ctid, ie select * from tablename where ctid = '(603713,1)'; you get nothing? What *should* happen is that you get the row if you mention offset 1, 3, or 5, but nothing if you say 2 or 4. How about this? # select attachment_id from attachment where ctid = '(603713,1)'; attachment_id --- 15460683 (1 row) # select attachment_id from attachment where attachment_id = 15460683; attachment_id --- (0 rows) Oh, so the row *is* there. What the above says is that you have a corrupt index on attachment_id, which you should be able to fix via REINDEX. However, I'm still a bit confused, because corrupt indexes don't normally cause a problem for pg_dump (which is just doing SELECT * or COPY, so the index wouldn't be consulted). Are the dumps you are talking about perhaps made with something other than pg_dump? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Tom Lane t...@sss.pgh.pa.us: Oh, so the row *is* there. Right. I'm happy to see it. What the above says is that you have a corrupt index on attachment_id, which you should be able to fix via REINDEX. This is not correct. The dumps are made with pg_dump. We did reindex on the table. I also tried looking for the row with another index and with seq scan and could not see it. -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
On 01/08/2010 09:31 AM, Konrad Garus wrote: 2010/1/8 Tom Lanet...@sss.pgh.pa.us: Oh, so the row *is* there. Right. I'm happy to see it. What the above says is that you have a corrupt index on attachment_id, which you should be able to fix via REINDEX. This is not correct. The dumps are made with pg_dump. We did reindex on the table. I also tried looking for the row with another index and with seq scan and could not see it. This looks a lot like this thread: http://archives.postgresql.org/pgsql-general/2009-12/msg00726.php Could we see the schema and indexes for this table? -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/8 Adrian Klaver adrian.kla...@gmail.com: This looks a lot like this thread: http://archives.postgresql.org/pgsql-general/2009-12/msg00726.php Could we see the schema and indexes for this table? Table public.attachment Column |Type | Modifiers | Description +-+---+- attachment_id | integer | not null | entity_kind| character varying(15) | | entity_id | integer | | attached_by| integer | not null | when_attached | timestamp without time zone | not null | when_uploaded | timestamp without time zone | | file_name | character varying(255) | not null | file_size | integer | not null | hash | character varying(50) | | description| character varying(300) | | thumb | bytea | | target_entity_kind | character varying(15) | | target_entity_id | integer | | file_size_enc | bigint | | hash_enc | character varying(50) | | secure_key | bytea | | status | character varying(50) | | width | integer | | height | integer | | lat| numeric(10,7) | | lon| numeric(10,7) | | created_date | timestamp without time zone | | created_time | integer | | Indexes: attachment_pkey PRIMARY KEY, btree (attachment_id) attachment_by_entity btree (entity_kind, entity_id) attachment_by_entity_id btree (entity_id) attachment_by_target_entity btree (target_entity_kind, target_entity_id) attachment_by_uploaded btree (when_uploaded) attachment_by_user btree (attached_by) attachment_hash_ix btree (hash) Foreign-key constraints: fk8af75923d38260d2 FOREIGN KEY (attached_by) REFERENCES usr(usr_id) Rules: attachment_no_delete AS ON DELETE TO attachment DO INSTEAD SELECT no_delete() AS no_delete Has OIDs: no -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Konrad Garus konrad.ga...@gmail.com writes: 2010/1/8 Tom Lane t...@sss.pgh.pa.us: What the above says is that you have a corrupt index on attachment_id, which you should be able to fix via REINDEX. This is not correct. The dumps are made with pg_dump. We did reindex on the table. I also tried looking for the row with another index and with seq scan and could not see it. I'm still baffled then, and am starting to think that this really needs some investigation with a debugger. How are you with gdb? Or is there a chance of letting me or another developer poke at (a copy of) your database? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
On 8 Jan 2010, at 18:28, Tom Lane wrote: # select attachment_id from attachment where ctid = '(603713,1)'; attachment_id --- 15460683 (1 row) # select attachment_id from attachment where attachment_id = 15460683; attachment_id --- (0 rows) Oh, so the row *is* there. What the above says is that you have a corrupt index on attachment_id, which you should be able to fix via REINDEX. However, I'm still a bit confused, because corrupt indexes don't normally cause a problem for pg_dump (which is just doing SELECT * or COPY, so the index wouldn't be consulted). Are the dumps you are talking about perhaps made with something other than pg_dump? Would pg_dump still not consult the index if someone sets enable_seqscan=false in the config file? To Konrad: Did you turn off seqscans in the postgres.conf? Could you try a REINDEX TABLE attachment again in case you somehow reindexed the wrong index or table? You seem to know what you're doing, but just in case we missed something as this is strange enough to have even the devs scratching their heads. The rows are there, so it _has_ to be an index or a transaction visibility issue... Alban Hertroys -- Screwing up is the best way to attach something to the ceiling. !DSPAM:737,4b47b35810731946694119! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Alban Hertroys dal...@solfertje.student.utwente.nl writes: You seem to know what you're doing, but just in case we missed something as this is strange enough to have even the devs scratching their heads. The rows are there, so it _has_ to be an index or a transaction visibility issue... The successful fetch-by-ctid test seems to have eliminated the transaction-visibility-problem theory too. The whole thing is passing strange at this point. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
One more bit of information. The master table has exactly 3 rows missing. They were all inserted half a year ago within one minute and no other rows have been inserted in between them. Is it possible that we lost a consistent piece of data (like a block or a page)? What can I do to track it down? I suspect it is a bug in PostgreSQL. Does it resemble an already submitted issue? -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Konrad Garus wrote: We use PG 8.3. We use pg_dump and pg_restore overnight to create copies of main database for reporting etc. One dump/restore runs at 9 PM, another at 11 PM. Today I discovered that the restore at 11 PM failed to recreate a foreign key constraint, because one row from master table was missing. It is also missing from main database, but not from the 9 PM dump. The main database is in curious state: The row from master table is missing, the row referencing it from slave table is present, and finally the FK constraint on slave is in place. Do you have any ideas on how it could possibly happen? What research could help find the root cause and fix the database? If that's really the case, it sounds like curruption. Is there anything in the server logs? Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
2010/1/7 Albe Laurenz laurenz.a...@wien.gv.at: If that's really the case, it sounds like curruption. Is there anything in the server logs? I'm pretty sure it is corruption. With FK in place I don't see any other way how it would be possible. The log is too large to read through, but I haven't spotted anything suspicious (searching by table name, grepping for error and vacuum, quick scan from when the state was consistent until when it was not). Also, reindex on the master table did not help. Is there anything I could do on logs or the database that could help diagnose the issue? -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
Le 07/01/2010 11:12, Konrad Garus a écrit : Hello, We use PG 8.3. We use pg_dump and pg_restore overnight to create copies of main database for reporting etc. One dump/restore runs at 9 PM, another at 11 PM. Today I discovered that the restore at 11 PM failed to recreate a foreign key constraint, because one row from master table was missing. It is also missing from main database, but not from the 9 PM dump. The main database is in curious state: The row from master table is missing, the row referencing it from slave table is present, and finally the FK constraint on slave is in place. Do you have any ideas on how it could possibly happen? Disabling trigger does this. Here is a quick (but long) example: guilla...@laptop:~$ createdb b1 guilla...@laptop:~$ LANG=C psql b1 psql (8.5devel) Type help for help. b1=# create table t1 (c1 integer primary key, c2 integer); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index t1_pkey for table t1 CREATE TABLE b1=# create table t2 (c1 integer, c1_t1 integer references t1(c1)); CREATE TABLE b1=# \d t1 Table public.t1 Column | Type | Modifiers +-+--- c1 | integer | not null c2 | integer | Indexes: t1_pkey PRIMARY KEY, btree (c1) Referenced by: TABLE t2 CONSTRAINT t2_c1_t1_fkey FOREIGN KEY (c1_t1) REFERENCES t1(c1) b1=# \d t2 Table public.t2 Column | Type | Modifiers +-+--- c1 | integer | c1_t1 | integer | Foreign-key constraints: t2_c1_t1_fkey FOREIGN KEY (c1_t1) REFERENCES t1(c1) b1=# insert into t1 values (1, 100); INSERT 0 1 b1=# insert into t1 values (2, 200); INSERT 0 1 b1=# insert into t2 values (1, 1); INSERT 0 1 b1=# insert into t2 values (1, 2); INSERT 0 1 b1=# insert into t2 values (1, 3); ERROR: insert or update on table t2 violates foreign key constraint t2_c1_t1_fkey DÉTAIL : Key (c1_t1)=(3) is not present in table t1. Which is right. Now, we disable triggers: b1=# alter table t2 disable trigger all; ALTER TABLE b1=# insert into t2 values (1, 3); INSERT 0 1 The INSERT now works. The FK is not checked. b1=# alter table t2 enable trigger all; ALTER TABLE Reenabling triggers won't alert you. Rows are inserted and will stay that way. b1=# select * from t1; c1 | c2 +- 1 | 100 2 | 200 (2 lines) b1=# select * from t2; c1 | c1_t1 +--- 1 | 1 1 | 2 1 | 3 (3 lines) Despite my examples are on 8.5dev, you have the same issue with 8.3. See http://www.postgresql.org/docs/8.3/interactive/sql-altertable.html for more details. So, question is: did you disable triggers sometime on the referenced table? -- Guillaume. http://www.postgresqlfr.org http://dalibo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rows missing from table despite FK constraint
So, question is: did you disable triggers sometime on the referenced table? No, at least not intentionally. More information: 1. Missing are 3 rows added quickly one after another over half a year ago. They were lost this week. That is the only corruption I am aware of. 2. The problem is rows that were present in master table and are gone, not superfluous rows in slave table. 3. The master table has this rule preventing DELETE on it: Rules: master_table_no_delete AS ON DELETE TO master_table DO INSTEAD SELECT no_delete() AS no_delete Where no_delete is: begin raise exception 'Cannot remove rows from the table.'; end; I am pretty confident it was not caused by us or our software. The data is relatively old, the missing piece is small and the no_delete rule is in place. Can it be a side effect of some PG job, such as vacuum etc? Let me know what other information could be helpful. -- Konrad Garus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general