Re: [GENERAL] Corrupted DB? could not open file pg_clog/####

2006-08-01 Thread Martijn van Oosterhout
On Mon, Jul 31, 2006 at 06:09:33PM -0400, Francisco Reyes wrote:
> Martijn van Oosterhout writes:
> 
> >That's when you've reached the end of the table. The point is that
> >before then you'll have found the value of N that produces the error.
> 
> Will be a while.. my little python script is doing under 10 selects/sec... 
> and there are nearly 67 million records. :-(

Naturally you'd do a binary search. That'd take a maximum of about
log2(67 million) = about 26 scans.

Once you find it you should be able to identify the ctid. You may be
able to delete it.

> >It will rollback all pending transactions. The point is that it's
> >looking for information about transactions that were committed. This is
> >usually a memory or disk error.
> 
> So, should it be safe to create the file and fill it up with 256K zeros?

For a certain value of "safe". It get the system running, but there's
no guarentees about the data in it...

> At the rate my script is going.. it's going to take a very long time to 
> find out where the problem is. If I have a dump.. any usefull info I can 
> take from the point the dump stopped?

That gives you the place in the file where it broke...

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] Corrupted DB? could not open file pg_clog/####

2006-07-31 Thread Francisco Reyes

Martijn van Oosterhout writes:


That's when you've reached the end of the table. The point is that
before then you'll have found the value of N that produces the error.


Will be a while.. my little python script is doing under 10 selects/sec... 
and there are nearly 67 million records. :-(
 

How?
Tried set client_min_message='DEBUG';


That should do it.


The right one (for the archives) was actually:
set client_min_messages=DEBUG;


It will rollback all pending transactions. The point is that it's
looking for information about transactions that were committed. This is
usually a memory or disk error.


So, should it be safe to create the file and fill it up with 256K zeros?


Sounds like some corrupt data. Once you've located the invalid data,
dump the block with pgfiledump, that should give you more info about
what happened.


At the rate my script is going.. it's going to take a very long time to 
find out where the problem is. If I have a dump.. any usefull info I can 
take from the point the dump stopped?


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [GENERAL] Corrupted DB? could not open file pg_clog/####

2006-07-31 Thread Martijn van Oosterhout
On Sun, Jul 30, 2006 at 04:58:34PM -0400, Francisco Reyes wrote:
> Martijn van Oosterhout writes:
> 
> >It's still a reasonable suggestion. The maximum offset is the number of
> >rows in the table. You'll notice when the output is empty.
> 
> Once I find the point where the output is empty then what?

That's when you've reached the end of the table. The point is that
before then you'll have found the value of N that produces the error.

> How?
> Tried set client_min_message='DEBUG';

That should do it.

> If the pg_clog files are to keep track of transactions, shouldn't a "pg_ctl 
> restart"  rollback all pending transactions.. so there are no pending 
> transactions upon the restart and this error should not appear again?
> Using 8.1.4

It will rollback all pending transactions. The point is that it's
looking for information about transactions that were committed. This is
usually a memory or disk error.

> >However, pg_clog/ is the very first transaction file created. Is it
> >in the range of the files that do do exist?
> 
> There are 228 files in the directory and the oldest one is "016E" from 
> about a month ago.

Sounds like some corrupt data. Once you've located the invalid data,
dump the block with pgfiledump, that should give you more info about
what happened.

Hope this helps,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] Corrupted DB? could not open file pg_clog/####

2006-07-30 Thread Francisco Reyes

Martijn van Oosterhout writes:


It's still a reasonable suggestion. The maximum offset is the number of
rows in the table. You'll notice when the output is empty.


Once I find the point where the output is empty then what?


Do you have
an idea how much data it contains?


Yes. Around 87 million rows.



If you connect via psql, you can set the client_min_message to DEBUG to
get some more stuff.


How?
Tried set client_min_message='DEBUG';


Well, it will stop complaining. What will happen is that any
transactions involving those transaction IDs will be assumed to have
been committed.


If the pg_clog files are to keep track of transactions, shouldn't a "pg_ctl 
restart"  rollback all pending transactions.. so there are no pending 
transactions upon the restart and this error should not appear again?

Using 8.1.4


This may be ok, but in extreme cases it could lead to
broken constraints.


That exteme case sounds pretty bad. :-(
Will it be safe to do after a restart? After all there should not be any 
transactions..
 

However, pg_clog/ is the very first transaction file created. Is it
in the range of the files that do do exist?


There are 228 files in the directory and the oldest one is "016E" from about 
a month ago.



 Are you sure some other
process didn't remove it accedently?


Not that I can think off. I didn't even know what this diretory was until 
this problem showed up.. much less delete anything from it.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Corrupted DB? could not open file pg_clog/####

2006-07-30 Thread Martijn van Oosterhout
On Sun, Jul 30, 2006 at 01:31:14AM -0400, Francisco Reyes wrote:
> Looking at archives seem to indicate missing pg_clog files is some form 
> of row or page corruption.
> 
> In an old thread from back in 2003 Tom Lane recommended 
> (http://tinyurl.com/jushf):
> 
> >>If you want to try to narrow down where the corruption is, you can 
> >>experiment with commands like 
> >>select ctid,* from big_table offset N limit 1;
> 
> Is that still a valid suggestion?
> How do I know the possible maximun value for offset to try for each table?

It's still a reasonable suggestion. The maximum offset is the number of
rows in the table. You'll notice when the output is empty. Do you have
an idea how much data it contains?

> If I have logs turned on.. at which level will the eror show? I am only 
> aware of the problem, because an application connected to postgrseql had 
> the errors in it's logs, but not seeing anything in the postgresql logs 
> themselves.

If you connect via psql, you can set the client_min_message to DEBUG to
get some more stuff.

> Just tried a pg_dump and got the 
> could not open file "pg_clog/"
> error.
> 
> The file pg_clog/ is missing.
> Looking at another thread (http://tinyurl.com/feyye) I see that the file 
> can be created as 256K worth of zeroes. If I do this.. will operations 
> resume normally? Is there a way to tell if any data was lost?

Well, it will stop complaining. What will happen is that any
transactions involving those transaction IDs will be assumed to have
been committed. This may be ok, but in extreme cases it could lead to
broken constraints.

However, pg_clog/ is the very first transaction file created. Is it
in the range of the files that do do exist? Are you sure some other
process didn't remove it accedently?

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


[GENERAL] Corrupted DB? could not open file pg_clog/####

2006-07-29 Thread Francisco Reyes
Looking at archives seem to indicate missing pg_clog files is some form 
of row or page corruption.


In an old thread from back in 2003 Tom Lane recommended 
(http://tinyurl.com/jushf):


If you want to try to narrow down where the corruption is, you can 
experiment with commands like 
select ctid,* from big_table offset N limit 1;


Is that still a valid suggestion?
How do I know the possible maximun value for offset to try for each table?

If I have logs turned on.. at which level will the eror show? I am only 
aware of the problem, because an application connected to postgrseql had the 
errors in it's logs, but not seeing anything in the postgresql logs 
themselves.


Just tried a pg_dump and got the 
could not open file "pg_clog/"

error.

The file pg_clog/ is missing.
Looking at another thread (http://tinyurl.com/feyye) I see that the file can 
be created as 256K worth of zeroes. If I do this.. will operations resume 
normally? Is there a way to tell if any data was lost?


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly