Re: [GENERAL] Primary key Index Error

2011-10-25 Thread Merlin Moncure
On Tue, Oct 25, 2011 at 2:41 AM, Raghavendra
 wrote:
> On Tue, Oct 25, 2011 at 11:40 AM, Manoj K P  wrote:
>>
>> Server log Oct 1 00:06:59 server_host_name postgres[1453]: [5-1]
>> 2011-10-01 00:06:59.831 EDT 1453 4e869041.5ad postgres [local] postgres LOG:
>> duration: 418583.238 ms statement: select pg_start_backup('fortnightly');
>> Oct 2 03:03:18 server_host_name postgres[1453]: [6-1] 2011-10-02
>> 03:03:18.243 EDT 1453 4e869041.5ad postgres [local] postgres LOG: duration:
>> 8034.385 ms statement: select pg_stop_backup(); In between stop and start
>> process server_host_name is receiving all type of DML & DDL and generating
>> new WAL file Taking base backup in between start and stop process Client Log
>> Details Oct 25 05:16:18 client_server_name postgres[28858]: [2-1] 2011-10-25
>> 05:16:18.202 BST 28858 LOG: could not open file
>> "pg_xlog/271047B1008C" (log file 18353, segment 140): No such
>> file or directory Oct 25 05:16:18 client_server_name postgres[28858]: [3-1]
>> 2011-10-25 05:16:18.203 BST 28858 LOG: invalid checkpoint record Oct 25
>> 05:16:18 client_server_name postgres[28858]: [4-1] 2011-10-25 05:16:18.203
>> BST 28858 FATAL: could not locate required checkpoint record Oct 25 05:16:18
>> client_server_name postgres[28858]: [4-2] 2011-10-25 05:16:18.203 BST 28858
>> HINT: If you are not restoring from a backup, try removing the file
>> "/mnt/new_cluster/backup_label". Oct 25 05:16:18 client_server_name
>> postgres[28857]: [1-1] 2011-10-25 05:16:18.205 BST 28857 LOG: startup
>> process (PID 28858) exited with exit code 1 Oct 25 05:16:18
>> client_server_name postgres[28857]: [2-1] 2011-10-25 05:16:18.205 BST 28857
>> LOG: aborting startup due to startup process failure Oct 25 05:20:53
>> client_server_name postgres[29030]: [2-1] 2011-10-25 05:20:53.630 BST 29030
>> LOG: could not open file "pg_xlog/271047B10068" (log file 18353,
>> segment 104): No such file or directory Oct 25 05:20:53 client_server_name
>> postgres[29030]: [3-1] 2011-10-25 05:20:53.630 BST 29030 FATAL: could not
>> find redo location referenced by checkpoint record Oct 25 05:20:53
>> client_server_name postgres[29030]: [3-2] 2011-10-25 05:20:53.630 BST 29030
>> HINT: If you are not restoring from a backup, try removing the file
>> "/mnt/new_cluster/backup_label". Oct 25 05:20:53 client_server_name
>> postgres[29029]: [1-1] 2011-10-25 05:20:53.633 BST 29029 LOG: startup
>> process (PID 29030) exited with exit code 1 Oct 25 05:20:53
>> client_server_name postgres[29029]: [2-1] 2011-10-25 05:20:53.633 BST 29029
>> LOG: aborting startup due to startup process failure manually copy following
>> file to pg_xlog folder 271047B1008C 271047B10068
>> After words i can start postgres and accessing the database , but same
>> error
>
> As per the logs, do you see missing XLOG files in Archive Destination ? becz
> these kind of situations mostly missing files will be in WAL-Archive
> location. You need to copy to pg_xlog directory and start the instance.
>
> As Merlin Said, you need to dig more to know why its crashing by increasing
> the LOG-DEBUG level's. Increasing DEBUG level may occupy good amount of
> space in log-location, so make sure you have good space for logs to get what
> exactly happening at the time of backup in particular. Am not sure whether
> its safe to attach ***backtrace*** to instance for information.

yeah. also, what's the setting of archive_command (or is it even set)?
 taking a 'hot' filesystem backup without having an archive_command
and not doing any other intervention to guarantee the necessary WAL
segments are present will not give you a complete backup.  my money is
on you having an invalid backup procedure.  the only way to take a
filesystem snapshot without dealing with WAL files is to bring the
database down.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Primary key Index Error

2011-10-25 Thread Raghavendra
On Tue, Oct 25, 2011 at 11:40 AM, Manoj K P  wrote:

> *Server log* Oct 1 00:06:59 server_host_name postgres[1453]: [5-1]
> 2011-10-01 00:06:59.831 EDT 1453 4e869041.5ad postgres [local] postgres LOG:
> duration: 418583.238 ms statement: select pg_start_backup('fortnightly');
> Oct 2 03:03:18 server_host_name postgres[1453]: [6-1] 2011-10-02
> 03:03:18.243 EDT 1453 4e869041.5ad postgres [local] postgres LOG: duration:
> 8034.385 ms statement: select pg_stop_backup(); In between stop and start
> process server_host_name is receiving all type of DML & DDL and generating
> new WAL file Taking base backup in between start and stop process *Client
> Log Details* Oct 25 05:16:18 client_server_name postgres[28858]: [2-1]
> 2011-10-25 05:16:18.202 BST 28858 LOG: could not open file
> "pg_xlog/271047B1008C" (log file 18353, segment 140): No such
> file or directory Oct 25 05:16:18 client_server_name postgres[28858]: [3-1]
> 2011-10-25 05:16:18.203 BST 28858 LOG: invalid checkpoint record Oct 25
> 05:16:18 client_server_name postgres[28858]: [4-1] 2011-10-25 05:16:18.203
> BST 28858 FATAL: could not locate required checkpoint record Oct 25 05:16:18
> client_server_name postgres[28858]: [4-2] 2011-10-25 05:16:18.203 BST 28858
> HINT: If you are not restoring from a backup, try removing the file
> "/mnt/new_cluster/backup_label". Oct 25 05:16:18 client_server_name
> postgres[28857]: [1-1] 2011-10-25 05:16:18.205 BST 28857 LOG: startup
> process (PID 28858) exited with exit code 1 Oct 25 05:16:18
> client_server_name postgres[28857]: [2-1] 2011-10-25 05:16:18.205 BST 28857
> LOG: aborting startup due to startup process failure Oct 25 05:20:53
> client_server_name postgres[29030]: [2-1] 2011-10-25 05:20:53.630 BST 29030
> LOG: could not open file "pg_xlog/271047B10068" (log file 18353,
> segment 104): No such file or directory Oct 25 05:20:53 client_server_name
> postgres[29030]: [3-1] 2011-10-25 05:20:53.630 BST 29030 FATAL: could not
> find redo location referenced by checkpoint record Oct 25 05:20:53
> client_server_name postgres[29030]: [3-2] 2011-10-25 05:20:53.630 BST 29030
> HINT: If you are not restoring from a backup, try removing the file
> "/mnt/new_cluster/backup_label". Oct 25 05:20:53 client_server_name
> postgres[29029]: [1-1] 2011-10-25 05:20:53.633 BST 29029 LOG: startup
> process (PID 29030) exited with exit code 1 Oct 25 05:20:53
> client_server_name postgres[29029]: [2-1] 2011-10-25 05:20:53.633 BST 29029
> LOG: aborting startup due to startup process failure manually copy following
> file to pg_xlog folder 271047B1008C 271047B10068
> After words i can start postgres and accessing the database , but same
> error


As per the logs, do you see missing XLOG files in Archive Destination ? becz
these kind of situations mostly missing files will be in WAL-Archive
location. You need to copy to pg_xlog directory and start the instance.

As Merlin Said, you need to dig more to know why its crashing by increasing
the LOG-DEBUG level's. Increasing DEBUG level may occupy good amount of
space in log-location, so make sure you have good space for logs to get what
exactly happening at the time of backup in particular. Am not sure whether
its safe to attach ***backtrace*** to instance for information.

---
Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/


Re: [GENERAL] Primary key Index Error

2011-10-24 Thread Manoj K P
*Server log*

Oct  1 00:06:59 server_host_name postgres[1453]: [5-1] 2011-10-01
00:06:59.831 EDT 1453 4e869041.5ad postgres [local] postgres LOG:  duration:
418583.238 ms  statement: select pg_start_backup('fortnightly');

Oct  2 03:03:18 server_host_name postgres[1453]: [6-1] 2011-10-02
03:03:18.243 EDT 1453 4e869041.5ad postgres [local] postgres LOG:  duration:
8034.385 ms  statement: select pg_stop_backup();


In between stop and start process  server_host_name is receiving all type of
DML & DDL and  generating new WAL file 

Taking base backup in between start and stop process 

/*Client Log Details*/

Oct 25 05:16:18 client_server_name postgres[28858]: [2-1] 2011-10-25
05:16:18.202 BST 28858 LOG:  could not open file
"pg_xlog/271047B1008C" (log file 18353, segment 140): No such
file or directory
Oct 25 05:16:18 client_server_name postgres[28858]: [3-1] 2011-10-25
05:16:18.203 BST 28858 LOG:  invalid checkpoint record
Oct 25 05:16:18 client_server_name postgres[28858]: [4-1] 2011-10-25
05:16:18.203 BST 28858 FATAL:  could not locate required checkpoint record
Oct 25 05:16:18 client_server_name postgres[28858]: [4-2] 2011-10-25
05:16:18.203 BST 28858 HINT:  If you are not restoring from a backup, try
removing the file "/mnt/new_cluster/backup_label".
Oct 25 05:16:18 client_server_name postgres[28857]: [1-1] 2011-10-25
05:16:18.205 BST 28857 LOG:  startup process (PID 28858) exited with exit
code 1
Oct 25 05:16:18 client_server_name postgres[28857]: [2-1] 2011-10-25
05:16:18.205 BST 28857 LOG:  aborting startup due to startup process failure


Oct 25 05:20:53 client_server_name postgres[29030]: [2-1] 2011-10-25
05:20:53.630 BST 29030 LOG:  could not open file
"pg_xlog/271047B10068" (log file 18353, segment 104): No such
file or directory
Oct 25 05:20:53 client_server_name postgres[29030]: [3-1] 2011-10-25
05:20:53.630 BST 29030 FATAL:  could not find redo location referenced by
checkpoint record
Oct 25 05:20:53 client_server_name postgres[29030]: [3-2] 2011-10-25
05:20:53.630 BST 29030 HINT:  If you are not restoring from a backup, try
removing the file "/mnt/new_cluster/backup_label".
Oct 25 05:20:53 client_server_name postgres[29029]: [1-1] 2011-10-25
05:20:53.633 BST 29029 LOG:  startup process (PID 29030) exited with exit
code 1
Oct 25 05:20:53 client_server_name postgres[29029]: [2-1] 2011-10-25
05:20:53.633 BST 29029 LOG:  aborting startup due to startup process failure



manually copy  following file to pg_xlog folder

   271047B1008C
   271047B10068

After words i can start postgres and accessing the database , but same error 

   



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Primary-key-Index-Error-tp4931714p4935172.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: [GENERAL] Primary key Index Error

2011-10-24 Thread Merlin Moncure
On Mon, Oct 24, 2011 at 3:22 AM, Manoj K P  wrote:
> Postgres recovered from data folder , after that some queries shows error
>
>
> select * from table2 order by app_id  ; - its work ( 5000 data)
> select * from table2 order by app_id   desc ; - its work
>
> Here app_id contains  binary index
>
>
> select * from table2 order by id  ; - its work ( 5000 data)
> select * from table2 order by id   desc ; - shows following error
>
> manoj[local] postgres ERROR:  index "tbl2_id_pkey" contains unexpected zero
> page at block 311121
> manoj [local] postgres HINT:  Please REINDEX it.
>
> index "tbl2_id_pkey" is primary key
>
> Reindex not pratical this table (250 GB data)
>
> How i can solve this ?

It looks like you have data corruption.  Your first priority should be
to do the minimum necessary to get a usable database dump and then to
get back online.  I would consider your database off line until this
is done. Essentially, the only practical ways to repair the error you
see would be to reindex or restore from backup.  Did you have any
unexpected power events?  Server crashes?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Primary key Index Error

2011-10-24 Thread Merlin Moncure
On Mon, Oct 24, 2011 at 8:32 AM, manoj  wrote:
> On 10/24/2011 06:38 PM, Merlin Moncure wrote:
>
> On Mon, Oct 24, 2011 at 3:22 AM, Manoj K P  wrote:
>
> Postgres recovered from data folder , after that some queries shows error
>
>
> select * from table2 order by app_id  ; - its work ( 5000 data)
> select * from table2 order by app_id   desc ; - its work
>
> Here app_id contains  binary index
>
>
> select * from table2 order by id  ; - its work ( 5000 data)
> select * from table2 order by id   desc ; - shows following error
>
> manoj[local] postgres ERROR:  index "tbl2_id_pkey" contains unexpected zero
> page at block 311121
> manoj [local] postgres HINT:  Please REINDEX it.
>
> index "tbl2_id_pkey" is primary key
>
> Reindex not pratical this table (250 GB data)
>
> How i can solve this ?
>
> It looks like you have data corruption.  Your first priority should be
> to do the minimum necessary to get a usable database dump and then to
> get back online.  I would consider your database off line until this
> is done. Essentially, the only practical ways to repair the error you
> see would be to reindex or restore from backup.  Did you have any
> unexpected power events?  Server crashes?
>
> merlin
>
>   Actually this is backup server ,no power failure & Server crashes happens
> in between , The data folder is base backup(pg_start_backup) on the server.
> pg_dump not practical in this case  because of data size.
>   After recovering from base backup run  WAL recovery file successfully and
> its updates current date data.
>   When i am running query on big table(more than 5 Million  data)  shows
> error other wise its work fine.
>
>   select id , name  from table2 order by id  ; It uses the index
>
>   select id , name  from  table2 order by id desc ; - shows the error
>
> Error happens only using (DESC clause against primary key) , all other case
> its work fine
>
>  REINDEX will take ages

do you happen to still have the database logs on the backup from
startup to end of recovery?  anything interesting in there?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Primary key Index Error

2011-10-24 Thread manoj

On 10/24/2011 06:38 PM, Merlin Moncure wrote:

On Mon, Oct 24, 2011 at 3:22 AM, Manoj K P  wrote:

Postgres recovered from data folder , after that some queries shows error


select * from table2 order by app_id  ; - its work ( 5000 data)
select * from table2 order by app_id   desc ; - its work

Here app_id contains  binary index


select * from table2 order by id  ; - its work ( 5000 data)
select * from table2 order by id   desc ; - shows following error

manoj[local] postgres ERROR:  index "tbl2_id_pkey" contains unexpected zero
page at block 311121
manoj [local] postgres HINT:  Please REINDEX it.

index "tbl2_id_pkey" is primary key

Reindex not pratical this table (250 GB data)

How i can solve this ?

It looks like you have data corruption.  Your first priority should be
to do the minimum necessary to get a usable database dump and then to
get back online.  I would consider your database off line until this
is done. Essentially, the only practical ways to repair the error you
see would be to reindex or restore from backup.  Did you have any
unexpected power events?  Server crashes?

merlin


  Actually this is backup server ,no power failure&  Server crashes happens  in 
between , The data folder is base backup(|pg_start_backup)|  on the server. pg_dump 
not practical in this case  because of data size.
  After recovering from base backup run  WAL recovery file successfully and its 
updates current date data.
  When i am running query on big table(more than 5 Million  data)  shows 
error other wise its work fine.

  select id , name  from table2 order by id  ; It uses the index

  select id , name  from  table2 order by id desc ; - shows the error

Error happens only using (DESC clause against primary key) , all other case its 
work fine

 REINDEX will take ages



--
*Manoj K P*
*Postgres DBA*
*Comodo India*


smime.p7s
Description: S/MIME Cryptographic Signature