Re: [BUGS] BUG #8154: pg_dump throws error beacause of field called new.

2013-07-29 Thread Jeff Janes
On Wed, Jul 24, 2013 at 5:18 AM, Willy-Bas Loos willy...@gmail.com wrote:
 Hi,

 The manual says:
 It is recommended that you use the pg_dump and pg_dumpall programs from the
 newer version of PostgreSQL, to take advantage of enhancements that might
 have been made in these programs. Current releases of the dump programs can
 read data from any server version back to 7.0.
 http://www.postgresql.org/docs/9.1/static/upgrading.html

 That means that this actually should work.

It will work if you use the --quote-all-identifiers option.

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #8273: Assertion failure in 9.3 beta2 with serializable and savepoints

2013-07-01 Thread Jeff Janes
On Mon, Jul 1, 2013 at 3:43 AM, levert...@googlemail.com wrote:

 The following bug has been logged on the website:

 Bug reference:  8273
 Logged by:  David Leverton
 Email address:  levert...@googlemail.com
 PostgreSQL version: Unsupported/Unknown
 Operating system:   RHEL 5 x86_64
 Description:

 The following test case causes a backend assertion failure in 9.3 beta2:


 START TRANSACTION ISOLATION LEVEL SERIALIZABLE;
 CREATE TABLE testing(
   x INTEGER PRIMARY KEY
 );
 INSERT INTO testing VALUES(1);
 SELECT * FROM testing WHERE x = 1 FOR UPDATE;
 SAVEPOINT test;
 UPDATE testing SET x = 2 WHERE x = 1;
 ROLLBACK TO test;
 UPDATE testing SET x = 3 WHERE x = 1;
 ROLLBACK;


 TRAP: FailedAssertion(!(((xid) != ((TransactionId) 0))), File:
 predicate.c, Line: 3936)


 Postgres was installed using the RPMs from http://yum.pgrpms.org/, and is
 using a near-default configuration (changes to port number, data directory
 and pg_hba.conf, and some roles created, but nothing likely to influence
 this bug).



I can confirm this compiling from source on CentOS 6.4.

The bug was introduced in commit: 0ac5ad5... Improve concurrency of foreign
key locking.

I don't know what more to look into on this, so I'm cc Alvaro, the patch
author.

Cheers,

Jeff


Re: [BUGS] BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica

2013-06-11 Thread Jeff Janes
On Fri, Jun 7, 2013 at 6:08 AM, Federico Campoli feder...@brandwatch.comwrote:

 On 06/06/13 21:22, Jeff Janes wrote:



 I'd probably approach this with a combination of strace -T -ttt -p
 PID and lsof -p PID on the PID of the start-up process, to see
 which file descriptors it is waiting for read or writes on, and what the
 underlying names of the files for those file descriptors are.

 Cheers,

 Jeff


 I've generated a strace from the startup process a new profile report and
 a new log file.

 It's quite big so you can download here

 http://d01.megashares.com/dl/**BqZLh5H/HS_data_lag.tarhttp://d01.megashares.com/dl/BqZLh5H/HS_data_lag.tar

 I've monitored the startup process lsof  and the only files accessed are
 the recovering wal and the relation affected by the master's activity.


You didn't need to strace the entire operation, just a sampling for ~1
minute in the middle of it should be enough.


It looks like it is basically idle before 1370603193.846487, so I ignored
that part.  Of the remaining 472 seconds covered (during which time it
replays ~19 WAL files), the sum of the system call times covers only 210
seconds, so the rest is presumably CPU time (although the error due to
rounding and instrumentation vagaries could be quite high).

Even during the active part, quite a bit of the time is still spent idle
(poll([{fd=3, events=POLLIN}], 1, 5000)).  And the biggest remaining
chunk is spent on lseek(11,, but again I don't know how  much to trust
that as any individual call is at the limit of measurability so it could
just be the accumulation of instrumentation errors, not real time.

zcat pg_HS_strace.out.gz | perl -lne '/([\d.]+)/ and $_ 
1370603193.846487 and print $1\t$_'|sort -rn|perl -lne '$sum+=$_; print
$sum\t$_' |less

zcat pg_HS_strace.out.gz | perl -lne '/ ([^,]+,).*([\d.]+)/ and $_ 
1370603193.846487 and $h{$1}+=$2; END {print $h{$_}\t$_ foreach keys
%h}'|sort -g


For the warm stand-by, if I cut of the uninteresting parts at the beginning
and end (1370604295.966518   and
1370605296.458290), then it applies ~33 WAL files in 1000 seconds, so it
doesn't seem to be any faster than the hot standby.

Are you sure the strace covered the events of interest?

Do you have output of vmstat or sar for the period of interest?  It would
be interesting to see what the user time and wait time are.




 Regarding the warm standby, I've repeated the test on my sandbox with the
 slave in warm standby and I've noticed a replication lag spike.

 This does not affect the production,same database version and architecture
 from debian package.


I'm not sure what you are saying here.  Both HS and WS lag on your test
machine, but only HS does on the production?  And the straces you posted
are from the test machine?

In any case, nothing really stands out here.  Other than the idle periods,
no one thing seems to be dominating the time.

Cheers,

Jeff


Re: [BUGS] BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica

2013-06-06 Thread Jeff Janes
On Tue, Jun 4, 2013 at 5:57 AM, Federico Campoli feder...@brandwatch.comwrote:


 I'm sorry, just guessing it could be a loop.
 The read remains stuck on the same segment.
 On my testbox I have at least 1 minute to 20 Mb/s.
 On the live system the peak is 124 Mb/s for 2 to 3 minutes without any
 progress in the wal reply.

 I've attached the part of postgresql's log with debug2  from my sandbox
 when that happens.


It looks like it is making progress, just very slowly.  Basically I think
every WAL record that needs it to be replayed triggers the random read of
some data block which is not already cached on the standby.



 In warm standby everything is fine no lag at all.


OK, then I don't think I can reproduce it.  The spiky replay I see is the
same whether the standby is hot or warm.


Cheers,

Jeff


Re: [BUGS] BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica

2013-06-01 Thread Jeff Janes
On Thursday, May 30, 2013, wrote:

 The following bug has been logged on the website:

 Bug reference:  8192
 Logged by:  Federico Campoli
 Email address:  feder...@brandwatch.com javascript:;
 PostgreSQL version: 9.2.4
 Operating system:   Debian 6.0
 Description:

 /*

 Description:

 It seems on very large tables the concurrent update with vacuum (or
 autovacuum),
 when the slave is in hot standby mode, generates long loops in read on a
 single wal segment during the recovery process.

 This have two nasty effects.
 A massive read IO peak and the replay lag increasing as the recovery
 process
 hangs for long periods on a pointless loop.


Are you observing a loop, and if so how are you observing it?  What is it
that is looping?


 SET client_min_messages='debug2';
 SET trace_sort='on';


Are these settings useful?  What are they showing you?



 --in a new session and start an huge table update
 UPDATE t_vacuum set ts_time=now() WHERE i_id_row2000;

 --then vacuum the table
 VACUUM VERBOSE t_vacuum;


Are you running the update and vacuum concurrently or serially?



 --at some point the startup process will stuck recovering one single wal
 file and
 --the DISK READ column will show a huge IO for a while.


What is huge?


I don't know if I can reproduce this or not.  I certainly get spiky lag,
but I see no reason to think it is anything other than IO congestion,
occurring during stretches of WAL records where compact records describe a
larger amount of work that needs to be done in terms of poorly-cached IO.
 Perhaps the kernel's read-ahead mechanism is not working as well as it
theoretically could be.  Also the standby isn't using a ring-buffer
strategy, but I see no reason to think it would help were it to do so.

The DISK READ column is not what I would call huge during this, often 10-15
MB/S, because much of the IO is scattered rather than sequential.  The IO
wait % on the other hand is maxed out.

It is hard to consider it as a bug that the performance is not as high as
one might wish it to be.  Is this behavior a regression from some earlier
version?  What if hot-standby is turned off?

Cheers,

Jeff


Re: [BUGS] BUG #8025: PostgreSQL crash (= 9.1 64 bit)

2013-04-18 Thread Jeff Janes
I've bisected this down to this commit:

commit 0f61d4dd1b4f95832dcd81c9688dac56fd6b5687
Author: Tom Lane t...@sss.pgh.pa.us
Date:   Fri Nov 19 17:31:50 2010 -0500

Improve relation width estimation for subqueries.

And looking at the core dump,

#0  make_rel_from_joinlist (root=0xee4fb8, joinlist=value optimized out)
at allpaths.c:942
942 if (IsA(jlnode, RangeTblRef))

jlnode holds 0x20, which is not a legal address.  Full backtrace at bottom
of this email.

Also, if I enable cassert on 0f61d4dd1b4f95, I get:

TRAP: FailedAssertion(!(te-resno = rel-min_attr  te-resno =
rel-max_attr), File: costsize.c, Line: 3217)

At which point, resno = 16, min_attr = 0, max_attr = 15

I don't where to take it from here, but maybe this info will help someone
else get there faster.

Cheers,

Jeff

#0  make_rel_from_joinlist (root=0xee4fb8, joinlist=value optimized out)
at allpaths.c:942
#1  0x005ae856 in query_planner (root=0xee4fb8, tlist=value
optimized out, tuple_fraction=0, limit_tuples=-1,
cheapest_path=0x7fff041d8420,
sorted_path=0x7fff041d8418, num_groups=0x7fff041d8428) at planmain.c:271
#2  0x005b0304 in grouping_planner (root=value optimized out,
tuple_fraction=0) at planner.c:1144
#3  0x005b1ee3 in subquery_planner (glob=0xe6bb20, parse=0xee1350,
parent_root=value optimized out, hasRecursion=value optimized out,
tuple_fraction=0, subroot=0x7fff041d8568) at planner.c:535
#4  0x005b57e4 in make_subplan (node=0xee10e0, context=value
optimized out) at subselect.c:378
#5  process_sublinks_mutator (node=0xee10e0, context=value optimized out)
at subselect.c:1680
#6  0x0057bc71 in expression_tree_mutator (node=value optimized
out, mutator=0x5b55e0 process_sublinks_mutator, context=0x7fff041d8640)
at nodeFuncs.c:2060
#7  0x005b5837 in process_sublinks_mutator (node=0xee1090,
context=0x7fff041d8730) at subselect.c:1770
#8  0x0057c09b in expression_tree_mutator (node=value optimized
out, mutator=0x5b55e0 process_sublinks_mutator, context=0x7fff041d8730)
at nodeFuncs.c:1991
#9  0x005b5837 in process_sublinks_mutator (node=0xee1060,
context=0x7fff041d8790) at subselect.c:1770
#10 0x005b5d6a in SS_process_sublinks (root=value optimized out,
expr=value optimized out, isQual=value optimized out) at
subselect.c:1653
#11 0x005aeebf in preprocess_expression (root=0xee0d10, expr=value
optimized out, kind=1) at planner.c:656
#12 0x005b1cbd in subquery_planner (glob=0xe6bb20, parse=0xedc2f0,
parent_root=value optimized out, hasRecursion=value optimized out,
tuple_fraction=0, subroot=0x7fff041d88b8) at planner.c:417
#13 0x00599426 in set_subquery_pathlist (root=0xedba38,
rel=0xedbed8, rti=value optimized out, rte=value optimized out) at
allpaths.c:754
#14 set_rel_pathlist (root=0xedba38, rel=0xedbed8, rti=value optimized
out, rte=value optimized out) at allpaths.c:178
#15 0x005997d3 in set_base_rel_pathlists (root=0xedba38,
joinlist=0xedc070) at allpaths.c:158
#16 make_one_rel (root=0xedba38, joinlist=0xedc070) at allpaths.c:94
#17 0x005ae856 in query_planner (root=0xedba38, tlist=value
optimized out, tuple_fraction=0, limit_tuples=-1,
cheapest_path=0x7fff041d8b60,
sorted_path=0x7fff041d8b58, num_groups=0x7fff041d8b68) at planmain.c:271
#18 0x005b0304 in grouping_planner (root=value optimized out,
tuple_fraction=0) at planner.c:1144
#19 0x005b1ee3 in subquery_planner (glob=0xe6bb20, parse=0xe6aa18,
parent_root=value optimized out, hasRecursion=value optimized out,
tuple_fraction=0, subroot=0x7fff041d8c48) at planner.c:535
#20 0x005b2191 in standard_planner (parse=0xe6aa18,
cursorOptions=0, boundParams=0x0) at planner.c:200
#21 0x00602b4a in pg_plan_query (querytree=value optimized out,
cursorOptions=value optimized out, boundParams=value optimized out)
at postgres.c:764
#22 0x00602c34 in pg_plan_queries (querytrees=value optimized
out, cursorOptions=0, boundParams=0x0) at postgres.c:823
#23 0x00603bf0 in exec_simple_query (query_string=0xe69bf0 select
* from vinfo;) at postgres.c:988
#24 0x00604a73 in PostgresMain (argc=value optimized out,
argv=value optimized out, username=value optimized out) at
postgres.c:3932
#25 0x005d1bc9 in BackendRun () at postmaster.c:3561
#26 BackendStartup () at postmaster.c:3246
#27 ServerLoop () at postmaster.c:1432
#28 0x005d41bc in PostmasterMain (argc=value optimized out,
argv=value optimized out) at postmaster.c:1093
#29 0x0057a210 in main (argc=4, argv=0xdbed00) at main.c:188

Cheers,

Jeff


Re: [BUGS] BUG #8043: 9.2.4 doesn't open WAL files from archive, only looks in pg_xlog

2013-04-06 Thread Jeff Janes
On Sat, Apr 6, 2013 at 1:24 AM, Heikki Linnakangas
hlinnakan...@vmware.comwrote:


 Incidentally, I bumped into another custom backup script just a few weeks
 back that also excluded backup_label. I don't know what the author was
 thinking when he wrote that, but it seems to be a surprisingly common
 mistake. Maybe it's the label in the filename that makes people think
 it's not important.



I think part of it is the name label', and part of it is that this file is
similar to and hence easily confused with the .history files, which (as far
as I know) truly are there only for human information and not for system
operation.



 Perhaps we should improve the documentation to make it more explicit that
 backup_label must be included in the backup. The docs already say that,
 though, so I suspect that people making this mistake have not read the docs
 very carefully anyway.



I don't think the docs are very clear on that.  They say This file will of
course be archived as a part of your backup dump file, but will be does
not imply must be.  Elsewhere it emphasizes that the label you gave to
pg_start_backup is written into the file, but doesn't really say what the
file itself is there for.  To me it seems to imply that the file is there
for your convenience, to hold that label, and not as a critical part of the
system.

Patch attached, which I hope can be back-patched.  I'll also add it to
commitfest-Next.

Cheers,

Jeff


backup_label_warning_v1.patch
Description: Binary data

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #8043: 9.2.4 doesn't open WAL files from archive, only looks in pg_xlog

2013-04-05 Thread Jeff Janes
On Fri, Apr 5, 2013 at 12:27 PM, boh...@visionlink.org wrote:

 The following bug has been logged on the website:

 Bug reference:  8043
 Logged by:  Jeff Bohmer
 Email address:  boh...@visionlink.org
 PostgreSQL version: 9.2.4
 Operating system:   CentOS 5.9 x86_64 kernel 2.6.18-348.3.1.el5
 Description:

 Hi pgsql-bugs,

 PG version: 9.2.4, from yum.postgresql.org
 OS: CentOS 5.9 x86_64 kernel 2.6.18-348.3.1.el5

 Upgrading from 9.2.3 to 9.2.4 has broken archive recovery for me. Probably
 related to this 9.2.4 change:

 Ensure we do crash recovery before entering archive recovery,
 if the database was not stopped cleanly and a recovery.conf
 file is present (Heikki Linnakangas, Kyotaro Horiguchi,
 Mitsumasa Kondo)

 When starting the PostgreSQL 9.2.4 service on a base backup, I get this:

 2013-04-05 12:49:04 MDT [10302]: [1-1] user=,db= LOG:  database system was
 interrupted; last known up at 2013-04-05 10:02:01 MDT
 2013-04-05 12:49:04 MDT [10302]: [2-1] user=,db= LOG:  starting archive
 recovery
 2013-04-05 12:49:04 MDT [10302]: [3-1] user=,db= LOG:  could not open file
 pg_xlog/0001002F002D (log file 47, segment 45): No such file
 or directory


 ...


 I use a custom base backup script to call pg_start/stop_backup() and make
 the backup with rsync.

 The restore_command in recovery.conf is never called by PG 9.2.4 during
 startup. I confirmed this by adding a touch /tmp/restore_command.`date
 +%H:%M:%S` line at the beginning of the shell script I use for my
 restore_command. No such files are created when starting PG 9.2.4.

 After downgrading back to 9.2.3, archive recovery works using the very same
 base backup, recovery.conf file, and restore_command. The log indicates
 that
 PG 9.2.3 begins recovery by pulling WAL files from the archive instead of
 pg_xlog:



I can reproduce the behavior you report only if I remove the backup_label
file from the restored data directory before I begin recovery.  Of course,
doing that renders the backup invalid, as without it recovery is very
likely to begin from the wrong WAL recovery location.

I think it is appropriate that 9.2.4 refuses to cooperate in those
circumstances, and it was a bug that 9.2.3 did allow it.

Do you have a backup_label file?



 Or, must I now include pg_xlog files when taking base backups with 9.2.4,
 contrary to the documentation?



You do not need to include pg_xlog, but you do need to include
backup_label.  And you always did need to include it--if you were not
including it in the past, then you were playing with fire and is only due
to luck that your database survived.

Cheers,

Jeff


[BUGS] BUG #8013: Memory leak

2013-03-31 Thread Jeff Janes
On Sunday, March 31, 2013, Tom Lane wrote:


 A different line of thought is that you might have set work_mem to
 an unreasonably large value --- the sort step will happily try to
 consume work_mem worth of memory.


I don't think that that can be the problem here, because memtuples can
never be more than 1GB even if work_mem is much larger than that.  Even if
his sort is using pass-by-reference (I don't think it would be), they
should be skinny enough that that limitation should prevent it from blowing
out memory.

Cheers,

Jeff


Re: [BUGS] BUG #7853: Incorrect statistics in table with many dead rows.

2013-03-02 Thread Jeff Janes
On Fri, Feb 22, 2013 at 3:41 PM, James R Skaggs
james.r.ska...@seagate.comwrote:

 Okay, I have some more info.

 Some background info.  This one table gets so many changes, I CLUSTER it
 each night.  However, after I do this. The statistics still appear to be
 incorrect.  Even after I do a select pg_stat_reset(); Followed by 3
 ANALYZE at default_statistics_target as 1, 10, and 100

 select relname, n_live_tup, n_dead_tup, n_tup_ins, n_tup_del,
 n_tup_hot_upd
 from pg_stat_all_tables
 ('stream_file', 109061143L, 103913868L, 0L, 14201L, 0L)

 Is it possible that there are still dead tuples after a CLUSTER?


Yes.  A cluster must bring along any tuples which are possibly visible to
any open transaction.  Your root problem seems to be that you have
long-open transactions which are preventing vacuum from doing its thing,
which leads you try clustering, but the long-open transaction prevents that
from doing its things effectively as well.

Perhaps PG could deal with this situation more gracefully, but
fundamentally you have to figure why you have these ancient transactions
lying around, and fix them or kill them.

Cheers,

Jeff


Re: [BUGS] BUG #7902: lazy cleanup of extraneous WAL files can cause out of disk issues

2013-02-26 Thread Jeff Janes
On Friday, February 22, 2013, wrote:

 The following bug has been logged on the website:

 Bug reference:  7902
 Logged by:  Jeff Frost
 Email address:  j...@pgexperts.com javascript:;
 PostgreSQL version: 9.2.3
 Operating system:   Ubuntu 12.04
 Description:

 While doing acceptance testing on a new Ubuntu 12.04 PostgreSQL server
 running 9.2.3, we set checkpoint_segments = 128,
 checkpoint_completion_target = 0.9 and placed pg_xlog on a separate 20G
 partition. Also, archive_mode = off on this system.

 According to the docs, you would expect the system to attempt to keep the
 WAL files down close to 3 * checkpoint_segments + 1.


Assuming checkpoints are driven by checkpoint_segments and not time or
manual checkpoints, this actually seems to be an error in the docs.  What
it brings it back down to is something more like
(3 + checkpoint_completion_target) * checkpoint_segments + 1.

At the point in the code where it does the removal, it has all of the logs
since the start of the just-finished checkpoint
(completion_target*checkpoint_segments) that can't be removed, all of the
logs since the start of the checkpoint before that one
(=checkpoint_segments) that can't be removed, plus it will recycle enough
to have checkpoint_segments*2+1 future logs before it removes the rest.

But this describes what happens once things have gotten out of control but
then calms down again.  In your case it never does calm down again.

Cheers,

Jeff


Re: [BUGS] BUG #7902: lazy cleanup of extraneous WAL files can cause out of disk issues

2013-02-26 Thread Jeff Janes
On Sunday, February 24, 2013, Jeff Frost wrote:


 This is how the log_checkpoint output looks during the run:

 2013-02-24 21:06:31.156 UTC,,,1624,,51282598.658,114,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint starting: immediate force wait,
 2013-02-24 21:06:31.216 UTC,,,1624,,51282598.658,115,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint complete: wrote 108 buffers (0.0%); 0
 transaction log file(s) added, 0 removed, 4 recycled; write=0.054 s,
 sync=0.002 s, total=0.059 s; sync files=10, longest=0.000 s, average=0.000
 s,

 So, this ^ is the manual checkpoint prior to the pgbench run and
 everything else is part of the pgbench run.

 2013-02-24 21:06:45.227 UTC,,,1624,,51282598.658,116,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoints are occurring too frequently (14 seconds
 apart),,Consider increasing the configuration parameter
 checkpoint_segments.,,,


It took 14 seconds to go fill up 128 files.


 2013-02-24 21:06:45.227 UTC,,,1624,,51282598.658,117,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint starting: xlog,
 2013-02-24 21:08:35.239 UTC,,,1624,,51282598.658,118,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint complete: wrote 266935 buffers (12.7%); 0
 transaction log file(s) added, 0 removed, 1 recycled; write=10.889 s,
 sync=99.079 s, total=110.011 s; sync files=29, longest=12.888 s,
 average=3.416 s,


It spent 10.9 seconds writing buffers, either because it knew it was behind
time and was writing as fast as it could, or because it figured it needed
to be done with everything in 10.9 / 0.9 = 12.1 seconds, as that was when
the next checkpoint was due to start (i.e. when the next 128 was going be
filled).  But it took 99 seconds to accomplish the part of the checkpoint
that is supposed to fit in 1.2 seconds or less.  Assuming it was still
generating log files at the same rate of 128 every 12 to 14 seconds, there
would be an extra 1000 files by the time it is done.  That is almost enough
there to fill up your hard drive.  Now it probably won't manage to do that,
as I imagine the long fsync times would also lead to stalls in throughput.


 2013-02-24 21:08:35.241 UTC,,,1624,,51282598.658,119,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint starting: xlog,
 2013-02-24 21:11:19.506 UTC,,,1624,,51282598.658,120,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint complete: wrote 801517 buffers (38.2%); 0
 transaction log file(s) added, 0 removed, 128 recycled; write=15.484 s,
 sync=148.739 s, total=164.266 s; sync files=32, longest=11.773 s,
 average=4.648 s,


Here again, either the writes are being metered so they finish after 15.484
s when PG estimates that 0.9 * 128 log files would be filled, or they were
going as fast as possible because it thought they would be filled even
earlier.  It then took nearly ten times that long to finish the checkpoint
(which according to completion_target, is supposed to be take 1/10 the time
of the write, not 10 times longer.  So of course the actual number of log
files is going to greatly exceed the planned 128.


 2013-02-24 21:11:19.506 UTC,,,1624,,51282598.658,121,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint starting: xlog,
 2013-02-24 21:14:20.486 UTC,,,1624,,51282598.658,122,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint complete: wrote 1009466 buffers (48.1%); 0
 transaction log file(s) added, 140 removed, 257 recycled; write=13.690 s,
 sync=167.245 s, total=180.980 s; sync files=33, longest=14.442 s,
 average=5.067 s,
 2013-02-24 21:14:20.487 UTC,,,1624,,51282598.658,123,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint starting: xlog,
 2013-02-24 21:17:31.507 UTC,,,1624,,51282598.658,124,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint complete: wrote 1055059 buffers (50.3%); 0
 transaction log file(s) added, 259 removed, 257 recycled; write=22.731 s,
 sync=168.239 s, total=191.020 s; sync files=32, longest=13.650 s,
 average=5.257 s,
 2013-02-24 21:17:31.507 UTC,,,1624,,51282598.658,125,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint starting: xlog,
 2013-02-24 21:23:36.201 UTC,,,1624,,51282598.658,126,,2013-02-23 02:12:40
 UTC,,0,LOG,0,checkpoint complete: wrote 814767 buffers (38.9%); 0
 transaction log file(s) added, 546 removed, 0 recycled; write=269.553 s,
 sync=95.042 s, total=364.693 s; sync files=31, longest=11.211 s,
 average=3.065 s,

 They're all number of xlog based and the first one doesn't remove any,
 that's not surprising,


What was the state of the pg_xlog directory before you started the test?
 If it were not full up to check_point segments * 2, (which is likely was
not, since populating pgbench_accounts avoids WAL when not archiving), this
I agree it is not surprising.

 but I'm surprised that the second one doesn't remove any.  It does
recycle quite a few.

PG keep two checkpoints worth or WAL files (that doesn't really seem to be
documented, and I'm not sure why it is done, other than an abundance of

[BUGS] new BUG: postgresql 9.2.3: very long query time

2013-02-23 Thread Jeff Janes
On Friday, February 22, 2013, Tom Lane wrote:

 Jeff Janes jeff.ja...@gmail.com writes:
  The slowness was introduced with this:
  Use parameterized paths to generate inner indexscans more flexibly.

 Try increasing from_collapse_limit to 11 or more.


I've increased it to 20 and still no luck.



 I'm not exactly sure why the param-path patch might have changed this
 behavior, but in any case the collapse limits ought to be the first
 thing you think of when you see a crappy plan for a many-relation query.


I've stripped it down to this:

explain (analyze,buffers)
select  1 as shift_date
from cb_order order1
inner join Template template2 on order1.template =
template2.id
left outer join Product product1 on template2.id =
product1.id,
Template template1  cross join Product
product2
where   order1.template = template1.id
and (
template2.id=product2.id
or
case when product1.id is not null
then 1
when template2.id is not null
then 0
end 1
and
product2.id=2916353
   ) ;


The good plan uses a BitmapOr on a product2.id index to satisfy 
template2.id=product2.id or product2.id= 2916353 (which then needs to use
a filter to check that the CASE part holds in case the true branch of the
OR was the branch with 2916353)

The bad plan seems to have forgotten how to do that, and so seq scans
product2 repeatedly.  If I remove the CASE, then it uses the BitmapOr, so
what it has forgotten seems to be that A or (B and C) can only be true if
(A or C) is true.

I say forgot, because the planner knows that the bad plan is way worse
than the good one, so it is probably about a lack-of-proof-of-correctness
rather than some small change in cost estimation pushing one over the other.

But it isn't as simple as that, as if I replace the CASE with one that
doesn't refer to product1.id, then it relearns how to use the BitmapOr.

case when random()0.5 then 1
 when template2.id is not null then 0
end 1

I'm not sure where to go from here.

Cheers,

Jeff


Re: [BUGS] BUG #7853: Incorrect statistics in table with many dead rows.

2013-02-23 Thread Jeff Janes
On Sun, Feb 10, 2013 at 12:10 PM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Tue, Feb 5, 2013 at 2:00 PM, Kevin Grittner kgri...@ymail.com wrote:
 
  OK, the estimate was 13 million and there were actually 13.8
  million, but it is a random sample used to generate estimates.
  That seems worse than average, but close enough to be useful.
  The 158.7 million total rows includes dead rows, which must be
  visited to determine visibility, but will not be counted because
  they are not visible to the counting transaction.

 To clarify here, the 158.7 million estimate does not *intentionally*
 include dead rows.  As you say, the ANALYZE did get a very good
 instantaneous estimate of the number of live rows.  However, ANALYZE
 doesn't over-write the old estimate, it averages its estimate into the
 old one.  After the table shape changes dramatically, the ANALYZE
 needs to be run repeatedly before the estimate will converge to the
 new reality.  (Of course a cluster or vacuum full will blow away the
 old statistics, so the next analyze after that will solely determine
 the new statistics.)


I was incredibly wrong here.  The cluster or vacuum do not blow away the
stats so that the next analyze gets to solely determine them.  Rather, they
impose their own idea of live tuples, and then analyze can only update that
incrementally as it averages itself into the old value.

Worse, the two methods have very different ideas of what constitutes a live
tuple.  ANALYZE thinks tuples that are visible to a current/recent snapshot
are live.  While CLUSTER and VACUUM think tuples that are possibly visible
to anyone are live.

I would say that this is a bug, or at least approaching to being one.  It
is not obvious whether reltuples and n_live_tuples should count the
recently dead, but it should either be one way or the other and not an
unholy mixture of the two.

As it is now, a cluster or simple vacuum will snap n_live_tuples so that it
counts recently dead, then analyze will slowly converge it to excludes
recently dead, and then the next vacuum will snap it back again.

Of course, all of this only comes into play in the presence of very
long-lived transactions that prevent tuples from going away.  Otherwise the
number recently dead is small enough not to matter.


create table foo as select (random()*100)::integer as val from
generate_series(1,5000);

In a different session, open a transaction and leave it open: begin; create
temp table adlfkj (x serial);

Back in the main session:

delete from foo where val  100;

run this repeatedly and watch the rows estimate slowly decay:

ANALYZE verbose foo;  explain select count(*) from foo;

Then run this and watch it instantly spring back:

VACUUM VERBOSE foo ; explain select count(*) from foo;

Cheers,

Jeff


Re: [BUGS] new BUG: postgresql 9.2.3: very long query time

2013-02-21 Thread Jeff Janes
On Wed, Feb 20, 2013 at 5:42 AM, Claude Speed gta3u...@gmail.com wrote:
 Hi all,

 Postgresql 9.2.3 is processing my query is much longer than Postgresql
 9.1.8:
 Postgresql 9.1.8 - 2292 ms
 Postgresql 9.2.3 - 163336 ms

 I provided my query in attach and the database dump too,
 this bug is reproducible.

 Operating system: Gentoo x86_64 Linux 2.6.36

The slowness was introduced with this:

commit e2fa76d80ba571d4de8992de6386536867250474
Author: Tom Lane t...@sss.pgh.pa.us
Date:   Fri Jan 27 19:26:38 2012 -0500

Use parameterized paths to generate inner indexscans more flexibly.


Do you know of simpler queries that also exhibits this behavior?  It
would certainly make analysis of the issue easier.


Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #7882: plperl poor error message

2013-02-14 Thread jeff . janes
The following bug has been logged on the website:

Bug reference:  7882
Logged by:  Jeff Janes
Email address:  jeff.ja...@gmail.com
PostgreSQL version: 9.2.3
Operating system:   Linux
Description:

If plperl.on_init attempts to load a module which does not exist, then
attempts to use plperl give a reasonable error message upon first attempt,
but after that the error message is rather cryptic.

in postgresql.conf:

plperl.on_init='use Nonexistent::Module;'

CREATE OR REPLACE FUNCTION perlfunc(text) RETURNS text AS $BODY$ return
scalar reverse $_[0]; $BODY$ LANGUAGE plperl;

ERROR:  Can't locate Nonexistent/Module.pm in @INC (@INC contains: ....)
at -e line 98.
BEGIN failed--compilation aborted
CONTEXT:  while parsing Perl initialization

Upon a second attempt, the error becomes much less useful:

CREATE OR REPLACE FUNCTION perlfunc(text) RETURNS text AS $BODY$ return
scalar reverse $_[0]; $BODY$ LANGUAGE plperl;

ERROR:  attempt to redefine parameter plperl.use_strict




-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7853: Incorrect statistics in table with many dead rows.

2013-02-10 Thread Jeff Janes
On Tue, Feb 5, 2013 at 2:00 PM, Kevin Grittner kgri...@ymail.com wrote:
 jim...@seagate.com jim...@seagate.com wrote:

 INFO:  analyzing public.stream_file
 INFO:  stream_file: scanned 3 of 2123642 pages, containing
 184517 live rows and 2115512 dead rows; 3 rows in sample,
 158702435 estimated total rows

 184517 live rows in 3 randomly sampled pages out of 2123642
 total pages, means that the statistics predict that a select
 count(*) will find about  13 million live rows to count.

 After analyze verbose, the table shows 158 million rows. A
 select count(1) yields 13.8 million rows.

 OK, the estimate was 13 million and there were actually 13.8
 million, but it is a random sample used to generate estimates.
 That seems worse than average, but close enough to be useful.
 The 158.7 million total rows includes dead rows, which must be
 visited to determine visibility, but will not be counted because
 they are not visible to the counting transaction.

To clarify here, the 158.7 million estimate does not *intentionally*
include dead rows.  As you say, the ANALYZE did get a very good
instantaneous estimate of the number of live rows.  However, ANALYZE
doesn't over-write the old estimate, it averages its estimate into the
old one.  After the table shape changes dramatically, the ANALYZE
needs to be run repeatedly before the estimate will converge to the
new reality.  (Of course a cluster or vacuum full will blow away the
old statistics, so the next analyze after that will solely determine
the new statistics.)

I agree, not a bug.

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Re: BUG #7748: drop owned by fails with error message: unrecognized object class: 1262

2013-01-29 Thread Jeff Janes
On Mon, Jan 28, 2013 at 2:37 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Pushed, thanks.

 Jeff, Thomas, Jaime: please have a look and let me know what you think.

I've tested it in the 9_2_STABLE branch and found no problems.

Thanks,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7836: COPY command does not honor 'FORMAT' option

2013-01-29 Thread Jeff Janes
On Tue, Jan 29, 2013 at 12:19 PM, hubert depesz lubaczewski
dep...@depesz.com wrote:
 On Tue, Jan 29, 2013 at 06:20:05PM +, kurt.l...@cello.com wrote:
 template1=# copy pg_aggregate to '/tmp/agg.bin' with format binary;

 correct syntax:

 copy pg_aggregate to '/tmp/agg.bin' with (format 'binary');

I was just about to call this confirmed before seeing your email.

I figured out the need for the parenthesis by reading the docs.  The
need to also quote 'binary' really took me by surprise, especially as
the other two formats don't need to be quoted.  I assume the reason
for this is the level of reservedness of the various words in bison.
But is this a doc bug?  Should literalbinary/literal be shown
quoted if it needs to be used quote?  Or at least maybe the Example
section could include an example of the use of FORMAT.


Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #6528: pglesslog still referenced in docs, but no 9.1 support

2013-01-25 Thread Jeff Janes
On Fri, Jan 25, 2013 at 2:01 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 1/25/13 1:35 AM, Kevin Grittner wrote:
 Peter, do you have a version that works with 9.3?

 I don't, but it shouldn't be hard for someone more up to date with the
 internal WAL addressing changes.

I've been thinking about that.  Should pg_clearxlogtail for 9.3 only
work for logs from 9.3, or should it be arranged to work on logs from
older versions as well (which probably means parts of the 9.2 headers
would need to be copied into the body of pg_clearxlogtail)?


 Given that we're past the start of the last CF, do we still want to
 consider including it? If you tilt your head at just the right
 angle, I guess it could look like a bug fix -- but it is a stretch.

 Given that clearxlog doesn't actually work in 9.3 right now (see above)
 and doesn't have any documentation, I think the stretch is a bit too big.

 I do like the idea behind it, but the first step would be to address
 these two points and let people try it out.  Note also that in our
 discussion we had some questions about how exactly to write the archive
 command to use it.

This is what I use:

archive_command = 'set -C -o pipefail;
(/usr/local/pgsql9_2/bin/pg_clearxlogtail  %p || cat %p)  | gzip 
/mnt/foo/archivelog9_2/%f.gz'

The || cat %p business is because pg_clearxlogtail will fail on
.history files and so needs an alternative path.

I think it would be better to have pg_clearxlogtail instead just
silently pass any unrecognized files through without manipulation,
rather than erroring out.

Most people would probably want to use a more explicit shell script
with notifications and such, like one Kevin posted to general a few
months ago under checkpoint_timeout and archive_timeout.


Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #7814: Rotation of the log is not carried out.

2013-01-23 Thread Jeff Janes
On Tue, Jan 22, 2013 at 9:29 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Tuesday, January 22, 2013, Tom Lane wrote:


 So what we need on Windows is for the data transfer thread to notice
 when Log_RotationSize  0  ftell(syslogFile) = Log_RotationSize,
 and then either signal the control thread to wake up, or do the
 rotation itself.  Probably the former is less risk.

 I'm not coding this fix though, since I'm not in a position to test it.


 If I can follow though on Andrew Dunstan's instructions to get mingw up
and
 running, I'll take it for a spin.

The attached patch fixes the problem, tested under mingw.

I've included an elog JJ logger woke up in the patch to verify that the
wake up is not happening more often than it should.  Obviously this line
should not be included in the actual commit--but I thought a tester would
like to see how I verified this.

This is my first effort in both Windows and in the Latch code, so committer
beware.

It should be applied to 9_2 and to head.

I've not added it to the commit-fest as it is a bug fix.

Cheers,

Jeff


syslogger_rotate_win_v1.patch
Description: Binary data

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Re: BUG #7748: drop owned by fails with error message: unrecognized object class: 1262

2013-01-23 Thread Jeff Janes
On Sunday, December 9, 2012, Alvaro Herrera wrote:

 Tom Lane wrote:
 
  spam_ea...@gmx.net javascript:; writes:
   postgres=# create user testuser with password 'secret';
   CREATE ROLE
   postgres=# create database testdb owner testuser;
   CREATE DATABASE
   testdb= drop owned by testuser;
   ERROR:  unrecognized object class: 1262
 
  I can reproduce this in all versions back to 8.3.  In 8.2, the role's
  ownership of the database is silently ignored, which I think was the
  design intention.  I doubt that we want DROP OWNED dropping whole
  databases.  At most maybe we should raise a NOTICE?

 I broke it recently: fe3b5eb08

 Got a one day old guy at home, can't look at this for at least a
 week, sorry.


Since back-branch releases are coming up, I think fe3b5eb08 and it's
analogues in all branches should be reverted.

The issue it was intended to solve was not really a bug in the first place,
and this patch didn't solve it anyway. But it introduced new behavior (into
all supported branches) which certainly is a bug.


I don't think the original issue it was intended to solve can be solved
easily, either. It would be easy to fix the code to add table-spaces to the
list of things to be reassigned, but the code that does the actual
reassignment can't deal with table-spaces anyway.



Cheers,


Jeff


Re: [BUGS] BUG #7814: Rotation of the log is not carried out.

2013-01-22 Thread Jeff Janes
On Tuesday, January 22, 2013, Tom Lane wrote:

 Jeff Janes jeff.ja...@gmail.com javascript:; writes:
  For what it is worth, I can reproduce this on Windows 7, using the
  9.2.2 and 9.1.7 windows 64 installers from EDB (i.e.
 
 http://www.enterprisedb.com/postgresql-922-installers-win64?ls=Crossovertype=Crossover
 ),
  with completely default installation (EDB changes the default to be
  logging_collector=on), and using the below to blow up the log files
  with error messages:

  perl -le 'print FOO$_; foreach 1..1e7' | psql

  9.1.7 rotates the log files in $DATA/pg_log, while 9.2.2 does not
  rotate based on size.

 [ scratches head for awhile... ]  Hm, what you are testing is the
 default rotation parameters, right?  log_rotation_age = 1 day,
 log_rotation_size = 10MB?  So you didn't wait to see what would happen
 at a rotation-age boundary.


Correct, I did not wait.  But from the original reporter's screenshot, it
seemed to be rotating correctly based on time, just not on size.


  If so, does pg_ctl reload kick it into
 rotating once the file is too big?


Yes, triggering a reload (through pgAdmin) does cause it to kick over to a
new log file.

...


 So what we need on Windows is for the data transfer thread to notice
 when Log_RotationSize  0  ftell(syslogFile) = Log_RotationSize,
 and then either signal the control thread to wake up, or do the
 rotation itself.  Probably the former is less risk.

 I'm not coding this fix though, since I'm not in a position to test it.


If I can follow though on Andrew Dunstan's instructions to get mingw up and
running, I'll take it for a spin.

Cheers,

Jeff


Re: [BUGS] BUG #7814: Rotation of the log is not carried out.

2013-01-19 Thread Jeff Janes
On Friday, January 18, 2013, Tsunezumi wrote:


 I installed ordinarily.


Could you be more specific?  I do not know what is ordinary for you.

I ordinarily install from source (although not on Windows).  Other people
ordinarily do it differently.


 I do not have this problem on PostgreSQL 9.1.7.


Could you show a directory listing of the pg_log directory under both 9.1.7
and 9.2.2, so that we can see what you are seeing that leads to the
conclusion that rotation is not occurring?

Cheers,

Jeff


Re: [BUGS] BUG #7748: drop owned by fails with error message: unrecognized object class: 1262

2012-12-11 Thread Jeff Janes
On Sun, Dec 9, 2012 at 8:28 PM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Sun, Dec 9, 2012 at 11:13 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Tom Lane wrote:

 spam_ea...@gmx.net writes:
  postgres=# create user testuser with password 'secret';
  CREATE ROLE
  postgres=# create database testdb owner testuser;
  CREATE DATABASE
  testdb= drop owned by testuser;
  ERROR:  unrecognized object class: 1262

 I can reproduce this in all versions back to 8.3.  In 8.2, the role's
 ownership of the database is silently ignored, which I think was the
 design intention.  I doubt that we want DROP OWNED dropping whole
 databases.  At most maybe we should raise a NOTICE?

 I broke it recently: fe3b5eb08


 whatever is the right way to solve this... shouldn't we do something
 similar in shdepReassignOwned() in which we are still ignoring shared
 objects?

Based on the commit message, it seems like it should *only* be in
shdepReassignOwned.

However, when I put it there it fails, as the code that does the
ownership change cannot deal with tablespaces (or databases)

ERROR:  unexpected classid 1213

I am unsure of the goal here.  The docs clearly say that only objects
in the current database are affected, so why are we even trying to do
something with tablespaces (or databases), which do not live in any
database?  And if we want to change the contract to allow it to climb
out of the current database, why limit it to shared objects rather
than crawling all databases?

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Jeff Janes
On Wed, Dec 5, 2012 at 8:40 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 The real question here probably needs to be what is the point of
 recoveryPauseAtTarget in the first place?.  I find it hard to envision
 what's the point of pausing unless the user has an opportunity to
 make a decision about whether to continue applying WAL.

Right now if I'm doing a PITR and want to look around before blessing
the restore, I have to:

0) restore from the base backup.

1) change pg_hba to lock out everyone else

2) do the restore to my best guess of what was just before the tragic DML.

3) Look around and see if I guessed correctly.

4a) If I replayed to far, blow everything away and start over.

4b) If I replayed not far enough, and opened the database normally
upon completion so I could look around, then blow everything away and
start over.

4c) If I replayed not far enough and opened the database in hot
standby, I don't have to blow everything away, I just have to shut
down the server, change the restore point forward, and restart it.

5) change pg_hba back to normal and restart the server.

It would be nice if I could execute 4c without a shutdown/restart, but
certainly a shutdown/restart is better than blowing everything away
and starting over, like you have to do once the restore picks a new
time line.  So while the current pause behavior is incomplete, it is
still useful.

I would also be nice if only the superuser is allowed to connect to
the hot standby when pause_at_recovery_target=true, until after
pg_xlog_replay_resume() is called.  That way I could skip steps 1 and
5--steps which are very easy to forget to do.

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Jeff Janes
On Wed, Dec 5, 2012 at 11:17 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 Right now if I'm doing a PITR and want to look around before blessing
 the restore, I have to:
 [ do painful stuff ]

 Yeah.  The worst thing about this is the cost of stepping too far
 forward, but I doubt we can do much about that --- WAL isn't reversible
 and I can't see us making it so.  What we can get rid of is the pain
 of shutting down to move the recovery target forward.

 Another thought here is that it would be good to have some kind of
 visibility of the last few potential stop points (timestamps/XIDs),
 so that if you do roll too far forward, you have some idea of what
 to try after you reset everything.  A zero-order implementation of
 that would be to emit LOG messages as we replay each potential
 commit, but maybe we can do better.

probably embellishments on xlogdump or xlogreader would be the way to go.


 I would also be nice if only the superuser is allowed to connect to
 the hot standby when pause_at_recovery_target=true, until after
 pg_xlog_replay_resume() is called.

 Uh, why?  Other users won't be able to do anything except look around;

On some systems, 95% of users never do anything (that the database
knows about) except look around.  But I think it would be unfortunate
to accidentally show them data that will soon be revoked.

 they can't force the database to become read/write.  I can't see that
 it's a good idea for recovery to play games with the pg_hba rules;
 too much chance of screwing things up for too little benefit.

OK.  I don't know at all what is involved in implementing such a
thing.  But a DBA in the middle of a rather arcane but urgent task has
a pretty high chance of screwing things up, too.

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] PITR potentially broken in 9.2

2012-12-04 Thread Jeff Janes
On Sun, Dec 2, 2012 at 1:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 On Sat, Dec 1, 2012 at 1:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm confused.  Are you now saying that this problem only exists in
 9.1.x?  I tested current HEAD because you indicated the problem was
 still there.

 No, I'm saying the problem exists both in 9.1.x and in hypothetical
 9.2.2 and in hypothetical 9.3, but not in 9.2.[01] because in those it
 is masked by that other problem which has just been fixed.

 I'm still confused.  I've now tried this in both HEAD and 9.1 branch
 tip, and I do not see any misbehavior.  If I set recovery_target_time to
 before the pg_stop_backup time, I get FATAL:  requested recovery stop
 point is before consistent recovery point which is what I expect; and
 if I set it to after the pg_stop_backup time, it starts up as expected.
 So if there's a remaining unfixed bug here, I don't understand what
 that is.

I've reproduced it again using the just-tagged 9.2.2, and uploaded a
135MB tarball of the /tmp/data_slave2 and /tmp/archivedir to google
drive.  The data directory contains the recovery.conf which is set to
end recovery between the two critical time points.

https://docs.google.com/open?id=0Bzqrh1SO9FcES181YXRVdU5NSlk

This is the command line I use to start recovery, and the resulting log output.

https://docs.google.com/open?id=0Bzqrh1SO9FcEaTQ2QXhFdDZYaUE

I can't connect to the standby to execute pg_xlog_replay_resume() because:
FATAL:  the database system is starting up

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] PITR potentially broken in 9.2

2012-12-04 Thread Jeff Janes
On Tue, Dec 4, 2012 at 4:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 I've reproduced it again using the just-tagged 9.2.2, and uploaded a
 135MB tarball of the /tmp/data_slave2 and /tmp/archivedir to google
 drive.  The data directory contains the recovery.conf which is set to
 end recovery between the two critical time points.

 Hmmm ... I can reproduce this with current 9.2 branch tip.  However,
 more or less by accident I first tried it with a 9.2-branch postmaster
 from a couple weeks ago, and it works as expected with that: the log
 output looks like

 LOG:  restored log file 0001001B from archive
 LOG:  restored log file 0001001C from archive
 LOG:  restored log file 0001001D from archive
 LOG:  database system is ready to accept read only connections
 LOG:  recovery stopping before commit of transaction 305610, time 2012-12-02 
 15:08:54.000131-08
 LOG:  recovery has paused
 HINT:  Execute pg_xlog_replay_resume() to continue.

 and I can connect and do the pg_xlog_replay_resume() thing.

But the key is, the database was not actually consistent at that
point, and so opening hot standby was a dangerous thing to do.

The bug that allowed the database to open early (the original topic if
this email chain) was masking this secondary issue.

 So apparently this is something we broke since Nov 18.  Don't know what
 yet --- any thoughts?  Also, I am still not seeing what the connection
 is to the original report against 9.1.6.

The behavior that we both see in 9.2.2, where it waits for a
pg_xlog_replay_resume() that cannot be delivered because the database
is not yet open, is the same thing I'm seeing in 9.1.6.  I'll see if I
can repeat it in 9.1.7 and post the tarball of the data directory.

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] PITR potentially broken in 9.2

2012-12-04 Thread Jeff Janes
On Tue, Dec 4, 2012 at 4:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 So apparently this is something we broke since Nov 18.  Don't know what
 yet --- any thoughts?

 Further experimentation shows that reverting commit
 ffc3172e4e3caee0327a7e4126b5e7a3c8a1c8cf makes it work.  So there's
 something wrong/incomplete about that fix.

I can't independently vouch for the correctness of that fix, but I can
vouch that there is so far no evidence that it is incorrect.

It is re-revealing an undesirable (but safe, as far as we know)
behavior that is present in 9.1.x but which was temporarily hidden by
a corruption-risk bug in 9.2.0 and 9.2.1.


 This is a bit urgent since we now have to consider whether to withdraw
 9.2.2 and issue a hasty 9.2.3.  Do we have a regression here since
 9.2.1, and if so how bad is it?

I don't think this is urgent.  The error-message issue in 9.1.6 and
9.2.2 is merely annoying, while the early-opening one in 9.2.0 and
9.2.1 seems fundamentally unsafe.

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] PITR potentially broken in 9.2

2012-12-01 Thread Jeff Janes
On Sat, Dec 1, 2012 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 On Wed, Nov 28, 2012 at 7:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Is this related at all to the problem discussed over at
 http://archives.postgresql.org/pgsql-general/2012-11/msg00709.php
 ?  The conclusion-so-far in that thread seems to be that an error
 ought to be thrown for recovery_target_time earlier than the
 backup stop time, but one is not getting thrown.

 It is not directly related.  That thread was about 9.1.6.

 In the newly fixed 9_2_STABLE, that problem still shows up the same as
 it does in 9.1.6.

 (In 9.2.1, recovery sometimes blows up before that particular problem
 could be detected, which is what lead me here in the first place--that
 is the extent of the relationship AFAIK)

 To see this one, follow the instructions in my previous email, but set
 recovery_target_time to a time just after the end of the
 pg_start_backup checkpoint, rather than just before it, and turn on
 hot_standby

 I tried to reproduce this as per your directions, and see no problem in
 HEAD.  Recovery advances to the specified stop time, hot standby mode
 wakes up,

Hot standby should only wake up once recovery has proceeded beyond the
pg_stop_backup() point.

If the specified stop point is after pg_start_backup() finishes, but
before pg_stop_backup(), then hot standby should not start up (and
with the newest HEAD, in my hands, it does not).  Are you sure you set
the stop time to just after pg_start_backup, not to just after
pg_stop_backup?

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] PITR potentially broken in 9.2

2012-12-01 Thread Jeff Janes
On Sat, Dec 1, 2012 at 1:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 On Sat, Dec 1, 2012 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 In the newly fixed 9_2_STABLE, that problem still shows up the same as
 it does in 9.1.6.

 I tried to reproduce this as per your directions, and see no problem in
 HEAD.  Recovery advances to the specified stop time, hot standby mode
 wakes up,

 Hot standby should only wake up once recovery has proceeded beyond the
 pg_stop_backup() point.

 If the specified stop point is after pg_start_backup() finishes, but
 before pg_stop_backup(), then hot standby should not start up (and
 with the newest HEAD, in my hands, it does not).  Are you sure you set
 the stop time to just after pg_start_backup, not to just after
 pg_stop_backup?

 I'm confused.  Are you now saying that this problem only exists in
 9.1.x?  I tested current HEAD because you indicated the problem was
 still there.

No, I'm saying the problem exists both in 9.1.x and in hypothetical
9.2.2 and in hypothetical 9.3, but not in 9.2.[01] because in those it
is masked by that other problem which has just been fixed.

I'll try it again in b1346822f3048ede254647f3a46 just to be sure, but
I'm already fairly sure.

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] PITR potentially broken in 9.2

2012-11-28 Thread Jeff Janes
On Wed, Nov 28, 2012 at 7:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 28.11.2012 06:27, Noah Misch wrote:
 I observed a similar problem with 9.2.  Despite a restore_command that 
 failed
 every time, startup from a hot backup completed.  At the time, I suspected a
 mistake on my part.

 I believe this was caused by this little typo/thinko:

 Is this related at all to the problem discussed over at
 http://archives.postgresql.org/pgsql-general/2012-11/msg00709.php
 ?  The conclusion-so-far in that thread seems to be that an error
 ought to be thrown for recovery_target_time earlier than the
 backup stop time, but one is not getting thrown.

It is not directly related.  That thread was about 9.1.6.

In the newly fixed 9_2_STABLE, that problem still shows up the same as
it does in 9.1.6.

(In 9.2.1, recovery sometimes blows up before that particular problem
could be detected, which is what lead me here in the first place--that
is the extent of the relationship AFAIK)

To see this one, follow the instructions in my previous email, but set
recovery_target_time to a time just after the end of the
pg_start_backup checkpoint, rather than just before it, and turn on
hot_standby

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] PITR potentially broken in 9.2

2012-11-27 Thread Jeff Janes
Doing PITR in 9.2.1, the system claims that it reached a consistent
recovery state immediately after redo starts.
This leads to it various mysterious failures, when it should instead
throw a requested recovery stop point is before consistent recovery
point error.
(If you are unlucky, I think it might even silently start up in a
corrupt state.)

This seems to have been introduced in:
commit 8366c7803ec3d0591cf2d1226fea1fee947d56c3
Author: Simon Riggs si...@2ndquadrant.com
Date:   Wed Jan 25 18:02:04 2012 +

Allow pg_basebackup from standby node with safety checking.
Base backup follows recommended procedure, plus goes to great
lengths to ensure that partial page writes are avoided.

Jun Ishizuka and Fujii Masao, with minor modifications


the backup file:

START WAL LOCATION: 1/CD89E48 (file 00010001000C)
STOP WAL LOCATION: 1/1AFA11A0 (file 00010001001A)
CHECKPOINT LOCATION: 1/188D8120
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2012-11-27 09:40:13 PST
LABEL: label
STOP TIME: 2012-11-27 09:42:10 PST

(The file 00010001000C was archived at 9:37.)

recovery.conf:
restore_command = 'cp /tmp/archivedir/%f %p'
recovery_target_time = '2012-11-27 09:38:00 PST'


Log file:

22110  2012-11-27 09:49:15.220 PST LOG:  database system was
interrupted; last known up at 2012-11-27 09:40:13 PST
22110  2012-11-27 09:49:15.235 PST LOG:  starting point-in-time
recovery to 2012-11-27 09:38:00-08
22110  2012-11-27 09:49:15.271 PST LOG:  restored log file
000100010018 from archive
22110  2012-11-27 09:49:15.367 PST LOG:  restored log file
00010001000C from archive
22110  2012-11-27 09:49:15.372 PST LOG:  redo starts at 1/CD89E48
22110  2012-11-27 09:49:15.374 PST LOG:  consistent recovery state
reached at 1/CD8B7F0
22110  2012-11-27 09:49:15.490 PST LOG:  restored log file
00010001000D from archive
22110  2012-11-27 09:49:15.775 PST LOG:  restored log file
00010001000E from archive
22110  2012-11-27 09:49:16.078 PST LOG:  restored log file
00010001000F from archive
22110  2012-11-27 09:49:16.345 PST LOG:  restored log file
000100010010 from archive
22110  2012-11-27 09:49:16.533 PST LOG:  recovery stopping before
commit of transaction 951967, time 2012-11-27 09:38:00.000689-08
22110  2012-11-27 09:49:16.533 PST LOG:  redo done at 1/10F41900
22110  2012-11-27 09:49:16.533 PST LOG:  last completed transaction
was at log time 2012-11-27 09:37:59.998496-08
22110  2012-11-27 09:49:16.537 PST LOG:  selected new timeline ID: 2
22110  2012-11-27 09:49:16.584 PST LOG:  archive recovery complete
22113  2012-11-27 09:49:16.599 PST LOG:  checkpoint starting:
end-of-recovery immediate wait
22113  2012-11-27 09:49:17.815 PST LOG:  checkpoint complete: wrote
8336 buffers (12.7%); 0 transaction log file(s) added, 0 removed, 0
recycled; write=0.097 s, sync=1.115 s, total=1.230 s; sync files=14,
longest=0.578 s, average=0.079 s
22110  2012-11-27 09:49:17.929 PST FATAL:  could not access status of
transaction 1014015
22110  2012-11-27 09:49:17.929 PST DETAIL:  Could not read from file
pg_clog/ at offset 245760: Success.
22109  2012-11-27 09:49:17.932 PST LOG:  startup process (PID 22110)
exited with exit code 1
22109  2012-11-27 09:49:17.932 PST LOG:  terminating any other active
server processes

Cheers,

Jeff


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #6068: automatic analyze runs endlessly

2011-06-19 Thread Jeff Janes
On 6/19/11, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 Starting with commit b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8,
 Fix VACUUM so that it always updates pg_class.reltuples/relpages.

 After running make installcheck, the tables regression.public.slow_emp4000
 and regression.public.fast_emp4000 get analyzed once a minute even though
 they have no activity.  None of the other installcheck tables, just those
 two.

 This has no direct negative effect that I know of, but it seems to be a
 worrying symptom that pg_class or some related entity may not be getting
 updated correctly.

 Hmm ... seems worrisome, all right, but I don't see that happening here.
 At least I don't see any indication of it in
 pg_stat_all_tables.last_autoanalyze.  Are you looking at that, or some
 other evidence?  Do you have any nondefault settings?


Just log_autovacuum_min_duration=0

Both evidence and setting.

Cheers, Jeff

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #6068: automatic analyze runs endlessly

2011-06-18 Thread Jeff Janes

The following bug has been logged online:

Bug reference:  6068
Logged by:  Jeff Janes
Email address:  jeff.ja...@gmail.com
PostgreSQL version: 9.1beta1
Operating system:   Linux
Description:automatic analyze runs endlessly
Details: 

Starting with commit b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8,
Fix VACUUM so that it always updates pg_class.reltuples/relpages.

After running make installcheck, the tables regression.public.slow_emp4000
and regression.public.fast_emp4000 get analyzed once a minute even though
they have no activity.  None of the other installcheck tables, just those
two.

This has no direct negative effect that I know of, but it seems to be a
worrying symptom that pg_class or some related entity may not be getting
updated correctly.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #5157: Hash index not concurrency safe

2009-11-01 Thread Jeff Janes
On Sun, Nov 1, 2009 at 8:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 Hash index is not concurrency safe, starting in REL8_4_0 and up to HEAD.

 Ouch.  This used to be okay, because adding new entries to a hash page
 always added them at the end.  The 8.4 changes to keep individual hash
 pages sorted by hashcode broke it :-(.

 I think we could recover by having the hashgettuple code path
 re-synchronize by looking for the heap TID it previously returned.
 That must be at the same or higher index TID as we had stopped at.
 (Deletions are not possible, so we only have to search forward,
 and the TID must be there someplace.)

Can it get pushed to another page (an overflow page)?  My quick
reading of the code suggests it can't get pushed, which makes the fix
easier.

I'll work on a fix for it.  But if 8.4.2 is coming out in the next
couple of weeks and we want the fix to be in it, then we might want
someone more proficient than me to work on it.

Cheers,

Jeff

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #5157: Hash index not concurrency safe

2009-10-31 Thread Jeff Janes

The following bug has been logged online:

Bug reference:  5157
Logged by:  Jeff Janes
Email address:  jeff.ja...@gmail.com
PostgreSQL version: 8.4.1
Operating system:   Linux
Description:Hash index not concurrency safe
Details: 

Hash index is not concurrency safe, starting in REL8_4_0 and up to HEAD.

T1:

create table foo (id int, x text);
create index asdlfkjsdf on foo using hash (id);
insert into foo select 1, 'xxx' from generate_series(1,100);
set enable_seqscan =off;
set enable_bitmapscan =off;
\timing on
select count(pg_sleep(.3)) from foo where id=1;
 count
---
   100
(1 row)

Time: 30897.835 ms

select count(pg_sleep(.3)) from foo where id=1;

While that is running switch to T2:
insert into foo select generate_series, 'xxx' from
generate_series(1,10);

Back in T1:

 count
---
 8
(1 row)

Time: 2474.709 ms


The pg_sleep is simply there to give me time to force the two transactions
to overlap.

The problem is that hashgettuple releases the buffer content share lock when
it returns a tuple, so when it comes back to get another tuple the block may
have been rearranged by concurrent inserts.  But the state of the scan
object, specifically so-hashso_curpos, makes no attempt to detect or
correct for this rearragement.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #4965: missing tests in tools/fsync/test_fsync.c

2009-08-05 Thread Jeff Janes

The following bug has been logged online:

Bug reference:  4965
Logged by:  Jeff Janes
Email address:  jeff.ja...@gmail.com
PostgreSQL version: 8.4.0
Operating system:   Linux
Description:missing tests in tools/fsync/test_fsync.c
Details: 

In the part that implements Compare file sync methods with one 8k write,
the #ifdef OPEN_SYNC_FLAG code
is nested within the #ifdef OPEN_DATASYNC_FLAG code.

This causes o_sync to be skipped if o_dsync is unavailable, but only for
this particular section (the section with 2 8k writes doesn't have this
problem.)

Also, the statement that prints the Compare file sync methods with one 8k
write section title is up in the #ifdef block of a previous section, where
it might be omitted on systems without an o_sync.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #4952: commit_delay ignored because CountActiveBackends always returns zero

2009-07-29 Thread Jeff Janes

The following bug has been logged online:

Bug reference:  4952
Logged by:  Jeff Janes
Email address:  jeff.ja...@gmail.com
PostgreSQL version: 8.4.0
Operating system:   Linux 2.4.21-15.0.3.ELsmp
Description:commit_delay ignored because CountActiveBackends always
returns zero
Details: 

I've found that commit_delay never has an effect.

By instrumenting src/backend/access/transam/xact.c, I see that this is
because CountActiveBackends always returns zero.  

This seems to be a bug introduced by: 
http://archives.postgresql.org/pgsql-committers/2009-03/msg00259.php

into file src/backend/storage/ipc/procarray.c

I believe the source of the bug is the addition to that file of

+   if (proc != NULL)
+   continue;

The sense of this test should be inverted, as it is NULLs, not non-nulls
that need to be skipped.

Due to this test all truly active backends get skipped, so
CountActiveBackends() always returns zero.  Also, I believe without evidence
that the change fails to correct the (hard to reproduce) bug it was
originally introduced to correct, as a proc that is NULL does not get
skipped, and goes on to be dereferenced.

If I change this code to:

+   if (proc == NULL)
+   continue;

Then I find that commit_delay now does have an effect. 

(The effect is to make pgbench -c 15 slower when commit_delay is set to
the max value of 10.  This is what I thought would happen, and was
surprised when it originally did not.)

thanks,

Jeff

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs