Re: [HACKERS] [CORE] postpone next week's release

2015-05-29 Thread Noah Misch
On Fri, May 29, 2015 at 04:01:00PM -0400, Tom Lane wrote:
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> I am unclear if we are anywhere near ready for beta1 even in June.  Are
> >> we?
> 
> > I'm all about having that discussion...  but can we do it on another
> > thread or at least wait til we've decided about the back-branch
> > releases?  They are clearly the more important issue to consider.
> 
> It's the same discussion though, ie what releases are we expecting to
> get out in the next couple of weeks.

+1 for Stephen's thought to decide about back-branch releases first and to
Magnus's sentiment upthread that beta has to stand back while we schedule
them.  In other words, the feedback between these two scheduling decisions
ought to be one-way: bringing today's supported branches to a state we can be
content about deserves first pick from the calendar.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Andres Freund
On May 29, 2015 9:08:07 PM PDT, Tom Lane  wrote:
>I think your position is completely nuts. 

Yeehaa.

> The GROUPING SETS code is
>desperately in need of testing.  The custom-plan code is desperately
>in need of fixing and testing.  The multixact code is desperately
>in need of testing.  

And the array/plpgsql changes and upsert, and...

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Andres Freund
On May 29, 2015 8:56:40 PM PDT, Robert Haas  wrote:
>On Fri, May 29, 2015 at 6:33 PM, Andres Freund 
>wrote:
>> On 2015-05-29 18:02:36 -0400, Robert Haas wrote:
>>> Well, I think we ought to take at least a few weeks to try to do a
>bit
>>> of code review and clean up what we can from the open items list.
>>
>> Why? A large portion of the input required to go from beta towards a
>> release is from actual users. To see when things break, what confuses
>> them and such.
>
>I have two concerns:
>
>1. I'm concerned that once we release beta, any idea about reverting a
>feature or fixing something that is broken will get harder, because
>people will say "well, we can't do that after we've released a beta".
>I confess to particularly wanting a solution to the item listed as
>"custom-join has no way to construct Plan nodes of child Path nodes",
>the history of which I'll avoid recapitulating until I'm sure I can do
>it while maintaining my blood pressure at safe levels.

I think we should just document that this a beta and that changes are to be 
expected. And have a release candidate once that's not the case.

I agree that it'd be very good of the custom join issue gets fixed. But I don't 
see a beta prohibiting it.  Independently from that in going to ask a Citus 
colleague to make sure that pg-shard can use this.


>2. Also, if we're going to make significant multixact-related changes
>to 9.5 to try to improve reliability, as you proposed on the other
>thread, then it would be nice to do that before beta, so that it gets
>tested.  Of course, someone is bound to point out that we could make
>those changes in time for beta2, and people could test that.  But in
>practice I think that'll just mean that stuff is only out there for
>let's say 2 months before we put it in a major release, which ain't
>much.


There seems to be enough other stuff in die need of testing that I don't think 
that's sufficient cause, even though I understand the sentiment.

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Tom Lane
Robert Haas  writes:
> On Fri, May 29, 2015 at 6:33 PM, Andres Freund  wrote:
>> Why? A large portion of the input required to go from beta towards a
>> release is from actual users. To see when things break, what confuses
>> them and such.

> I have two concerns:

> 1. I'm concerned that once we release beta, any idea about reverting a
> feature or fixing something that is broken will get harder, because
> people will say "well, we can't do that after we've released a beta".
> I confess to particularly wanting a solution to the item listed as
> "custom-join has no way to construct Plan nodes of child Path nodes",
> the history of which I'll avoid recapitulating until I'm sure I can do
> it while maintaining my blood pressure at safe levels.

> 2. Also, if we're going to make significant multixact-related changes
> to 9.5 to try to improve reliability, as you proposed on the other
> thread, then it would be nice to do that before beta, so that it gets
> tested.  Of course, someone is bound to point out that we could make
> those changes in time for beta2, and people could test that.  But in
> practice I think that'll just mean that stuff is only out there for
> let's say 2 months before we put it in a major release, which ain't
> much.

I think your position is completely nuts.  The GROUPING SETS code is
desperately in need of testing.  The custom-plan code is desperately
in need of fixing and testing.  The multixact code is desperately
in need of testing.  The open-items list has several other problems
besides those.  All of those problems are independent.  If we insist
on tackling them serially rather than in parallel, 9.5 might not come
out till 2017.

I agree that we are not in a position to promise features won't change.
So let's call it an alpha not a beta --- but for heaven's sake let's
try to move forward on all these issues, not just some of them.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] More Vacuum Statistics

2015-05-29 Thread Alvaro Herrera
Andres Freund wrote:
> On 2015-05-29 21:30:57 -0500, Jim Nasby wrote:
> > It occurs to me that there's no good reason for vacuum-derived stats to be
> > in the stats file; it's not like users run vacuum anywhere near as often as
> > other commands. It's stats could be kept in pg_class; we're already keeping
> > things like relallvisible there.
> 
> While it might be viable to store them somewhere but the stat files, I
> don't think pg_class is a good place. Its size is not any less critical
> than the stats files. I.e. reading it sits in several rather hot paths,
> and we keep tuples from it in memory in a lot of places.

Greg Smith had this idea about "timing events",
https://www.postgresql.org/message-id/50A4BC4E.4030007%402ndQuadrant.com
Sounds like this thread is related.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Alvaro Herrera
Bruce Momjian wrote:

> I think we need to step back and look at the brain power required to
> unravel the mess we have made regarding multi-xact and fixes.  (I bet
> few people can even remember which multi-xact fixes went into which
> releases --- I can't.)  Instead of working on actual features, we are
> having to do this complex diagnosis because we didn't do a thorough
> analysis at the time a pattern of multi-xact bugs started to appear. 
> Many projects deal with this compound bug debt regularly, but we have
> mostly avoided this fate.

Well, it's pretty obvious that if we had had a glimpse of the nature of
the issues back then, we wouldn't have committed the patch.  The number
of ends that we left loose we now know to be huge, but we didn't know
that back then.  (I, at least, certainly didn't.)

Simon told me when this last one showed up that what we need at this
point is a way to turn the whole thing off to stop it from affecting
users anymore.  I would love to be able to do that, because the whole
situation has become stressing, but I don't see a way.  Heck, if we
could implement Heikki's TED idea or something similar, I would be up
for back-patching it so that people can pg_upgrade from postgresql-9.3
to postgresql-ted-9.3, and just forget any further multixact pain.
Don't think that's really doable, though.  As far as I can see, for
branches 9.3 and 9.4 the best we can do is soldier on and get these bugs
fixed, hoping that this time they are really the last [serious] ones.

For 9.5, I concur with Andres that we'd do good to change the way
truncations are done by WAL-logging more stuff and keep more data in
pg_control, to avoid all these nasty games.  And for 9.6, find a better
representation of the data so that the durable data is stored separately
from the volatile data.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Alvaro Herrera
Andres Freund wrote:

> I considered for a second whether the solution for that could be to not
> truncate while inconsistent - but I think that doesn't solve anything as
> then we can end up with directories where every single offsets/member
> file exists.

Hang on a minute.  We don't need to scan any files to determine the
truncate point for offsets; we have the valid range for them in
pg_control, as nextMulti + oldestMulti.  And using those end points, we
can look for the offsets corresponding to each, and determine the member
files corresponding to the whole set; it doesn't matter what other files
exist, we just remove them all.  In other words, maybe we can get away
with considering truncation separately for offset and members on
recovery: do it like today for offsets (i.e. at each restartpoint), but
do it only in TrimMultiXact for members.

One argument against this idea is that we may not want to keep a full
set of member files on standbys (due to disk space usage), but that's
what will happen unless we truncate during replay.

> I think at least for 9.5+ we should a) invent proper truncation records
> for pg_multixact b) start storing oldestValidMultiOffset in pg_control.
> The current hack of scanning the directories to get knowledge we should
> have is a pretty bad hack, and we should not continue using it forever.
> I think we might end up needing to do a) even in the backbranches.

Definitely agree with WAL-logging truncations; also +1 on backpatching
that to 9.3.  We already have experience with adding extra WAL records
on minor releases, and it didn't seem to have bitten too hard.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Robert Haas
On Fri, May 29, 2015 at 6:33 PM, Andres Freund  wrote:
> On 2015-05-29 18:02:36 -0400, Robert Haas wrote:
>> Well, I think we ought to take at least a few weeks to try to do a bit
>> of code review and clean up what we can from the open items list.
>
> Why? A large portion of the input required to go from beta towards a
> release is from actual users. To see when things break, what confuses
> them and such.

I have two concerns:

1. I'm concerned that once we release beta, any idea about reverting a
feature or fixing something that is broken will get harder, because
people will say "well, we can't do that after we've released a beta".
I confess to particularly wanting a solution to the item listed as
"custom-join has no way to construct Plan nodes of child Path nodes",
the history of which I'll avoid recapitulating until I'm sure I can do
it while maintaining my blood pressure at safe levels.

2. Also, if we're going to make significant multixact-related changes
to 9.5 to try to improve reliability, as you proposed on the other
thread, then it would be nice to do that before beta, so that it gets
tested.  Of course, someone is bound to point out that we could make
those changes in time for beta2, and people could test that.  But in
practice I think that'll just mean that stuff is only out there for
let's say 2 months before we put it in a major release, which ain't
much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Thomas Munro
On Sat, May 30, 2015 at 1:46 PM, Andres Freund  wrote:
> On 2015-05-29 15:08:11 -0400, Robert Haas wrote:
>> It seems pretty clear that we can't effectively determine anything
>> about member wraparound until the cluster is consistent.
>
> I wonder if this doesn't actually hints at a bigger problem.  Currently,
> to determine where we need to truncate SlruScanDirectory() is
> used. That, afaics, could actually be a problem during recovery when
> we're not consistent.
>
> Consider the scenario where a very large database is copied while
> running. At the start of the backup we'll determine at which checkpoint
> recovery will start and store it in the label. After that the copy will
> start, copying everything slowly. That works because we expect recovery
> to fix things up.  The problem I see WRT multixacts is that the copied
> state of pg_multixact could be wildly different from the one at the
> label's checkpoint. During recovery, before reaching the first
> checkpoint, we'll create multixact files as used at the time of the
> checkpoint. But the rest of pg_multixact may be ahead 2**31 xacts.

Yes, I think the code in TruncateMultiXact that scans for the earliest
multixact only works when the segment files span at most 2^31 of
multixact space. If they span more than that, MultiXactIdPrecedes is
no long able to provide a total ordering, so of the scan may be wrong,
depending on the order that it encounters the files.

Incidentally, your description of that scenario gave me an idea for
how to reproduce a base backup that 9.4.2 or master can't start.  I
tried this first:

1.  Set up with max_wal_senders = 1, wal_level = hot_standby, initdb
2.  Create enough multixacts to fill a couple of segments in
pg_multixacts/offsets using "explode_mxact_members 99 1000" (create
foo table first)
3.  Start a base backup with logs, but break in
src/backend/replication/basebackup.c after
sendFileWithContent(BACKUP_LABEL_FILE, labelfile); and before sending
the contents of the data dir (including pg_multixacts)... (or just put
a big sleep in there)
4.  UPDATE pg_database SET datallowconn = true; vacuumdb --freeze
--all; CHECKPOINT;, see that offsets/ is now gone and
oldestMultiXid is 98001 in pg_control
5.  ... allow the server backend to continue; the basebackup completes.

Inspecting the new data directory, I see that offsets/ is not
present as expected, and pg_control contains the oldestMultiXid 98001.

Since pg_control was copied after pg_multixacts and my database didn't
move between those copies, it points to a valid multixact (unlike the
pg_upgrade scenario) and is able to start up, but does something
different again which may or may not be good, I'm not sure:

LOG:  database system was interrupted; last known up at 2015-05-30 14:30:23 NZST
LOG:  file "pg_multixact/offsets/" doesn't exist, reading as zeroes
LOG:  redo starts at 0/728
LOG:  consistent recovery state reached at 0/70C8898
LOG:  redo done at 0/70C8898
LOG:  last completed transaction was at log time 2015-05-30 14:30:17.261436+12
LOG:  database system is ready to accept connections

My next theory about how to get a FATAL during startup is something
like this:  Break in basebackup.c in between copying pg_multixacts and
copying pg_control (simulating a very large/slow file copy, perhaps if
'base' happens to get copied after 'pg_multixacts', though I don't
know if that's possible), and while it's stopped, generate some
offsets segments, vacuum --freeze --all, checkpoint and then create a
few more multixacts, then checkpoint again (so that oldestMultiXact is
not equal to nextMultiXact).  Continue.  Now pg_control's
oldestMultiXactId now points at a segment file that didn't exist when
pg_multixacts was copied.  I haven't managed to get this to work (ie
produce a FATAL) and I'm out of time for a little while, but wanted to
share this idea in case it helps someone.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Robert Haas
On Fri, May 29, 2015 at 9:46 PM, Andres Freund  wrote:
> On 2015-05-29 15:08:11 -0400, Robert Haas wrote:
>> It seems pretty clear that we can't effectively determine anything
>> about member wraparound until the cluster is consistent.
>
> I wonder if this doesn't actually hints at a bigger problem.  Currently,
> to determine where we need to truncate SlruScanDirectory() is
> used. That, afaics, could actually be a problem during recovery when
> we're not consistent.

I agree.  I actually meant to mention this in my previous email, but,
owing to exhaustion and burnout, didn't.

> I think at least for 9.5+ we should a) invent proper truncation records
> for pg_multixact b) start storing oldestValidMultiOffset in pg_control.
> The current hack of scanning the directories to get knowledge we should
> have is a pretty bad hack, and we should not continue using it forever.
> I think we might end up needing to do a) even in the backbranches.

That may be the right thing to do.  I'm concerned that changing the
behavior of master too much will make it every subsequent fix twice as
hard, because we'll have to do one fix in master and another fix in
the back-branches.  I'm also concerned that it will create even more
convoluted failure scenarios. The failure-to-start problem discussed
on this thread requires a chain of four (maybe three) different
PostgreSQL versions in order to create it, and the more things we go
change, the harder it's going to be to reason about this stuff.

The diseased and rotting elephant in the room here is that clusters
with bogus relminmxid, datminmxid, and/or oldestMultiXid values may
exist in the wild and we really have no plan to get rid of them.
78db307bb may have helped somewhat - although I'm haven't grokked what
it's about well enough to be sure - but it's certainly not a complete
solution, as this bug report itself illustrates rather well.  Unless
we figure out some clever solution that is not now apparent to me, or
impose a hard pg_upgrade compatibility break at some point, we
basically can't count on pg_control's "oldest multixact" information
to be correct ever again.  We may be running into clusters 15 years
from now that have problems that are just holdovers from what was
fixed in 9.3.5.

One thing I think we should definitely do is add one or two additional
fields to pg_controldata that get filled in by pg_upgrade.  One of
them should be "the oldest known catversion in the lineage of this
cluster" and the other should be "the most recent catverson in the
lineage of this cluster before this one".   Or maybe we should store
PG_VERSION_NUM values.  Or store both things.  I think that would make
troubleshooting this kind of problem a lot easier - just from the
pg_controldata output, you'd be able to tell whether the cluster had
been pg_upgraded, whether it had been pg_upgraded once or multiple
times, and at least some of the versions involved, without relying on
the user's memory of what they did and when.  Fortunately, Steve
Kellet had a pretty clear idea of what his history was, but not all
users know that kind of thing, and I've wanted it more than once while
troubleshooting.

Another thing I think we should do is add a field to pg_class that is
propagated by pg_upgrade and stores the most recent PG_VERSION_NUM
that is known to have performed a scan_all vacuum of the table.  This
would allow us to do things in the future like (a) force a full-table
vacuum of any table that hasn't been vacuumed since $BUGGYRELEASE or
(b) advise users to manually inspect the values and manually perform
said vacuum or (c) only believe that certain information about a table
is accurate if it's been full-scanned by a vacuum newer than
$BUGGYRELEASE.  It could also be used as part of a strategy for
reclaiming HEAP_MOVED_IN/HEAP_MOVED_OFF; e.g. you can't upgrade to
10.5, which repurposes those bits, unless you've done a scan_all
vacuum on every table with a release new enough to guarantee that
they're not used for their historical purpose.

> This problem isn't conflicting with most of the fixes you describe, so
> I'll continue with reviewing those.

Thank you.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Robert Haas
On Fri, May 29, 2015 at 3:08 PM, Robert Haas  wrote:
> It won't fix the fact that pg_upgrade is putting
> a wrong value into everybody's datminmxid field, which should really
> be addressed too, but I've been working on this for about three days
> virtually non-stop and I don't have the energy to tackle it right now.
> If anyone feels the urge to step into that breech, I think what it
> needs to do is: when upgrading from a 9.3-or-later instance, copy over
> each database's datminmxid into the corresponding database in the new
> cluster.

Bruce was kind enough to spend some time on IM with me this afternoon,
and I think this may actually be OK.  What pg_upgrade does is:

1. First, put next-xid into the relminmxid for all tables, including
catalog tables.  This is the correct behavior for upgrades from a
pre-9.3 release, and is correct for catalog tables in general.

2. Next, restoring the schema dump will set the relminmxid values for
all non-catalog tables to the value dumped from the old cluster.  At
this point, everything is fine provided that we are coming from a
release 9.3 or newer.  But if the old cluster is pre-9.3, it will have
dumped *zero* values for all of its relminmxid values; so all of the
user tables go from the correct value they had after step 1 to an
incorrect value.

3. Finally, if the old cluster is pre-9.3, repeat step 1, undoing the
damage done in step 2.

This is a bit convoluted, but I don't know of a reason why it
shouldn't work.  Sorry for the false alarm.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] session_replication_role origin vs local

2015-05-29 Thread Peter Eisentraut
Does anyone know what the difference between the
session_replication_role settings of 'origin' vs 'local' is supposed to
be?  AFAICT, the code treats them the same and has done since this
feature was initially introduced.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] cannot set view triggers to replica

2015-05-29 Thread Peter Eisentraut
It appears to be an omission that ALTER TABLE ... ENABLE TRIGGER and
similar commands don't allow acting on views, even though we now have
triggers on views.

Similarly, the ALTER TABLE ... ENABLE RULE commands only allow acting on
tables, even though rules can also exist on views and materialized views.

(Why don't we allow rules on foreign tables?  Is that intentional?)

Attached is a sample patch.  It appears we don't have any regression
tests for this.
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 84dbee0..e530953 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3341,13 +3341,18 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
 		case AT_DisableTrig:	/* DISABLE TRIGGER variants */
 		case AT_DisableTrigAll:
 		case AT_DisableTrigUser:
-			ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE);
+			ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE | ATT_VIEW);
 			pass = AT_PASS_MISC;
 			break;
 		case AT_EnableRule:		/* ENABLE/DISABLE RULE variants */
 		case AT_EnableAlwaysRule:
 		case AT_EnableReplicaRule:
 		case AT_DisableRule:
+			ATSimplePermissions(rel, ATT_TABLE | ATT_MATVIEW | ATT_VIEW);
+			/* These commands never recurse */
+			/* No command-specific prep needed */
+			pass = AT_PASS_MISC;
+			break;
 		case AT_AddOf:			/* OF */
 		case AT_DropOf: /* NOT OF */
 		case AT_EnableRowSecurity:

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] More Vacuum Statistics

2015-05-29 Thread Andres Freund
On 2015-05-29 21:30:57 -0500, Jim Nasby wrote:
> It occurs to me that there's no good reason for vacuum-derived stats to be
> in the stats file; it's not like users run vacuum anywhere near as often as
> other commands. It's stats could be kept in pg_class; we're already keeping
> things like relallvisible there.

While it might be viable to store them somewhere but the stat files, I
don't think pg_class is a good place. Its size is not any less critical
than the stats files. I.e. reading it sits in several rather hot paths,
and we keep tuples from it in memory in a lot of places.

Greetings,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] More Vacuum Statistics

2015-05-29 Thread Jim Nasby

On 5/28/15 9:14 AM, Tom Lane wrote:

Naoya Anzai  writes:

In my much experience up until now,I have an idea that we can add
2 new vacuum statistics into pg_stat_xxx_tables.


Adding new stats in that way requires adding per-table counters, which
bloat the statistics files (especially in database with very many tables).
I do not think you've made a case for these stats being valuable enough
to justify such overhead for everybody.


It occurs to me that there's no good reason for vacuum-derived stats to 
be in the stats file; it's not like users run vacuum anywhere near as 
often as other commands. It's stats could be kept in pg_class; we're 
already keeping things like relallvisible there.



As far as the first one goes, I don't even think it's especially useful.
There might be value in tracking the times of the last few vacuums on a
table, but knowing the time for only the latest one doesn't sound like it
would prove much.  So I'd be inclined to think more along the lines of
scanning the postmaster log for autovacuum runtimes, instead of squeezing
it into the pg_stats views.


You'd also want to know how many pages were scanned, since any decent 
estimation would need to take table size into account.


As for history, that's a problem that exists for *all* our statistics, 
so anyone that cares about that is going to setup some system to 
periodically capture the contents of pg_stat_*.



A possible alternative so far as the second one goes is to add a function
(perhaps in contrib/pg_freespacemap) that simply runs through a table's
VM and counts the number of set bits.  This would be more accurate (no
risk of lost counter updates) and very possibly cheaper overall: it would
take longer to find out the number when you wanted it, but you wouldn't be
paying the distributed overhead of tracking it when you didn't want it.


Seems like a reasonable addition to that contrib module regardless. As 
Jeff Janes mentioned this info is available in pg_class, but it requires 
an ANALYZE to update it.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested loop semijoin estimates

2015-05-29 Thread Tomas Vondra



On 05/30/15 01:20, Tomas Vondra wrote:


Notice the cost - it's way lover than the previous plan (9.2 vs
~111k), yet this plan was not chosen. So either the change broke
something (e.g. by violating some optimizer assumption), or maybe
there's a bug somewhere else ...


After a bit more investigation, what I think is happening here is 
add_path() does not realize this is a SEMI join and a single tuple is 
enough, and discards the simple indexscan path in favor of the bitmap 
index scan as that seems cheaper when scanning everything.


So not really a bug, but maybe 'consider_startup' would help?


--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Andres Freund
On 2015-05-29 15:08:11 -0400, Robert Haas wrote:
> It seems pretty clear that we can't effectively determine anything
> about member wraparound until the cluster is consistent.

I wonder if this doesn't actually hints at a bigger problem.  Currently,
to determine where we need to truncate SlruScanDirectory() is
used. That, afaics, could actually be a problem during recovery when
we're not consistent.

Consider the scenario where a very large database is copied while
running. At the start of the backup we'll determine at which checkpoint
recovery will start and store it in the label. After that the copy will
start, copying everything slowly. That works because we expect recovery
to fix things up.  The problem I see WRT multixacts is that the copied
state of pg_multixact could be wildly different from the one at the
label's checkpoint. During recovery, before reaching the first
checkpoint, we'll create multixact files as used at the time of the
checkpoint. But the rest of pg_multixact may be ahead 2**31 xacts.

For clog and such that's not a problem because the truncation
etc. points are all stored in WAL - during recovery we just replay the
truncations that happened on the master, there's no need to look at the
data directory. And we won't access the clog before being consistent.

But for multixacts is different. To avoid ending up with
pg_multixact/*/* directories we have to do truncations during
recovery. As there's currently no truncation records we have to do that
scanning the data directory. But that state could be "from the future".

I considered for a second whether the solution for that could be to not
truncate while inconsistent - but I think that doesn't solve anything as
then we can end up with directories where every single offsets/member
file exists.  We could possibly try to fix that by always truncating
away slru segments in offsets that we know to be too old to exist in a
valid database. But achieving the same for members fries my brain.  It
also seems awfully risky.

I think at least for 9.5+ we should a) invent proper truncation records
for pg_multixact b) start storing oldestValidMultiOffset in pg_control.
The current hack of scanning the directories to get knowledge we should
have is a pretty bad hack, and we should not continue using it forever.
I think we might end up needing to do a) even in the backbranches.


Am I missing something?


This problem isn't conflicting with most of the fixes you describe, so
I'll continue with reviewing those.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Andres Freund
On 2015-05-29 15:49:53 -0400, Bruce Momjian wrote:
> I think we need to step back and look at the brain power required to
> unravel the mess we have made regarding multi-xact and fixes.  (I bet
> few people can even remember which multi-xact fixes went into which
> releases --- I can't.)  Instead of working on actual features, we are
> having to do this complex diagnosis because we didn't do a thorough
> analysis at the time a pattern of multi-xact bugs started to appear. 
> Many projects deal with this compound bug debt regularly, but we have
> mostly avoided this fate.

What is the consequences of that observation you're imagining?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Stephen Frost
* Andres Freund (and...@anarazel.de) wrote:
> On 2015-05-29 18:02:36 -0400, Robert Haas wrote:
> > Well, I think we ought to take at least a few weeks to try to do a bit
> > of code review and clean up what we can from the open items list.
> 
> Why? A large portion of the input required to go from beta towards a
> release is from actual users. To see when things break, what confuses
> them and such.
> 
> I don't see why that requires that there are no minor entries in the
> open items list - and that's what currently is on it.  Neither does it
> seem to be a problem to do code review concurrently to user beta
> testing.  We obviously can't start a beta if things crash left and
> right, but I don't think that's the situation right now?

Agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] nested loop semijoin estimates

2015-05-29 Thread Tomas Vondra

Hi,

while looking at this post from pgsql-performance about plan changes

http://www.postgresql.org/message-id/flat/20150529095117.gb15...@hjp.at

I noticed that initial_cost_nestloop() does this in (9.1, mentioned in 
the pgsql-performance post uses the same logic):



if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)
{
double  outer_matched_rows;
Selectivity inner_scan_frac;

run_cost += inner_run_cost;

outer_matched_rows
= rint(outer_path_rows * semifactors->outer_match_frac);
inner_scan_frac = 2.0 / (semifactors->match_count + 1.0);

if (outer_matched_rows > 1)
run_cost += (outer_matched_rows - 1)
* inner_rescan_run_cost * inner_scan_frac;

...
}


I wonder whether the

run_cost += inner_run_cost;

is actually correct, because this pretty much means we assume scanning 
the whole inner relation (once). Wouldn't something like this be more 
appropriate?


run_cost += inner_run_cost * inner_scan_frac;

i.e. only counting the proportional part of the inner run cost, just 
like we do for the rescans (i.e. until the first match)?


Imagine a simple EXISTS() query that gets turned into a semijoin, with 
an inner index scan. The current code pretty much means we'll scan the 
whole result, even though we only really need a single matching query 
(to confirm the EXISTS clause).


For cheap index scans (e.g. using a PK) this does not make much 
difference, but the example posted to pgsql-performance results in index 
scans matching ~50% of the table, which makes the whole index scan quite 
expensive, and much higher than the rescan cost (multiplied by 
inner_scan_frac).


While investigating the pgsql-performance report I've prepared the 
attached testcase, producing similar data set (attached). So let's see 
how the code change impacts plan choice.


With the current code, the planner produces a plan like this:

 QUERY PLAN
---
 Nested Loop  (cost=169627.96..169636.03 rows=1 width=74)
   Join Filter: ((t.term)::text = (f.berechnungsart)::text)
   ->  Index Scan using term_facttablename_columnname_idx on term t
   (cost=0.55..8.57 rows=1 width=74)
 Index Cond: (((facttablename)::text =
 'facttable_stat_fta4'::text) AND ((columnname)::text =
 'berechnungsart'::text))
   ->  HashAggregate  (cost=169627.41..169627.43 rows=2 width=2)
 Group Key: (f.berechnungsart)::text
 ->  Seq Scan on facttable_stat_fta4 f
 (cost=0.00..145274.93 rows=9740993 width=2)

which seems slightly inefficient, as ~50% of the facttable_stat_fta4 
matches the condition (so building the whole hash is a waste of time). 
Also notice this is a regular join, not a semijoin - that seems to 
confirm the planner does not realize the cost difference, believes both 
plans (regular and semijoin) are equally expensive and simply uses the 
first one.


With the run_cost change, I do get this plan:

QUERY PLAN
---
 Nested Loop Semi Join  (cost=111615.33..111623.42 rows=1 width=74)
   ->  Index Scan using term_facttablename_columnname_idx on term t
 (cost=0.55..8.57 rows=1 width=74)
 Index Cond: (((facttablename)::text =
  'facttable_stat_fta4'::text) AND ((columnname)::text =
   'berechnungsart'::text))
   ->  Bitmap Heap Scan on facttable_stat_fta4 f
 (cost=111614.78..220360.98 rows=4870496 width=2)
 Recheck Cond: ((berechnungsart)::text = (t.term)::text)
 ->  Bitmap Index Scan on facttable_stat_fta4_berechnungsart_idx
   (cost=0.00..110397.15 rows=4870496 width=0)
   Index Cond: ((berechnungsart)::text = (t.term)::text)

and this runs about ~3x faster than the original plan (700ms vs. 
2000ms), at least when using the testcase dataset on my laptop.


This however is not the whole story, because after disabling the bitmap 
index scan, I do get this plan, running in ~1ms (so ~700x faster than 
the bitmap index scan):


 QUERY PLAN

 Nested Loop Semi Join  (cost=0.99..9.20 rows=1 width=74)
   ->  Index Scan using term_facttablename_columnname_idx on term t
   (cost=0.55..8.57 rows=1 width=74)
 Index Cond: (((facttablename)::text =
'facttable_stat_fta4'::text) AND
((columnname)::text = 'berechnungsart'::text))
   ->  Index Only Scan using facttable_stat_fta4_berechnungsart_idx
  on facttable_stat_fta4 f
 (cost=0.43..310525.02 rows=4870496 width=2)
 Index Cond: (berechnungsart = (t.term)::text)

Notice the cost - it's way lover than the previous plan (9.2 vs ~111k), 
yet this plan was not chosen. So either the 

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Andres Freund
On 2015-05-30 10:55:30 +1200, Thomas Munro wrote:
> That's the error message, but then further down:

Ooops.

> "I have confirmed that directory "pg_multixact/members" does not
> existing in the restored data directory.
>
> I can see this directory and the file if i restore a few days old
> backup. I have used WAL-E backups a number of time from this server
> but this is the first time I am running into this issue."

That sounds like a WAL-E problem rather than a postgres one then. I
think it's exceedingly unlikely - although I perhaps should be careful
with that given the recent history - that postgres contains a bug that
deletes an entire directory on the standby/while cloning.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Thomas Munro
On Sat, May 30, 2015 at 10:48 AM, Andres Freund  wrote:
> On 2015-05-30 10:41:01 +1200, Thomas Munro wrote:
>> On Sat, May 30, 2015 at 10:29 AM, Robert Haas  wrote:
>> > On Fri, May 29, 2015 at 5:14 PM, Josh Berkus  wrote:
>> >> Just saw what looks like a report of this issue on 9.2.
>> >>
>> >> https://github.com/wal-e/wal-e/issues/177
>> >
>> > Urk.  That looks awfully similar, but I don't think any of the code
>> > that is affected here exists in 9.2, or that any of the fixes involved
>> > were back-patched to 9.2.  So that might be something else altogether.
>>
>> Not only that, the pg_multixact/members *directory* is reported
>> missing, which is a different problem entirely.
>
> I don't read that from the log, what does make you think that that's the
> case. Afaics it's "just" the 0146 member file that is missing?

That's the error message, but then further down:

"I have confirmed that directory "pg_multixact/members" does not
existing in the restored data directory.

I can see this directory and the file if i restore a few days old
backup. I have used WAL-E backups a number of time from this server
but this is the first time I am running into this issue."

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Andres Freund
On 2015-05-30 10:41:01 +1200, Thomas Munro wrote:
> On Sat, May 30, 2015 at 10:29 AM, Robert Haas  wrote:
> > On Fri, May 29, 2015 at 5:14 PM, Josh Berkus  wrote:
> >> Just saw what looks like a report of this issue on 9.2.
> >>
> >> https://github.com/wal-e/wal-e/issues/177
> >
> > Urk.  That looks awfully similar, but I don't think any of the code
> > that is affected here exists in 9.2, or that any of the fixes involved
> > were back-patched to 9.2.  So that might be something else altogether.
> 
> Not only that, the pg_multixact/members *directory* is reported
> missing, which is a different problem entirely.

I don't read that from the log, what does make you think that that's the
case. Afaics it's "just" the 0146 member file that is missing?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Join Filter vs. Index Cond (performance regression 9.1->9.2+/HEAD)

2015-05-29 Thread Andrew Gierth
This is distilled down from a performance regression problem that came
past on IRC earlier today:

create table t1 (a integer, b integer, c integer, primary key (a,b,c));
create table t2 (k2 integer, a integer, primary key (k2,a));
create table t3 (k3 integer, b integer, primary key (k3,b));
create table t4 (k4 integer, c integer, primary key (k4,c));
insert into t1 select i,i,i from generate_series(1,1000,20) i;
insert into t1 select 2,2,i from generate_series(1,500) i;
insert into t2 select i,i from generate_series(1,1000) i;
insert into t3 select i,i from generate_series(1,1000) i;
insert into t4 select i,i from generate_series(1,1000) i;
analyze;

explain analyze
  select * from t4
   left join t3 on (t4.c=t3.k3)
   left join t2 on (t3.b=t2.k2)
   left join t1 on (t1.a=t2.a and t1.b=t3.b and t1.c=t4.c)
   where t4.k4=2;

The plan for this on 9.4.2 comes out like this:

 Nested Loop Left Join  (cost=1.10..17.28 rows=1 width=36) (actual 
time=0.089..0.448 rows=1 loops=1)  
   Join Filter: (t1.c = t4.c)  
   Rows Removed by Join Filter: 499  
   ->  Nested Loop Left Join  (cost=0.83..16.94 rows=1 width=24) (actual 
time=0.056..0.059 rows=1 loops=1)  
 ->  Nested Loop Left Join  (cost=0.55..16.60 rows=1 width=16) (actual 
time=0.044..0.046 rows=1 loops=1)  
   ->  Index Only Scan using t4_pkey on t4  (cost=0.28..8.29 rows=1 
width=8) (actual time=0.024..0.025 rows=1 loops=1)  
 Index Cond: (k4 = 2)  
 Heap Fetches: 1  
   ->  Index Only Scan using t3_pkey on t3  (cost=0.28..8.29 rows=1 
width=8) (actual time=0.011..0.012 rows=1 loops=1)  
 Index Cond: (k3 = t4.c)  
 Heap Fetches: 1  
 ->  Index Only Scan using t2_pkey on t2  (cost=0.28..0.33 rows=1 
width=8) (actual time=0.010..0.011 rows=1 loops=1)
   Index Cond: (k2 = t3.b)  
   Heap Fetches: 1  
   ->  Index Only Scan using t1_pkey on t1  (cost=0.28..0.33 rows=1 width=12) 
(actual time=0.025..0.281 rows=500 loops=1)  
 Index Cond: ((a = t2.a) AND (b = t3.b))  
 Heap Fetches: 500  

Whereas 9.1 gives this:

 Nested Loop Left Join  (cost=0.00..33.12 rows=1 width=36) (actual 
time=0.074..0.096 rows=1 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..24.83 rows=1 width=24) (actual 
time=0.054..0.069 rows=1 loops=1)
 ->  Nested Loop Left Join  (cost=0.00..16.55 rows=1 width=16) (actual 
time=0.039..0.048 rows=1 loops=1)
   ->  Index Scan using t4_pkey on t4  (cost=0.00..8.27 rows=1 
width=8) (actual time=0.020..0.022 rows=1 loops=1)
 Index Cond: (k4 = 2)
   ->  Index Scan using t3_pkey on t3  (cost=0.00..8.27 rows=1 
width=8) (actual time=0.009..0.011 rows=1 loops=1)
 Index Cond: (t4.c = k3)
 ->  Index Scan using t2_pkey on t2  (cost=0.00..8.27 rows=1 width=8) 
(actual time=0.008..0.010 rows=1 loops=1)
   Index Cond: (t3.b = k2)
   ->  Index Scan using t1_pkey on t1  (cost=0.00..8.27 rows=1 width=12) 
(actual time=0.013..0.016 rows=1 loops=1)
 Index Cond: ((a = t2.a) AND (b = t3.b) AND (c = t4.c))

In the real example, the join filter in the 9.4.2 plan was discarding 40
million rows, not just 500, so the performance impact was quite serious.

Obviously it makes little sense to use an (a,b,c) index to look up just
(a,b) and then filter on c; the question is, what is the planner doing
that leads it to get this so wrong? Finding a workaround for it was not
easy, either - the only thing that I found that worked was replacing the
t1 join with a lateral join with an OFFSET 0 clause to nobble the
planner entirely.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Thomas Munro
On Sat, May 30, 2015 at 10:29 AM, Robert Haas  wrote:
> On Fri, May 29, 2015 at 5:14 PM, Josh Berkus  wrote:
>> Just saw what looks like a report of this issue on 9.2.
>>
>> https://github.com/wal-e/wal-e/issues/177
>
> Urk.  That looks awfully similar, but I don't think any of the code
> that is affected here exists in 9.2, or that any of the fixes involved
> were back-patched to 9.2.  So that might be something else altogether.

Not only that, the pg_multixact/members *directory* is reported
missing, which is a different problem entirely.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Andres Freund
On 2015-05-29 18:02:36 -0400, Robert Haas wrote:
> Well, I think we ought to take at least a few weeks to try to do a bit
> of code review and clean up what we can from the open items list.

Why? A large portion of the input required to go from beta towards a
release is from actual users. To see when things break, what confuses
them and such.

I don't see why that requires that there are no minor entries in the
open items list - and that's what currently is on it.  Neither does it
seem to be a problem to do code review concurrently to user beta
testing.  We obviously can't start a beta if things crash left and
right, but I don't think that's the situation right now?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Robert Haas
On Fri, May 29, 2015 at 5:14 PM, Josh Berkus  wrote:
> Just saw what looks like a report of this issue on 9.2.
>
> https://github.com/wal-e/wal-e/issues/177

Urk.  That looks awfully similar, but I don't think any of the code
that is affected here exists in 9.2, or that any of the fixes involved
were back-patched to 9.2.  So that might be something else altogether.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Bruce Momjian
On Fri, May 29, 2015 at 05:37:13PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > Do we need release notes for an alpha?  Once I do the release notes, it
> > is possible to miss subtle changes in the code that aren't mentioned in
> > commit messages.
> 
> If the commit message isn't clear about something, you'd likely miss the
> issue anyway, no?  Anyway, once the release notes are in the tree, we

I often do research in the git tree to get details on the feature beyond
just looking at the commit or the patch.

> could expect that anyone committing a user-visible semantics change should
> update the release notes themselves.

Yes, that would be nice.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Josh Berkus
On 05/29/2015 02:54 PM, Tom Lane wrote:
> Peter Geoghegan  writes:
>> The problem here is that these ranges are controlled by a
>> decentralized patchwork of national standards bodies, and the ranges
>> are always subject to revision. I think that it's egregious that
>> contrib/isn imagines it can track that with a static array.
> 
> Well, that module has already been rewritten once (which proves that
> there's an audience out there for it).  Perhaps somebody will rewrite it
> again to support a non-hardwired set of ranges.  Now that we have the
> concept of an extension configuration table, that'd be one possible
> way to fix it ...

FWIW, neither of the projects I know of which uses ISBN has had any
issues with the range since the 2010 updates.  While the ranges can be
updated in theory, in practice is happens glacially slowly.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Peter Geoghegan
On Fri, May 29, 2015 at 2:54 PM, Tom Lane  wrote:
> Well, that module has already been rewritten once (which proves that
> there's an audience out there for it).  Perhaps somebody will rewrite it
> again to support a non-hardwired set of ranges.  Now that we have the
> concept of an extension configuration table, that'd be one possible
> way to fix it ...

I wouldn't bother.

ISBNs already have a UPC-style weighted sum check digit that will
catch the vast majority of errors, including all transposition errors.
You'd have to try hard to fatfinger an ISBN in a way that produced
something that accidentally had a valid check digit.

contrib/isn suffers from a bad case of protecting against Machiavelli
rather than Murphy. The enforcement isn't just obviously wrong, it's
also ridiculous in principle.

You're right, though -- we have better things to worry about.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Robert Haas
On Fri, May 29, 2015 at 4:37 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I'm personally kind of astonished that we're even thinking about beta
>> so soon.  I mean, we at least need to go through the stuff listed
>> here, I think:
>> https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items
>
> Well, maybe we ought to call it an alpha not a beta, but I think we ought
> to put out some kind of release that we can encourage people to test.
> What you are suggesting is that we serialize resolution of the known
> issues with discovery of new issues, and that's not an efficient use of
> time.  Especially seeing that we're approaching the summer season where
> we won't get much input at all.

Well, I think we ought to take at least a few weeks to try to do a bit
of code review and clean up what we can from the open items list.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Tom Lane
Peter Geoghegan  writes:
> The problem here is that these ranges are controlled by a
> decentralized patchwork of national standards bodies, and the ranges
> are always subject to revision. I think that it's egregious that
> contrib/isn imagines it can track that with a static array.

Well, that module has already been rewritten once (which proves that
there's an audience out there for it).  Perhaps somebody will rewrite it
again to support a non-hardwired set of ranges.  Now that we have the
concept of an extension configuration table, that'd be one possible
way to fix it ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Peter Geoghegan
On Fri, May 29, 2015 at 2:35 PM, Tom Lane  wrote:
> It made us realize that extensions that create types
> that are physically equivalent to int8 or float8 were broken when we made
> those types potentially pass-by-value; we had to add a CREATE TYPE option
> to allow that to still work (cf commit 3f936aacc057e4b3).  If contrib/isn
> had not been around and been getting built by the buildfarm, we would have
> found that out only much later and with much more pain.

Interesting.

FWIW, my concerns with contrib/isn are limited to the ISBN type and
related types. These types enforce that ISBNs are within certain
ranges known by the module to be valid. The first patch I reviewed for
Postgres back in 2010 extended this range, and I first raised the
issue then -- how many such patches can we expect in the future?

The problem here is that these ranges are controlled by a
decentralized patchwork of national standards bodies, and the ranges
are always subject to revision. I think that it's egregious that
contrib/isn imagines it can track that with a static array.

Since contrib is a place that example code is supposed to live,
perhaps contrib/isn could be held up as an example of what not to
do...
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Tom Lane
Bruce Momjian  writes:
> Do we need release notes for an alpha?  Once I do the release notes, it
> is possible to miss subtle changes in the code that aren't mentioned in
> commit messages.

If the commit message isn't clear about something, you'd likely miss the
issue anyway, no?  Anyway, once the release notes are in the tree, we
could expect that anyone committing a user-visible semantics change should
update the release notes themselves.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Tom Lane
Josh Berkus  writes:
> On 05/29/2015 02:08 PM, Peter Geoghegan wrote:
>> I always liked the idea of organizing contrib along these lines.
>> 
>> I know that I will never be successful in convincing people to remove,
>> say, contrib/isn, which is total garbage, but the next best thing is
>> to categorize it in a way that sets expectations very low.

> Well, contrib/isn is still useful (I use it).  But there's no good
> reason it couldn't be on pgxn.

We already did one round of removal of low-grade contrib items.
Admittedly that was in 2006, and maybe some of the stuff that survived
that cut no longer looks good enough.  But I don't think there's all
that much low-hanging fruit there.

But let's get to the point: the real reason for keeping most of these
contrib modules in the core distribution is that they are essential test
cases for core's extensibility features.  contrib/isn is actually a good
example of that.  It made us realize that extensions that create types
that are physically equivalent to int8 or float8 were broken when we made
those types potentially pass-by-value; we had to add a CREATE TYPE option
to allow that to still work (cf commit 3f936aacc057e4b3).  If contrib/isn
had not been around and been getting built by the buildfarm, we would have
found that out only much later and with much more pain.

You could imagine some other way to address that, like generalizing the
buildfarm so that it can pull in extensions from other source repos for
testing purposes.  But that's going to be a lot of work and I'm not even
real sure we want it --- it'd increase the trust problem for buildfarm
owners quite a bit, for one thing.

I'm not particularly on board with renaming things just to get rid of the
term "contrib".  We have much better things to do with our time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Steve Kehlet
On Fri, May 29, 2015 at 12:08 PM Robert Haas  wrote:

> OK, here's a patch.
>

I grabbed branch REL9_4_STABLE from git, and Robert got me a 9.4-specific
patch. I rebuilt, installed, and postgres started up successfully!  I did a
bunch of checks, had our app run several thousand SQL queries against it,
had a colleague check it out, and it looks good. Looking at top and ps, I
don't see anything funny (e.g. no processes spinning cpu, etc), things look
normal. Let me know if I can provide anything else.


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Andres Freund
On May 29, 2015 2:12:24 PM PDT, Bruce Momjian  wrote:
>On Fri, May 29, 2015 at 11:04:59PM +0200, Andres Freund wrote:
>> On 2015-05-29 16:37:00 -0400, Tom Lane wrote:
>> > Well, maybe we ought to call it an alpha not a beta, but I think we
>ought
>> > to put out some kind of release that we can encourage people to
>test.
>> 
>> I also do think it's important that we put out a beta (or alpha)
>> relatively soon. Both because we actually need input to find out what
>> works and what doesn't and also because it pushes us to tie up loose
>> ends.
>> 
>> A beta with open items isn't that bad a thing? There's many bigger
>> projects doing 4-8 betas releases before a major one; and most of
>them
>> have open items at the indvidual beta's release times.
>> 
>> I think we should define/document it so that there's no hard goal of
>> being compatible for beta releases and that the compatibility goal
>> starts with the first release candidate, and not the betas.
>
>Do we need release notes for an alpha?  Once I do the release notes, it
>is possible to miss subtle changes in the code that aren't mentioned in
>commit messages.

Yes I think so. Otherwise it's pretty useless for people not following closely. 
I see little point in explicitly delaying release note work any further.

Andres



--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Josh Berkus
All,

Just saw what looks like a report of this issue on 9.2.

https://github.com/wal-e/wal-e/issues/177

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Peter Geoghegan
On Fri, May 29, 2015 at 11:47 AM, Josh Berkus  wrote:
> A. Extra commands and tools which aren't considered general enough, or
> reliable enough, to be included by default, e.g. pg_standby, pgbench and
> vacuumlo.
>
> B. Developer tools, like spi, start-scripts, and oid2name.
>
> C. "Core Extensions", which fall into three further groups:
> C1: encryption extensions we can't include in core
> for legal reasons (pg_crypto)
> C2: example extensions which show useful things about
> how to build an extension
> C3: Admin extensions which are not core because they carry
> risks (e.g. pgstattuple, auto_explain)
> C4: Extensions which are generally useful, used, and
> maintained with Postgres (e.g. hstore, citext)

I always liked the idea of organizing contrib along these lines.

I know that I will never be successful in convincing people to remove,
say, contrib/isn, which is total garbage, but the next best thing is
to categorize it in a way that sets expectations very low.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Bruce Momjian
On Fri, May 29, 2015 at 11:04:59PM +0200, Andres Freund wrote:
> On 2015-05-29 16:37:00 -0400, Tom Lane wrote:
> > Well, maybe we ought to call it an alpha not a beta, but I think we ought
> > to put out some kind of release that we can encourage people to test.
> 
> I also do think it's important that we put out a beta (or alpha)
> relatively soon. Both because we actually need input to find out what
> works and what doesn't and also because it pushes us to tie up loose
> ends.
> 
> A beta with open items isn't that bad a thing? There's many bigger
> projects doing 4-8 betas releases before a major one; and most of them
> have open items at the indvidual beta's release times.
> 
> I think we should define/document it so that there's no hard goal of
> being compatible for beta releases and that the compatibility goal
> starts with the first release candidate, and not the betas.

Do we need release notes for an alpha?  Once I do the release notes, it
is possible to miss subtle changes in the code that aren't mentioned in
commit messages.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Josh Berkus
On 05/29/2015 02:08 PM, Peter Geoghegan wrote:
> On Fri, May 29, 2015 at 11:47 AM, Josh Berkus  wrote:
>> A. Extra commands and tools which aren't considered general enough, or
>> reliable enough, to be included by default, e.g. pg_standby, pgbench and
>> vacuumlo.
>>
>> B. Developer tools, like spi, start-scripts, and oid2name.
>>
>> C. "Core Extensions", which fall into three further groups:
>> C1: encryption extensions we can't include in core
>> for legal reasons (pg_crypto)
>> C2: example extensions which show useful things about
>> how to build an extension
>> C3: Admin extensions which are not core because they carry
>> risks (e.g. pgstattuple, auto_explain)
>> C4: Extensions which are generally useful, used, and
>> maintained with Postgres (e.g. hstore, citext)
> 
> I always liked the idea of organizing contrib along these lines.
> 
> I know that I will never be successful in convincing people to remove,
> say, contrib/isn, which is total garbage, but the next best thing is
> to categorize it in a way that sets expectations very low.

Well, contrib/isn is still useful (I use it).  But there's no good
reason it couldn't be on pgxn.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] initdb -S versus superuser check and Windows restricted mode

2015-05-29 Thread Tom Lane
I noticed that if you use "initdb -S", the code does its thing and
exits without ever calling get_restricted_token().  It doesn't get
to get_id() where the no-superuser check is, either.  Is this OK,
or should we reorder the operations so that fsyncing is done with
the usual restricted privileges?

You could argue that it's harmless to let root do a bunch of fsyncs,
and that's probably true, but on the other hand this doesn't meet
our usual expectations that no significant PG code runs as root.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Andres Freund
On 2015-05-29 16:37:00 -0400, Tom Lane wrote:
> Well, maybe we ought to call it an alpha not a beta, but I think we ought
> to put out some kind of release that we can encourage people to test.

I also do think it's important that we put out a beta (or alpha)
relatively soon. Both because we actually need input to find out what
works and what doesn't and also because it pushes us to tie up loose
ends.

A beta with open items isn't that bad a thing? There's many bigger
projects doing 4-8 betas releases before a major one; and most of them
have open items at the indvidual beta's release times.

I think we should define/document it so that there's no hard goal of
being compatible for beta releases and that the compatibility goal
starts with the first release candidate, and not the betas.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: 9.5 release notes may need ON CONFLICT DO NOTHING compatibility notice for FDW authors

2015-05-29 Thread Peter Geoghegan
On Mon, May 25, 2015 at 1:28 AM, Simon Riggs  wrote:
> My earlier summary was that the support for multiple constraints has been
> poorly thought through. This is an example of the breakage I have been
> complaining about when we are forced to specify the constraint
> (conflict-target).
>
> This is not just related to FDWs and should not be fixed solely for FDWs.
> This was already an open item for me in 9.5, now even more so.

I agree that the decision to change the current behavior has nothing
to do with FDWs. There is no reason to treat foreign tables
differently to local ones in this regard, which implies that ON
CONFLICT DO UPDATE cannot work with postgres_fdw unless and until
someone invents foreign constraints on foreign tables (I think), or
unless we change our mind generally (for other reasons). So,
certainly, the rationale for mandating (or not mandating) an inference
specification with ON CONFLICT DO UPDATE ought to come from balancing
concerns about safety, compatibility, flexibility, and so on.

I did not mean to imply that your comments were unreasonable/too late.
However, I don't see a lot of demand for changing the behavior. There
is at least some demand for accepting as arbiters multiple unique
constraints (that are not more or less equivalent), from Andres for
example, but that's a different question. It's also something that
could reasonably be added later.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Simon Riggs
On 29 May 2015 at 18:15, Josh Berkus  wrote:


> While I'm just doing this during testing


That part is good. I'm sure you will find something in need of improvement.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Re: 9.5 release notes may need ON CONFLICT DO NOTHING compatibility notice for FDW authors

2015-05-29 Thread Peter Geoghegan
On Thu, May 28, 2015 at 1:20 AM, Etsuro Fujita
 wrote:
> I think that those are interesting problems.  Wouldn't we need some
> additional hacks for the core or FDW to perform an operation that is
> equivalent to dynamically switching the ExecInsert/ExecForeignInsert
> processing to the ExecUpdate/ExecForeignUpdate processing in case of a
> conflict?

I did not imagine so. Rather, I thought that it was a matter of simply
introducing a way that foreign tables can have foreign constraints
recognizable by the local Postgres optimizer. The decision to  insert
or update must belong to the foreign server, since the feature could
be useful for systems like MySQL, and not just Postgres. I may be
mistaken.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Simon Riggs
On 29 May 2015 at 18:15, Josh Berkus  wrote:


> pg_drop_replication_slot() can be a time-critical function when the
> master is running out of disk space because the replica is falling
> behind.  So I was a little startled by this:
>
> cio=# select
> pg_drop_replication_slot('bdr_24577_6147720645156311471_1_25383__');
> ERROR:  replication slot "bdr_24577_6147720645156311471_1_25383__" is
> already active
>
> You have to first terminate the replication connection before you can
> delete the slot ... and do it fast enough that the replica doesn't
> reconnect before you drop the slot.
>

Why would you not stop the receiver first, then drop the slot?

Dropping the slot destroys any chance you have of recovering the downstream
server, so should not be done lightly.

That sounds like a critical fail to me, so making it easier to do that
doesn't sound cool. I oppose this suggestion.


> While I'm just doing this during testing, it could be a critical fail in
> production.  I think the simplest way to resolve this would be to add a
> boolean flag to pg_drop_replication_slot(), which would terminate the
> replication connection and delete the slot as a single operation.
>

If you really want it you can write a function to do that for private use.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Pavel Stehule
2015-05-29 21:59 GMT+02:00 Joshua D. Drake :

>
> On 05/29/2015 12:30 PM, Pavel Stehule wrote:
>
>  Contrib made sense years ago. It does not any longer. Let's put the
>> old horse down and raise a new herd of ponies on a new pasture.
>>
>>
>> Still there is strong sense - it is a referential implementation of our
>> extension API. We need it to find regressions, changes. I don't believe
>>
>
> No, then we need a proper test suite for the extension API.
>

maybe partially, but it is.

>
>  so external extensions can do it. Only PostGIS is massively accepted and
>> developed by more than few people. Personally I am thinking so removing
>> contrib is not good idea.
>>
>
> Is there an extension/contrib module in the last decade that more than
> once has shown to help us with that?
>

What I know - 9.5 transformations for testing on more platforms.

It is hard to calculate how often the code from contrib helps - but any
feature last four years has not to break contrib test too, so I believe it
enforce better API stability.

It is hard to imagine to design and maintaining any extension API without
platform like contrib. It can be renamed, divided, but some like contrib
must exists in core code base if PostgreSQL should be extensible database.


>
>
> Sincerely,
>
>
> JD
>
>
>
> --
> Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
> PostgreSQL Centered full stack support, consulting and development.
> Announcing "I'm offended" is basically telling the world you can't
> control your own emotions, so everyone else should do it for you.
>


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Tom Lane
Robert Haas  writes:
> I'm personally kind of astonished that we're even thinking about beta
> so soon.  I mean, we at least need to go through the stuff listed
> here, I think:
> https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items

Well, maybe we ought to call it an alpha not a beta, but I think we ought
to put out some kind of release that we can encourage people to test.
What you are suggesting is that we serialize resolution of the known
issues with discovery of new issues, and that's not an efficient use of
time.  Especially seeing that we're approaching the summer season where
we won't get much input at all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Stephen Frost
* Josh Berkus (j...@agliodbs.com) wrote:
> On 05/29/2015 11:30 AM, Stephen Frost wrote:
> > I know how big my WAL partition is.  Let me tell PG how big it is and to
> > not do anything that'll end up going over that amount, and we'll never
> > see a crash due to out of disk space for WAL again.
> 
> H.  Do we have a clear idea anywhere in server memory how many WAL
> segments there are?

Why does it need to be in shared memory..?

Clearly, when we're looking at cleaning up the WAL files, we know if the
archive command is failing and what file we're trying to archive, or if
we're not able to recycle a given file because we have logical
replication slots that want it, etc.

We certainly know where we're currently at in the WAL stream and we know
how big each WAL file is..

We just need a knob to be able to say "alright, this WAL file might
still be desired by something, but we're running out of room for *new*
WAL and, therefore, that's just too bad for those process that want it"
and recycle it anyway.  There are probably error conditions we have to
consider for replication slots when that happens, etc, but I don't think
we lack the info to make the decision, except for what value to set the
knob to, which is clearly system-dependent.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Robert Haas
On Fri, May 29, 2015 at 4:01 PM, Tom Lane  wrote:
> It's possible that we ought to give up on a pre-conference beta.
> Certainly a whole lot of time that I'd hoped would go into reviewing
> 9.5 feature commits has instead gone into back-branch bug chasing this
> week.

I'm personally kind of astonished that we're even thinking about beta
so soon.  I mean, we at least need to go through the stuff listed
here, I think:

https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items

The bigger issue is: what's NOT on that list that should be?  I think
we need to devote some cycles to figuring that out, and I sure haven't
had any this week.

In any case, I think the negative PR that we're going to get from not
getting this multixact stuff taken care of is going to far outweigh
any positive PR from getting 9.5beta1 out a little sooner, especially
if 9.5beta1 is bug-ridden because we gave it no time to settle.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Josh Berkus
On 05/29/2015 11:30 AM, Stephen Frost wrote:
> I know how big my WAL partition is.  Let me tell PG how big it is and to
> not do anything that'll end up going over that amount, and we'll never
> see a crash due to out of disk space for WAL again.

H.  Do we have a clear idea anywhere in server memory how many WAL
segments there are?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Joshua D. Drake

On 05/29/2015 01:03 PM, Stephen Frost wrote:

* Tom Lane (t...@sss.pgh.pa.us) wrote:

It's possible that we ought to give up on a pre-conference beta.
Certainly a whole lot of time that I'd hoped would go into reviewing
9.5 feature commits has instead gone into back-branch bug chasing this
week.


I guess that's what I'm getting at.  We need to take care of the
back-branches and that means pushing beta back.


+1

JD


--
The most kicking donkey PostgreSQL Infrastructure company in existence.
The oldest, the most experienced, the consulting company to the stars.
Command Prompt, Inc. http://www.commandprompt.com/ +1 -503-667-4564 -
24x7 - 365 - Proactive and Managed Professional Services!


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Bruce Momjian
On Fri, May 29, 2015 at 04:01:00PM -0400, Tom Lane wrote:
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> I am unclear if we are anywhere near ready for beta1 even in June.  Are
> >> we?
> 
> > I'm all about having that discussion...  but can we do it on another
> > thread or at least wait til we've decided about the back-branch
> > releases?  They are clearly the more important issue to consider.
> 
> It's the same discussion though, ie what releases are we expecting to
> get out in the next couple of weeks.

Agreed.  If we want to put out beta1 before PGCon, I need to start on
the release notes on Monday.

> It's possible that we ought to give up on a pre-conference beta.
> Certainly a whole lot of time that I'd hoped would go into reviewing
> 9.5 feature commits has instead gone into back-branch bug chasing this
> week.

Based on what has transpired in the past two weeks, I am thinking we
need to move _slower_, not faster.  I am concerned we have focused so
much on new features that we have taken our eye off of reliability.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> It's possible that we ought to give up on a pre-conference beta.
> Certainly a whole lot of time that I'd hoped would go into reviewing
> 9.5 feature commits has instead gone into back-branch bug chasing this
> week.

I guess that's what I'm getting at.  We need to take care of the
back-branches and that means pushing beta back.  I fully expect a good
discussion on when to release beta when we get closer on that, but we're
not going to be close while we have outstanding big back-branch bugs.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Tom Lane
Stephen Frost  writes:
> * Bruce Momjian (br...@momjian.us) wrote:
>> I am unclear if we are anywhere near ready for beta1 even in June.  Are
>> we?

> I'm all about having that discussion...  but can we do it on another
> thread or at least wait til we've decided about the back-branch
> releases?  They are clearly the more important issue to consider.

It's the same discussion though, ie what releases are we expecting to
get out in the next couple of weeks.

It's possible that we ought to give up on a pre-conference beta.
Certainly a whole lot of time that I'd hoped would go into reviewing
9.5 feature commits has instead gone into back-branch bug chasing this
week.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Joshua D. Drake


On 05/29/2015 12:30 PM, Pavel Stehule wrote:


Contrib made sense years ago. It does not any longer. Let's put the
old horse down and raise a new herd of ponies on a new pasture.


Still there is strong sense - it is a referential implementation of our
extension API. We need it to find regressions, changes. I don't believe


No, then we need a proper test suite for the extension API.


so external extensions can do it. Only PostGIS is massively accepted and
developed by more than few people. Personally I am thinking so removing
contrib is not good idea.


Is there an extension/contrib module in the last decade that more than 
once has shown to help us with that?



Sincerely,

JD



--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Stephen Frost
* Bruce Momjian (br...@momjian.us) wrote:
> On Fri, May 29, 2015 at 03:32:57PM -0400, Tom Lane wrote:
> > I know Josh doesn't like to do beta1 releases concurrently with back
> > branches because it confuses the PR messaging.  But we could make an
> > exception perhaps; or do all those releases the same week but announce
> > the beta the day after the bugfix releases.
> > 
> > Or we just let the beta slide till after PGCon, but then I think we're
> > missing some excitement factor.
> 
> I am unclear if we are anywhere near ready for beta1 even in June.  Are
> we?

I'm all about having that discussion...  but can we do it on another
thread or at least wait til we've decided about the back-branch
releases?  They are clearly the more important issue to consider.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Bruce Momjian
On Fri, May 29, 2015 at 03:32:57PM -0400, Tom Lane wrote:
> I know Josh doesn't like to do beta1 releases concurrently with back
> branches because it confuses the PR messaging.  But we could make an
> exception perhaps; or do all those releases the same week but announce
> the beta the day after the bugfix releases.
> 
> Or we just let the beta slide till after PGCon, but then I think we're
> missing some excitement factor.

I am unclear if we are anywhere near ready for beta1 even in June.  Are
we?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote:
> On Fri, May 29, 2015 at 9:46 PM, Stephen Frost  wrote:
> 
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > > Magnus Hagander  writes:
> > > > On Fri, May 29, 2015 at 9:32 PM, Tom Lane  wrote:
> > > >> I think there's no way that we wait more than one additional week to
> > push
> > > >> the fsync fix.  So the problem is not with scheduling the update
> > releases,
> > > >> it's with whether we can also fit in a 9.5 beta release before PGCon.
> > >
> > > > I think 9.5 beta has to stand back. The question is what we do with the
> > > > potentially two minor releases. Then we can slot in the beta whenever.
> > >
> > > > If we do the minor as currently planned, can we do another one the week
> > > > after to deal with the multixact issues? (scheduling wise we're going
> > to
> > > > have to do one the week after *regardless*, the question is if we can
> > make
> > > > two different ones, or if we need to fold them into one)
> > >
> > > I suppose we could, but it doubles the amount of release gruntwork
> > > involved, and it doesn't exactly make us look good to our users either.
> >
> > Agreed.  Makes it look like we can't manage to figure out our bugs and
> > put fixes for them together in sensible releases..
> >
> 
> The flipside of that is that we have a bug fix that's preventing peoples
> databases from starting, and we're the intentionally delaying the shipment
> of it. Though i guess a mitigating fact there is that it is very easy to
> manually recover from that. But it's painful if your db server restarts
> awhen you're not around...

And we have *another* fix for a *data corruption* bug which is coming in
the following *week*.

Yes, I think delaying a week to get both in is better than putting out a
fix for one bug when we *know* there's a data corruption bug sitting in
that code, and we're putting out a fix for it the following week.

If we were talking about a month-long delay, that'd be one thing, but
that isn't the impression I've got about what we're talking about.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Bruce Momjian
On Thu, May 28, 2015 at 07:24:26PM -0400, Robert Haas wrote:
> On Thu, May 28, 2015 at 4:06 PM, Joshua D. Drake  
> wrote:
> > FTR: Robert, you have been a Samurai on this issue. Our many thanks.
> 
> Thanks!  I really appreciate the kind words.
> 
> So, in thinking through this situation further, it seems to me that
> the situation is pretty dire:
> 
> 1. If you pg_upgrade to 9.3 before 9.3.5, then you may have relminmxid
> or datminmxid values which are 1 instead of the correct value.
> Setting the value to 1 was too far in the past if your MXID counter is
> < 2B, and too far in the future if your MXID counter is > 2B.
> 
> 2. If you pg_upgrade to 9.3.7 or 9.4.2, then you may have datminmxid
> values which are equal to the next-mxid counter instead of the correct
> value; in other words, they are two new.
> 
> 3. If you pg_upgrade to 9.3.5, 9.3.6, 9.4.0, or 9.4.1, then you will
> have the first problem for tables in template databases, and the
> second one for the rest. (See 866f3017a.)

I think we need to step back and look at the brain power required to
unravel the mess we have made regarding multi-xact and fixes.  (I bet
few people can even remember which multi-xact fixes went into which
releases --- I can't.)  Instead of working on actual features, we are
having to do this complex diagnosis because we didn't do a thorough
analysis at the time a pattern of multi-xact bugs started to appear. 
Many projects deal with this compound bug debt regularly, but we have
mostly avoided this fate.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Magnus Hagander
On Fri, May 29, 2015 at 9:46 PM, Stephen Frost  wrote:

> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > Magnus Hagander  writes:
> > > On Fri, May 29, 2015 at 9:32 PM, Tom Lane  wrote:
> > >> I think there's no way that we wait more than one additional week to
> push
> > >> the fsync fix.  So the problem is not with scheduling the update
> releases,
> > >> it's with whether we can also fit in a 9.5 beta release before PGCon.
> >
> > > I think 9.5 beta has to stand back. The question is what we do with the
> > > potentially two minor releases. Then we can slot in the beta whenever.
> >
> > > If we do the minor as currently planned, can we do another one the week
> > > after to deal with the multixact issues? (scheduling wise we're going
> to
> > > have to do one the week after *regardless*, the question is if we can
> make
> > > two different ones, or if we need to fold them into one)
> >
> > I suppose we could, but it doubles the amount of release gruntwork
> > involved, and it doesn't exactly make us look good to our users either.
>
> Agreed.  Makes it look like we can't manage to figure out our bugs and
> put fixes for them together in sensible releases..
>

The flipside of that is that we have a bug fix that's preventing peoples
databases from starting, and we're the intentionally delaying the shipment
of it. Though i guess a mitigating fact there is that it is very easy to
manually recover from that. But it's painful if your db server restarts
awhen you're not around...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Magnus Hagander  writes:
> > On Fri, May 29, 2015 at 9:32 PM, Tom Lane  wrote:
> >> I think there's no way that we wait more than one additional week to push
> >> the fsync fix.  So the problem is not with scheduling the update releases,
> >> it's with whether we can also fit in a 9.5 beta release before PGCon.
> 
> > I think 9.5 beta has to stand back. The question is what we do with the
> > potentially two minor releases. Then we can slot in the beta whenever.
> 
> > If we do the minor as currently planned, can we do another one the week
> > after to deal with the multixact issues? (scheduling wise we're going to
> > have to do one the week after *regardless*, the question is if we can make
> > two different ones, or if we need to fold them into one)
> 
> I suppose we could, but it doubles the amount of release gruntwork
> involved, and it doesn't exactly make us look good to our users either.

Agreed.  Makes it look like we can't manage to figure out our bugs and
put fixes for them together in sensible releases..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Tom Lane
Magnus Hagander  writes:
> On Fri, May 29, 2015 at 9:32 PM, Tom Lane  wrote:
>> I think there's no way that we wait more than one additional week to push
>> the fsync fix.  So the problem is not with scheduling the update releases,
>> it's with whether we can also fit in a 9.5 beta release before PGCon.

> I think 9.5 beta has to stand back. The question is what we do with the
> potentially two minor releases. Then we can slot in the beta whenever.

> If we do the minor as currently planned, can we do another one the week
> after to deal with the multixact issues? (scheduling wise we're going to
> have to do one the week after *regardless*, the question is if we can make
> two different ones, or if we need to fold them into one)

I suppose we could, but it doubles the amount of release gruntwork
involved, and it doesn't exactly make us look good to our users either.

I believe Christoph indicated that he was going to cherry-pick the fsync
patch and push out an intermediate Debian package with that fix, so at
least for that community there is not an urgent reason to get out a set
of releases with only the fsync fixes and not the multixact fixes.  I'm
not clear though on how many of the other reports we heard came from
Debian users.  (Some of them did, but maybe not all.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Josh Berkus
On 05/29/2015 12:27 PM, Andres Freund wrote:
> On 2015-05-29 12:08:24 -0700, Josh Berkus wrote:
>> Now, BDR is good because it sets an application_name which lets me
>> figure out what's using the replication slot.  But that's by no means
>> required; other LC plug-ins, I expect, do not do so.  So there's no way
>> for the user to figure out which replication connection relates to which
>> slots, as far as I can tell.
>>
>> In this test, it's easy because there's only one replication connection
>> and one slot.  But imagine the case of 14 replication connections with
>> their own slots.  How could you possibly figure out which one was the
>> laggy one?
> 
> 9.5 shows the pid.

OK, will test, thanks.

--Josh Berkus


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> (I can't see doing a beta *during* PGCon week.  I for one am going to be
> on an airplane at the time I'd normally have to be Doing Release Stuff.)
[...]
> Or we just let the beta slide till after PGCon, but then I think we're
> missing some excitement factor.

Personally, I'd be all for a "watch Tom do the 9.5 beta release!"
Unconference slot...

:)

(mostly kidding, but I'm 100% sure it'd draw a huge crowd..)

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Magnus Hagander
On Fri, May 29, 2015 at 9:32 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > On Fri, May 29, 2015 at 8:54 PM, Stephen Frost 
> wrote:
> >> I just caution that we appreciate PGCon coming up and that we do our
> >> best to avoid running into a case where we have to push it further due
> >> to everyone being at the conference.
>
> > If we plan it, we certainly *can* make a release during pgcon. If that's
> > what the reasonable timing comes down to, I think getting these fixes out
> > definitely has to be considered more important than the conference, so a
> > few of us will just have to take a break...
>
> I think there's no way that we wait more than one additional week to push
> the fsync fix.  So the problem is not with scheduling the update releases,
> it's with whether we can also fit in a 9.5 beta release before PGCon.
>

I think 9.5 beta has to stand back. The question is what we do with the
potentially two minor releases. Then we can slot in the beta whenever.

If we do the minor as currently planned, can we do another one the week
after to deal with the multixact issues? (scheduling wise we're going to
have to do one the week after *regardless*, the question is if we can make
two different ones, or if we need to fold them into one)


(I can't see doing a beta *during* PGCon week.  I for one am going to be
> on an airplane at the time I'd normally have to be Doing Release Stuff.)
>

Agreed. We can push a *minor* during pgcon, but not beta.


I know Josh doesn't like to do beta1 releases concurrently with back
> branches because it confuses the PR messaging.  But we could make an
> exception perhaps; or do all those releases the same week but announce
> the beta the day after the bugfix releases.
>


I can't comment on the PR parts, I'll leave that to Josh.



>
> Or we just let the beta slide till after PGCon, but then I think we're
> missing some excitement factor.
>

Well, most of the people going to pgcon know it already. And most of the
excitement affects people who are not at pgcon (simply based on that most
of our users are not at pgcon). If doing it the week after pgcon is what
ends up making sense once weve figured out what to do with the minors, then
so be it, IMNSHO.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Tom Lane
Magnus Hagander  writes:
> On Fri, May 29, 2015 at 8:54 PM, Stephen Frost  wrote:
>> I just caution that we appreciate PGCon coming up and that we do our
>> best to avoid running into a case where we have to push it further due
>> to everyone being at the conference.

> If we plan it, we certainly *can* make a release during pgcon. If that's
> what the reasonable timing comes down to, I think getting these fixes out
> definitely has to be considered more important than the conference, so a
> few of us will just have to take a break...

I think there's no way that we wait more than one additional week to push
the fsync fix.  So the problem is not with scheduling the update releases,
it's with whether we can also fit in a 9.5 beta release before PGCon.

(I can't see doing a beta *during* PGCon week.  I for one am going to be
on an airplane at the time I'd normally have to be Doing Release Stuff.)

I know Josh doesn't like to do beta1 releases concurrently with back
branches because it confuses the PR messaging.  But we could make an
exception perhaps; or do all those releases the same week but announce
the beta the day after the bugfix releases.

Or we just let the beta slide till after PGCon, but then I think we're
missing some excitement factor.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [CORE] postpone next week's release

2015-05-29 Thread Joshua D. Drake


On 05/29/2015 12:18 PM, Robert Haas wrote:


On Fri, May 29, 2015 at 3:09 PM, Magnus Hagander  wrote:

Do you have any feeling of how likely people are to actually hit the
multixact one? I've followed some of that impressive debugging you guys did,
and I know it's a pretty critical bug if you hit it, but how wide-spread
will it be?


That precise problem has been reported a few times, but it may not be
widespread.  I don't know.  My bigger concern is that, at present,
taking a base backup is broken.


This I think is the bigger issue. They both are horrible but basebackup 
being broken is rather... egregious.


JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Pavel Stehule
2015-05-29 21:20 GMT+02:00 Joshua D. Drake :

>
> On 05/29/2015 11:27 AM, Jeff Janes wrote:
>
>  It would be less confusing for users. Contrib modules seem to be
>> first class extensions, leaving doubt on other extensions.
>>
>>
>> Presumably there are still going to be some extensions maintained by
>> -hackers, and some not.  I don't think we are going to change that, so
>> the difference will still need to be explained, regardless of what words
>> are used.  And people *should* have doubts about other extensions.
>> Couldn't any talented programmer write an extension which gives them a
>> backdoor into the deployer's system, and then upload it to github?
>>
>
> Yes, it is called Open Source development.
>
>
>> I would certainly be cautious about installing any old extension I found
>> some some place on the internet.
>>
>> But the fact they aren't in core make them not fully trusted by some
>> users.
>>
>
> No. This is completely wrong thinking. If you are compiling this stuff
> from source you are taking that risk on yourself.
>
> Most people are not compiling from source, they are installing from a
> distribution (or downloading a distribution package).
>
>
>> Trying to explain all that in a training is a PITA. It would be much
>> less confusing if they were either in core or in their own repository.
>>
>> Several of the contrib modules should be kept in tight sync with src or
>> else testing and debugging would be a disaster. Putting them in
>> different git repositories wouldn't work well.  Unless those would among
>> the ones moved to "core".
>>
>>
> Note: I actually don't care if the current contrib gets pushed into core
> proper and is default installed.
>
> I care about this idea that contrib exists. It isn't needed and leads to a
> discussion like this one (or the pg_audit), almost every release.
>
> Contrib made sense years ago. It does not any longer. Let's put the old
> horse down and raise a new herd of ponies on a new pasture.
>

Still there is strong sense - it is a referential implementation of our
extension API. We need it to find regressions, changes. I don't believe so
external extensions can do it. Only PostGIS is massively accepted and
developed by more than few people. Personally I am thinking so removing
contrib is not good idea.

Pavel


>
> JD
>
> --
> Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
> PostgreSQL Centered full stack support, consulting and development.
> Announcing "I'm offended" is basically telling the world you can't
> control your own emotions, so everyone else should do it for you.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote:
> On Fri, May 29, 2015 at 8:54 PM, Stephen Frost  wrote:
> 
> > * Robert Haas (robertmh...@gmail.com) wrote:
> > > I think we should postpone next week's release.  I have been hard at
> > > work on the multixact-related bugs that were reported in 9.4.2 and
> > > 9.3.7, and the subsequent bugs found by code-reading, but getting them
> > > all fixed by Monday doesn't seem realistic.  Such fixes should have
> > > careful review, and not be dashed into the tree under time pressure.
> > >
> > > We could do the release anyway to relieve the pain caused by the
> > > fsync-pgdata hard-failure problem, but it seems to me that if we do
> > > that, we're just going to end up having to do yet another release
> > > almost right away.  I think it would be better to wait and do one
> > > release that fixes both sets of issues.
> >
> > Agreed.
> >
> > I just caution that we appreciate PGCon coming up and that we do our
> > best to avoid running into a case where we have to push it further due
> > to everyone being at the conference.
> 
> If we plan it, we certainly *can* make a release during pgcon. If that's
> what the reasonable timing comes down to, I think getting these fixes out
> definitely has to be considered more important than the conference, so a
> few of us will just have to take a break...

I don't disagree with you about any of that, just wanted to make mention
of the timing.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Andres Freund
On 2015-05-29 12:08:24 -0700, Josh Berkus wrote:
> Now, BDR is good because it sets an application_name which lets me
> figure out what's using the replication slot.  But that's by no means
> required; other LC plug-ins, I expect, do not do so.  So there's no way
> for the user to figure out which replication connection relates to which
> slots, as far as I can tell.
> 
> In this test, it's easy because there's only one replication connection
> and one slot.  But imagine the case of 14 replication connections with
> their own slots.  How could you possibly figure out which one was the
> laggy one?

9.5 shows the pid.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Andres Freund
On 2015-05-29 14:39:02 -0400, Stephen Frost wrote:
> * Andres Freund (and...@anarazel.de) wrote:
> > How is this measurably worse than trying to truncate a log table that
> > has grown too large? That's often harder to fight actually, because
> > there's dozens of other processes that might be using the relation?  In
> > one case you don't have wait ordering, but only one locker, in the other
> > case you have multiple waiters, and to benefit from wait ordering you
> > need multiple sessions.
> 
> Because we don't fall over if we can't extend a relation.
> 
> We do fall over if we can't write WAL.

As nearly everybody uses the same filesystem for pg_xlog and the actual
databases, that distinction isn't worth much. You'll still fail when
writing the WAL, even if the disk space has been used by a relation
instead of WAL.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Joshua D. Drake


On 05/29/2015 12:08 PM, Josh Berkus wrote:


Now, BDR is good because it sets an application_name which lets me
figure out what's using the replication slot.  But that's by no means
required; other LC plug-ins, I expect, do not do so.  So there's no way
for the user to figure out which replication connection relates to which
slots, as far as I can tell.

In this test, it's easy because there's only one replication connection
and one slot.  But imagine the case of 14 replication connections with
their own slots.  How could you possibly figure out which one was the
laggy one?


The client_addr?

JD






--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Joshua D. Drake


On 05/29/2015 11:27 AM, Jeff Janes wrote:


It would be less confusing for users. Contrib modules seem to be
first class extensions, leaving doubt on other extensions.


Presumably there are still going to be some extensions maintained by
-hackers, and some not.  I don't think we are going to change that, so
the difference will still need to be explained, regardless of what words
are used.  And people *should* have doubts about other extensions.
Couldn't any talented programmer write an extension which gives them a
backdoor into the deployer's system, and then upload it to github?


Yes, it is called Open Source development.



I would certainly be cautious about installing any old extension I found
some some place on the internet.

But the fact they aren't in core make them not fully trusted by some
users.


No. This is completely wrong thinking. If you are compiling this stuff 
from source you are taking that risk on yourself.


Most people are not compiling from source, they are installing from a 
distribution (or downloading a distribution package).




Trying to explain all that in a training is a PITA. It would be much
less confusing if they were either in core or in their own repository.

Several of the contrib modules should be kept in tight sync with src or
else testing and debugging would be a disaster. Putting them in
different git repositories wouldn't work well.  Unless those would among
the ones moved to "core".



Note: I actually don't care if the current contrib gets pushed into core 
proper and is default installed.


I care about this idea that contrib exists. It isn't needed and leads to 
a discussion like this one (or the pg_audit), almost every release.


Contrib made sense years ago. It does not any longer. Let's put the old 
horse down and raise a new herd of ponies on a new pasture.


JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [CORE] postpone next week's release

2015-05-29 Thread Robert Haas
On Fri, May 29, 2015 at 3:09 PM, Magnus Hagander  wrote:
> Do you have any feeling of how likely people are to actually hit the
> multixact one? I've followed some of that impressive debugging you guys did,
> and I know it's a pretty critical bug if you hit it, but how wide-spread
> will it be?

That precise problem has been reported a few times, but it may not be
widespread.  I don't know.  My bigger concern is that, at present,
taking a base backup is broken.  I haven't figured out the exact
reproduction scenario, but I think it's something like this:

- begin base backup
- checkpoint happens, truncating pg_multixact
- at this point pg_multixact gets copied
- end base backup

I think what will happen on replay is that replaying the checkpoint,
it will try to reference pg_multixact files that don't exist any more
and die with a fatal error.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Joshua D. Drake


On 05/29/2015 11:02 AM, Jeff Janes wrote:



Also, removing a feature is a regression, and someone is always
bound to complain... What is the real benefit? ISTM that it is a
solution that fixes no important problem. Reaching a consensus about
what to move here or there will consume valuable time that could be
spent on more important tasks... Is it worth it?


Yeah, I don't really see the benefit either.  Some could be moved to
core, and some could just be removed, but many of them it just seems
like we would end up inventing a new 'contrib' to which is the same as
the old, but with a different name.


Name one.

Sincerely,

JD
--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Magnus Hagander
On Fri, May 29, 2015 at 8:54 PM, Stephen Frost  wrote:

> * Robert Haas (robertmh...@gmail.com) wrote:
> > I think we should postpone next week's release.  I have been hard at
> > work on the multixact-related bugs that were reported in 9.4.2 and
> > 9.3.7, and the subsequent bugs found by code-reading, but getting them
> > all fixed by Monday doesn't seem realistic.  Such fixes should have
> > careful review, and not be dashed into the tree under time pressure.
> >
> > We could do the release anyway to relieve the pain caused by the
> > fsync-pgdata hard-failure problem, but it seems to me that if we do
> > that, we're just going to end up having to do yet another release
> > almost right away.  I think it would be better to wait and do one
> > release that fixes both sets of issues.
>
> Agreed.
>
> I just caution that we appreciate PGCon coming up and that we do our
> best to avoid running into a case where we have to push it further due
> to everyone being at the conference.
>

If we plan it, we certainly *can* make a release during pgcon. If that's
what the reasonable timing comes down to, I think getting these fixes out
definitely has to be considered more important than the conference, so a
few of us will just have to take a break...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [CORE] postpone next week's release

2015-05-29 Thread Magnus Hagander
On Fri, May 29, 2015 at 8:02 PM, Robert Haas  wrote:

> Hi,
>
> I think we should postpone next week's release.  I have been hard at
> work on the multixact-related bugs that were reported in 9.4.2 and
> 9.3.7, and the subsequent bugs found by code-reading, but getting them
> all fixed by Monday doesn't seem realistic.  Such fixes should have
> careful review, and not be dashed into the tree under time pressure.
>
> We could do the release anyway to relieve the pain caused by the
> fsync-pgdata hard-failure problem, but it seems to me that if we do
> that, we're just going to end up having to do yet another release
> almost right away.  I think it would be better to wait and do one
> release that fixes both sets of issues.
>
> Thoughts?
>

I'm a bit split on this.

We *definitely* don't want to release the multixact fix without it being
carefully reviewed, that's the part I'm not split about :) And I fully
appreciate we can't have that done by monday.

However, the file-permission thing seems to hit quite a few people (have we
ever had this many bug reports after a minor release), which means wed
really want to get that out quickly.

Do you have any feeling of how likely people are to actually hit the
multixact one? I've followed some of that impressive debugging you guys
did, and I know it's a pretty critical bug if you hit it, but how
wide-spread will it be?

I guess one option we could do is encourage packagers to push updated
packages (-2 versions) basically. But if we do that, perhaps we might as
well release anyway?

AIUI, the permission thing won't actually be very likely to affect Windows
users. And Windows packages are the ones that take by far the most work to
make. Perhaps we should consider skipping making packages of that version
on Windows, and then plan to push yet another minor one or two weeks later,
that goes out on all platforms?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Robert Haas
On Fri, May 29, 2015 at 12:43 PM, Robert Haas  wrote:
> Working on that now.

OK, here's a patch.  Actually two patches, differing only in
whitespace, for 9.3 and for master (ha!).  I now think that the root
of the problem here is that DetermineSafeOldestOffset() and
SetMultiXactIdLimit() were largely ignorant of the possibility that
they might be called at points in time when the cluster was
inconsistent.  SetMultiXactIdLimit() bracketed certain parts of its
logic with if (!InRecovery), but those guards were ineffective because
it gets called before InRecovery is set in the first place.

It seems pretty clear that we can't effectively determine anything
about member wraparound until the cluster is consistent.  Before then,
there might be files missing from the offsets or members SLRUs which
get put back during replay.  There could even be gaps in the sequence
of files, with some things having made it to disk before the crash (or
having made it into the backup) and others not.  So all the work of
determining what the safe stop points and vacuum thresholds for
members are needs to be postponed until TrimMultiXact() time.  And
that's fine, because we don't need this information in recovery anyway
- it only affects behavior in normal running.

So this patch does the following:

1. Moves the call to DetermineSafeOldestOffset() that appears in
StartupMultiXact() to TrimMultiXact(), so that we don't try to do this
until we're consistent.  Also, instead of passing
MultiXactState->oldestMultiXactId, pass the newer of that value and
the earliest offset that exists on disk.  That way, it won't try to
read data that's not there.  Note that the second call to
DetermineSafeOldestOffset() in TruncateMultiXact() doesn't need a
similar guard, because we already bail out of that function early if
the multixacts we're going to truncate away don't exist.

2. Adds a new flag MultiXactState->didTrimMultiXact indicate whether
we've finished TrimMultiXact(), and arranges for SetMultiXactIdLimit()
to use that rather than InRecovery to test whether it's safe to do
complicated things that might require that the cluster is consistent.
This is a slight behavior change, since formerly we would have tried
to do that stuff very early in the startup process, and now it won't
happen until somebody completes a vacuum operation.  If that's a
problem, we could consider doing it in TrimMultiXact(), but I don't
think it's safe the way it was.  The new flag also prevents
oldestOffset from being set while in recovery; I think it would be
safe to do that in recovery once we've reached consistency, but I
don't believe it's necessary.

3. Arranges for TrimMultiXact() to set oldestOffset.  This is
necessary because, previously, we relied on SetMultiXactIdLimit doing
that during early startup or during recovery, and that's no longer
true.  Here too we set oldestOffset keeping in mind that our notion of
the oldest multixact may point to something that doesn't exist; if so,
we use the oldest MXID that does.

4. Modifies TruncateMultiXact() so that it doesn't re-scan the SLRU
directory on every call to find the oldest file that exists.  Instead,
it arranges to remember the value from the first scan and then updates
it thereafter to reflect its own truncation activity.  This isn't
absolutely necessary, but because this oldest-file logic is used in
multiple places (TrimMultiXact, SetMultiXactIdLimit, and
TruncateMultiXact all need it directly or indirectly) caching the
value seems like a better idea than recomputing it frequently.

I have tested that this patch fixes Steve Kehlet's problem, or at
least what I believe to be Steve Kehlet's problem based on the
reproduction scenario I described upthread.  I believe it will also
fix the problems with starting up from a base backup with Alvaro
mentioned upthread.  It won't fix the fact that pg_upgrade is putting
a wrong value into everybody's datminmxid field, which should really
be addressed too, but I've been working on this for about three days
virtually non-stop and I don't have the energy to tackle it right now.
If anyone feels the urge to step into that breech, I think what it
needs to do is: when upgrading from a 9.3-or-later instance, copy over
each database's datminmxid into the corresponding database in the new
cluster.

Aside from that, it's very possible that despite my best efforts this
has serious bugs.  Review and testing would be very much appreciated.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 699497c..8d28a5c 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -197,8 +197,9 @@ typedef struct MultiXactStateData
 	MultiXactOffset nextOffset;
 
 	/*
-	 * Oldest multixact that is still on disk.  Anything older than this
-	 * should not be consulted.  These values are updated by vacuum.
+	 * Old

Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Josh Berkus
So, here's an example of why it's hard to give our users a workaround.

cio=# select * from pg_replication_slots;
slot_name| plugin | slot_type | datoid |
database | active | xmin | catalog_xmin | restart_lsn
-++---++--++--+--+-
 bdr_24577_6147720645156311471_1_26507__ | bdr| logical   |  24577 |
cio  | t  |  |  906 | 0/1C4F410
(1 row)

cio=# select
pg_drop_replication_slot('bdr_24577_6147720645156311471_1_26507__');
ERROR:  replication slot "bdr_24577_6147720645156311471_1_26507__" is
already active
cio=# select * from pg_stat_replication;
-[ RECORD 1 ]+---
pid  | 28481
usesysid | 10
usename  | postgres
application_name | bdr (6147720645156311471,1,26507,):receive
client_addr  | 172.17.0.11
client_hostname  |
client_port  | 44583
backend_start| 2015-05-29 18:10:34.601796+00
backend_xmin |
state| streaming
sent_location| 0/1C4F448
write_location   | 0/1C4F448
flush_location   | 0/1C4F448
replay_location  | 0/1C4F448
sync_priority| 0
sync_state   | async

Now, BDR is good because it sets an application_name which lets me
figure out what's using the replication slot.  But that's by no means
required; other LC plug-ins, I expect, do not do so.  So there's no way
for the user to figure out which replication connection relates to which
slots, as far as I can tell.

In this test, it's easy because there's only one replication connection
and one slot.  But imagine the case of 14 replication connections with
their own slots.  How could you possibly figure out which one was the
laggy one?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fsync-pgdata-on-recovery tries to write to more files than previously

2015-05-29 Thread Christoph Berg
Re: Tom Lane 2015-05-29 <13871.1432921...@sss.pgh.pa.us>
> Why can't the user stop it?  We won't be bleating about the case of a
> symlink to a non-writable file someplace else, which is the Debian use
> case.  I don't see a very good excuse to have a non-writable file right
> in the data directory.

I've repeatedly seen PGDATA or pg_xlog been put directly on a
mountpoint, which means there well be a non-writable lost+found
directory there. (A case with pg_xlog was also reported as a support
case at credativ.) I'm usually advising against using the top level
directory directly, but it's not uncommon to encounter it.

> In any case, if the cost of such a file is one more line of log output
> during a crash restart, most people would have no problem at all in
> ignoring that log output.

Nod.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-29 Thread Bruce Momjian
On Fri, May 29, 2015 at 02:54:31PM -0400, Stephen Frost wrote:
> * Robert Haas (robertmh...@gmail.com) wrote:
> > I think we should postpone next week's release.  I have been hard at
> > work on the multixact-related bugs that were reported in 9.4.2 and
> > 9.3.7, and the subsequent bugs found by code-reading, but getting them
> > all fixed by Monday doesn't seem realistic.  Such fixes should have
> > careful review, and not be dashed into the tree under time pressure.
> > 
> > We could do the release anyway to relieve the pain caused by the
> > fsync-pgdata hard-failure problem, but it seems to me that if we do
> > that, we're just going to end up having to do yet another release
> > almost right away.  I think it would be better to wait and do one
> > release that fixes both sets of issues.
> 
> Agreed.
> 
> I just caution that we appreciate PGCon coming up and that we do our
> best to avoid running into a case where we have to push it further due
> to everyone being at the conference.

This brings up the issue of when we want to do 9.5 beta.  Ideas?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postpone next week's release

2015-05-29 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> I think we should postpone next week's release.  I have been hard at
> work on the multixact-related bugs that were reported in 9.4.2 and
> 9.3.7, and the subsequent bugs found by code-reading, but getting them
> all fixed by Monday doesn't seem realistic.  Such fixes should have
> careful review, and not be dashed into the tree under time pressure.
> 
> We could do the release anyway to relieve the pain caused by the
> fsync-pgdata hard-failure problem, but it seems to me that if we do
> that, we're just going to end up having to do yet another release
> almost right away.  I think it would be better to wait and do one
> release that fixes both sets of issues.

Agreed.

I just caution that we appreciate PGCon coming up and that we do our
best to avoid running into a case where we have to push it further due
to everyone being at the conference.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] postpone next week's release

2015-05-29 Thread Bruce Momjian
On Fri, May 29, 2015 at 02:02:43PM -0400, Robert Haas wrote:
> Hi,
> 
> I think we should postpone next week's release.  I have been hard at
> work on the multixact-related bugs that were reported in 9.4.2 and
> 9.3.7, and the subsequent bugs found by code-reading, but getting them
> all fixed by Monday doesn't seem realistic.  Such fixes should have
> careful review, and not be dashed into the tree under time pressure.
> 
> We could do the release anyway to relieve the pain caused by the
> fsync-pgdata hard-failure problem, but it seems to me that if we do
> that, we're just going to end up having to do yet another release
> almost right away.  I think it would be better to wait and do one
> release that fixes both sets of issues.

It does seem wise to make sure we have all these items fixed.  We have
PR'ed the recovery failure issue so I think we are good at this point. 
I see having to put out another multi-xact-only fix release the week
after as being a bigger negative.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Josh Berkus
All,

So there are currently three kinds of things in contrib:

A. Extra commands and tools which aren't considered general enough, or
reliable enough, to be included by default, e.g. pg_standby, pgbench and
vacuumlo.

B. Developer tools, like spi, start-scripts, and oid2name.

C. "Core Extensions", which fall into three further groups:
C1: encryption extensions we can't include in core
for legal reasons (pg_crypto)
C2: example extensions which show useful things about
how to build an extension
C3: Admin extensions which are not core because they carry
risks (e.g. pgstattuple, auto_explain)
C4: Extensions which are generally useful, used, and
maintained with Postgres (e.g. hstore, citext)

So A and B need to stay somewhere in the source distribution.  I would
like to see them go into /admin-tools and /developer-tools directories;
I believe that Greg Smith came up with a patch to do just this at
sometime in the past.  C2 could also go into /developer-tools, and C3
are really just more admin-tools.  C1 would need to stick around as a
special case.

To move C4, you'd have to solve the problem of "how do we turn a former
external extension into a core feature", which nobody yet has solved.

All of this ignores the critical part of this, which is packaging.
Right now packagers ship a "contrib" package which includes everything
in /contrib.  Shifting to having any other arrangement is going to
involve working with them and convincing them that a change to packaging
is worthwhile.  And then getting the news to our users.

Given that, there needs to be significant benefit to our users in the
change.  So, what's the benefit?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Stephen Frost
* Andres Freund (and...@anarazel.de) wrote:
> How is this measurably worse than trying to truncate a log table that
> has grown too large? That's often harder to fight actually, because
> there's dozens of other processes that might be using the relation?  In
> one case you don't have wait ordering, but only one locker, in the other
> case you have multiple waiters, and to benefit from wait ordering you
> need multiple sessions.

Because we don't fall over if we can't extend a relation.

We do fall over if we can't write WAL.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Josh Berkus
On 05/29/2015 11:07 AM, Andres Freund wrote:
> On 2015-05-29 10:53:30 -0700, Josh Berkus wrote:
>> On 05/29/2015 10:45 AM, Stephen Frost wrote:
>> So, here's they scenario:
>>
>> 1. you're almost out of disk space due to a replica falling behind, like
>> down to 16mb left.  Or maybe you are out of disk space.
>>
>> 2. You need to drop the laggy replication slots in a hurry to get your
>> master working again.
>>
>> 3. Now you have to do this timing-sensitive two-stage drop to make it work.
> 
> How is this measurably worse than trying to truncate a log table that
> has grown too large? That's often harder to fight actually, because
> there's dozens of other processes that might be using the relation?  In
> one case you don't have wait ordering, but only one locker, in the other
> case you have multiple waiters, and to benefit from wait ordering you
> need multiple sessions.
> 
> Again, I'm not against improving either situation, it's just that the
> urgency argument doesn't seem worth its weight.

Well, I wouldn't mind a solution for drop table and drop database,
either. I'm pretty sure that's on our TODO list.

Oh, I see the confusion.  When I say "time-critical", I was referring to
the situation where someone is running out of disk space.  Not coming up
with a patch.  AFAIK, hardly anyone is using replication slots, still.

> 
> Note that all of this is 9.4 code, not 9.5.

Yes, but I'm not suggesting backporting it, just maybe a backported doc
patch.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Stephen Frost
* Josh Berkus (j...@agliodbs.com) wrote:
> On 05/29/2015 11:01 AM, Stephen Frost wrote:
> > * Josh Berkus (j...@agliodbs.com) wrote:
> >> > 1. you're almost out of disk space due to a replica falling behind, like
> >> > down to 16mb left.  Or maybe you are out of disk space.
> > This right here is a real issue.  What I'd personally like to see is an
> > option which says "you have X GB of disk space.  Once it's gone, forget
> > about all replicas or failing archive commands or whatever, and just
> > stop holding on to ancient WAL that you no longer need to operate."
> 
> The substantial challenge here is how do we determine that you're
> "almost out of disk space"?

Eh?  That "X GB" above was intended to be the value of a GUC.

I know how big my WAL partition is.  Let me tell PG how big it is and to
not do anything that'll end up going over that amount, and we'll never
see a crash due to out of disk space for WAL again.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] fsync-pgdata-on-recovery tries to write to more files than previously

2015-05-29 Thread Stephen Frost
* Andres Freund (and...@anarazel.de) wrote:
> On 2015-05-29 13:49:16 -0400, Tom Lane wrote:
> > > That sounds like a potentially nontrivial amount of repetitive log bleat
> > > after every crash start? One which the user can't really stop?
> > 
> > Why can't the user stop it?
> 
> Because it makes a good amount of sense to have e.g. certificates not
> owned by postgres and not writeable? You don't necessarily want to
> symlink them somewhere else, because that makes moving clusters around
> harder than when they're self contained.

A certain other file might be non-writable by PG too... (*cough* .auto
*cough*).

> > I'd say it's a pretty damn-fool arrangement: for starters, it's an
> > unnecessary security hazard.
> 
> I don't buy the security argument at all. You likely have
> postgresql.conf in the data directoy. You can write to at least .auto,
> which will definitely reside the data directory. That contains
> archive_command.

I'm not sure that I see the security issue here either..  We're not
talking about setuid shell scripts or anything that isn't running as the
PG user, which a superuser could take over anyway..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Jeff Janes
On Thu, May 28, 2015 at 11:26 PM, Guillaume Lelarge 
wrote:

> Le 29 mai 2015 8:01 AM, "Fabien COELHO"  a écrit :
> >
> >
> >> FWIW, I don't mind which one we put in core and which one we put out of
> >> core. But I like Joshua's idea of getting rid of contribs and pushing
> them
> >> out as any other extensions.
> >
> >
> > Hmmm.
> >
> > I like the contrib directory as a living example of "how to do an
> extension" directly available in the source tree. It also allows to test
> in-tree that the extension mechanism works. So I think it should be kept at
> least with a minimum set of dummy examples for this purpose, even if all
> current extensions are moved out.
> >
>
> Agreed.
>
> > Also, removing a feature is a regression, and someone is always bound to
> complain... What is the real benefit? ISTM that it is a solution that fixes
> no important problem. Reaching a consensus about what to move here or there
> will consume valuable time that could be spent on more important tasks...
> Is it worth it?
> >
>
> It would be less confusing for users. Contrib modules seem to be first
> class extensions, leaving doubt on other extensions.
>

Presumably there are still going to be some extensions maintained by
-hackers, and some not.  I don't think we are going to change that, so the
difference will still need to be explained, regardless of what words are
used.  And people *should* have doubts about other extensions.  Couldn't
any talented programmer write an extension which gives them a backdoor into
the deployer's system, and then upload it to github?

I would certainly be cautious about installing any old extension I found
some some place on the internet.


> But the fact they aren't in core make them not fully trusted by some
> users.
>
Would it help if we called it "base" or "minimal" rather than "core" in the
docs?  (And called 'contrib' something different as well?  The docs already
do call it "Additional Supplied Modules" and use "contrib" only when
referring the the directory, not the concept.)


> Trying to explain all that in a training is a PITA. It would be much less
> confusing if they were either in core or in their own repository.
>

Several of the contrib modules should be kept in tight sync with src or
else testing and debugging would be a disaster. Putting them in different
git repositories wouldn't work well.  Unless those would among the ones
moved to "core".

Cheers,

Jeff


Re: [HACKERS] pgindent vs emacs

2015-05-29 Thread Andrew Dunstan


On 05/29/2015 01:49 PM, Andres Freund wrote:

On 2015-05-29 13:37:40 -0400, Andrew Dunstan wrote:

One of the annoying inconsistencies between emacs and pgindent is that emacs
refuses to offset a block following a case label, while pgindent does. Is
there anything we can do to induce emacs to do what pgindent does?

Are you using the logic from src/tools/editors/emacs.samples

I don't see that problem here. I've further tuned my emacs for pPG, but
afaics nothing but relevant for this but the above.




Hmm, yes, you're right, I was missing something. It also turns out it 
depends on stuff we can't put in .dir-locals.el.


Sorry for the noise.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fsync-pgdata-on-recovery tries to write to more files than previously

2015-05-29 Thread Andres Freund
On 2015-05-29 14:15:48 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-05-29 13:49:16 -0400, Tom Lane wrote:
> >> Why can't the user stop it?
> 
> > Because it makes a good amount of sense to have e.g. certificates not
> > owned by postgres and not writeable? You don't necessarily want to
> > symlink them somewhere else, because that makes moving clusters around
> > harder than when they're self contained.
> 
> Meh.  Well, I'm willing to yield on the EACCES point, but I still find
> the exclusion for ETXTBSY to be ugly and inappropriate.

Ok.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fsync-pgdata-on-recovery tries to write to more files than previously

2015-05-29 Thread Tom Lane
Andres Freund  writes:
> On 2015-05-29 13:49:16 -0400, Tom Lane wrote:
>> Why can't the user stop it?

> Because it makes a good amount of sense to have e.g. certificates not
> owned by postgres and not writeable? You don't necessarily want to
> symlink them somewhere else, because that makes moving clusters around
> harder than when they're self contained.

Meh.  Well, I'm willing to yield on the EACCES point, but I still find
the exclusion for ETXTBSY to be ugly and inappropriate.

>> I'd say it's a pretty damn-fool arrangement: for starters, it's an
>> unnecessary security hazard.

> I don't buy the security argument at all. You likely have
> postgresql.conf in the data directoy. You can write to at least .auto,
> which will definitely reside the data directory. That contains
> archive_command.

The fact that a superuser might have multiple ways to subvert things
doesn't make it a good idea to add another one: the attack surface
could be larger, or at least different.  But even if you don't buy
that it's a security hazard, why would it be a good idea to have
executables inside $PGDATA?  That would for example lead to them getting
copied by pg_basebackup, which seems unlikely to be a good thing.
And if you did have such executables there, why would they be active
during a postmaster restart?

I really seriously doubt that this is either common enough or useful
enough to justify suppressing warning messages about it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Andres Freund
On 2015-05-29 10:53:30 -0700, Josh Berkus wrote:
> On 05/29/2015 10:45 AM, Stephen Frost wrote:
> So, here's they scenario:
> 
> 1. you're almost out of disk space due to a replica falling behind, like
> down to 16mb left.  Or maybe you are out of disk space.
> 
> 2. You need to drop the laggy replication slots in a hurry to get your
> master working again.
> 
> 3. Now you have to do this timing-sensitive two-stage drop to make it work.

How is this measurably worse than trying to truncate a log table that
has grown too large? That's often harder to fight actually, because
there's dozens of other processes that might be using the relation?  In
one case you don't have wait ordering, but only one locker, in the other
case you have multiple waiters, and to benefit from wait ordering you
need multiple sessions.

Again, I'm not against improving either situation, it's just that the
urgency argument doesn't seem worth its weight.


Note that all of this is 9.4 code, not 9.5.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Remove contrib entirely

2015-05-29 Thread Jeff Janes
On Thu, May 28, 2015 at 11:01 PM, Fabien COELHO  wrote:

>
>  FWIW, I don't mind which one we put in core and which one we put out of
>> core. But I like Joshua's idea of getting rid of contribs and pushing them
>> out as any other extensions.
>>
>
> Hmmm.
>
> I like the contrib directory as a living example of "how to do an
> extension" directly available in the source tree. It also allows to test
> in-tree that the extension mechanism works. So I think it should be kept at
> least with a minimum set of dummy examples for this purpose, even if all
> current extensions are moved out.
>

It is mostly an example of "How to do an contrib module" rather than "how
to do an extension".  There are differences between those things regarding
the the USE_PGXS and some other things.  If we want to keep it as an
example of what we want people to do in the future, it needs be a really
good example.  And if we want to step new things from going into contrib,
we wouldn't want to provide an example of how to put new things into it.


>
> Also, removing a feature is a regression, and someone is always bound to
> complain... What is the real benefit? ISTM that it is a solution that fixes
> no important problem. Reaching a consensus about what to move here or there
> will consume valuable time that could be spent on more important tasks...
> Is it worth it?
>

Yeah, I don't really see the benefit either.  Some could be moved to core,
and some could just be removed, but many of them it just seems like we
would end up inventing a new 'contrib' to which is the same as the old, but
with a different name.

Cheers,

Jeff


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Josh Berkus
On 05/29/2015 11:01 AM, Stephen Frost wrote:
> * Josh Berkus (j...@agliodbs.com) wrote:
>> > 1. you're almost out of disk space due to a replica falling behind, like
>> > down to 16mb left.  Or maybe you are out of disk space.
> This right here is a real issue.  What I'd personally like to see is an
> option which says "you have X GB of disk space.  Once it's gone, forget
> about all replicas or failing archive commands or whatever, and just
> stop holding on to ancient WAL that you no longer need to operate."

The substantial challenge here is how do we determine that you're
"almost out of disk space"?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] postpone next week's release

2015-05-29 Thread Robert Haas
Hi,

I think we should postpone next week's release.  I have been hard at
work on the multixact-related bugs that were reported in 9.4.2 and
9.3.7, and the subsequent bugs found by code-reading, but getting them
all fixed by Monday doesn't seem realistic.  Such fixes should have
careful review, and not be dashed into the tree under time pressure.

We could do the release anyway to relieve the pain caused by the
fsync-pgdata hard-failure problem, but it seems to me that if we do
that, we're just going to end up having to do yet another release
almost right away.  I think it would be better to wait and do one
release that fixes both sets of issues.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need Force flag for pg_drop_replication_slot()

2015-05-29 Thread Stephen Frost
* Josh Berkus (j...@agliodbs.com) wrote:
> 1. you're almost out of disk space due to a replica falling behind, like
> down to 16mb left.  Or maybe you are out of disk space.

This right here is a real issue.  What I'd personally like to see is an
option which says "you have X GB of disk space.  Once it's gone, forget
about all replicas or failing archive commands or whatever, and just
stop holding on to ancient WAL that you no longer need to operate."

Perhaps there would be a warning threshold there too, where you start
getting complaints in the log if things are falling too far behind.
Ideally, you'd have a monitoring system which is checking for that, but
it'd be trivial to include and could be useful for environments that
don't have proper monitoring yet.

Having this work on the replicas would be nice too.  I realize we have
time-based constraints there which say "kill off queries which are
blocking us from moving forward after X time", but it'd be awful nice to
have a size-based way too, to avoid having PG crash when it runs out of
space.  I have to admit that I'm getting quite tired of the ways in
which PG can crash due to out of memory (yes, I know, it's the OOM
killer because of a misconfigured Linux box, but still), out of disk
space on the master, out of space on the replica, etc, etc.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] fsync-pgdata-on-recovery tries to write to more files than previously

2015-05-29 Thread Andres Freund
On 2015-05-29 13:49:16 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-05-29 13:14:18 -0400, Tom Lane wrote:
> >> Abhijit Menon-Sen  writes:
> >> As I mentioned yesterday, I'm not really on board with ignoring EACCES,
> >> except for the directories-on-Windows case.  Since we're only logging
> >> the failures anyway, I think it is reasonable to log a complaint for any
> >> unwritable file in the data directory.
> 
> > That sounds like a potentially nontrivial amount of repetitive log bleat
> > after every crash start? One which the user can't really stop?
> 
> Why can't the user stop it?

Because it makes a good amount of sense to have e.g. certificates not
owned by postgres and not writeable? You don't necessarily want to
symlink them somewhere else, because that makes moving clusters around
harder than when they're self contained.

> I'd say it's a pretty damn-fool arrangement: for starters, it's an
> unnecessary security hazard.

I don't buy the security argument at all. You likely have
postgresql.conf in the data directoy. You can write to at least .auto,
which will definitely reside the data directory. That contains
archive_command.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >