date:20140604

Re: [HACKERS] Need to backpatch 2985e16 to 9.3 and further (HS regression test out)

2014-06-04 Thread Amit Langote

On Wed, Jun 4, 2014 at 11:10 PM, Fujii Masao  wrote:
> On Wed, Jun 4, 2014 at 3:26 PM, Amit Langote  wrote:
>> Hi,
>>
>> Following (commit 2985e16) has not been backpatched, I guess.
>>
>>  ANALYZE hs1;
>> -ERROR:  cannot execute VACUUM during recovery
>> +ERROR:  cannot execute ANALYZE during recovery
>>
>> Attached is a patch for this.
>
> Why did you cut off the following part? ISTM that also needs to be 
> back-patched.
> So we should just do "git cherry-pick 2985e16" on 9.3.
>
> -begin transaction isolation level serializable;
> +begin transaction isolation level repeatable read;
>

You are right, I did not pay attention to that at all.
Please find attached = 2985e16 ;-)

--
Amit


hs-regression-test-backpatch-fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Tom Lane

David G Johnston  writes:
> If we are planning on keeping this rule, which it seems like at least a few
> people feel is too stringent, maybe we can consider releasing an Alpha
> version and communicate the expectation that an initdb will be required to
> go from the alpha to beta1.  Then hopefully, but not certainly, no initdb
> needed once in the beta phase.  Basically convert beta1 into an alpha with
> that single policy/expectation change.

I think that would just amount to adding a month of dead time in what is
already a very long beta cycle.  Our past experience with releasing things
called "alphas" has been that people don't test them.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread David G Johnston

Stefan Kaltenbrunner wrote
> On 06/04/2014 08:56 PM, Joshua D. Drake wrote:
>> 
>> On 06/04/2014 11:52 AM, Tom Lane wrote:
>> 
>>> I think we could possibly ship 9.4 without fixing this, but it would be
>>> imprudent.  Anyone think differently?
>>>
>>> Of course, if we do fix this then the door opens for pushing other
>>> initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition
>>> that I was looking at when I noticed this, or the pg_lsn catalog
>>> additions that were being discussed a couple weeks ago.
>> 
>> It certainly seems that if we are going to initdb anyway, let's do it
>> with approved features that got kicked (assuming) only because they
>> would cause an initdb.
> 
> agreed there - I dont think the "no initdb rule during BETA" really buys
> us that much these days. If people test our betas at all they do on
> scratch boxes in development/staging, i really doubt that (especially
> given the .0 history we had in the last years) people really move -BETA
> installs to production or expect to do so.

If we are planning on keeping this rule, which it seems like at least a few
people feel is too stringent, maybe we can consider releasing an Alpha
version and communicate the expectation that an initdb will be required to
go from the alpha to beta1.  Then hopefully, but not certainly, no initdb
needed once in the beta phase.  Basically convert beta1 into an alpha with
that single policy/expectation change.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Sigh-we-need-an-initdb-tp5806058p5806137.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] slotname vs slot_name

2014-06-04 Thread Amit Langote

Hi,

On Thu, Jun 5, 2014 at 10:57 AM, Fujii Masao  wrote:
> I like using "slot_name" everywhere, i.e, even in recovery.conf.
> primary_slot_name seems not so long name.
>
> BTW, what about also renaming pg_llog directory? I'm afraid that
> a user can confuse pg_log with pg_llog.
>

Recently I came across this while tab-completing pg_log ;-)
I remember asking to document pg_llog elsewhere.

--
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New pg_lsn type doesn't have hash/btree opclasses

2014-06-04 Thread Michael Paquier

On Thu, Jun 5, 2014 at 9:54 AM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2014-05-09 22:14:25 +0900, Michael Paquier wrote:
>>> [ patch ]
>
> I've committed a revised version of Andres' patch.
Thanks!

> I thought even that was kind of overkill; but a bigger problem is the
> output was sensitive to hash values which are not portable across
> different architectures.  With a bit of experimentation I found that
> a SELECT DISTINCT ... ORDER BY query would exercise both hashing and
> sorting, so that's what got committed.  (I'm not entirely sure though
> whether the plan will be stable across architectures; we might have
> to tweak the test case based on buildfarm feedback.)
Yeah this was a bit too much, and came up with some more light-weight
regression tests instead in the patch here:
http://www.postgresql.org/message-id/CAB7nPqSgZVrHRgsgUg63SCFY+AwH-=3judy7moq-_fo7wi4...@mail.gmail.com
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] slotname vs slot_name

2014-06-04 Thread Fujii Masao

On Thu, Jun 5, 2014 at 8:24 AM, Andres Freund  wrote:
> Hi,
>
> Due to the opened window of the pg_control/catalog version bump a chance
> has opened to fix a inconsistency I've recently been pointed
> towards:
> Namely that replication slots are named 'slot_name' in one half of the
> cases and 'slotname' in the other. That's in views, SRF columns,
> function parameters and the primary_slotname recovery.conf parameter.
>
> My personal tendency would be to make it slot_name everywhere except the
> primary_slotname recovery.conf parameter. There we already have
> precedent for shortening names.
>
> Other opinions?

I like using "slot_name" everywhere, i.e, even in recovery.conf.
primary_slot_name seems not so long name.

BTW, what about also renaming pg_llog directory? I'm afraid that
a user can confuse pg_log with pg_llog.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-04 Thread Haribabu Kommi

On Thu, Jun 5, 2014 at 9:12 AM, Andres Freund  wrote:
> Hi,
>
> On 2014-06-04 10:37:48 +1000, Haribabu Kommi wrote:
>> Thanks for the test. Patch is re-based to the latest head.
>
> Did you look at the source of the conflict? Did you intentionally mark
> the functions as leakproof and reviewed that that truly is the case? Or
> was that caused by copy & paste?

Yes it is copy & paste mistake. I didn't know much about that parameter.
Thanks for the information. I changed it.

Regards,
Hari Babu
Fujitsu Australia


inet_agg_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tests for client programs

2014-06-04 Thread Tom Lane

Peter Eisentraut  writes:
> On Wed, 2014-05-07 at 03:08 +0200, Andres Freund wrote:
>> As an additional issue it currently doesn't seem to work in VPATH
>> builds. That's imo a must fix.
>> A "cd $(srcdir) && .." in prove_installcheck and prove_check seems to do
>> the trick.

> Here is my proposed patch for this.

BTW, my Salesforce colleagues were complaining to me that this stuff
doesn't work at all on older Perl versions; apparently IPC::Run has
changed significantly since Perl 5.8 or so.  Can we do anything about
that?

They were also not too happy that the checks get skipped if IPC::Run isn't
installed (as it is not on stock RHEL, for instance).  It'd be better if
we could avoid depending on stuff that isn't in a pretty-vanilla Perl
installation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New pg_lsn type doesn't have hash/btree opclasses

2014-06-04 Thread Tom Lane

Andres Freund  writes:
> On 2014-05-09 22:14:25 +0900, Michael Paquier wrote:
>> [ patch ]

I've committed a revised version of Andres' patch.  Mostly cosmetic
fixes, but the hash implementation was still wrong:

return DirectFunctionCall1(hashint8, PG_GETARG_LSN(0));

DirectFunctionCallN takes Datums, not de-Datumified values.  This coding
would've crashed on 32-bit machines, where hashint8 would be expecting
to receive a Datum containing a pointer to int64.

We could have done it correctly like this:

return DirectFunctionCall1(hashint8, PG_GETARG_DATUM(0));

but I thought it was better, and microscopically more efficient, to
follow the existing precedent in timestamp_hash:

return hashint8(fcinfo);

since that avoids an extra FunctionCallInfoData setup.

>> On Fri, May 9, 2014 at 10:01 PM, Fujii Masao  wrote:
>>> The patch looks good to me except the name of index operator class.
>>> I think that "pg_lsn_ops" is better than "pglsn_ops" because it's for 
>>> "pg_lsn"
>>> data type.

I concur, and changed this.

> I honestly don't really see much point in the added tests. If at all I'd
> rather see my tests from
> http://archives.postgresql.org/message-id/20140506230722.GE24808%40awork2.anarazel.de
> addded. With some EXPLAIN (COSTS OFF) they'd test more.

I thought even that was kind of overkill; but a bigger problem is the
output was sensitive to hash values which are not portable across
different architectures.  With a bit of experimentation I found that
a SELECT DISTINCT ... ORDER BY query would exercise both hashing and
sorting, so that's what got committed.  (I'm not entirely sure though
whether the plan will be stable across architectures; we might have
to tweak the test case based on buildfarm feedback.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tests for client programs

2014-06-04 Thread Peter Eisentraut

On Wed, 2014-05-07 at 03:08 +0200, Andres Freund wrote:
> > As an additional issue it currently doesn't seem to work in VPATH
> > builds. That's imo a must fix.
> 
> A "cd $(srcdir) && .." in prove_installcheck and prove_check seems to do
> the trick.

Here is my proposed patch for this.
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 14119a1..93be859 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -301,13 +301,13 @@ PG_PROVE_FLAGS = --ext='.pl' -I $(top_srcdir)/src/test/perl/
 PROVE_FLAGS = --verbose
 
 define prove_installcheck
-PATH="$(bindir):$$PATH" PGPORT='6$(DEF_PGPORT)' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS)
+cd $(srcdir) && CURDIR='$(CURDIR)' PATH="$(bindir):$$PATH" PGPORT='6$(DEF_PGPORT)' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS)
 endef
 
 define prove_check
 $(MKDIR_P) tmp_check/log
 $(MAKE) -C $(top_builddir) DESTDIR=$(CURDIR)/tmp_check/install install >$(CURDIR)/tmp_check/log/install.log 2>&1
-PATH="$(CURDIR)/tmp_check/install$(bindir):$$PATH" PGPORT='6$(DEF_PGPORT)' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS)
+cd $(srcdir) && CURDIR='$(CURDIR)' PATH="$(CURDIR)/tmp_check/install$(bindir):$$PATH" PGPORT='6$(DEF_PGPORT)' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS)
 endef
 
 # Installation.
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 8a31110..78622fa 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -62,7 +62,7 @@ $ENV{PGPORT} = int($ENV{PGPORT}) % 65536;
 
 sub tempdir
 {
-	return File::Temp::tempdir('test', DIR => cwd(), CLEANUP => 1);
+	return File::Temp::tempdir('test', DIR => $ENV{CURDIR} || cwd(), CLEANUP => 1);
 }
 
 my ($test_server_datadir, $test_server_logfile);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing join removals for more join types

2014-06-04 Thread Andres Freund

On 2014-06-04 20:04:07 -0400, Noah Misch wrote:
> On Wed, Jun 04, 2014 at 10:14:42AM -0400, Tom Lane wrote:
> > It's possible that we could apply the optimization only to queries that
> > have been issued directly by a client, but that seems rather ugly and
> > surprise-filled.
> 
> ... such as this idea.  It's a good start to a fairly-hard problem.  FKs are
> also always valid when afterTriggers->query_depth == -1, such as when all
> ongoing queries qualified for EXEC_FLAG_SKIP_TRIGGERS.  What else?  We could
> teach trigger.c to efficiently report whether a given table has a queued RI
> trigger.  Having done that, when plancache.c is building a custom plan, the
> planner could ignore FKs with pending RI checks and use the rest.  At that
> point, the surprises will be reasonably-isolated.

A bit more crazy, but how about trying trying to plan joins with a added
one-time qual that checks the size of the deferred trigger queue? Then
we wouldn't even need special case plans.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing join removals for more join types

2014-06-04 Thread Noah Misch

On Wed, Jun 04, 2014 at 10:14:42AM -0400, Tom Lane wrote:
> David Rowley  writes:
> > On Wed, Jun 4, 2014 at 11:50 AM, Noah Misch  wrote:
> >> When a snapshot can see modifications that queued referential integrity
> >> triggers for some FK constraint, that constraint is not guaranteed to hold
> >> within the snapshot until those triggers have fired.
> 
> > I remember reading about some concerns with that here:
> > http://www.postgresql.org/message-id/51e2d505.5010...@2ndquadrant.com
> > But I didn't quite understand the situation where the triggers are delayed.
> > I just imagined that the triggers would have fired by the time the command
> > had completed. If that's not the case, when do the triggers fire? on

Normally, they fire in AfterTriggerEndQuery(), which falls at the end of a
command.  The trouble arises there when commands nest.  (If the constraint is
deferred, they fire just before COMMIT.)

> > commit? Right now I've no idea how to check for this in order to disable
> > the join removal.
> 
> I'm afraid that this point destroys your entire project :-( ... even
> without deferred constraints, there's no good way to be sure that you're
> not planning a query that will be run inside some function that can see
> the results of partially-completed updates.  The equivalent problem for
> unique indexes is tolerable because the default choice is immediate
> uniqueness enforcement, but there's no similar behavior for FKs.

Let's not give up just yet.  There's considerable middle ground between
ignoring the hazard and ignoring all FK constraints in the planner, ...

> It's possible that we could apply the optimization only to queries that
> have been issued directly by a client, but that seems rather ugly and
> surprise-filled.

... such as this idea.  It's a good start to a fairly-hard problem.  FKs are
also always valid when afterTriggers->query_depth == -1, such as when all
ongoing queries qualified for EXEC_FLAG_SKIP_TRIGGERS.  What else?  We could
teach trigger.c to efficiently report whether a given table has a queued RI
trigger.  Having done that, when plancache.c is building a custom plan, the
planner could ignore FKs with pending RI checks and use the rest.  At that
point, the surprises will be reasonably-isolated.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Turning recovery.conf into GUCs

2014-06-04 Thread Joshua D. Drake



On 06/04/2014 04:35 PM, Andres Freund wrote:


Hi,

On 2014-06-04 16:32:33 -0700, Josh Berkus wrote:

Can we get this patch going again for 9.5?


A patch gets going by somebody working on it.


Well yes, but it is also great to have someone remind others that it is 
of interest.


JD




Greetings,

Andres Freund




--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
Political Correctness is for cowards.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Turning recovery.conf into GUCs

2014-06-04 Thread Andres Freund

Hi,

On 2014-06-04 16:32:33 -0700, Josh Berkus wrote:
> Can we get this patch going again for 9.5?

A patch gets going by somebody working on it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Turning recovery.conf into GUCs

2014-06-04 Thread Josh Berkus

All,

Can we get this patch going again for 9.5?


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] btree_gist valgrind warnings about uninitialized memory

2014-06-04 Thread Tom Lane

Andres Freund  writes:
> On 2014-05-14 12:20:55 -0400, Tom Lane wrote:
>> Yeah, I don't think we want to bump the WAL version code post-beta1.
>> 
>> Probably better to assign the modified spgist record a new xl_info ID
>> number, so that a beta1 slave would throw an error for it.

> Since that ship has now sailed...? It's imo bad form to release a new
> version that overwrites the stack and heap, even if we can't see a
> concrete danger.

Yeah, no longer much reason to avoid changing the WAL version code.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Tom Lane

I wrote:
> Robert Haas  writes:
>> I think that's worth considering.  Another idea is: suppose we set up
>> a PostgreSQL database somewhere that contained information about what
>> controldata layout corresponded to what control version:

> That seems possibly workable.

BTW, a possibly-easier-to-implement idea is to git diff pg_control.h
against the previous release branch.  Any non-comment diffs (or, really,
anything at all except a copyright date change) should excite a warning
unless a change to the PG_CONTROL_VERSION line is included.  We'd probably
need a way to shut it up after a simple comment change, but otherwise
not a lot of infrastructure needed except a shell script.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] btree_gist valgrind warnings about uninitialized memory

2014-06-04 Thread Andres Freund

On 2014-05-14 12:20:55 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-05-14 10:07:18 -0400, Tom Lane wrote:
> >> I think that's an OK restriction as long as we warn people about it
> >> (you could update a replication pair as long as you shut them both
> >> down cleanly at the same time, right?).  Can the WAL replay routine
> >> be made to detect incompatible records?
> 
> > We could just bump the wal version. Somewhat surprisingly that works if
> > both nodes are shutdown cleanly (primary first)... But the errors about
> > it are really ugly (will moan about unusable checkpoints), so it's
> > probably not a good idea. Especially as it'll make it an issue for all
> > users, not just the ones creating spgist indexes.
> 
> Yeah, I don't think we want to bump the WAL version code post-beta1.
> 
> Probably better to assign the modified spgist record a new xl_info ID
> number, so that a beta1 slave would throw an error for it.

Since that ship has now sailed...? It's imo bad form to release a new
version that overwrites the stack and heap, even if we can't see a
concrete danger.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] slotname vs slot_name

2014-06-04 Thread Andres Freund

Hi,

Due to the opened window of the pg_control/catalog version bump a chance
has opened to fix a inconsistency I've recently been pointed
towards:
Namely that replication slots are named 'slot_name' in one half of the
cases and 'slotname' in the other. That's in views, SRF columns,
function parameters and the primary_slotname recovery.conf parameter.

My personal tendency would be to make it slot_name everywhere except the
primary_slotname recovery.conf parameter. There we already have
precedent for shortening names.

Other opinions?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Tom Lane

Robert Haas  writes:
> On Wed, Jun 4, 2014 at 4:37 PM, Andrew Dunstan  wrote:
>> I agree it's scary but in a few minutes thinking about it I haven't been
>> able to come up with a good way of checking it. Maybe we could build
>> sizeof(ControlData) into the version number, so instead of 937 we'd have
>> 937n. Then we could check the n against what we know we is the size.
>> I realize this isn't perfect, but might be better than nothing.

> I think that's worth considering.  Another idea is: suppose we set up
> a PostgreSQL database somewhere that contained information about what
> controldata layout corresponded to what control version:

> CREATE TABLE control_formats (version_number integer, data_types text[]);

> Every time it runs, it checks out the latest source code.  It checks
> whether the control version is already present in the table; if so, it
> verifies that the data types match.  If they don't, it makes something
> turn red.  If the control version isn't present yet, it inserts
> whatever types it sees as the definitive record of what the new
> version number means.

That seems possibly workable.  Merely checking sizeof(ControlData) isn't
real helpful since (1) it might not catch field additions because of
alignment padding, and (2) it's not clear that that number is, or should
be, entirely architecture-independent.  But I think a list of the C data
type names of the fields (and maybe the fields' own names, for good
measure) would be reasonably robust.

Not sure how we'd scale this idea to catversion or WAL version, but
I think both of those are less significant hazards.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Robert Haas

On Wed, Jun 4, 2014 at 4:37 PM, Andrew Dunstan  wrote:
> On 06/04/2014 03:50 PM, Robert Haas wrote:
>> On Wed, Jun 4, 2014 at 2:52 PM, Tom Lane  wrote:
>>> I just noticed that we had not one, but two commits in 9.4 that added
>>> fields to pg_control.  And neither one changed PG_CONTROL_VERSION.
>>> This is inexcusable sloppiness on the part of the committers involved,
>>> but the question is what do we do now?
>>
>> I think it would be an awfully good idea to think about what we could
>> put into the buildfarm, the git repository, or the source tree to get
>> some automatic notification when somebody screws up this way (or the
>> xlog header magic, or catversion).  The first of those two screw-ups
>> (by me) was 11 months ago today; it's pretty scary that we're only
>> just now noticing.
>>
>
> I agree it's scary but in a few minutes thinking about it I haven't been
> able to come up with a good way of checking it. Maybe we could build
> sizeof(ControlData) into the version number, so instead of 937 we'd have
> 937n. Then we could check the n against what we know we is the size.
> I realize this isn't perfect, but might be better than nothing.

I think that's worth considering.  Another idea is: suppose we set up
a PostgreSQL database somewhere that contained information about what
controldata layout corresponded to what control version:

CREATE TABLE control_formats (version_number integer, data_types text[]);

Every time it runs, it checks out the latest source code.  It checks
whether the control version is already present in the table; if so, it
verifies that the data types match.  If they don't, it makes something
turn red.  If the control version isn't present yet, it inserts
whatever types it sees as the definitive record of what the new
version number means.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-04 Thread Andres Freund

Hi,

On 2014-06-04 10:37:48 +1000, Haribabu Kommi wrote:
> Thanks for the test. Patch is re-based to the latest head.

Did you look at the source of the conflict? Did you intentionally mark
the functions as leakproof and reviewed that that truly is the case? Or
was that caused by copy & paste?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "pivot aggregation" with a patched intarray

2014-06-04 Thread Ali Akbar

2014-06-01 20:48 GMT+07:00 Marc Mamin :
>
> >On Sat, May 31, 2014 at 12:31 AM, Marc Mamin  wrote:
> >> I have patched intarray with 3 additional functions in order to 
> >> count[distinct] event IDs
> >> into arrays, whereas the array position correspond to the integer values. 
> >> (mimic column oriented storage)
> >
> >I didn't look at the feature itself, but here are some comments about
> >the format of the patch:
> >- Be careful the newlines on the file you posted use ¥r¥n, which is
> >purely Windows stuff... This will generate unnecessary diffs with the
> >source code
> I don't mean to suggests this directly as a patch,
> I'm first interested to see if there are some interest
> for such an aggregation type.

From what i see, the icount_to_array is complementary to standard
count() aggregates, but it produces array. If the values are not
sparse, i think the performance and memory/storage benefit you
mentioned will be true. But if the values are sparse, there will be
many 0's, how it will perform?

I'm interested to benchmark it with some use cases, to confirm the
performance benefits of it.

--
Ali Akbar


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-04 Thread Tom Lane

Andres Freund  writes:
> On 2014-06-03 10:37:53 -0400, Tom Lane wrote:
>> It hasn't even got a comment saying why changes here should
>> receive any scrutiny; moreover, it's not in a file where changes would be
>> likely to excite suspicion.  (Probably it should be in opr_sanity, if
>> we're going to have such a thing at all.)

> I've written up the attached patch that moves the test to opr_sanity and
> adds a littlebit of commentary. Will apply unless somebody protests in
> the next 24h or so.

+1, but as long as we're touching this, could we make the output be

SELECT oid::regprocedure, prorettype::regtype FROM pg_proc ...

Same information, but more readable IMO.  (I'm not really sure why
we need to show prorettype here at all, btw.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Noah Misch

On Wed, Jun 04, 2014 at 09:16:36PM +0200, Stefan Kaltenbrunner wrote:
> On 06/04/2014 08:56 PM, Joshua D. Drake wrote:
> > On 06/04/2014 11:52 AM, Tom Lane wrote:
> >> I think we could possibly ship 9.4 without fixing this, but it would be
> >> imprudent.  Anyone think differently?
> >>
> >> Of course, if we do fix this then the door opens for pushing other
> >> initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition
> >> that I was looking at when I noticed this, or the pg_lsn catalog
> >> additions that were being discussed a couple weeks ago.
> > 
> > It certainly seems that if we are going to initdb anyway, let's do it
> > with approved features that got kicked (assuming) only because they
> > would cause an initdb.
> 
> agreed there - I dont think the "no initdb rule during BETA" really buys
> us that much these days. If people test our betas at all they do on
> scratch boxes in development/staging, i really doubt that (especially
> given the .0 history we had in the last years) people really move -BETA
> installs to production or expect to do so.

+1.  You need a microscope to see the gain from imposing that rule.  Even if
people do move beta installs to production, that's just a pg_upgrade away.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-04 Thread Andres Freund

On 2014-06-03 10:37:53 -0400, Tom Lane wrote:
> It hasn't even got a comment saying why changes here should
> receive any scrutiny; moreover, it's not in a file where changes would be
> likely to excite suspicion.  (Probably it should be in opr_sanity, if
> we're going to have such a thing at all.)

I've written up the attached patch that moves the test to opr_sanity and
adds a littlebit of commentary. Will apply unless somebody protests in
the next 24h or so.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From 61339023f026be3e61acc5e29038e0b3484b4ddc Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 5 Jun 2014 00:52:49 +0200
Subject: [PATCH] Move regression test listing of builtin leakproof functions
 to opr_sanity.sql.

The original location in create_function_3.sql didn't invite the close
structinity warranted for adding new leakproof functions. Add comments
to the test explaining that functions should only be added after
careful consideration and understanding what a leakproof function is.

Per complaint from Tom Lane after 5eebb8d954ad.
---
 src/test/regress/expected/create_function_3.out | 238 --
 src/test/regress/expected/opr_sanity.out| 249 +++-
 src/test/regress/sql/create_function_3.sql  |  14 --
 src/test/regress/sql/opr_sanity.sql |  25 ++-
 4 files changed, 268 insertions(+), 258 deletions(-)

diff --git a/src/test/regress/expected/create_function_3.out b/src/test/regress/expected/create_function_3.out
index a4b2d99..6a4352c 100644
--- a/src/test/regress/expected/create_function_3.out
+++ b/src/test/regress/expected/create_function_3.out
@@ -149,244 +149,6 @@ CREATE FUNCTION functext_E_3(int) RETURNS bool LANGUAGE 'sql'
LEAKPROOF AS 'SELECT $1 < 200';	-- failed
 ERROR:  only superuser can define a leakproof function
 RESET SESSION AUTHORIZATION;

--- list of built-in leakproof functions

--- temporarily disable fancy output, so catalog changes create less diff noise
-\a\t
-SELECT proname, prorettype::regtype, proargtypes::regtype[]
-   FROM pg_proc JOIN pg_namespace ON pronamespace = pg_namespace.oid
-   WHERE nspname = 'pg_catalog' AND proleakproof ORDER BY proname;
-abstimeeq|boolean|[0:1]={abstime,abstime}
-abstimege|boolean|[0:1]={abstime,abstime}
-abstimegt|boolean|[0:1]={abstime,abstime}
-abstimele|boolean|[0:1]={abstime,abstime}
-abstimelt|boolean|[0:1]={abstime,abstime}
-abstimene|boolean|[0:1]={abstime,abstime}
-biteq|boolean|[0:1]={bit,bit}
-bitge|boolean|[0:1]={bit,bit}
-bitgt|boolean|[0:1]={bit,bit}
-bitle|boolean|[0:1]={bit,bit}
-bitlt|boolean|[0:1]={bit,bit}
-bitne|boolean|[0:1]={bit,bit}
-booleq|boolean|[0:1]={boolean,boolean}
-boolge|boolean|[0:1]={boolean,boolean}
-boolgt|boolean|[0:1]={boolean,boolean}
-boolle|boolean|[0:1]={boolean,boolean}
-boollt|boolean|[0:1]={boolean,boolean}
-boolne|boolean|[0:1]={boolean,boolean}
-bpchareq|boolean|[0:1]={character,character}
-bpcharne|boolean|[0:1]={character,character}
-byteaeq|boolean|[0:1]={bytea,bytea}
-byteage|boolean|[0:1]={bytea,bytea}
-byteagt|boolean|[0:1]={bytea,bytea}
-byteale|boolean|[0:1]={bytea,bytea}
-bytealt|boolean|[0:1]={bytea,bytea}
-byteane|boolean|[0:1]={bytea,bytea}
-cash_eq|boolean|[0:1]={money,money}
-cash_ge|boolean|[0:1]={money,money}
-cash_gt|boolean|[0:1]={money,money}
-cash_le|boolean|[0:1]={money,money}
-cash_lt|boolean|[0:1]={money,money}
-cash_ne|boolean|[0:1]={money,money}
-chareq|boolean|[0:1]={"\"char\"","\"char\""}
-charge|boolean|[0:1]={"\"char\"","\"char\""}
-chargt|boolean|[0:1]={"\"char\"","\"char\""}
-charle|boolean|[0:1]={"\"char\"","\"char\""}
-charlt|boolean|[0:1]={"\"char\"","\"char\""}
-charne|boolean|[0:1]={"\"char\"","\"char\""}
-cideq|boolean|[0:1]={cid,cid}
-circle_eq|boolean|[0:1]={circle,circle}
-circle_ge|boolean|[0:1]={circle,circle}
-circle_gt|boolean|[0:1]={circle,circle}
-circle_le|boolean|[0:1]={circle,circle}
-circle_lt|boolean|[0:1]={circle,circle}
-circle_ne|boolean|[0:1]={circle,circle}
-date_eq|boolean|[0:1]={date,date}
-date_ge|boolean|[0:1]={date,date}
-date_gt|boolean|[0:1]={date,date}
-date_le|boolean|[0:1]={date,date}
-date_lt|boolean|[0:1]={date,date}
-date_ne|boolean|[0:1]={date,date}
-float48eq|boolean|[0:1]={real,"double precision"}
-float48ge|boolean|[0:1]={real,"double precision"}
-float48gt|boolean|[0:1]={real,"double precision"}
-float48le|boolean|[0:1]={real,"double precision"}
-float48lt|boolean|[0:1]={real,"double precision"}
-float48ne|boolean|[0:1]={real,"double precision"}
-float4eq|boolean|[0:1]={real,real}
-float4ge|boolean|[0:1]={real,real}
-float4gt|boolean|[0:1]={real,real}
-float4le|boolean|[0:1]={real,real}
-float4lt|boolean|[0:1]={real,real}
-float4ne|boolean|[0:1]={real,real}
-float84eq|boolean|[0:1]={"double precision",real}
-float84ge|boolean|[0:1]={"double precision",real}
-float84gt|boolean|[0:1]={"double precision",real}
-float84le|boolean|[0

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Tom Lane

Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> There are at least two places in inv_api.c where we have
>> "Assert(pagelen <= LOBLKSIZE)" that is protecting a subsequent memcpy
>> into a local variable of size LOBLKSIZE, so that the only thing standing
>> between us and a stack-smash security issue that's trivially exploitable
>> in production builds is that on-disk data conforms to our expectation
>> about LOBLKSIZE.  I think it's definitely worth promoting these checks
>> to regular runtime-if-test-and-elog.

> Agreed.  Promoting that to a run-time check seems well worth it to me.

Here's a draft patch for this.  Barring objections I'll commit the whole
thing to HEAD, and the inv_api.c changes to the back branches as well.

regards, tom lane

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d675560..a61878e 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 49,54 
--- 49,55 
  #include "storage/bufmgr.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
+ #include "storage/large_object.h"
  #include "storage/latch.h"
  #include "storage/pmsignal.h"
  #include "storage/predicate.h"
*** WriteControlFile(void)
*** 4352,4357 
--- 4353,4359 
  	ControlFile->indexMaxKeys = INDEX_MAX_KEYS;
  
  	ControlFile->toast_max_chunk_size = TOAST_MAX_CHUNK_SIZE;
+ 	ControlFile->loblksize = LOBLKSIZE;
  
  #ifdef HAVE_INT64_TIMESTAMP
  	ControlFile->enableIntTimes = true;
*** ReadControlFile(void)
*** 4545,4550 
--- 4547,4559 
  " but the server was compiled with TOAST_MAX_CHUNK_SIZE %d.",
  			  ControlFile->toast_max_chunk_size, (int) TOAST_MAX_CHUNK_SIZE),
   errhint("It looks like you need to recompile or initdb.")));
+ 	if (ControlFile->loblksize != LOBLKSIZE)
+ 		ereport(FATAL,
+ (errmsg("database files are incompatible with server"),
+ 		  errdetail("The database cluster was initialized with LOBLKSIZE %d,"
+ 	" but the server was compiled with LOBLKSIZE %d.",
+ 	ControlFile->loblksize, (int) LOBLKSIZE),
+  errhint("It looks like you need to recompile or initdb.")));
  
  #ifdef HAVE_INT64_TIMESTAMP
  	if (ControlFile->enableIntTimes != true)
diff --git a/src/backend/storage/large_object/inv_api.c b/src/backend/storage/large_object/inv_api.c
index 57ec1c2..0918142 100644
*** a/src/backend/storage/large_object/inv_api.c
--- b/src/backend/storage/large_object/inv_api.c
*** myLargeObjectExists(Oid loid, Snapshot s
*** 173,185 
  }
  
  
! static int32
! getbytealen(bytea *data)
  {
! 	Assert(!VARATT_IS_EXTENDED(data));
! 	if (VARSIZE(data) < VARHDRSZ)
! 		elog(ERROR, "invalid VARSIZE(data)");
! 	return (VARSIZE(data) - VARHDRSZ);
  }
  
  
--- 173,210 
  }
  
  
! /*
!  * Extract data field from a pg_largeobject tuple, detoasting if needed
!  * and verifying that the length is sane.  Returns data pointer (a bytea *),
!  * data length, and an indication of whether to pfree the data pointer.
!  */
! static void
! getdatafield(Form_pg_largeobject tuple,
! 			 bytea **pdatafield,
! 			 int *plen,
! 			 bool *pfreeit)
  {
! 	bytea	   *datafield;
! 	int			len;
! 	bool		freeit;
! 
! 	datafield = &(tuple->data); /* see note at top of file */
! 	freeit = false;
! 	if (VARATT_IS_EXTENDED(datafield))
! 	{
! 		datafield = (bytea *)
! 			heap_tuple_untoast_attr((struct varlena *) datafield);
! 		freeit = true;
! 	}
! 	len = VARSIZE(datafield) - VARHDRSZ;
! 	if (len < 0 || len > LOBLKSIZE)
! 		ereport(ERROR,
! (errcode(ERRCODE_DATA_CORRUPTED),
!  errmsg("pg_largeobject entry for OID %u, page %d has invalid data field size %d",
! 		tuple->loid, tuple->pageno, len)));
! 	*pdatafield = datafield;
! 	*plen = len;
! 	*pfreeit = freeit;
  }
  
  
*** inv_getsize(LargeObjectDesc *obj_desc)
*** 366,385 
  	{
  		Form_pg_largeobject data;
  		bytea	   *datafield;
  		bool		pfreeit;
  
  		if (HeapTupleHasNulls(tuple))	/* paranoia */
  			elog(ERROR, "null field found in pg_largeobject");
  		data = (Form_pg_largeobject) GETSTRUCT(tuple);
! 		datafield = &(data->data);		/* see note at top of file */
! 		pfreeit = false;
! 		if (VARATT_IS_EXTENDED(datafield))
! 		{
! 			datafield = (bytea *)
! heap_tuple_untoast_attr((struct varlena *) datafield);
! 			pfreeit = true;
! 		}
! 		lastbyte = (uint64) data->pageno * LOBLKSIZE + getbytealen(datafield);
  		if (pfreeit)
  			pfree(datafield);
  	}
--- 391,404 
  	{
  		Form_pg_largeobject data;
  		bytea	   *datafield;
+ 		int			len;
  		bool		pfreeit;
  
  		if (HeapTupleHasNulls(tuple))	/* paranoia */
  			elog(ERROR, "null field found in pg_largeobject");
  		data = (Form_pg_largeobject) GETSTRUCT(tuple);
! 		getdatafield(data, &datafield, &len, &pfreeit);
! 		lastbyte = (uint64) data->pageno * LOBLKSIZE + len;
  		if (pfreeit)
  			pfree(datafield);
  	}
*** inv_read(LargeObjectDesc *obj_desc, char
*** 506,520

Re: [HACKERS] Fix xpath() to return namespace definitions

2014-06-04 Thread Ali Akbar

2014-06-05 2:37 GMT+07:00 Robert Haas :

> Please add your patch here so we don't forget about it:
>
> https://commitfest.postgresql.org/action/commitfest_view/open
>

Added: https://commitfest.postgresql.org/action/patch_view?id=1461


> Please see also:
>
> https://wiki.postgresql.org/wiki/Submitting_a_Patch
>

Thanks, i've read it.
As for the suggestion there: "Start out by reviewing a patch or responding
to email on the lists. Even if it is unrelated to what you're doing", i'll
look around other posts in this list.

For back versions, i think because this patch changes xpath() behavior, we
will only apply this to future versions. The old behavior is wrong
(according to XPath standard) for not including namespaces, but maybe there
are some application that depends on the old behavior.

Thanks!
--
Ali Akbar

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Tom Lane

Greg Stark  writes:
> Fwiw I tried to use the pg_lsn data type the other day and pretty much
> immediately ran into the lack of the btree operator class. Pretty much
> the first thing I did when I loaded the data was "select ... order by
> lsn".

Agreed, now that we're going to force an initdb anyway, that fix is
high priority.  I'll take a look at the patch shortly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Andres Freund

On 2014-06-04 17:03:52 -0400, Tom Lane wrote:
> Actually, that statement makes me realize that if we fix
> PG_CONTROL_VERSION then it's a good idea to *also* do some regular catalog
> changes, or at least bump catversion.  Otherwise pg_upgrade won't be able to
> cope with upgrading non-default tablespaces in beta1 installations.

Heh. That's not a particularly nice property, although I can see where
it's coming from. Probably not really problematic because catversion
updates are so much more frequent.

> For the moment I'll just go bump PG_CONTROL_VERSION, assuming that we have
> enough other things on the table that at least one of them will result in
> a catversion bump before beta2.

The slot_name vs slotname thing seems uncontroversial enough since
slot_name is the thing that already appears everywhere in the docs and
it's what we'd agreed upon onlist. It's just that not everything got the
message.

> I have no objection to these as long as we can get some consensus on the
> new names (and personally I don't much care what those are, but I agree
> "xmin" for a user column is a bad idea).

I won't do anything about this one though until we've agreed upon
something.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Greg Stark

On Wed, Jun 4, 2014 at 9:52 PM, Andres Freund  wrote:
> Other things I'd like to change in that case:


Fwiw I tried to use the pg_lsn data type the other day and pretty much
immediately ran into the lack of the btree operator class. Pretty much
the first thing I did when I loaded the data was "select ... order by
lsn".

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Tom Lane

Andres Freund  writes:
> On 2014-06-04 14:52:35 -0400, Tom Lane wrote:
>> I think we could possibly ship 9.4 without fixing this, but it would be
>> imprudent.  Anyone think differently?

> Agreed. Additionally I also agree with Stefan that the price of a initdb
> during beta isn't that high these days.

Yeah, if nothing else it gives testers another opportunity to exercise
pg_upgrade ;-).  The policy about post-beta1 initdb is "avoid if
practical", not "avoid at all costs".

Actually, that statement makes me realize that if we fix
PG_CONTROL_VERSION then it's a good idea to *also* do some regular catalog
changes, or at least bump catversion.  Otherwise pg_upgrade won't be able to
cope with upgrading non-default tablespaces in beta1 installations.

For the moment I'll just go bump PG_CONTROL_VERSION, assuming that we have
enough other things on the table that at least one of them will result in
a catversion bump before beta2.

> Other things I'd like to change in that case:

I have no objection to these as long as we can get some consensus on the
new names (and personally I don't much care what those are, but I agree
"xmin" for a user column is a bad idea).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Andres Freund

Hi,

On 2014-06-04 14:52:35 -0400, Tom Lane wrote:
> I think we could possibly ship 9.4 without fixing this, but it would be
> imprudent.  Anyone think differently?

Agreed. Additionally I also agree with Stefan that the price of a initdb
during beta isn't that high these days.

> Of course, if we do fix this then the door opens for pushing other
> initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition
> that I was looking at when I noticed this, or the pg_lsn catalog
> additions that were being discussed a couple weeks ago.

Other things I'd like to change in that case:

* rename pg_replication_slots.xmin to something else. After the
  replication slot patch went in, in another patch's review you/Tom
  objected that xmin isn't the best name. The only problem being that I
  still don't have a better idea than my original 'data_xmin' which
  Robert disliked.

* Standardize on either slot_name for the replication slot's
  name. Currently some functions have a parameter named 'slotname' but
  all columnnames (from SRFs) are slot_name. That's not really bad since
  the parameter names don't really mean much, but if we can we should
  fix it imo.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Andrew Dunstan



On 06/04/2014 03:50 PM, Robert Haas wrote:

On Wed, Jun 4, 2014 at 2:52 PM, Tom Lane  wrote:

I just noticed that we had not one, but two commits in 9.4 that added
fields to pg_control.  And neither one changed PG_CONTROL_VERSION.
This is inexcusable sloppiness on the part of the committers involved,
but the question is what do we do now?

I think it would be an awfully good idea to think about what we could
put into the buildfarm, the git repository, or the source tree to get
some automatic notification when somebody screws up this way (or the
xlog header magic, or catversion).  The first of those two screw-ups
(by me) was 11 months ago today; it's pretty scary that we're only
just now noticing.



I agree it's scary but in a few minutes thinking about it I haven't been 
able to come up with a good way of checking it. Maybe we could build 
sizeof(ControlData) into the version number, so instead of 937 we'd have 
937n. Then we could check the n against what we know we is the 
size. I realize this isn't perfect, but might be better than nothing.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread David G Johnston

Robert Haas wrote
> On Wed, Jun 4, 2014 at 2:52 PM, Tom Lane <

> tgl@.pa

> > wrote:
>> I just noticed that we had not one, but two commits in 9.4 that added
>> fields to pg_control.  And neither one changed PG_CONTROL_VERSION.
>> This is inexcusable sloppiness on the part of the committers involved,
>> but the question is what do we do now?
> 
> I think it would be an awfully good idea to think about what we could
> put into the buildfarm, the git repository, or the source tree to get
> some automatic notification when somebody screws up this way (or the
> xlog header magic, or catversion).  The first of those two screw-ups
> (by me) was 11 months ago today; it's pretty scary that we're only
> just now noticing.

Not withstanding Tom's comments on the topic a regression test could work
here.

There was just a recent "leakproof" function discovery that resulted from a
regression test that compared all known leakproof functions to those in the
current catalog.

When the test fails there should be additional instruction - like "Please
alter this output file AND bump PG_CONTROL_VERSION!"

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Sigh-we-need-an-initdb-tp5806058p5806071.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery testing for beta

2014-06-04 Thread Andres Freund

Hi Jeff,

On 2014-05-29 09:39:56 -0700, Jeff Janes wrote:
> What features in 9.4 need more beta testing for recovery?

Another thing I'd like to add to the list is wal_level=logical. Not such
much the logical decoding side, but that we haven't screwed up normal
crash recovery/wal replay.

> I also applied the foreign key test retroactively to 9.3, and it quickly
> re-found the multixact bugs up until commit 9a57858f1103b89a5674.

But you haven't found 1a917ae8610d44985f (master) / c0bd128c81c2b23a
(REL9_3_STABLE)? That's somewhat interesting...

Thanks for all the testing,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-04 Thread Andres Freund

On 2014-06-03 15:08:15 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > In 9.4. COSTS OFF for EXPLAIN prevents 'Planning time' to be
> > printed. Should we perhaps do the same for 'Execution time'? That'd make
> > it possible to use EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) in
> > regression tests.
> 
> > Currently the output for that is:
> > postgres=# EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF) SELECT 1;
> >QUERY PLAN
> > 
> >  Result (actual rows=1 loops=1)
> >  Total runtime: 0.035 ms
> > (2 rows)
> 
> > Leaving off the total runtime doesn't seem bad to me.
> 
> It seems a little weird to call it a "cost" ... but maybe that
> ship has sailed given how we're treating the planning-time item.
> 
> I'm unconvinced that this'd add much to our regression testing capability,
> though.  The standard thing is to do an EXPLAIN to check the plan shape
> and then run the query to see if it gets the right answer.  Checking row
> counts is pretty well subsumed by the latter, and is certainly not an
> adequate substitute for it.
> 
> So on the whole, -1 ... this is an unintuitive and
> non-backwards-compatible change that doesn't look like it buys much.

I've added the regression test I want this for.

0001 is the bugfix making me look into it
0002 is COSTS OFF removing the display of execution time
0003 is the regression test

Note that 0003 will require a kill -9 without 0001.

I am not sure myself if the test is really worth it. On one hand it's an
area that had seen several hard to find bugs over the years and is
likely to see further changes (e.g. CSN stuff) in the near future, on
the other hand the tests are tricky and require specific ordering.

Opinions?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From 621a99a666ba1a27b852dc5ddc0e1b224c388f53 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 4 Jun 2014 21:36:19 +0200
Subject: [PATCH 1/3] Fix longstanding bug in HeapTupleSatisfiesVacuum().

HeapTupleSatisfiesVacuum() didn't properly discern between
DELETE_IN_PROGRESS and INSERT_IN_PROGRESS for rows that have been
inserted in the current transaction and deleted in a aborted
subtransaction of the current backend. At the very least that caused
problems for CLUSTER and CREATE INDEX in transactions that had
aborting subtransactions producing rows, leading to warnings like:
WARNING:  concurrent delete in progress within table "..."
possibly in an endless, uninterruptible, loop.

Instead of treating *InProgress xmins the same as *IsCurrent ones,
treat them as being distinct like the other visibility routines. As
implemented this separatation can cause a behaviour change for rows
that have been inserted and deleted in another, still running,
transaction. HTSV will now return INSERT_IN_PROGRESS instead of
DELETE_IN_PROGRESS for those. That's both, more in line with the other
visibility routines and arguably more correct. The latter because a
INSERT_IN_PROGRESS will make callers look at/wait for xmin, instead of
xmax.
The only current caller where that's possibly worse than the old
behaviour is heap_prune_chain() which now won't mark the page as
prunable if a row has concurrently been inserted and deleted. That's
harmless enough.

As a cautionary measure also insert a interrupt check before the gotos
in IndexBuildHeapScan() that lead to the uninterruptible loop. There
are other possible causes, like a row that several sessions try to
update and all fail, for repeated loops and the cost of doing so in
the retry case is low.

As this bug goes back all the way to the introduction of
subtransactions in 573a71a5da backpatch to all supported releases.

Reported-By: Sandro Santilli
---
 src/backend/catalog/index.c|  2 ++
 src/backend/utils/time/tqual.c | 19 +--
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 80acc0e..a5a204e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2298,6 +2298,7 @@ IndexBuildHeapScan(Relation heapRelation,
 			XactLockTableWait(xwait, heapRelation,
 			  &heapTuple->t_data->t_ctid,
 			  XLTW_InsertIndexUnique);
+			CHECK_FOR_INTERRUPTS();
 			goto recheck;
 		}
 	}
@@ -2346,6 +2347,7 @@ IndexBuildHeapScan(Relation heapRelation,
 			XactLockTableWait(xwait, heapRelation,
 			  &heapTuple->t_data->t_ctid,
 			  XLTW_InsertIndexUnique);
+			CHECK_FOR_INTERRUPTS();
 			goto recheck;
 		}

diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 75cd53e..96874ab 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -1166,7 +1166,7 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
 return HEAPTUPLE_DEAD;
 			}
 		}
-		else if (TransactionIdIsInProgress(HeapTupleHeaderGetRaw

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Robert Haas

On Wed, Jun 4, 2014 at 2:52 PM, Tom Lane  wrote:
> I just noticed that we had not one, but two commits in 9.4 that added
> fields to pg_control.  And neither one changed PG_CONTROL_VERSION.
> This is inexcusable sloppiness on the part of the committers involved,
> but the question is what do we do now?

I think it would be an awfully good idea to think about what we could
put into the buildfarm, the git repository, or the source tree to get
some automatic notification when somebody screws up this way (or the
xlog header magic, or catversion).  The first of those two screw-ups
(by me) was 11 months ago today; it's pretty scary that we're only
just now noticing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix xpath() to return namespace definitions

2014-06-04 Thread Robert Haas

On Fri, May 30, 2014 at 5:04 AM, Ali Akbar  wrote:
> While developing some XML processing queries, i stumbled on an old bug
> mentioned in http://wiki.postgresql.org/wiki/Todo#XML: Fix Nested or
> repeated xpath() that apparently mess up namespaces.
>
> Source of the bug is that libxml2's xmlNodeDump won't output XML namespace
> definitions that declared in the node's parents. As per
> https://bug639036.bugzilla-attachments.gnome.org/attachment.cgi?id=177858,
> the behavior is intentional.
>
> This patch uses function xmlCopyNode that copies a node, including its
> namespace definitions as required (without unused namespace in the node or
> its children). When the copy dumped, the resulting XML is complete with its
> namespaces. Calling xmlCopyNode will need additional memory to execute, but
> reimplementing its routine to handle namespace definition will introduce
> much complexity to the code.
>
> Note: This is my very first postgresql patch.

Please add your patch here so we don't forget about it:

https://commitfest.postgresql.org/action/commitfest_view/open

Please see also:

https://wiki.postgresql.org/wiki/Submitting_a_Patch

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Magnus Hagander

On Jun 4, 2014 8:52 PM, "Tom Lane"  wrote:
>
> I just noticed that we had not one, but two commits in 9.4 that added
> fields to pg_control.  And neither one changed PG_CONTROL_VERSION.
> This is inexcusable sloppiness on the part of the committers involved,
> but the question is what do we do now?
>
> Quick experimentation says that you don't really get to the point of
> noticing this if you try to start a 9.4 postmaster against 9.3 database or
> vice versa, because the postmaster will look at the PG_VERSION file first.
> However, pg_resetxlog and pg_controldata don't do that, so you could get
> misleading results from the wrong pg_controldata version or potentially
> screw yourself entirely with the wrong pg_resetxlog, if you failed to
> interpret their warnings about wrong CRC values correctly.
>
> I think we could possibly ship 9.4 without fixing this, but it would be
> imprudent.  Anyone think differently?

Shipping it without fixing that seems like a horrible idea. We should
either force an initdb or revert those changes.

I'd vote for forcing initdb. If we push it off we'll be paying for it for
the coming 5 years.

> Of course, if we do fix this then the door opens for pushing other
> initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition
> that I was looking at when I noticed this, or the pg_lsn catalog
> additions that were being discussed a couple weeks ago.

+1. We should of course evaluate those patches individually  but the initdb
argument against them isn't valid, so if they are otherwise good, we should
accept them.

/Magnus

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Stefan Kaltenbrunner

On 06/04/2014 08:56 PM, Joshua D. Drake wrote:
> 
> On 06/04/2014 11:52 AM, Tom Lane wrote:
> 
>> I think we could possibly ship 9.4 without fixing this, but it would be
>> imprudent.  Anyone think differently?
>>
>> Of course, if we do fix this then the door opens for pushing other
>> initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition
>> that I was looking at when I noticed this, or the pg_lsn catalog
>> additions that were being discussed a couple weeks ago.
> 
> It certainly seems that if we are going to initdb anyway, let's do it
> with approved features that got kicked (assuming) only because they
> would cause an initdb.

agreed there - I dont think the "no initdb rule during BETA" really buys
us that much these days. If people test our betas at all they do on
scratch boxes in development/staging, i really doubt that (especially
given the .0 history we had in the last years) people really move -BETA
installs to production or expect to do so.


Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sigh, we need an initdb

2014-06-04 Thread Joshua D. Drake



On 06/04/2014 11:52 AM, Tom Lane wrote:


I think we could possibly ship 9.4 without fixing this, but it would be
imprudent.  Anyone think differently?

Of course, if we do fix this then the door opens for pushing other
initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition
that I was looking at when I noticed this, or the pg_lsn catalog
additions that were being discussed a couple weeks ago.


It certainly seems that if we are going to initdb anyway, let's do it 
with approved features that got kicked (assuming) only because they 
would cause an initdb.


JD




regards, tom lane





--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
Political Correctness is for cowards.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Sigh, we need an initdb

2014-06-04 Thread Tom Lane

I just noticed that we had not one, but two commits in 9.4 that added
fields to pg_control.  And neither one changed PG_CONTROL_VERSION.
This is inexcusable sloppiness on the part of the committers involved,
but the question is what do we do now?

Quick experimentation says that you don't really get to the point of
noticing this if you try to start a 9.4 postmaster against 9.3 database or
vice versa, because the postmaster will look at the PG_VERSION file first.
However, pg_resetxlog and pg_controldata don't do that, so you could get
misleading results from the wrong pg_controldata version or potentially
screw yourself entirely with the wrong pg_resetxlog, if you failed to
interpret their warnings about wrong CRC values correctly.

I think we could possibly ship 9.4 without fixing this, but it would be
imprudent.  Anyone think differently?

Of course, if we do fix this then the door opens for pushing other
initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition
that I was looking at when I noticed this, or the pg_lsn catalog
additions that were being discussed a couple weeks ago.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposing pg_hibernate

2014-06-04 Thread Andres Freund

On 2014-06-04 14:50:39 -0400, Robert Haas wrote:
> The thing I was concerned about is that the system might have been in
> recovery for months.  What was hot at the time the base backup was
> taken seems like a poor guide to what will be hot at the time of
> promotion. Consider a history table, for example: the pages at the
> end, which have just been written, are much more likely to be useful
> than anything earlier.

I'd assumed that the hibernation files would simply be excluded from the
basebackup...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposing pg_hibernate

2014-06-04 Thread Robert Haas

On Wed, Jun 4, 2014 at 9:56 AM, Andres Freund  wrote:
> On 2014-06-04 09:51:36 -0400, Robert Haas wrote:
>> On Wed, Jun 4, 2014 at 2:08 AM, Andres Freund  wrote:
>> > On 2014-06-04 10:24:13 +0530, Amit Kapila wrote:
>> >> Incase of recovery, the shared buffers saved by this utility are
>> >> from previous shutdown which doesn't seem to be of more use
>> >> than buffers loaded by recovery.
>> >
>> > Why? The server might have been queried if it's a hot standby one?
>>
>> I think that's essentially the same point Amit is making.  Gurjeet is
>> arguing for reloading the buffers from the previous shutdown at end of
>> recovery; IIUC, Amit, you, and I all think this isn't a good idea.
>
> I think I am actually arguing for Gurjeet's position. If the server is
> actively being queried (i.e. hot_standby=on and actually used for
> queries) it's quite reasonable to expect that shared_buffers has lots of
> content that is *not* determined by WAL replay.
>
> There's not that much read IO going on during WAL replay anyway - after
> a crash/start from a restartpoint most of it is loaded via full page
> anyway. So it's only disadvantageous to fault in pages via pg_hibernate
> if that causes pages that already have been read in via FPIs to be
> thrown out.

The thing I was concerned about is that the system might have been in
recovery for months.  What was hot at the time the base backup was
taken seems like a poor guide to what will be hot at the time of
promotion. Consider a history table, for example: the pages at the
end, which have just been written, are much more likely to be useful
than anything earlier.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> There are at least two places in inv_api.c where we have
> "Assert(pagelen <= LOBLKSIZE)" that is protecting a subsequent memcpy
> into a local variable of size LOBLKSIZE, so that the only thing standing
> between us and a stack-smash security issue that's trivially exploitable
> in production builds is that on-disk data conforms to our expectation
> about LOBLKSIZE.  I think it's definitely worth promoting these checks
> to regular runtime-if-test-and-elog.

Agreed.  Promoting that to a run-time check seems well worth it to me.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Tom Lane

Robert Haas  writes:
> On Wed, Jun 4, 2014 at 1:21 PM, Tom Lane  wrote:
>> BTW, just comparing the handling of TOAST_MAX_CHUNK_SIZE and LOBLKSIZE,
>> I noticed that the tuptoaster.c functions are reasonably paranoid about
>> checking that toast chunks are the expected size, but the large object
>> functions are not: the latter have either no check at all, or just an
>> Assert that the size is not more than expected.  So we could provide at
>> least a partial guard against a wrong LOBLKSIZE configuration by making
>> all the large-object functions throw elog(ERROR) if the length of a LO
>> chunk is more than LOBLKSIZE.  Unfortunately, length *less* than LOBLKSIZE
>> is an expected case, so this would only help in one direction.  Still,
>> it'd be an easy and back-patchable change that would provide at least some
>> defense, so I'm thinking of doing it.

> This seems like a pretty weak argument for adding run-time overhead.
> Maybe it can be justified on the grounds that it would catch corrupted
> TOAST data, but I've never heard of anyone changing LOBLKSIZE and see
> no reason to get agitated about it.

One if-test per fetched tuple hardly seems likely to add measurable
overhead.  As for "never heard of", see today's thread in pgsql-general:
http://www.postgresql.org/message-id/flat/CAGou9Mg=9qpytdh18ndo3ltjtwqn8urdtwabfkcymrut6d_...@mail.gmail.com
There was a similar gripe a few months ago:
http://www.postgresql.org/message-id/cacg6vwxy_84shccxzncsz9xlfwnx5szvqru6ancrr0c3xw1...@mail.gmail.com

There are at least two places in inv_api.c where we have
"Assert(pagelen <= LOBLKSIZE)" that is protecting a subsequent memcpy
into a local variable of size LOBLKSIZE, so that the only thing standing
between us and a stack-smash security issue that's trivially exploitable
in production builds is that on-disk data conforms to our expectation
about LOBLKSIZE.  I think it's definitely worth promoting these checks
to regular runtime-if-test-and-elog.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Robert Haas

On Wed, Jun 4, 2014 at 1:21 PM, Tom Lane  wrote:
> Stephen Frost  writes:
>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>>> I've not heard one either, but there was just somebody asking in
>>> pgsql-general about changing LOBLKSIZE, so he's going to be at risk.
>>> That's not a big enough sample size to make me panic about getting a
>>> hasty fix into 9.4, but I do think we should fix this going forward.
>
>> Agreed.
>
> BTW, just comparing the handling of TOAST_MAX_CHUNK_SIZE and LOBLKSIZE,
> I noticed that the tuptoaster.c functions are reasonably paranoid about
> checking that toast chunks are the expected size, but the large object
> functions are not: the latter have either no check at all, or just an
> Assert that the size is not more than expected.  So we could provide at
> least a partial guard against a wrong LOBLKSIZE configuration by making
> all the large-object functions throw elog(ERROR) if the length of a LO
> chunk is more than LOBLKSIZE.  Unfortunately, length *less* than LOBLKSIZE
> is an expected case, so this would only help in one direction.  Still,
> it'd be an easy and back-patchable change that would provide at least some
> defense, so I'm thinking of doing it.

This seems like a pretty weak argument for adding run-time overhead.
Maybe it can be justified on the grounds that it would catch corrupted
TOAST data, but I've never heard of anyone changing LOBLKSIZE and see
no reason to get agitated about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Tom Lane

Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> I've not heard one either, but there was just somebody asking in
>> pgsql-general about changing LOBLKSIZE, so he's going to be at risk.
>> That's not a big enough sample size to make me panic about getting a
>> hasty fix into 9.4, but I do think we should fix this going forward.

> Agreed.

BTW, just comparing the handling of TOAST_MAX_CHUNK_SIZE and LOBLKSIZE,
I noticed that the tuptoaster.c functions are reasonably paranoid about
checking that toast chunks are the expected size, but the large object
functions are not: the latter have either no check at all, or just an
Assert that the size is not more than expected.  So we could provide at
least a partial guard against a wrong LOBLKSIZE configuration by making
all the large-object functions throw elog(ERROR) if the length of a LO
chunk is more than LOBLKSIZE.  Unfortunately, length *less* than LOBLKSIZE
is an expected case, so this would only help in one direction.  Still,
it'd be an easy and back-patchable change that would provide at least some
defense, so I'm thinking of doing it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] backup_label revisited

2014-06-04 Thread Greg Stark

On Mon, Jun 2, 2014 at 2:10 PM, Fujii Masao  wrote:
> Could you let me know the link to the page explaining this?
>
>> That would even protect against another
>> restore on the same host.
>
> What about the case where we restore the backup to another server and
> start the recovery? In this case, ISTM inode can be the same. No?

Hm, I was about to comment that that seems very unlikely. Even on the
same server if you delete the old database root and then unpack a
backup it could possibly reuse the exact same inode again. But it's
really not likely to happen.

However in the brave new world of filesystem snapshots there is the
possibility someone has "restored" a backup by opening one of their
snapshots read-write. In which case the backup-label would have the
same inode number. That would still be fine if the snapshot is
consistent but if they have tablespaces and those tablespaces were
snapshotted separately then it wouldn't be ok.

I think that's a fatal flaw unless anyone can see a way to distinguish
a filesystem from a snapshot of the filesystem.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql: show only failed queries

2014-06-04 Thread Pavel Stehule

2014-06-04 18:16 GMT+02:00 Peter Eisentraut :

> On 6/4/14, 11:54 AM, Pavel Stehule wrote:
> > updated patch - only one change: query is prefixed by "QUERY:  "
>
> In the backend server log, this is called "STATEMENT:  ".
>

good idea

updated patch

Pavel
commit f752566830032438739b7e5ab1f55149c737e6d8
Author: Pavel Stehule 
Date:   Wed Jun 4 17:44:54 2014 +0200

initial

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index ee6ec3a..cf0e78b 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2812,7 +2812,8 @@ bar
 queries,
 psql merely prints all queries as
 they are sent to the server. The switch for this is
--e.
+-e. If set to error then only
+failed queries are displayed.
 
 
   
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 60169a2..fed3ee5 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -995,6 +995,12 @@ SendQuery(const char *query)
 		results = NULL;			/* PQclear(NULL) does nothing */
 	}
 
+	if (!OK && pset.echo == PSQL_ECHO_ERROR)
+	{
+		printf("STATEMENT:  %s\n", query);
+		fflush(stdout);
+	}
+
 	/* If we made a temporary savepoint, possibly release/rollback */
 	if (on_error_rollback_savepoint)
 	{
diff --git a/src/bin/psql/settings.h b/src/bin/psql/settings.h
index 0a60e68..e4a18f0 100644
--- a/src/bin/psql/settings.h
+++ b/src/bin/psql/settings.h
@@ -31,6 +31,7 @@ typedef enum
 {
 	PSQL_ECHO_NONE,
 	PSQL_ECHO_QUERIES,
+	PSQL_ECHO_ERROR,
 	PSQL_ECHO_ALL
 } PSQL_ECHO;
 
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 45653a1..b59bd59 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -720,6 +720,8 @@ echo_hook(const char *newval)
 		pset.echo = PSQL_ECHO_NONE;
 	else if (strcmp(newval, "queries") == 0)
 		pset.echo = PSQL_ECHO_QUERIES;
+	else if (strcmp(newval, "error") == 0)
+		pset.echo = PSQL_ECHO_ERROR;
 	else if (strcmp(newval, "all") == 0)
 		pset.echo = PSQL_ECHO_ALL;
 	else
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 3bb727f..e5e1558 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3479,6 +3479,23 @@ psql_completion(const char *text, int start, int end)
 	{
 		matches = complete_from_variables(text, "", "");
 	}
+	else if (strcmp(prev2_wd, "\\set") == 0)
+	{
+		if (strcmp(prev_wd, "ECHO") == 0)
+		{
+			static const char *const my_list[] =
+			{"none", "error", "queries", "all", NULL};
+
+			COMPLETE_WITH_LIST_CS(my_list);
+		}
+		else if (strcmp(prev_wd, "ECHO_HIDDEN") == 0)
+		{
+			static const char *const my_list[] =
+			{"noexec", "off", "on", NULL};
+
+			COMPLETE_WITH_LIST_CS(my_list);
+		}
+	}
 	else if (strcmp(prev_wd, "\\sf") == 0 || strcmp(prev_wd, "\\sf+") == 0)
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_functions, NULL);
 	else if (strcmp(prev_wd, "\\cd") == 0 ||

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql: show only failed queries

2014-06-04 Thread Peter Eisentraut

On 6/4/14, 11:54 AM, Pavel Stehule wrote:
> updated patch - only one change: query is prefixed by "QUERY:  "

In the backend server log, this is called "STATEMENT:  ".



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Migrating from 9.2.4 to 9.3.0 with XML DOCTYPE

2014-06-04 Thread Tim Kane

> 
> 
> From:  Tom Lane 
> 
> Hm, can you restore it into 9.2 either?
> 
> AFAICS, pg_dump has absolutely no idea that it should be worried about the
> value of xmloption, despite the fact that that setting affects what is
> considered valid XML data.  What's worse, even if it were attempting to do
> something about xmloption, I don't see how it could deal with a case like
> this where you have different values inside the same database that require
> two different settings in order to parse.
> 
> This isn't a 9.3.x bug, it's an aboriginal misdesign of the XML datatype.
> Not sure what we can do about it at this point.  Perhaps we could invent
> a "document_or_content" setting that would tell xml_in to accept either
> case?  And then have pg_dump force that setting to be used during restore?


This sounds reasonable. My use case is purely as a document store, with the
ability to perform xml parse functions against it – as such, I’m not
concerned wether it’s a document or content – hence why we have both types
recorded against that field.

For the minute, I’m getting around the restore problem by mangling the dump
such that the table is created using the text type rather than xml.  This at
least gets the data onto a 9.3 cluster, even if it’s cosmetically
represented as text instead of xml.  I can worry about the document vs
content problem at a later stage.

> 
> PS: BTW, I agree with the advice expressed by David J: under no
> circumstances put any data you care about on 9.3.0.  That release
> was rather a disaster from a quality-control standpoint :-(
> But that's unrelated to your XML issue.


Ack. Thanks for the info. I’ll push the upgrade-path agenda a little harder.

Re: [HACKERS] psql: show only failed queries

2014-06-04 Thread Pavel Stehule

Hello

updated patch - only one change: query is prefixed by "QUERY:  "

current state:

[pavel@localhost ~]$ src/postgresql/src/bin/psql/psql postgres -q -f
data.sql
psql:data.sql:6: ERROR:  value too long for type character varying(3)

Show only errors mode:

[pavel@localhost ~]$ src/postgresql/src/bin/psql/psql postgres -q -v
ECHO=error -f data.sql
psql:data.sql:6: ERROR:  value too long for type character varying(3)
QUERY:  INSERT INTO bubu VALUES('Ahoj');

Now, when I am thinking about these results, I am thinking, so second
variant is more practical and can be default.

Opinions, notes?

Regards

Pavel




2014-03-04 8:52 GMT+01:00 Pavel Stehule :

>
>
>
> 2014-03-04 6:35 GMT+01:00 Fabrízio de Royes Mello  >:
>
>
>>
>>
>> On Sat, Mar 1, 2014 at 8:01 AM, Pavel Stehule 
>> wrote:
>> >
>> > Hello
>> >
>> > I was asked, how can be showed only failed queries in psql.
>> >
>> > I am thinking, so it is not possible now. But implementation is very
>> simple
>> >
>> > What do you think about it?
>> >
>> > bash-4.1$ psql postgres -v ECHO=error -f data.sql
>> > INSERT 0 1
>> > Time: 27.735 ms
>> > INSERT 0 1
>> > Time: 8.303 ms
>> > psql:data.sql:3: ERROR:  value too long for type character varying(2)
>> > insert into foo values('bbb');
>> > Time: 0.178 ms
>> > INSERT 0 1
>> > Time: 8.285 ms
>> > psql:data.sql:5: ERROR:  value too long for type character varying(2)
>> > insert into foo values('ss');
>> > Time: 0.422 ms
>> >
>>
>> The patch works fine, but I think we must add some prefix to printed
>> query. Like that:
>>
>> fabrizio=# \set ECHO error
>> fabrizio=# insert into foo values ('XXX');
>>
>> ERROR:  value too long for type character varying(2)
>> DETAIL:  insert into foo values ('XXX');
>>
>> or
>>
>> fabrizio=# \set ECHO error
>> fabrizio=# insert into foo values ('XXX');
>>
>> ERROR:  value too long for type character varying(2)
>> QUERY:  insert into foo values ('XXX');
>>
>> This may help to filter the output with some tool like 'grep'.
>>
>
> sure, good idea.
>
> I add link to your notice to commitfest app
>
> Regards
>
> Pavel
>
>
>>
>> Regards,
>>
>> --
>> Fabrízio de Royes Mello
>> Consultoria/Coaching PostgreSQL
>> >> Timbira: http://www.timbira.com.br
>> >> Blog sobre TI: http://fabriziomello.blogspot.com
>> >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> >> Twitter: http://twitter.com/fabriziomello
>>
>
>
commit 47b8944a5d3afb6763096bdd993e256ecae6691d
Author: Pavel Stehule 
Date:   Wed Jun 4 17:44:54 2014 +0200

initial

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index ee6ec3a..cf0e78b 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2812,7 +2812,8 @@ bar
 queries,
 psql merely prints all queries as
 they are sent to the server. The switch for this is
--e.
+-e. If set to error then only
+failed queries are displayed.
 
 
   
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 60169a2..5c94c03 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -995,6 +995,12 @@ SendQuery(const char *query)
 		results = NULL;			/* PQclear(NULL) does nothing */
 	}
 
+	if (!OK && pset.echo == PSQL_ECHO_ERROR)
+	{
+		printf("QUERY:  %s\n", query);
+		fflush(stdout);
+	}
+
 	/* If we made a temporary savepoint, possibly release/rollback */
 	if (on_error_rollback_savepoint)
 	{
diff --git a/src/bin/psql/settings.h b/src/bin/psql/settings.h
index 0a60e68..e4a18f0 100644
--- a/src/bin/psql/settings.h
+++ b/src/bin/psql/settings.h
@@ -31,6 +31,7 @@ typedef enum
 {
 	PSQL_ECHO_NONE,
 	PSQL_ECHO_QUERIES,
+	PSQL_ECHO_ERROR,
 	PSQL_ECHO_ALL
 } PSQL_ECHO;
 
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 45653a1..b59bd59 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -720,6 +720,8 @@ echo_hook(const char *newval)
 		pset.echo = PSQL_ECHO_NONE;
 	else if (strcmp(newval, "queries") == 0)
 		pset.echo = PSQL_ECHO_QUERIES;
+	else if (strcmp(newval, "error") == 0)
+		pset.echo = PSQL_ECHO_ERROR;
 	else if (strcmp(newval, "all") == 0)
 		pset.echo = PSQL_ECHO_ALL;
 	else
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 3bb727f..e5e1558 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3479,6 +3479,23 @@ psql_completion(const char *text, int start, int end)
 	{
 		matches = complete_from_variables(text, "", "");
 	}
+	else if (strcmp(prev2_wd, "\\set") == 0)
+	{
+		if (strcmp(prev_wd, "ECHO") == 0)
+		{
+			static const char *const my_list[] =
+			{"none", "error", "queries", "all", NULL};
+
+			COMPLETE_WITH_LIST_CS(my_list);
+		}
+		else if (strcmp(prev_wd, "ECHO_HIDDEN") == 0)
+		{
+			static const char *const my_list[] =
+			{"noexec", "off", "on", NULL};
+
+			COMPLETE_WITH_LIST_CS(my_list);
+		}
+	}
 	else if (strcmp(prev_wd, "\\sf") == 0 || strcmp(prev_wd, "\\sf+") == 0)
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_functions, NULL);

Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-04 Thread Keith Fiske

On Tue, Jun 3, 2014 at 8:37 PM, Haribabu Kommi 
wrote:

> On Wed, Jun 4, 2014 at 5:46 AM, Keith Fiske  wrote:
> >
> > Andres's changes on June 3rd to
> >
> https://github.com/postgres/postgres/commits/master/src/test/regress/expected/create_function_3.out
> > are causing patch v2 to fail for that regression test file.
> >
> > postgres $ patch -p1 -i ../inet_agg_v2.patch
> > patching file src/backend/utils/adt/network.c
> > patching file src/include/catalog/pg_aggregate.h
> > patching file src/include/catalog/pg_proc.h
> > patching file src/include/utils/builtins.h
> > patching file src/test/regress/expected/create_function_3.out
> > Hunk #1 FAILED at 153.
> > 1 out of 1 hunk FAILED -- saving rejects to file
> > src/test/regress/expected/create_function_3.out.rej
> > patching file src/test/regress/expected/inet.out
> > patching file src/test/regress/sql/inet.sql
> >
> > Otherwise it applies without any issues to the latest HEAD. I built and
> > started a new instance, and I was able to test at least the basic min/max
> > functionality
> >
> > keith=# create table test_inet (id serial, ipaddress inet);
> > CREATE TABLE
> > Time: 25.546 ms
> > keith=# insert into test_inet (ipaddress) values ('192.168.1.1');
> > INSERT 0 1
> > Time: 3.143 ms
> > keith=# insert into test_inet (ipaddress) values ('192.168.1.2');
> > INSERT 0 1
> > Time: 2.932 ms
> > keith=# insert into test_inet (ipaddress) values ('127.0.0.1');
> > INSERT 0 1
> > Time: 1.786 ms
> > keith=# select min(ipaddress) from test_inet;
> > min
> > ---
> >  127.0.0.1
> > (1 row)
> >
> > Time: 3.371 ms
> > keith=# select max(ipaddress) from test_inet;
> >  max
> > -
> >  192.168.1.2
> > (1 row)
> >
> > Time: 1.104 ms
>
> Thanks for the test. Patch is re-based to the latest head.
>
> Regards,
> Hari Babu
> Fujitsu Australia
>


Applying patch against latest HEAD
(654e8e444749f053c3bf3fd543d10deb6aa6dd09) with no issues

$ patch -p1 -i ../inet_agg_v3.patch
patching file src/backend/utils/adt/network.c
patching file src/include/catalog/pg_aggregate.h
patching file src/include/catalog/pg_proc.h
patching file src/include/utils/builtins.h
patching file src/test/regress/expected/create_function_3.out
patching file src/test/regress/expected/inet.out
patching file src/test/regress/sql/inet.sql

Ran same min/max test as before and worked without issues.

--
Keith Fiske
Database Administrator
OmniTI Computer Consulting, Inc.
http://www.keithf4.com

Re: [HACKERS] Could not open file pg_multixact/offsets/ ERROR on 9.3.4

2014-06-04 Thread Alvaro Herrera

Andres Freund wrote:

> On 2014-06-04 16:59:10 +0200, Alexey Klyukin wrote:
> > I've recently discovered a peculiar problem on one of our big databases
> > (more than 1TB). The database has been upgraded from 9.2 to 9.3.4 (using
> > hardlinks to speedup the process)  on April 7th around 22:00 local time.
> > When doing  vacuum on any table, the system fails with the following error:
> > 
> > ERROR:  could not access status of transaction 2035987419
> > DETAIL:  Could not open file "pg_multixact/offsets/795A": No such file or
> > directory.

> Looks like you're hitting the issue described in
> http://archives.postgresql.org/message-id/20140530121631.GE25431%40alap3.anarazel.de

Will look into this.  Thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposing pg_hibernate

2014-06-04 Thread Amit Kapila

On Wed, Jun 4, 2014 at 7:26 PM, Andres Freund 
wrote:
> On 2014-06-04 09:51:36 -0400, Robert Haas wrote:
> > On Wed, Jun 4, 2014 at 2:08 AM, Andres Freund 
wrote:
> > > On 2014-06-04 10:24:13 +0530, Amit Kapila wrote:
> > >> Incase of recovery, the shared buffers saved by this utility are
> > >> from previous shutdown which doesn't seem to be of more use
> > >> than buffers loaded by recovery.
> > >
> > > Why? The server might have been queried if it's a hot standby one?

Consider the case, crash (force kill or some other way) occurs when
BGSaver is saving the buffers, now I think it is possible that it has
saved partial information (information about some buffers is correct
and others is missing) and it is also possible by that time checkpoint
record is not written (which means recovery will start from previous
restart point).  So whats going to happen is that pg_hibernate might
load some less used buffers/blocks (which have lower usage count)
and WAL replayed blocks will be sacrificed.  So the WAL data from
previous restart point and some more due to delay in start of
standby (changes occured in master during that time) will be
sacrificed.

Another case is of standalone server in which case there is always
high chance that blocks recovered by recovery are the active one's.
Now I agree that case of standalone servers is less, but still some
small applications might be using it.  Also I think same is true if
the crashed server is master.

> > I think that's essentially the same point Amit is making.  Gurjeet is
> > arguing for reloading the buffers from the previous shutdown at end of
> > recovery; IIUC, Amit, you, and I all think this isn't a good idea.
>
> I think I am actually arguing for Gurjeet's position. If the server is
> actively being queried (i.e. hot_standby=on and actually used for
> queries) it's quite reasonable to expect that shared_buffers has lots of
> content that is *not* determined by WAL replay.

Yes, that's quite possible, however there can be situations where it
is not true as explained above.

> There's not that much read IO going on during WAL replay anyway - after
> a crash/start from a restartpoint most of it is loaded via full page
> anyway.

> So it's only disadvantageous to fault in pages via pg_hibernate
> if that causes pages that already have been read in via FPIs to be
> thrown out.

So for such cases, pages loaded by pg_hibernate turn out to be loss.

Overall I think there can be both kind of cases when it is beneficial
to load buffers after recovery and before recovery, thats why I
mentioned above that either it can be a parameter from user to
decide the same or may be we can have a new API which will
load buffers by BGworker without evicting any existing buffer
(use buffers from free list only).

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-04 Thread Tom Lane

Magnus Hagander  writes:
> On Tue, Jun 3, 2014 at 6:38 PM, Tom Lane  wrote:
>> Another thought is we could make pg_basebackup simply skip any files that
>> exceed RELSEG_SIZE, on the principle that you don't really need/want
>> enormous log files to get copied anyhow.  We'd still need the pax
>> extension if the user had configured large RELSEG_SIZE, but having a
>> compatible tar could be documented as a requirement of doing that.

> I think going all the way to pax is the proper long-term thing to do, at
> least if we can confirm it works in the main tar implementations.

> For backpatchable that seems more reasonable. It doesn't work today, and we
> just need to document that it doesn't, with larger RELSEG_SIZE. And then
> fix it properly for the future.

Agreed, this would be a reasonable quick fix that we could replace in
9.5 or later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Could not open file pg_multixact/offsets/ ERROR on 9.3.4

2014-06-04 Thread Andres Freund

Hi,

On 2014-06-04 16:59:10 +0200, Alexey Klyukin wrote:
> I've recently discovered a peculiar problem on one of our big databases
> (more than 1TB). The database has been upgraded from 9.2 to 9.3.4 (using
> hardlinks to speedup the process)  on April 7th around 22:00 local time.
> When doing  vacuum on any table, the system fails with the following error:
> 
> ERROR:  could not access status of transaction 2035987419
> DETAIL:  Could not open file "pg_multixact/offsets/795A": No such file or
> directory.
> 
> The erorr doesn't depend on the table being vacuumed, or even database, i.e:
> 
> postgres=# create database x;
> CREATE DATABASE
> postgres=# \c x
> You are now connected to database "x" as user "postgres".
> x=# create table test();
> CREATE TABLE
> x=# vacuum test;
> ERROR:  could not access status of transaction 2035987419
> DETAIL:  Could not open file "pg_multixact/offsets/795A": No such file or
> directory.
> 
> The content of pg_multixact/offsets is:
>
> pg_multixact$ ls -lR
> ./members:
> -rw--- 1 postgres postgres 8192 Apr 16 18:20 
> ./offsets:
> -rw--- 1 postgres postgres   8192 Apr  7 22:51 
> -rw--- 1 postgres postgres 262144 Apr 16 18:20 79A6
> 
> the select mutlixact from pg_database gives me:

> and the 2035987419 = 0x795AB3DB belongs to 795A segment.
> The  file just contains a single page of all zeroes. Neither the 9.3.4
> replica of this database, nor the original 9.2 cluster data directory
> contain this file.

Looks like you're hitting the issue described in
http://archives.postgresql.org/message-id/20140530121631.GE25431%40alap3.anarazel.de

> I'm tempted to just remove the  file from master and restart the
> database, since it's effectively impossible to run vacuum now, but I'd like
> to understand what's happening first. Below is the result of pg_filedump
> for the master:

Yes, that's fine in this specific case.

Note that the  segment isn't yused by anything between the oldest
and newest multixact:

> Latest checkpoint's NextMultiXactId:  2040987419
> Latest checkpoint's NextMultiOffset:  3
> Latest checkpoint's oldestXID:1038291920
> Latest checkpoint's oldestXID's DB:   16415
> Latest checkpoint's oldestActiveXID:  1655189767
> Latest checkpoint's oldestMultiXid:   2040987417
> Latest checkpoint's oldestMulti's DB: 0

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-04 Thread Magnus Hagander

On Tue, Jun 3, 2014 at 6:38 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > Yeah, that is a clear advantage of that method. Didn't read up on pax
> > format backwards compatibility, does it have some trick to achieve
> > something similar?
>
> I didn't read the fine print but it sounded like the extended header
> would look like a separate file entry to a non-aware tar implementation,
> which would write it out as a file and then get totally confused when
> the length specified in the overlength file's entry didn't match the
> amount of data following.  So it's a nice solution for some properties
> but doesn't fail-soft for file length.  Not clear that there's any way
> to achieve that though.
>

Well, there is no way to make it fail completely soft on the client side I
think. But at least we should make sure that our implementation doesn't go
do something really bad if you were to upgrade your server but not the
client. And make sure that we test with at least GNU and BSD tar to see
they don't break things badly either.

Another thought is we could make pg_basebackup simply skip any files that
> exceed RELSEG_SIZE, on the principle that you don't really need/want
> enormous log files to get copied anyhow.  We'd still need the pax
> extension if the user had configured large RELSEG_SIZE, but having a
> compatible tar could be documented as a requirement of doing that.
>

I think going all the way to pax is the proper long-term thing to do, at
least if we can confirm it works in the main tar implementations.

For backpatchable that seems more reasonable. It doesn't work today, and we
just need to document that it doesn't, with larger RELSEG_SIZE. And then
fix it properly for the future.

Not sure we want to go change the file format even for 9.4 at this time, it
seems we're quite overdue for that - so I think that's a 9.5 feature.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Tom Lane

Andres Freund  writes:
> Btw, I had wondered before if we shouldn't also add sizeof(long) to
> pg_control to catch cases where a database is copied between a LLP64
> (64bit windows) and an LP64 (nearly every other 64bit system) system. I
> have my doubts that we're completely clean about the size
> difference. Not to speak of extension datatypes.

I don't believe that this is necessary.  It's certainly true that some
in-memory structures will be laid out differently, but not on-disk.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Migrating from 9.2.4 to 9.3.0 with XML DOCTYPE

2014-06-04 Thread Tom Lane

Tim Kane  writes:
> Iâm migrating a database from 9.2.4 to 9.3.0 and encountering an issue with
> an XML field failing to restore.

Hm, can you restore it into 9.2 either?

AFAICS, pg_dump has absolutely no idea that it should be worried about the
value of xmloption, despite the fact that that setting affects what is
considered valid XML data.  What's worse, even if it were attempting to do
something about xmloption, I don't see how it could deal with a case like
this where you have different values inside the same database that require
two different settings in order to parse.

This isn't a 9.3.x bug, it's an aboriginal misdesign of the XML datatype.
Not sure what we can do about it at this point.  Perhaps we could invent
a "document_or_content" setting that would tell xml_in to accept either
case?  And then have pg_dump force that setting to be used during restore?

regards, tom lane

PS: BTW, I agree with the advice expressed by David J: under no
circumstances put any data you care about on 9.3.0.  That release
was rather a disaster from a quality-control standpoint :-(
But that's unrelated to your XML issue.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Could not open file pg_multixact/offsets/ ERROR on 9.3.4

2014-06-04 Thread Alexey Klyukin

Greetings,

I've recently discovered a peculiar problem on one of our big databases
(more than 1TB). The database has been upgraded from 9.2 to 9.3.4 (using
hardlinks to speedup the process)  on April 7th around 22:00 local time.
When doing  vacuum on any table, the system fails with the following error:

ERROR:  could not access status of transaction 2035987419
DETAIL:  Could not open file "pg_multixact/offsets/795A": No such file or
directory.

The erorr doesn't depend on the table being vacuumed, or even database, i.e:

postgres=# create database x;
CREATE DATABASE
postgres=# \c x
You are now connected to database "x" as user "postgres".
x=# create table test();
CREATE TABLE
x=# vacuum test;
ERROR:  could not access status of transaction 2035987419
DETAIL:  Could not open file "pg_multixact/offsets/795A": No such file or
directory.

The content of pg_multixact/offsets is:

pg_multixact$ ls -lR
./members:
-rw--- 1 postgres postgres 8192 Apr 16 18:20 
./offsets:
-rw--- 1 postgres postgres   8192 Apr  7 22:51 
-rw--- 1 postgres postgres 262144 Apr 16 18:20 79A6

the select mutlixact from pg_database gives me:

postgres=# select oid, datminmxid from pg_database;
  oid  | datminmxid
---+
 12078 | 2040987419
 12073 | 2040987419
 16414 | 2035987419
 16415 | 2040987418
 1 | 2040987419
(5 rows)

and the 2035987419 = 0x795AB3DB belongs to 795A segment.
The  file just contains a single page of all zeroes. Neither the 9.3.4
replica of this database, nor the original 9.2 cluster data directory
contain this file.

I guess what happens is that at the end of vacuum PostgreSQL checks the
status of the earliest available multixact (perhaps in order to truncate
it), and fails because the file has been already removed.

I also have a replica of this database, (which, as I said, doesn't have the
 segment). When I shut down this replica, copied the data directory,
promoted the copied directory (effectively making a copy of the master) and
issued vacuum I observed no errors. However, when I manually created this
multixact/offset/ with dd (populating 8192 bytes with 0), exactly the
same problem re-appeared as on the master.

I'm tempted to just remove the  file from master and restart the
database, since it's effectively impossible to run vacuum now, but I'd like
to understand what's happening first. Below is the result of pg_filedump
for the master:

data$ pg_controldata .
pg_control version number:937
Catalog version number:   201306121
Database system identifier:   5999656352803688379
Database cluster state:   in production
pg_control last modified: Wed 04 Jun 2014 04:49:45 PM CEST
Latest checkpoint location:   25D0/D38A8DF0
Prior checkpoint location:25D0/BFA7B068
Latest checkpoint's REDO location:25D0/C27A54A0
Latest checkpoint's REDO WAL file:000125D000C2
Latest checkpoint's TimeLineID:   1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:  0/1655213806
Latest checkpoint's NextOID:  16795470
Latest checkpoint's NextMultiXactId:  2040987419
Latest checkpoint's NextMultiOffset:  3
Latest checkpoint's oldestXID:1038291920
Latest checkpoint's oldestXID's DB:   16415
Latest checkpoint's oldestActiveXID:  1655189767
Latest checkpoint's oldestMultiXid:   2040987417
Latest checkpoint's oldestMulti's DB: 0
Time of latest checkpoint:Wed 04 Jun 2014 04:45:45 PM CEST
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline:   0
Backup start location:0/0
Backup end location:  0/0
End-of-backup record required:no
Current wal_level setting:hot_standby
Current max_connections setting:  500
Current max_prepared_xacts setting:   0
Current max_locks_per_xact setting:   64
Maximum data alignment:   8
Database block size:  8192
Blocks per segment of large relation: 131072
WAL block size:   8192
Bytes per WAL segment:16777216
Maximum length of identifiers:64
Maximum columns in an index:  32
Maximum size of a TOAST chunk:1996
Date/time type storage:   64-bit integers
Float4 argument passing:  by value
Float8 argument passing:  by value
Data page checksum version:   0

There were no wraparound-related messages in the logs of either old or new
cluster, nor we observed any other errors except this one (we originally
discovered it in the server logs, likely because the autovacuum was also
failing). Any hints on what's going on (and whether the data corruption is
a possibility)?


Cheers,
-- 
Alexey Klyukin

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Andres Freund

On 2014-06-04 10:03:09 -0400, Tom Lane wrote:
> I just chanced to notice that if someone were to change the value for
> LOBLKSIZE and recompile, there'd be nothing to stop him from starting
> that postmaster against an existing database, even though it would
> completely misinterpret and mangle any data in pg_largeobject.
> 
> I think there ought to be a guard for that, for exactly the same reasons
> that we check TOAST_MAX_CHUNK_SIZE: correct interpretation of on-disk
> data requires that this value match the original database configuration.
> 
> Obviously it's too late to do anything about this in existing branches,
> but I propose to add a field to pg_control after we branch off 9.4.

Btw, I had wondered before if we shouldn't also add sizeof(long) to
pg_control to catch cases where a database is copied between a LLP64
(64bit windows) and an LP64 (nearly every other 64bit system) system. I
have my doubts that we're completely clean about the size
difference. Not to speak of extension datatypes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Tom Lane

Stephen Frost  writes:
> * Andrew Dunstan (and...@dunslane.net) wrote:
>> On 06/04/2014 10:03 AM, Tom Lane wrote:
>>> I just chanced to notice that if someone were to change the value for
>>> LOBLKSIZE and recompile, there'd be nothing to stop him from starting
>>> that postmaster against an existing database, even though it would
>>> completely misinterpret and mangle any data in pg_largeobject.

> Then again, I've never heard of a field complaint regarding this, so
> pehraps it's not worth it.

I've not heard one either, but there was just somebody asking in
pgsql-general about changing LOBLKSIZE, so he's going to be at risk.
That's not a big enough sample size to make me panic about getting a
hasty fix into 9.4, but I do think we should fix this going forward.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Andrew Dunstan (and...@dunslane.net) wrote:
> >> On 06/04/2014 10:03 AM, Tom Lane wrote:
> >>> I just chanced to notice that if someone were to change the value for
> >>> LOBLKSIZE and recompile, there'd be nothing to stop him from starting
> >>> that postmaster against an existing database, even though it would
> >>> completely misinterpret and mangle any data in pg_largeobject.
> 
> > Then again, I've never heard of a field complaint regarding this, so
> > pehraps it's not worth it.
> 
> I've not heard one either, but there was just somebody asking in
> pgsql-general about changing LOBLKSIZE, so he's going to be at risk.
> That's not a big enough sample size to make me panic about getting a
> hasty fix into 9.4, but I do think we should fix this going forward.

Agreed.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Andrew Dunstan



On 06/04/2014 10:27 AM, Andres Freund wrote:

On 2014-06-04 10:25:07 -0400, Andrew Dunstan wrote:

If we did an initdb-requiring change for 9.4 could we piggy-back this onto
it?

Do you know of a problem requiring that?




No, just thinking ahead.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Stephen Frost

* Andrew Dunstan (and...@dunslane.net) wrote:
> On 06/04/2014 10:03 AM, Tom Lane wrote:
> >I just chanced to notice that if someone were to change the value for
> >LOBLKSIZE and recompile, there'd be nothing to stop him from starting
> >that postmaster against an existing database, even though it would
> >completely misinterpret and mangle any data in pg_largeobject.
> >
> >I think there ought to be a guard for that, for exactly the same reasons
> >that we check TOAST_MAX_CHUNK_SIZE: correct interpretation of on-disk
> >data requires that this value match the original database configuration.
> >
> >Obviously it's too late to do anything about this in existing branches,
> >but I propose to add a field to pg_control after we branch off 9.4.
> >
> > 
> 
> If we did an initdb-requiring change for 9.4 could we piggy-back
> this onto it?

I was thinking more-or-less the same thing...

Then again, I've never heard of a field complaint regarding this, so
pehraps it's not worth it.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Andres Freund

On 2014-06-04 10:25:07 -0400, Andrew Dunstan wrote:
> If we did an initdb-requiring change for 9.4 could we piggy-back this onto
> it?

Do you know of a problem requiring that?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pass Form_pg_attribute to examine_attribute rather than Relation structure.

2014-06-04 Thread Tom Lane

amul sul  writes:
> For more granularity, I think passing Form_pg_attribute to 
> examine_attribute() function  rather than passing Relation will be more 
> relevant & makes it simple to understand.

I don't find that to be a good idea at all.  It makes examine_attribute
inconsistent with most other functions in analyze.c, and it limits our
ability to add logic inside that function that might want to look at
other properties of the relation.

Even without that argument, moving the responsibility for initializing
stats->tupattnum to the callers of examine_attribute is certainly a
net loss in readability and reliability.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Andrew Dunstan



On 06/04/2014 10:03 AM, Tom Lane wrote:

I just chanced to notice that if someone were to change the value for
LOBLKSIZE and recompile, there'd be nothing to stop him from starting
that postmaster against an existing database, even though it would
completely misinterpret and mangle any data in pg_largeobject.

I think there ought to be a guard for that, for exactly the same reasons
that we check TOAST_MAX_CHUNK_SIZE: correct interpretation of on-disk
data requires that this value match the original database configuration.

Obviously it's too late to do anything about this in existing branches,
but I propose to add a field to pg_control after we branch off 9.4.




If we did an initdb-requiring change for 9.4 could we piggy-back this 
onto it?



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing join removals for more join types

2014-06-04 Thread Tom Lane

David Rowley  writes:
> On Wed, Jun 4, 2014 at 11:50 AM, Noah Misch  wrote:
>> When a snapshot can see modifications that queued referential integrity
>> triggers for some FK constraint, that constraint is not guaranteed to hold
>> within the snapshot until those triggers have fired.

> I remember reading about some concerns with that here:
> http://www.postgresql.org/message-id/51e2d505.5010...@2ndquadrant.com
> But I didn't quite understand the situation where the triggers are delayed.
> I just imagined that the triggers would have fired by the time the command
> had completed. If that's not the case, when do the triggers fire? on
> commit? Right now I've no idea how to check for this in order to disable
> the join removal.

I'm afraid that this point destroys your entire project :-( ... even
without deferred constraints, there's no good way to be sure that you're
not planning a query that will be run inside some function that can see
the results of partially-completed updates.  The equivalent problem for
unique indexes is tolerable because the default choice is immediate
uniqueness enforcement, but there's no similar behavior for FKs.

It's possible that we could apply the optimization only to queries that
have been issued directly by a client, but that seems rather ugly and
surprise-filled.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Need to backpatch 2985e16 to 9.3 and further (HS regression test out)

2014-06-04 Thread Fujii Masao

On Wed, Jun 4, 2014 at 3:26 PM, Amit Langote  wrote:
> Hi,
>
> Following (commit 2985e16) has not been backpatched, I guess.
>
>  ANALYZE hs1;
> -ERROR:  cannot execute VACUUM during recovery
> +ERROR:  cannot execute ANALYZE during recovery
>
> Attached is a patch for this.

Why did you cut off the following part? ISTM that also needs to be back-patched.
So we should just do "git cherry-pick 2985e16" on 9.3.

-begin transaction isolation level serializable;
+begin transaction isolation level repeatable read;

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-04 Thread Tom Lane

I just chanced to notice that if someone were to change the value for
LOBLKSIZE and recompile, there'd be nothing to stop him from starting
that postmaster against an existing database, even though it would
completely misinterpret and mangle any data in pg_largeobject.

I think there ought to be a guard for that, for exactly the same reasons
that we check TOAST_MAX_CHUNK_SIZE: correct interpretation of on-disk
data requires that this value match the original database configuration.

Obviously it's too late to do anything about this in existing branches,
but I propose to add a field to pg_control after we branch off 9.4.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposing pg_hibernate

2014-06-04 Thread Andres Freund

On 2014-06-04 09:51:36 -0400, Robert Haas wrote:
> On Wed, Jun 4, 2014 at 2:08 AM, Andres Freund  wrote:
> > On 2014-06-04 10:24:13 +0530, Amit Kapila wrote:
> >> Incase of recovery, the shared buffers saved by this utility are
> >> from previous shutdown which doesn't seem to be of more use
> >> than buffers loaded by recovery.
> >
> > Why? The server might have been queried if it's a hot standby one?
> 
> I think that's essentially the same point Amit is making.  Gurjeet is
> arguing for reloading the buffers from the previous shutdown at end of
> recovery; IIUC, Amit, you, and I all think this isn't a good idea.

I think I am actually arguing for Gurjeet's position. If the server is
actively being queried (i.e. hot_standby=on and actually used for
queries) it's quite reasonable to expect that shared_buffers has lots of
content that is *not* determined by WAL replay.

There's not that much read IO going on during WAL replay anyway - after
a crash/start from a restartpoint most of it is loaded via full page
anyway. So it's only disadvantageous to fault in pages via pg_hibernate
if that causes pages that already have been read in via FPIs to be
thrown out.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposing pg_hibernate

2014-06-04 Thread Robert Haas

On Wed, Jun 4, 2014 at 2:08 AM, Andres Freund  wrote:
> On 2014-06-04 10:24:13 +0530, Amit Kapila wrote:
>> On Tue, Jun 3, 2014 at 5:43 PM, Gurjeet Singh  wrote:
>> > On Tue, Jun 3, 2014 at 7:57 AM, Robert Haas  wrote:
>> > > It seems like it would be best to try to do this at cluster startup
>> > > time, rather than once recovery has reached consistency.  Of course,
>> > > that might mean doing it with a single process, which could have its
>> > > own share of problems.  But I'm somewhat inclined to think that if
>> > > recovery has already run for a significant period of time, the blocks
>> > > that recovery has brought into shared_buffers are more likely to be
>> > > useful than whatever pg_hibernate would load.
>> >
>> > I am not absolutely sure of the order of execution between recovery
>> > process and the BGWorker, but ...
>> >
>> > For sizeable shared_buffers size, the restoration of the shared
>> > buffers can take several seconds.
>>
>> Incase of recovery, the shared buffers saved by this utility are
>> from previous shutdown which doesn't seem to be of more use
>> than buffers loaded by recovery.
>
> Why? The server might have been queried if it's a hot standby one?

I think that's essentially the same point Amit is making.  Gurjeet is
arguing for reloading the buffers from the previous shutdown at end of
recovery; IIUC, Amit, you, and I all think this isn't a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-04 Thread Robert Haas

On Tue, Jun 3, 2014 at 11:37 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I thought the reason why this hasn't been implemented before now is
>> that sending an ErrorResponse to the client will result in a loss of
>> protocol sync.
>
> Hmm ... you are right that this isn't as simple as an ereport(ERROR),
> but I'm not sure it's impossible.  We could for instance put the backend
> into skip-till-Sync state so that it effectively ignored the next command
> message.  Causing that to happen might be impracticably messy, though.

Another thing we could maybe do is AbortCurrentTransaction() and send
the client a NoticeResponse saying "hey, expect all of your future
commands to fail with complaints about the transaction being aborted".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_xlogdump --stats

2014-06-04 Thread Abhijit Menon-Sen

Hi.

Here's a patch to make pg_xlogdump print summary statistics instead of
individual records.

By default, for each rmgr it shows the number of records, the size of
rmgr-specific records, the size of full-page images, and the combined
size. With --stats=record it shows these figures for each rmgr/xl_info
combination (omitting those with zero counts for readability).

Here's an example of the output after a few pgbench runs with the
default checkpoint settings. I raised wal_keep_segments, resulting
in 3.5GB of WAL data in pg_xlog.

As you can see in the "Total" line, 96.83% of this is full-page images.
As Andres observed, this is a good demonstration of why one should not
use the default checkpoint_segments in production.

$ ../bin/pg_xlogdump --stats=record 00010001 
000100DE
Type   N  (%)  Record size  
(%) FPI size  (%)Combined size  (%)
   -  ---  ---  
---   ----  ---
XLOG/CHECKPOINT_SHUTDOWN  16 (  0.00) 1152 
(  0.00)0 (  0.00) 1152 (  0.00)
XLOG/CHECKPOINT_ONLINE80 (  0.00) 5760 
(  0.01)0 (  0.00) 5760 (  0.00)
XLOG/NEXTOID  12 (  0.00)   48 
(  0.00)0 (  0.00)   48 (  0.00)
Transaction/COMMIT71 (  0.00) 4708 
(  0.00)0 (  0.00) 4708 (  0.00)
Transaction/COMMIT_COMPACT413956 ( 14.82)  4967472 
(  4.35)0 (  0.00)  4967472 (  0.14)
Storage/CREATE   231 (  0.01) 3696 
(  0.00)0 (  0.00) 3696 (  0.00)
Storage/TRUNCATE   1 (  0.00)   16 
(  0.00)0 (  0.00)   16 (  0.00)
CLOG/ZEROPAGE 13 (  0.00)   52 
(  0.00)0 (  0.00)   52 (  0.00)
Database/CREATE3 (  0.00)   48 
(  0.00)0 (  0.00)   48 (  0.00)
RelMap/UPDATE 14 (  0.00) 7336 
(  0.01)0 (  0.00) 7336 (  0.00)
Heap2/CLEAN   369312 ( 13.22) 10769122 
(  9.43)   2698910088 ( 77.33)   2709679210 ( 75.17)
Heap2/FREEZE_PAGE 53 (  0.00) 3276 
(  0.00)   327732 (  0.01)   331008 (  0.01)
Heap2/VISIBLE  58160 (  2.08)  1163200 
(  1.02)   599768 (  0.02)  1762968 (  0.05)
Heap2/MULTI_INSERT 1 (  0.00)   59 
(  0.00)0 (  0.00)   59 (  0.00)
Heap2/MULTI_INSERT+INIT7 (  0.00)38874 
(  0.03)0 (  0.00)38874 (  0.00)
Heap/INSERT   425472 ( 15.23) 22392664 
( 19.61)  6081712 (  0.17) 28474376 (  0.79)
Heap/DELETE 1638 (  0.06)42588 
(  0.04)19800 (  0.00)62388 (  0.00)
Heap/UPDATE53912 (  1.93)  7145531 
(  6.26)390264760 ( 11.18)397410291 ( 11.03)
Heap/HOT_UPDATE  1185607 ( 42.43) 59538947 
( 52.13) 48724168 (  1.40)108263115 (  3.00)
Heap/LOCK 199320 (  7.13)  4983000 
(  4.36)  1656812 (  0.05)  6639812 (  0.18)
Heap/INPLACE 469 (  0.02)66676 
(  0.06)   558604 (  0.02)   625280 (  0.02)
Heap/INSERT+INIT2992 (  0.11)   272895 
(  0.24)0 (  0.00)   272895 (  0.01)
Heap/UPDATE+INIT1184 (  0.04)   146420 
(  0.13)  6611352 (  0.19)  6757772 (  0.19)
Btree/INSERT_LEAF  81058 (  2.90)  2224916 
(  1.95)336444372 (  9.64)338669288 (  9.40)
Btree/INSERT_UPPER   128 (  0.00) 5272 
(  0.00)16368 (  0.00)21640 (  0.00)
Btree/SPLIT_L 48 (  0.00)   171384 
(  0.15)46712 (  0.00)   218096 (  0.01)
Btree/

[HACKERS] pass Form_pg_attribute to examine_attribute rather than Relation structure.

2014-06-04 Thread amul sul

Hi,

examine_attribute() function accept Relation type argument, and extract 
attribute from it.


For more granularity, I think passing Form_pg_attribute to examine_attribute() 
function  rather than passing Relation will be more relevant & makes it simple 
to understand.

Thinking to change examine_attribute as,

- examine_attribute(Relation onerel, int attnum, Node *index_expr)
+examine_attribute(Form_pg_attribute attr, Node *index_expr)
 

I think there wont any loss or gain except little optimization of function call 
stack space. 

Thoughts? comments?

Please have look attached patch.

Regards,

Amul Sul

0001-examine_attribute-function-s-arguments-changed.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing join removals for more join types

2014-06-04 Thread David Rowley

On Wed, Jun 4, 2014 at 11:50 AM, Noah Misch  wrote:

> On Wed, May 28, 2014 at 08:39:32PM +1200, David Rowley wrote:
> > The attached patch allows join removals for both sub queries with left
> > joins and also semi joins where a foreign key can prove the existence of
> > the record.
>
> When a snapshot can see modifications that queued referential integrity
> triggers for some FK constraint, that constraint is not guaranteed to hold
> within the snapshot until those triggers have fired.  For example, a query
> running within a VOLATILE function f() in a statement like "UPDATE t SET c
> =
> f(c)" may read data that contradicts FK constraints involving table "t".
> Deferred UNIQUE constraints, which we also do not yet use for deductions in
> the planner, have the same problem; see commit 0f39d50.  This project will
> need a design accounting for that hazard.
>
>
I remember reading about some concerns with that here:
http://www.postgresql.org/message-id/51e2d505.5010...@2ndquadrant.com
But I didn't quite understand the situation where the triggers are delayed.
I just imagined that the triggers would have fired by the time the command
had completed. If that's not the case, when do the triggers fire? on
commit? Right now I've no idea how to check for this in order to disable
the join removal.

For the deferred unique constraints I'm protecting against that the same
way as the left join removal does...  It's in the
relation_has_foreign_key_for() function where I'm matching the foreign keys
up to the indexes on the other relation.

As a point of procedure, I recommend separating the semijoin support into
> its
> own patch.  Your patch is already not small; delaying non-essential parts
> will
> make the essential parts more accessible to reviewers.
>
>
That's a good idea. I think the left join additions would be realistic to
get in early in the 9.5 cycle, but the semi and anti joins stuff I know
that I'm going to need some more advice for. It makes sense to split them
out and get what I can in sooner rather than delaying it for no good reason.

Regards

David Rowley

80 matches

Mail list logo