Re: [HACKERS] Streaming replication for psycopg2

2015-10-04 Thread Craig Ringer
On 30 June 2015 at 22:42, Shulgin, Oleksandr
 wrote:
> On Thu, Jun 4, 2015 at 5:49 PM, Shulgin, Oleksandr
>  wrote:
>>
>> On Tue, Jun 2, 2015 at 2:23 PM, Shulgin, Oleksandr
>>  wrote:
>> >
>> > I've submitted a patch to psycopg2 to support streaming replication
>> > protocol (COPY_BOTH): https://github.com/psycopg/psycopg2/pull/322
>
> Hello again,
>
> I have updated the pull request above to address the feedback I've gathered
> from using this construct internally and from other sources.  Now included,
> the lower level asynchronous interface that gives the user more control, for
> the price of doing select() and keeping track of keepalive timeouts
> manually.
>
> It would be nice if someone could review this.  The final look can be found
> by using this link (docs included):
> https://github.com/psycopg/psycopg2/pull/322/files

Per the pull request's history, I've given this a fairly detailed
review. With a few minor changes I think it's really good and will be
a valuable addition to psycopg2.

Before merging I think the connection should preferably be split into
a proper subclass, though it might not be worth the verbosity involved
when working with the CPython API.

I suspect that the row-by-row COPY support should be pushed down to be
shared with copy_expert, too. There's so much logic that'd be shared:
the push (callback) or pull oriented read modes, the support for
reading raw bytes or decoded unicode text for rows, the result object,
etc.

Otherwise I think this is great, and it was a real pleasure to read a
patch where the first half is good documentation and examples.

Since pyscopg2 is libpq based, so it can't be treated as an
independent re-implementation of the protocol for testing purposes.
For that it'd probably be necessary to add replication protocol
support to PgJDBC. But it's still really handy for prototyping logical
decoding clients, testing output plugins, adapting logical decoding
output to feed into other systems, etc, and I can see this as
something that could be really handy for optional extra tests for
PostgreSQL its self - validating the replication protocol, etc.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-10-04 Thread Michael Paquier
On Sun, Oct 4, 2015 at 11:40 AM, Paul Ramsey  wrote:
> I put all changes relative to your review here if you want a nice colorized
> place to check
>
> https://github.com/pramsey/postgres/commit/ed33e7489601e659f436d6afda3cce28304eba50

-/* updatable is available on both server and table */
+/* updatable option is available on both server and table */
This is just noise (perhaps I am the one who introduced it, oops).
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE behind-the-scenes effects' CONTEXT

2015-10-04 Thread Pavel Stehule
2015-10-05 0:08 GMT+02:00 Marko Tiikkaja :

> Hi,
>
> In the past I've found the error message in cases such as this somewhat
> less helpful than it could be:
>
> =# CREATE TABLE qqq (a int);
> =# CREATE UNIQUE INDEX IF NOT EXISTS qqq_a_idx ON qqq(a);
> =# ALTER TABLE qqq ALTER COLUMN a TYPE json USING NULL;
> ERROR:  data type json has no default operator class for access method
> "btree"
> HINT:  You must specify an operator class for the index or define a
> default operator class for the data type.
>
> The attached patch adds a CONTEXT line to index and constraint rebuilds,
> e.g:
>
>   CONTEXT:  while rebuilding index qqq_a_idx
>
> Any feedback welcome.
>

I prefer using DETAIL field for this case.

Regards

Pavel


>
> .m
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2015-10-04 Thread Amit Kapila
On Mon, Oct 5, 2015 at 6:34 AM, Jeff Janes  wrote:

> On Fri, Sep 11, 2015 at 8:01 PM, Amit Kapila 
> wrote:
>>
>>
>> If I am not wrong we need 1048576 number of transactions difference
>> for each record to make each CLOG access a disk access, so if we
>> increment XID counter by 100, then probably every 1th (or multiplier
>> of 1) transaction would go for disk access.
>>
>> The number 1048576 is derived by below calc:
>> #define CLOG_XACTS_PER_BYTE 4
>> #define CLOG_XACTS_PER_PAGE (BLCKSZ * CLOG_XACTS_PER_BYTE)
>>
>
>> Transaction difference required for each transaction to go for disk
>> access:
>> CLOG_XACTS_PER_PAGE * num_clog_buffers.
>>
>
>
> That guarantees that every xid occupies its own 32-contiguous-pages chunk
> of clog.
>
> But clog pages are not pulled in and out in 32-page chunks, but one page
> chunks.  So you would only need 32,768 differences to get every real
> transaction to live on its own clog page, which means every look up of a
> different real transaction would have to do a page replacement.
>


Agreed, but that doesn't effect the test result with the test done above.


>  (I think your references to disk access here are misleading.  Isn't the
> issue here the contention on the lock that controls the page replacement,
> not the actual IO?)
>
>
The point is that if there is no I/O needed, then all the read-access for
transaction status will just use Shared locks, however if there is an I/O,
then it would need an Exclusive lock.


> I've attached a patch that allows you set the guc "JJ_xid",which makes it
> burn the given number of xids every time one new one is asked for.  (The
> patch introduces lots of other stuff as well, but I didn't feel like
> ripping the irrelevant parts out--if you don't set any of the other gucs it
> introduces from their defaults, they shouldn't cause you trouble.)  I think
> there are other tools around that do the same thing, but this is the one I
> know about.  It is easy to drive the system into wrap-around shutdown with
> this, so lowering autovacuum_vacuum_cost_delay is a good idea.
>
> Actually I haven't attached it, because then the commitfest app will list
> it as the patch needing review, instead I've put it here
> https://drive.google.com/file/d/0Bzqrh1SO9FcERV9EUThtT3pacmM/view?usp=sharing
>
>
Thanks, I think probably this could also be used for testing.


> I think reducing to every 100th access for transaction status as disk
>> access
>> is sufficient to prove that there is no regression with the patch for the
>> screnario
>> asked by Andres or do you think it is not?
>>
>> Now another possibility here could be that we try by commenting out fsync
>> in CLOG path to see how much it impact the performance of this test and
>> then for pgbench test.  I am not sure there will be any impact because
>> even
>> every 100th transaction goes to disk access that is still less as compare
>> WAL fsync which we have to perform for each transaction.
>>
>
> You mentioned that your clog is not on ssd, but surely at this scale of
> hardware, the hdd the clog is on has a bbu in front of it, no?
>
>
Yes.


> But I thought Andres' concern was not about fsync, but about the fact that
> the SLRU does linear scans (repeatedly) of the buffers while holding the
> control lock?  At some point, scanning more and more buffers under the lock
> is going to cause more contention than scanning fewer buffers and just
> evicting a page will.
>
>

Yes, at some point, that could matter, but I could not see the impact
at 64 or 128 number of Clog buffers.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Confusing remark about UPSERT in fdwhandler.sgml

2015-10-04 Thread Etsuro Fujita

On 2015/10/03 5:57, Robert Haas wrote:

On Fri, Oct 2, 2015 at 4:04 AM, Peter Geoghegan  wrote:

On Fri, Oct 2, 2015 at 1:00 AM, Etsuro Fujita
 wrote:

ISTM that the sentence "as remote constraints are not locally known" is
somewhat confusing, because check constrains on remote tables can be
defined locally in 9.5.  How about "unique constraints or exclusion
constraints on remote tables are not locally known"?  Attached is a
patch for that.



Makes sense to me.



Me, too.  Committed.


Thanks!

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2015-10-04 Thread Jeff Janes
On Fri, Sep 11, 2015 at 8:01 PM, Amit Kapila 
wrote:

> On Fri, Sep 11, 2015 at 9:21 PM, Robert Haas 
> wrote:
> >
> > On Fri, Sep 11, 2015 at 10:31 AM, Amit Kapila 
> wrote:
> > > > Could you perhaps try to create a testcase where xids are accessed
> that
> > > > are so far apart on average that they're unlikely to be in memory?
> And
> > > > then test that across a number of client counts?
> > > >
> > >
> > > Now about the test, create a table with large number of rows (say
> 11617457,
> > > I have tried to create larger, but it was taking too much time (more
> than a day))
> > > and have each row with different transaction id.  Now each transaction
> should
> > > update rows that are at least 1048576 (number of transactions whose
> status can
> > > be held in 32 CLog buffers) distance apart, that way ideally for each
> update it will
> > > try to access Clog page that is not in-memory, however as the value to
> update
> > > is getting selected randomly and that leads to every 100th access as
> disk access.
> >
> > What about just running a regular pgbench test, but hacking the
> > XID-assignment code so that we increment the XID counter by 100 each
> > time instead of 1?
> >
>
> If I am not wrong we need 1048576 number of transactions difference
> for each record to make each CLOG access a disk access, so if we
> increment XID counter by 100, then probably every 1th (or multiplier
> of 1) transaction would go for disk access.
>
> The number 1048576 is derived by below calc:
> #define CLOG_XACTS_PER_BYTE 4
> #define CLOG_XACTS_PER_PAGE (BLCKSZ * CLOG_XACTS_PER_BYTE)
>

> Transaction difference required for each transaction to go for disk access:
> CLOG_XACTS_PER_PAGE * num_clog_buffers.
>


That guarantees that every xid occupies its own 32-contiguous-pages chunk
of clog.

But clog pages are not pulled in and out in 32-page chunks, but one page
chunks.  So you would only need 32,768 differences to get every real
transaction to live on its own clog page, which means every look up of a
different real transaction would have to do a page replacement.  (I think
your references to disk access here are misleading.  Isn't the issue here
the contention on the lock that controls the page replacement, not the
actual IO?)

I've attached a patch that allows you set the guc "JJ_xid",which makes it
burn the given number of xids every time one new one is asked for.  (The
patch introduces lots of other stuff as well, but I didn't feel like
ripping the irrelevant parts out--if you don't set any of the other gucs it
introduces from their defaults, they shouldn't cause you trouble.)  I think
there are other tools around that do the same thing, but this is the one I
know about.  It is easy to drive the system into wrap-around shutdown with
this, so lowering autovacuum_vacuum_cost_delay is a good idea.

Actually I haven't attached it, because then the commitfest app will list
it as the patch needing review, instead I've put it here
https://drive.google.com/file/d/0Bzqrh1SO9FcERV9EUThtT3pacmM/view?usp=sharing

I think reducing to every 100th access for transaction status as disk access
> is sufficient to prove that there is no regression with the patch for the
> screnario
> asked by Andres or do you think it is not?
>
> Now another possibility here could be that we try by commenting out fsync
> in CLOG path to see how much it impact the performance of this test and
> then for pgbench test.  I am not sure there will be any impact because even
> every 100th transaction goes to disk access that is still less as compare
> WAL fsync which we have to perform for each transaction.
>

You mentioned that your clog is not on ssd, but surely at this scale of
hardware, the hdd the clog is on has a bbu in front of it, no?

But I thought Andres' concern was not about fsync, but about the fact that
the SLRU does linear scans (repeatedly) of the buffers while holding the
control lock?  At some point, scanning more and more buffers under the lock
is going to cause more contention than scanning fewer buffers and just
evicting a page will.

Cheers,

Jeff


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Peter Geoghegan
On Wed, Sep 23, 2015 at 1:41 PM, Jim Nasby  wrote:
> max was set to 1. I don't know about average query text size, but the
> command that was causing the error was a very large number of individual
> INSERT ... VALUES statements all in one command.
>
> The machine had plenty of free memory and no ulimit, so I don't see how this
> could have been anything but MaxAllocSize, unless there's some other failure
> mode in malloc I don't know about.

The patch that Tom committed does __nothing__ to actually address
*how* you were able to get into this position in the first place.
Sure, now you might be able to recover if you're not affected by the
MaxAllocSize limitation, but: What did it take in the first place to
need memory > MaxAllocSize to do a garbage collection? Whatever it was
that allowed things to get that bad before a garbage collection was
attempted is not even considered in the patch committed. The patch
just committed is, in short, nothing more than a band-aid, and may do
more harm than good due to allocating huge amounts of memory while a
very contended exclusive LWLock is held.

You say you have plenty of memory and no ulimit, and I believe that
you're right that there was never a conventional OOM where malloc()
returns NULL. If we conservatively assume that you were only barely
over the MaxAllocSize limitation before a garbage collection was first
attempted, then the average query text length would have to be about
50 kilobytes, since only then 20,000 texts (pg_stat_statements.max *
2) would put us over.

Does anyone actually think that the *average* SQL query on Jim's
client's application was in excess of 50% the size of the ~3,000 line
file pg_stat_statements.c? It beggars belief.

I'm annoyed and disappointed that the patch committed does not even
begin to address the underlying problem -- it just adds an escape
hatch, and fixes another theoretical issue that no one was affected
by. Honestly, next time I won't bother.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CustomScan support on readfuncs.c

2015-10-04 Thread Kouhei Kaigai
> -Original Message-
> From: Robert Haas [mailto:robertmh...@gmail.com]
> Sent: Saturday, October 03, 2015 5:44 AM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Amit Kapila; Andres Freund; pgsql-hackers
> Subject: ##freemail## Re: [HACKERS] CustomScan support on readfuncs.c
> 
> On Tue, Sep 29, 2015 at 6:19 PM, Kouhei Kaigai  wrote:
> >> On Mon, Sep 28, 2015 at 8:31 PM, Kouhei Kaigai  
> >> wrote:
> >> >> Instead of doing this:
> >> >>
> >> >> +/* Dump library and symbol name instead of raw pointer */
> >> >>  appendStringInfoString(str, " :methods ");
> >> >> -_outToken(str, node->methods->CustomName);
> >> >> +_outToken(str, node->methods->methods_library_name);
> >> >> +appendStringInfoChar(str, ' ');
> >> >> +_outToken(str, node->methods->methods_symbol_name);
> >> >>
> >> >> Suppose we just make library_name and symbol_name fields in the node
> >> >> itself, that are dumped and loaded like any others.
> >> >>
> >> >> Would that be better?
> >> >>
> >> > I have no preference here.
> >> >
> >> > Even if we dump library_name/symbol_name as if field in CustomScan,
> >> > not CustomScanMethods, in a similar way, we cannot use WRITE_STRING_FIELD
> >> > here, because its 'fldname' assumes these members are direct field of
> >> > CustomScan.
> >> >
> >> >   /* Write a character-string (possibly NULL) field */
> >> >   #define WRITE_STRING_FIELD(fldname) \
> >> >   (appendStringInfo(str, " :" CppAsString(fldname) " "), \
> >> >_outToken(str, node->fldname))
> >>
> >> Well that's exactly what I was suggesting: making them a direct field
> >> of CustomScan.
> >>
> > Let me confirm. Are you suggesting to have library_name/symbol_name
> > in CustomScan, not CustomScanMethods?
> > I prefer these fields are in CustomScanMethods because we don't need
> > to setup them again once PG_init set up these symbols. CustomScan has
> > to be created every time when it is chosen by planner.
> 
> True.  But that doesn't cost much.  I'm not sure it's better, so if
> you don't like it, don't worry about it.
>
Yep, I like library_name and symbol_name are in CustomScanMethods
because the table of callbacks and its identifiers are not separable

> >> > One other question I have. Do we have a portable way to lookup
> >> > a pair of library and symbol by address?
> >> > Glibc has dladdr() functions that returns these information,
> >> > however, manpage warned it is not defined by POSIX.
> >> > If we would be able to have any portable way, it may make the
> >> > interface simpler.
> >>
> >> Yes: load_external_function.
> >>
> > It looks up an address by a pair of library and symbol name
> > What I'm looking for is a portable function that looks up a pair
> > of library and symbol name by an address, like dladdr() of glibc.
> > I don't know whether other *nix or windows have these infrastructure.
> 
> No, that doesn't exist, and the chances of us trying add that
> infrastructure for this feature are nil.
>
OK, probably, it is the only way to expect extension put correct
values on the pair of library and symbol names.

I try to check whether the current patch workable with this direction
using my extension. Please wait for a few days.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] debbugs

2015-10-04 Thread Stephen Frost
Andrew,

* Andrew Dunstan (and...@dunslane.net) wrote:
> Could someone please point me at the documentation that says how to
> stand up and administer an instance of debbugs? If it doesn't exist
> I would say that is a very heavy mark against it. If it does, it's
> well enough hidden that a quick use of Google doesn't make it stand
> out, which doesn't fill me with confidence either

https://bugs.debian.org/debbugs-source/

README.md

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] No Issue Tracker - Say it Ain't So!]

2015-10-04 Thread Nathan Wagner
On Sun, Oct 04, 2015 at 04:30:49PM -0700, Josh Berkus wrote:
> That would be the key part, wouldn't it?  Nice that you have [code to
> store and parse email messages].

Yeah.  It actually made most of the work pretty easy.  It's available
with a bunch of other code at https://pd.if.org/git/ if anyone wants it.
I did find a bug in my header processing though, so I'll need to commit
that fix.

> We'd also want a way to link a bug fix to a commit, and probably a way
> to give the bug a list of searchable keywords (and add to that list).

I've been thinking of hooking it up to the fti machinery and providing
a search box.  I've never really used fti before, so this might be a
good opportunity to learn it for real.

> > it probably makes sense to just close up front bugs that are marked
> > against unsupported pg versions, or haven't had any activity for too
> > long, perhaps two years.
 
> I'm reluctant to close all of those unexamined, since part of the
> purpose of this is to find bugs which were never fixed.  Probably we
> should organize a posse to comb trhough all of the old bugs and
> hand-close them.

I'm doing some of that as I poke at the bugs pages.  Perhaps it would
make sense to have a closed status of 'stale' or the like (perhaps
that's what you meant by 'timed out') which could be used to get bugs
out of the main list but still be marked as 'not human reviewed'.  These
could then be reviewed by the posse.

> Yeah, fixing this [email's without a bug id] would probably be tied to
> the possible change to mailman.  Unless someone already has a way to
> get majordomo to append a bug ID.

How are bug ids assigned now?  From the evidence, I assume there is a
sequence in a database that the web submission form queries to format a
resulting email to the bugs mailing list.  Do the mailing lists live on
the same server?  If so, perhaps it would be easy to assign a new bug id
to a new thread on -bugs.  But perhaps that's too aggressive in creating
bugs.

> > 5: How can we use email to update the status of a bug?
> > 
> > I suggest using email headers to do this.  'X-PGBug-Fixed:
> > ' and the like.  I assume here that everyone who might
> > want to do such a thing uses an MUA that would allow this, and they
> > know how.
> 
> I guess that depends on who we expect to use this, at least for
> closing stuff.

I could certainly support more than one mechanism.  A web interface for
those who would prefer such and emails would seem to be the basic
requirements.

> > 6: Does there need to be any security on updating the status?
> > 
> > Probably not.  I don't think it's the sort of thing that would
> > attract malicious adjustments.  If I'm wrong, I'd need to rethink
> > this.  I realize I'm making security an afterthought, which makes my
> > teeth itch, but I think layers of security would make it much less
> > likely to be actually adopted.
> 
> I think there needs to be some kind of administrative access which
> allows, for example, an issue to be closed so that it can't be
> reopened.

I guess technically we have that now since I'm the only one who can
close or open a bug in the db I've set up :)

Seriously though, I think it probably makes the most sense to tie the
system in with the existing pg community accounts if it goes that far.

> Anyway, I'm not convinced we want to reinvent this particular wheel, but
> if we do, you've done a yeoman's job.

Thank you.  (Assuming I've interpreted the phrase correctly, I'm not
familiar with it, and a web search was only semi-enlightening).

-- 
nw


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] debbugs

2015-10-04 Thread Andrew Dunstan
Could someone please point me at the documentation that says how to 
stand up and administer an instance of debbugs? If it doesn't exist I 
would say that is a very heavy mark against it. If it does, it's well 
enough hidden that a quick use of Google doesn't make it stand out, 
which doesn't fill me with confidence either


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!]

2015-10-04 Thread Josh Berkus
On 10/04/2015 03:42 PM, Nathan Wagner wrote:
> I downloaded the archives for pgsql-bugs, and fed them into a database.  This
> part was easy, since I have already written a pg backed usenet server and had
> the code hand for storing and parsing out bits of rfc 2822 messages.

That would be the key part, wouldn't it?  Nice that you have that.

So this would be the other option if adopting debbugs doesn't work out.
 I think it's likely that we'll end up recreating most of debbugs in the
process (or bugzilla or something else) but whatever.  As long as we
have some kind of bug tracker, I'm happy.

> It's dirt simple.  If the system sees a message with 'Bug #(\d+)' in the
> subject line, it creates an entry in a bugs table with that bug number (if
> needed), and then marks the message as belonging to that bug.  If there seems
> to be metadata about the bug in the format of the (unquoted)
> 
> Bug reference:
> Logged by:  
> Email address:
> PostgreSQL version:
> Operating system:
> Description:
> Details:
> 
> it pulls that out and puts it in the bugs table.  There's also an "open"
> boolean in the table, defaulting to true.
> 
> The results can be found at https://granicus.if.org/pgbugs/
> 
> Ok.  So now we have a bug tracker, but...
> 
> Some open questions that I don't think have really been addressed, with my
> commentary interspersed:
> 
> 1: Can a bug be more than "open" or "closed"?
> 
> I think yes.  At least we probably want to know why a bug is closed.  Is it 
> not
> a bug at all, not our bug, a duplicate submission, a duplicate of another bug,
> something we won't fix for some reason (e.g. a bug against version 7)

We'd want the usual statuses:

* fixed
* duplicate
* unreproduceable
* timed out
* not a bug
* won't fix
* reopened

We'd also want a way to link a bug fix to a commit, and probably a way
to give the bug a list of searchable keywords (and add to that list).

> 2: Who can declare a bug closed.
> 
> Ugh.  I'm going to close some of them if it seems obvious to me that they
> should be closed.  But what if it's not obvious?  I could probably maintain it
> to some extent, but I don't know how much time that would actually take.
> 
> Related to the next point, it probably makes sense to just close up front
> bugs that are marked against unsupported pg versions, or haven't had
> any activity for too long, perhaps two years.  Just closing bugs with no
> mailing list activity for two years closes 5280 of 6376 bugs.

I'm reluctant to close all of those unexamined, since part of the
purpose of this is to find bugs which were never fixed.  Probably we
should organize a posse to comb trhough all of the old bugs and
hand-close them.

> 3: How far back should I actually import data from the bugs list?
> 
> I have imported each archived month from December of 1998.  It looks like the
> bug sequence was started at 1000 in December of 2003.  Emails with no bug id 
> in
> the subject line don't get associated with any bug, they're in the DB bug not
> really findable.
> 
> 4: What should I do with emails that don't reference a bug id but seem to be
> talking about a bug?
> 
> I suggest we do nothing with them as far as the bug tracker is concerned.  If
> people want to mark their message as pertaining to a bug, they can put that in
> the subject line.  However, I don't think a bug id can be assigned via email,
> that is, I think you have to use a web form to create a bug report with a bug
> id.  Presumably that could change if whoever runs the bug counter wants it to.

Yeah, fixing this would probably be tied to the possible change to
mailman.  Unless someone already has a way to get majordomo to append a
bug ID.

> 5: How can we use email to update the status of a bug?
> 
> I suggest using email headers to do this.  'X-PGBug-Fixed: ' and the
> like.  I assume here that everyone who might want to do such a thing uses an
> MUA that would allow this, and they know how.

I guess that depends on who we expect to use this, at least for closing
stuff.

> 6: Does there need to be any security on updating the status?
> 
> Probably not.  I don't think it's the sort of thing that would attract
> malicious adjustments.  If I'm wrong, I'd need to rethink this.  I realize I'm
> making security an afterthought, which makes my teeth itch, but I think layers
> of security would make it much less likely to be actually adopted.

I think there needs to be some kind of administrative access which
allows, for example, an issue to be closed so that it can't be reopened.

Anyway, I'm not convinced we want to reinvent this particular wheel, but
if we do, you've done a yeoman's job.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Peter Geoghegan
On Sun, Oct 4, 2015 at 3:16 PM, Tom Lane  wrote:
> Peter Geoghegan  writes:
>> To be clear: I wasn't sure why you though I falsely count entries with
>> dropped texts within entry_dealloc().
>
> In the existing^H^H^Hprevious code, dropped-text entries would essentially
> act as length-zero summands in the average calculation, whereas I think
> we agree that they ought to be ignored; otherwise they decrease the
> computed mean and thereby increase the probability of (useless) GC cycles.
> In the worst case where the hashtable is mostly dropped-text entries,
> which would for instance be the prevailing situation shortly after a GC
> failure, we'd be calculating ridiculously small mean values and that'd
> prompt extra GC cycles no?

Yes, but my patch changed that, too. I suggested that first.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Peter Geoghegan
On Sun, Oct 4, 2015 at 3:10 PM, Tom Lane  wrote:
>> That seems perfectly reasonable, yes. Should I leave that to you?
>
> After a closer look I decided that wasn't reasonable at all.  Discounting
> sticky texts would then mean that after a GC cycle, we might still think
> the query texts file is bloated and issue another GC request, which of
> course would not shrink the file, so that we'd be doing GC cycles every
> time we added a new query.

That seems unlikely to me. Firstly, there is a generic check of 512
bytes per entry within need_gc_qtexts() -- we never GC if that generic
limit is not exceeded, and that's far from tiny. Secondly, how long do
you think those sticky entries will stay around in the hastable to
continue to cause trouble, and dominate over regular entries? There'd
have to be constant evictions/cache pressure for this situation to
occur, because the threshold within need_gc_qtexts() is based on
pg_stat_statements.max, and yet many evictions aggressively evict
sticky entries very quickly. Seems like a very unusual workload to me.

I think that you're overestimating the cost of discounting sticky
entries, which are usually very much in the minority (at least when it
matters, with cache pressure), and underestimating the cost of
continuing to weigh sticky entries' contribution to mean query text.

As I said, my proposal to not have sticky entries contribute to
overall mean query text length is based on a problem report involving
a continually failing data integration process. That in particular is
what I hope to stop having a negative impact on mean query length -- a
unique queryid makes for an entry that is bound to remain useless
forever (as opposed to just failing the first few times). With the
larger limit of MaxAllocHugeSize, I worry about swapping with the
exclusive lock held.

The fact that there is a big disparity between mean query length for
sticky and non-sticky entries is weird. It was seen to happen in the
wild only because the sticky entries that clogged things up were not
really distinct to a human -- only to the fingerprinting/jumbling
code, more or less by accident, which in a practical sense caused a
distortion.  That's what I'm targeting. I have a hard time imagining
any harm from discounting sticky entries with a realistic case.

> The mean query len recorded by gc_qtexts()
> *has to* match the mean length of what it actually put in the file, not
> the mean length of what we might wish it would put in the file.

Why? Why not simply care about whether or not the file was
unreasonably large relative to available, useful query statistics? I
think that mean_query_len isn't something that swings all that much
with realistic workloads and 5,000 representative entries -- it
certainly should not swing wildly. The problem to an extent was that
that accidentally stopped being true.

> By the same token, I'm back to believing that it's fundamentally bogus for
> entry_dealloc() to compute mean_query_len that way.  The most likely
> result of that is useless GC cycles.  The only thing that will actually
> free memory when you've got a lot of dead sticky entries is getting rid of
> the sticky hashtable entries, and the only way to do that is to wait for
> entry_dealloc() to get rid of 'em.  You were unenthused about making that
> removal more aggressive, which is fine, but you can't have it both ways.

Meanwhile, mean_query_len grows as more "distinct" entries are
created, pushing out their garbage collection further and further.
Especially when there is a big flood of odd queries that stay sticky.
Once again, I'm having a really hard time imagining a minority of
current, non-sticky hashtable entries dominating a majority of
current, sticky hashtable entries come garbage collection time.
Garbage collection is linked to creation of entries, which is also
linked to eviction, which aggressively evicts sticky entries in this
thrashing scenario that you describe.

> It does strike me that when we do get rid of the sticky entries, cleanup
> of the texts file might lag a bit longer than it needs to because
> mean_query_len is computed before not after deleting whatever entries
> we're going to delete.  On average, that shouldn't matter ... but if we
> are tossing a bunch of dead sticky entries, maybe they would have a higher
> mean length than the rest?

Yes, that is something that was observed to happen in the problem case
I looked into. You know my solution already.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Tom Lane
Andrew Dunstan  writes:
> Sorry, I'm a bit late to this party. Does what you have committed mean 
> people are less likely to see "Out of Memory" coming from 
> pg_stat_statements? If not, what can be done about them short of a 
> restart? And what bad effects follow from an event generating them?

The main thing we've done that will alleviate that is increase the size of
query text file that the garbage-collection routine can cope with from
MaxAllocSize (1GB) to MaxAllocHugeSize (at least 2GB, lots more on 64bit
machines, though on 32-bit you probably can't get to 2GB anyway ...).

Also, what will now happen if you do get an out-of-memory is that the code
will discard stored query texts and truncate the file, so that the problem
doesn't recur (at least not till you build up a new set of stored query
texts).  At this point you still have statistics, but they can only be
identified by query ID since the text has been forgotten.  I'm not sure
how useful that situation really is ...

> The docs seem to be quite silent on these points.

The docs probably ought to describe this situation and recommend reducing
pg_stat_statements.max if you want to preserve query texts.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Andrew Dunstan



On 10/04/2015 06:16 PM, Tom Lane wrote:

Peter Geoghegan  writes:

To be clear: I wasn't sure why you though I falsely count entries with
dropped texts within entry_dealloc().

In the existing^H^H^Hprevious code, dropped-text entries would essentially
act as length-zero summands in the average calculation, whereas I think
we agree that they ought to be ignored; otherwise they decrease the
computed mean and thereby increase the probability of (useless) GC cycles.
In the worst case where the hashtable is mostly dropped-text entries,
which would for instance be the prevailing situation shortly after a GC
failure, we'd be calculating ridiculously small mean values and that'd
prompt extra GC cycles no?





Sorry, I'm a bit late to this party. Does what you have committed mean 
people are less likely to see "Out of Memory" coming from 
pg_stat_statements? If not, what can be done about them short of a 
restart? And what bad effects follow from an event generating them?


The docs seem to be quite silent on these points.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain`t So!]

2015-10-04 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


> Comments are welcome, and no, I don't really expect that this will be what 
> gets
> adopted, mainly I wanted to show that we can probably just build something
> rather effective off our existing infrastructure

+1, good job.

> The bugs have 3.5 messages each on average, with 2 being the most common
> number, and 113 at the most, for bug 12990.  1284 bugs have only one message
> associated with them.

For anyone who is dying to know, as I was, what the winning bug report was:

"Missing pg_multixact/members files (appears to have wrapped, then truncated)"

http://www.postgresql.org/message-id/flat/20150406192130.2573.22...@wrigleys.postgresql.org#20150406192130.2573.22...@wrigleys.postgresql.org
or:
http://goo.gl/4lKYOC


- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201510041854
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlYRriIACgkQvJuQZxSWSsgJkwCgsROux3esaDxHbitNhHs17Thk
rKIAoNMD6NnKRAvguuvxkg4hiJOfPDH6
=5kJJ
-END PGP SIGNATURE-




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No Issue Tracker - Say it Ain't So!]

2015-10-04 Thread Nathan Wagner
I don't have the original message for this thread, so I arbitrarily picked a
message to reply to.

Since what has been asked for is a bug *tracker*, and we already have a bugs
mailing list, I put together something.

I downloaded the archives for pgsql-bugs, and fed them into a database.  This
part was easy, since I have already written a pg backed usenet server and had
the code hand for storing and parsing out bits of rfc 2822 messages.

It's dirt simple.  If the system sees a message with 'Bug #(\d+)' in the
subject line, it creates an entry in a bugs table with that bug number (if
needed), and then marks the message as belonging to that bug.  If there seems
to be metadata about the bug in the format of the (unquoted)

Bug reference:
Logged by:  
Email address:
PostgreSQL version:
Operating system:
Description:
Details:

it pulls that out and puts it in the bugs table.  There's also an "open"
boolean in the table, defaulting to true.

The results can be found at https://granicus.if.org/pgbugs/

Ok.  So now we have a bug tracker, but...

Some open questions that I don't think have really been addressed, with my
commentary interspersed:

1: Can a bug be more than "open" or "closed"?

I think yes.  At least we probably want to know why a bug is closed.  Is it not
a bug at all, not our bug, a duplicate submission, a duplicate of another bug,
something we won't fix for some reason (e.g. a bug against version 7)

2: Who can declare a bug closed.

Ugh.  I'm going to close some of them if it seems obvious to me that they
should be closed.  But what if it's not obvious?  I could probably maintain it
to some extent, but I don't know how much time that would actually take.

Related to the next point, it probably makes sense to just close up front
bugs that are marked against unsupported pg versions, or haven't had
any activity for too long, perhaps two years.  Just closing bugs with no
mailing list activity for two years closes 5280 of 6376 bugs.

3: How far back should I actually import data from the bugs list?

I have imported each archived month from December of 1998.  It looks like the
bug sequence was started at 1000 in December of 2003.  Emails with no bug id in
the subject line don't get associated with any bug, they're in the DB bug not
really findable.

4: What should I do with emails that don't reference a bug id but seem to be
talking about a bug?

I suggest we do nothing with them as far as the bug tracker is concerned.  If
people want to mark their message as pertaining to a bug, they can put that in
the subject line.  However, I don't think a bug id can be assigned via email,
that is, I think you have to use a web form to create a bug report with a bug
id.  Presumably that could change if whoever runs the bug counter wants it to.

5: How can we use email to update the status of a bug?

I suggest using email headers to do this.  'X-PGBug-Fixed: ' and the
like.  I assume here that everyone who might want to do such a thing uses an
MUA that would allow this, and they know how.

6: Does there need to be any security on updating the status?

Probably not.  I don't think it's the sort of thing that would attract
malicious adjustments.  If I'm wrong, I'd need to rethink this.  I realize I'm
making security an afterthought, which makes my teeth itch, but I think layers
of security would make it much less likely to be actually adopted.

Just to be clear, this is both a work in progress and a proof of concept.  It's
slow, it's ugly.  I haven't created any indexes, written any css or javascript,
or implemented any caching.  I may work on that before you read this though.

Comments are welcome, and no, I don't really expect that this will be what gets
adopted, mainly I wanted to show that we can probably just build something
rather effective off our existing infrastructure, and I had Saturday free (as
of this writing, I've got maybe 5 hours into it total, albeit with lots of
code re-use from other projects).  There are some obvious todo items,
filtering and searching being the most salient.

Some data import issues:

March 10, 2003, bad Date header, looked like junk anyway, so I deleted it.

Time zone offsets of --0400 and -0500 (at least), I took these as being -0400
(or whathaveyou).

Date header of Sat, 31 May 2008 12:12:18 +1930, judging by the name on the
email, this is probably posted from Venezuela, which switched to time one -0430
in 2007, which could also be thought of as +1930 if you ignore the implied date
change.

And, by way of some statistics, since I've got the archives in a database:

Emails: 43870
Bugs: 6376
Distinct 'From' headers: 10643

The bugs have 3.5 messages each on average, with 2 being the most common
number, and 113 at the most, for bug 12990.  1284 bugs have only one message
associated with them.

-- 
nw


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Tom Lane
Peter Geoghegan  writes:
> To be clear: I wasn't sure why you though I falsely count entries with
> dropped texts within entry_dealloc().

In the existing^H^H^Hprevious code, dropped-text entries would essentially
act as length-zero summands in the average calculation, whereas I think
we agree that they ought to be ignored; otherwise they decrease the
computed mean and thereby increase the probability of (useless) GC cycles.
In the worst case where the hashtable is mostly dropped-text entries,
which would for instance be the prevailing situation shortly after a GC
failure, we'd be calculating ridiculously small mean values and that'd
prompt extra GC cycles no?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Tom Lane
Peter Geoghegan  writes:
> On Sun, Oct 4, 2015 at 1:01 PM, Tom Lane  wrote:
>> Hm.  The problem I've got with this is that then mean_query_len means
>> something significantly different after entry_dealloc than it does
>> after gc_texts.
>> 
>> I'd be okay with changing *both* of those functions to ignore sticky
>> entries in the calculation, if that seems reasonable to you.

> That seems perfectly reasonable, yes. Should I leave that to you?

After a closer look I decided that wasn't reasonable at all.  Discounting
sticky texts would then mean that after a GC cycle, we might still think
the query texts file is bloated and issue another GC request, which of
course would not shrink the file, so that we'd be doing GC cycles every
time we added a new query.  The mean query len recorded by gc_qtexts()
*has to* match the mean length of what it actually put in the file, not
the mean length of what we might wish it would put in the file.

By the same token, I'm back to believing that it's fundamentally bogus for
entry_dealloc() to compute mean_query_len that way.  The most likely
result of that is useless GC cycles.  The only thing that will actually
free memory when you've got a lot of dead sticky entries is getting rid of
the sticky hashtable entries, and the only way to do that is to wait for
entry_dealloc() to get rid of 'em.  You were unenthused about making that
removal more aggressive, which is fine, but you can't have it both ways.

It does strike me that when we do get rid of the sticky entries, cleanup
of the texts file might lag a bit longer than it needs to because
mean_query_len is computed before not after deleting whatever entries
we're going to delete.  On average, that shouldn't matter ... but if we
are tossing a bunch of dead sticky entries, maybe they would have a higher
mean length than the rest?  Not sure about it.  I put a comment about
this into entry_dealloc() for the moment, but we can revisit whether it
is worth adding code/cycles to get a more up-to-date mean length.

Anyway, I've committed the aspects of this that we're agreed on.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ALTER TABLE behind-the-scenes effects' CONTEXT

2015-10-04 Thread Marko Tiikkaja

Hi,

In the past I've found the error message in cases such as this somewhat 
less helpful than it could be:


=# CREATE TABLE qqq (a int);
=# CREATE UNIQUE INDEX IF NOT EXISTS qqq_a_idx ON qqq(a);
=# ALTER TABLE qqq ALTER COLUMN a TYPE json USING NULL;
ERROR:  data type json has no default operator class for access method 
"btree"
HINT:  You must specify an operator class for the index or define a 
default operator class for the data type.


The attached patch adds a CONTEXT line to index and constraint rebuilds, 
e.g:


  CONTEXT:  while rebuilding index qqq_a_idx

Any feedback welcome.


.m
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 82,87 
--- 82,88 
  #include "storage/smgr.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
+ #include "utils/elog.h"
  #include "utils/fmgroids.h"
  #include "utils/inval.h"
  #include "utils/lsyscache.h"
***
*** 309,314  static void ATController(AlterTableStmt *parsetree,
--- 310,316 
 Relation rel, List *cmds, bool recurse, LOCKMODE 
lockmode);
  static void ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
  bool recurse, bool recursing, LOCKMODE lockmode);
+ static void ATRewriteSubcommandErrorCallback(void *arg);
  static void ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode);
  static void ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
  AlterTableCmd *cmd, LOCKMODE lockmode);
***
*** 3373,3378  ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
--- 3375,3399 
tab->subcmds[pass] = lappend(tab->subcmds[pass], cmd);
  }
  
+ static void
+ ATRewriteSubcommandErrorCallback(void *arg)
+ {
+   AlterTableCmd *subcmd = (AlterTableCmd *) arg;
+   
+   switch (subcmd->subtype)
+   {
+   case AT_ReAddIndex:
+   errcontext("while rebuilding index %s", subcmd->name);
+   break;
+   case AT_ReAddConstraint:
+   errcontext("while rebuilding constraint %s", 
subcmd->name);
+   break;
+   default:
+   /* keep compiler quiet */
+   (void) 0;
+   }
+ }
+ 
  /*
   * ATRewriteCatalogs
   *
***
*** 3402,3407  ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode)
--- 3423,3429 
List   *subcmds = tab->subcmds[pass];
Relationrel;
ListCell   *lcmd;
+   ErrorContextCallback errcallback;
  
if (subcmds == NIL)
continue;
***
*** 3411,3418  ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode)
 */
rel = relation_open(tab->relid, NoLock);
  
foreach(lcmd, subcmds)
!   ATExecCmd(wqueue, tab, rel, (AlterTableCmd *) 
lfirst(lcmd), lockmode);
  
/*
 * After the ALTER TYPE pass, do cleanup work (this is 
not done in
--- 3433,3451 
 */
rel = relation_open(tab->relid, NoLock);
  
+   errcallback.callback = ATRewriteSubcommandErrorCallback;
+   errcallback.previous = error_context_stack;
+   error_context_stack = &errcallback;
+ 
foreach(lcmd, subcmds)
!   {
!   AlterTableCmd *subcmd = (AlterTableCmd *) 
lfirst(lcmd);
! 
!   errcallback.arg = subcmd;
!   ATExecCmd(wqueue, tab, rel, subcmd, lockmode);
!   }
! 
!   error_context_stack = errcallback.previous;
  
/*
 * After the ALTER TYPE pass, do cleanup work (this is 
not done in
***
*** 8682,8687  ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, 
char *cmd,
--- 8715,8721 
  
newcmd = makeNode(AlterTableCmd);
newcmd->subtype = AT_ReAddIndex;
+   newcmd->name = stmt->idxname;
newcmd->def = (Node *) stmt;
tab->subcmds[AT_PASS_OLD_INDEX] =
lappend(tab->subcmds[AT_PASS_OLD_INDEX], 
newcmd);
***
*** 8712,8717  ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, 
char *cmd,
--- 8746,8752 

 RelationRelationId, 0);
  
cmd->subtype = AT_ReAddIndex;
+   cmd->name = indstmt->idxname;
tab->subcmds[AT_PASS_OLD_INDEX

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Peter Geoghegan
On Sun, Oct 4, 2015 at 1:12 PM, Peter Geoghegan  wrote:
> On Sun, Oct 4, 2015 at 1:01 PM, Tom Lane  wrote:
>> Ah, right, sorry.  I meant to make its result match what gc_texts would
>> get, by not falsely counting entries with dropped texts.  That's not
>> what you have in your patch but it seems like an easy enough fix.
>
> I'm trying to make mean_query_len representative of *useful* entry
> query length. I guess I don't have that within gc_texts in my patch,
> but I do have it within entry_dealloc (up to and including considering
> dropped texts), which FWIW is far more important.

To be clear: I wasn't sure why you though I falsely count entries with
dropped texts within entry_dealloc(). I suppose my sense was that
dropped texts ought to not make garbage collection occur too
frequently, which could also be a problem.

Garbage collection ought to occur when the size of the query text file
becomes excessive relative to useful entries. I was worried about the
thrashing risk from dropped text entries. Maybe we could, as an
alternative, not forget the original size of dropped query texts,
relying only on their offset to indicate the text is invalid. Dropped
query texts would then not be special in that sense, which seems like
a good thing all around.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Tom Lane
Peter Geoghegan  writes:
> On Sun, Oct 4, 2015 at 1:01 PM, Tom Lane  wrote:
>> Hm.  The problem I've got with this is that then mean_query_len means
>> something significantly different after entry_dealloc than it does
>> after gc_texts.
>> 
>> I'd be okay with changing *both* of those functions to ignore sticky
>> entries in the calculation, if that seems reasonable to you.

> That seems perfectly reasonable, yes. Should I leave that to you?

Sure, I can take it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Peter Geoghegan
On Sun, Oct 4, 2015 at 1:01 PM, Tom Lane  wrote:
> Ah, right, sorry.  I meant to make its result match what gc_texts would
> get, by not falsely counting entries with dropped texts.  That's not
> what you have in your patch but it seems like an easy enough fix.

I'm trying to make mean_query_len representative of *useful* entry
query length. I guess I don't have that within gc_texts in my patch,
but I do have it within entry_dealloc (up to and including considering
dropped texts), which FWIW is far more important.

>> I'd be quite happy if you did everything listed, and also left the
>> extra discrimination against sticky entries within entry_dealloc in --
>> consider what happens when a huge malloc() ends up swapping with an
>> exclusive lock held, and consider that repeated, failed data
>> integration transactions are implicated in this in a big way when a
>> problem appears in the wild. A big part of the problem here was that
>> garbage collection did not run often enough.
>
> Hm.  The problem I've got with this is that then mean_query_len means
> something significantly different after entry_dealloc than it does
> after gc_texts.
>
> I'd be okay with changing *both* of those functions to ignore sticky
> entries in the calculation, if that seems reasonable to you.

That seems perfectly reasonable, yes. Should I leave that to you?

>> In other words, I'd be fine with *not* doing the query size filter
>> thing for now, since that is something that seems like an extra
>> defense and not core to the problem. I was kind of ambivalent about
>> doing that part myself, actually.
>
> Agreed on that part.

We're in full agreement on what needs to happen for the next point
release, then. Excellent.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential GIN vacuum bug

2015-10-04 Thread Jeff Janes
On Thu, Sep 3, 2015 at 10:42 AM, Jeff Janes  wrote:

> On Mon, Aug 31, 2015 at 12:10 AM, Jeff Janes  wrote:
>
>> On Sun, Aug 30, 2015 at 3:57 PM, Tom Lane  wrote:
>>
>>> Jeff Janes  writes:
>>> > On Sun, Aug 30, 2015 at 11:11 AM, Tom Lane  wrote:
>>>
>> > But we would still have to deal with the
>>> > fact that unconditional acquire attempt by the backends will cause a
>>> vacuum
>>> > to cancel itself, which is undesirable.
>>>
>>> Good point.
>>>
>>> > If we define a new namespace for
>>> > this lock (like the relation extension lock has its own namespace) then
>>> > perhaps the cancellation code could be made to not cancel on that
>>> > condition.  But that too seems like a lot of work to backpatch.
>>>
>>> We could possibly teach the autocancel logic to distinguish this lock
>>> type
>>> from others without using a new namespace.  That seems a bit klugy, but
>>> maybe better than adding a new namespace.  (For example, there are
>>> probably only a couple of modes in which we take page-level locks at
>>> present.  Choosing a currently unused, but self-exclusive, mode for
>>> taking
>>> such a lock might serve.)
>>>
>>
>> Like the attached?  (The conditional variant for user backends was
>> unceremoniously yanked out.)
>>
>
> A problem here is that now we have the user backends waiting on vacuum to
> do the clean up, but vacuum is using throttled IO and so taking its sweet
> time at it.  Under the old code, the user backend could pass up the vacuum
> while it was sleeping.
>
> Maybe we could have the vacuum detect when someone is waiting on it, and
> temporarily suspend throttling and just run at full speed.  I don't believe
> there is any precedence for that, but I do think there are other places
> where such a technique could be useful.  That is kind of a scary change to
> backpatch.
>
> I am running out of ideas.
>


Teodor published a patch in another thread:
http://www.postgresql.org/message-id/56041b26.2040...@sigaev.ru but I
thought it would be best to discuss it here.

It is similar to my most recent patch.

He removes the parts of the code that anticipates concurrent clean up, and
replaces them with asserts, which I was too lazy to do until we have a
final design.

He uses a different lock mode (ExclusiveLock, rather than
ShareUpdateExclusiveLock) when heavy-locking the metapage.  It doesn't make
a difference, as long as it is self-exclusive and no one else uses it in a
way that causes false sharing (which is currently the case--the only other
user of PageLocks is the hash index code)

He always does conditional locks in regular backends.  That means he
doesn't have to hack the lmgr to prevent vacuum from canceling itself, the
way I did.  It also means there is not the "priority inversion" I mention
above, where a user backend blocks on vacuum, but vacuum is intentionally
throttling itself.

On the other hand, using conditional locks for normal backends does mean
that the size of the pending list can increase without limit, as there is
nothing to throttle the user backends from adding tuples faster than they
are cleaned up. Perhaps worse, it can pin down a vacuum worker without
limit, as it keeps finding more pages have been added by the time it
finished the prior set of pages.  I actually do see this on my (not very
realistic) testing.

I think that for correctness, vacuum only needs to clean the part of the
pending list which existed as of the time vacuum started.  So as long as it
gets that far, it can just be done even if more pages have since been
added.  I'm not sure the best way implement that, I guess you remember the
blkno of the tail page from when you started, and would set a flag once you
truncated away a page with that same blkno.  That would solve the pinning
down a vacuum worker for an unlimited amount of time issue, but would not
solve the unlimited growth of the pending list issue.

Cheers,

Jeff


Re: [HACKERS] check fails on Fedora 23

2015-10-04 Thread Andrew Dunstan



On 10/04/2015 12:52 PM, Pavel Stehule wrote:




Isn't this arguably a Fedora regression? What did they change
in F23 to make it fail? I note that F23 is still in Beta.


It is working on F22 - so it is looking as regression in some fedora 
components.


can somebody repeat check on FC23?


Yes, I have reproduced it.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Tom Lane
Peter Geoghegan  writes:
> I'm not clear on what you actually propose to do to "make
> entry_dealloc's recomputation of mean_query_len sane", but I think you
> are talking about something distinct from what I've proposed

Ah, right, sorry.  I meant to make its result match what gc_texts would
get, by not falsely counting entries with dropped texts.  That's not
what you have in your patch but it seems like an easy enough fix.

> I'd be quite happy if you did everything listed, and also left the
> extra discrimination against sticky entries within entry_dealloc in --
> consider what happens when a huge malloc() ends up swapping with an
> exclusive lock held, and consider that repeated, failed data
> integration transactions are implicated in this in a big way when a
> problem appears in the wild. A big part of the problem here was that
> garbage collection did not run often enough.

Hm.  The problem I've got with this is that then mean_query_len means
something significantly different after entry_dealloc than it does
after gc_texts.

I'd be okay with changing *both* of those functions to ignore sticky
entries in the calculation, if that seems reasonable to you.

> In other words, I'd be fine with *not* doing the query size filter
> thing for now, since that is something that seems like an extra
> defense and not core to the problem. I was kind of ambivalent about
> doing that part myself, actually.

Agreed on that part.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Peter Geoghegan
On Sun, Oct 4, 2015 at 9:27 AM, Tom Lane  wrote:
> Hm.  I'm unconvinced by the aspects of this that involve using
> mean_query_len as a filter on which texts will be accepted.  While that's
> not necessarily bad in the abstract, there are way too many implementation
> artifacts here that will result in random-seeming decisions about whether
> to normalize.

There are already plausible race conditions that can make query text
normalization not occur. I think that's much more likely in practice
to cause a failure to normalize than anything proposed here; I've
personally observed such things in the wild a few times already. Also,
note that mean_query_len is not used directly there -- I decided on
Max(ASSUMED_LENGTH_INIT, mean_query_len) instead. mean_query_len is
just for cases with very large query texts.

> * mean_query_len only gets recomputed in entry_dealloc(), which is only
> run if we exceed pgss_max, and gc_qtexts(), which is only run if we decide
> the query texts file is more than 50% bloat.  So there could be quite a
> long startup transient before the value gets off its initialization
> minimum, and I'm suspicious that there might be plausible use-cases where
> it never does.  So it's not so much "restrict to a multiple of the mean
> query len" as "restrict to some number that might once upon a time have
> had some relation to the mean query len, or maybe not".

ASSUMED_LENGTH_INIT * 5 is a pretty conservative lower bound, I'd say.
mean_query_len is only really used for cases where query texts are
much longer on average. So in order for that to be a problem, you'd
have to have what are, in an absolute sense, very large query texts. I
think I noticed no more than a handful of changes in the regression
tests, for example.

> * One could expect that after changing mean_query_len, the population of
> query texts would change character as a result of the filter behavior
> changing, so that convergence to stable behavior over the long haul is
> not exactly self-evident.

FWIW, I think that there is a feedback loop today, and that in problem
cases that was what allowed it to get out of hand.

> * As you've got it here, entry_dealloc() and gc_qtexts() don't compute
> mean_query_len the same way, because only one of them discriminates
> against sticky entries.  So the value would bounce about rather randomly
> based on which one had run last.

entry_dealloc() will naturally run far more frequently than
gc_qtexts(). That said, it would be better if they matched.

> * I'm not exactly convinced that sticky entries should be ignored for
> this purpose anyway.

I think that data integration transactions that fail repeatedly are
strongly implicated here in practice. That's behind the query size
filter thing that you may also take issue with, as well as this.

> Taking a step back, ISTM the real issue you're fighting here is lots of
> orphaned sticky entries, but the patch doesn't do anything directly to fix
> that problem.  I wonder if we should do something like making
> entry_dealloc() and/or gc_qtexts() aggressively remove sticky entries,
> or at least those with "large" texts.

Sticky entries are (almost by definition) always aggressively removed,
and I hesitate to give certain ones a lower usage_count to begin with,
which is the only way to directly be more aggressive that might work
better.

> I think the aspects of this patch that are reasonably uncontroversial are
> increasing the allowed malloc attempt size in gc_qtexts, flushing the
> query text file on malloc failure, fixing the missing cleanup steps after
> a gc failure, and making entry_dealloc's recomputation of mean_query_len
> sane (which I'll define for the moment as "the same as gc_qtexts would
> get").  Since we're hard against a release deadline, I propose to commit
> just those changes, and we can consider the idea of a query size filter
> and/or redefining mean_query_len at leisure.

I'm not clear on what you actually propose to do to "make
entry_dealloc's recomputation of mean_query_len sane", but I think you
are talking about something distinct from what I've proposed based on
your separate remarks about entry_dealloc and the extra discrimination
against sticky entries there (vis-a-vis calculating mean query
length). I can't decide exactly what you mean, though: neither
entry_dealloc nor gc_qtexts care about orphaned query texts in my
patch (or in master). Please clarify.

I'd be quite happy if you did everything listed, and also left the
extra discrimination against sticky entries within entry_dealloc in --
consider what happens when a huge malloc() ends up swapping with an
exclusive lock held, and consider that repeated, failed data
integration transactions are implicated in this in a big way when a
problem appears in the wild. A big part of the problem here was that
garbage collection did not run often enough.

In other words, I'd be fine with *not* doing the query size filter
thing for now, since that is something that seems like an e

Re: [HACKERS] about fsync in CLOG buffer write

2015-10-04 Thread Andres Freund
On 2015-10-04 12:14:05 -0700, Jeff Janes wrote:
> My (naive) expectation is that no additional locking is needed.
> 
> Once we decide to consult the clog, we already know the transaction is no
> longer in progress, so it can't be in-flight to change that clog entry we
> care about because it was required to have done that already.

Other xids on the same page can still be in progress and those
concurrently might need to be written to.

> Once we have verified (under existing locking) that the relevant page is
> already not in memory, we know it can't be dirty in memory.  If someone
> pulls it into memory after we observe it to be not there, it doesn't matter
> to us as whatever transaction they are about to change can't be the one we
> care about.

The read of the page from disk from a concurrent process might have been
before our write, i.e. containing an unmodified page, but now future
writes will overwrite the entry we wrote directly. I think there's a
bunch of related issues.

Such things will currently prevented by the IO locks in slru.c.

> Is there a chance that, if we read a byte from the kernel when someone is
> in the process of writing adjacent bytes (or writing the same byte, with
> changes only to bits in it which we don't care about), the kernel will
> deliver us something which is neither the old value nor the new value, but
> some monstrosity?

Depends on the granularity of the write/read and the OS IIRC.


I don't think it's worth investing time and complexity to bypass SLRU in
certain cases. We should rather rewrite the thing completely.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] about fsync in CLOG buffer write

2015-10-04 Thread Jeff Janes
On Sat, Sep 12, 2015 at 5:21 PM, Andres Freund  wrote:

> On September 12, 2015 5:18:28 PM PDT, Jeff Janes 
> wrote:
> >On Wed, Sep 2, 2015 at 5:32 AM, Andres Freund 
> >wrote:
> >
> >> On 2015-09-10 19:39:59 +0800, 张广舟(明虚) wrote:
> >> > We found there is a fsync call when CLOG buffer
> >> > is written out in SlruPhysicalWritePage(). It is often called when
> >a
> >> backend
> >> > needs to check transaction status with SimpleLruReadPage().
> >>
> >> That's when there's not enough buffers available some other, and your
> >> case dirty, needs to be written out.
> >>
> >
> >Why bother to find a place to store the page in shared memory at all?
> >If
> >we just want to read it, and it isn't already in shared memory, then
> >why
> >not just ask the kernel for the specific byte we need?  The byte we
> >want to
> >read can't differ between shared memory and kernel, because it doesn't
> >exist in shared memory.
>
> I doubt that'd help - the next access would be more expensive, and we'd
> need to have a more complex locking regime. These pages aren't necessarily
> read only at that point.
>

My (naive) expectation is that no additional locking is needed.

Once we decide to consult the clog, we already know the transaction is no
longer in progress, so it can't be in-flight to change that clog entry we
care about because it was required to have done that already.

Once we have verified (under existing locking) that the relevant page is
already not in memory, we know it can't be dirty in memory.  If someone
pulls it into memory after we observe it to be not there, it doesn't matter
to us as whatever transaction they are about to change can't be the one we
care about.

Perhaps someone will want the same page later so that they can write to it
and so will have to pull it in.  But we have to play the odds, and the odds
are that a page already dirty in memory is more likely to be needed to be
written to in the near future, than another page which was not already
dirty and is only needed with read intent.

If we are wrong, all that happens is someone later on has to do the same
work that we would have had to do anyway, at no greater cost than we if did
it now.  If we are right, we avoid an fsync to make room for new page, and
then later on avoid someone else having to shove out the page we brought in
(or a different one) only to replace it with the same page we just wrote,
fsynced, and shoved out.

Is there a chance that, if we read a byte from the kernel when someone is
in the process of writing adjacent bytes (or writing the same byte, with
changes only to bits in it which we don't care about), the kernel will
deliver us something which is neither the old value nor the new value, but
some monstrosity?

Cheers,

Jeff


Re: [HACKERS] DBT-3 with SF=20 got failed

2015-10-04 Thread Tom Lane
Tomas Vondra  writes:
> Anyway, I think you're right we're going in circles here. I think we 
> both presented all the arguments we had and we still disagree. I'm not 
> going to continue with this - I'm unlikely to win an argument against 
> two committers if that didn't happen until now. Thanks for the 
> discussion though.

Since we're hard up against the 9.5beta1 deadline, I've made an executive
decision to commit just the minimal change, which I view as being to
constrain the array size to MaxAllocSize where it has been all along.

I found a second rather serious bug in the new hash code while doing that
--- it failed to ensure nbuckets was actually a power of 2 --- which did
not improve my opinion of it one bit.

It's clear from this discussion that there's room for further improvement
in the hashtable sizing behavior, but I think that's 9.6 material at
this point.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] check fails on Fedora 23

2015-10-04 Thread Pavel Stehule
> Isn't this arguably a Fedora regression? What did they change in F23 to
>> make it fail? I note that F23 is still in Beta.
>>
>
It is working on F22 - so it is looking as regression in some fedora
components.

can somebody repeat check on FC23?

Regards

Pavel


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Tom Lane
Shay Rojansky  writes:
>> Try adding a sync before the second execute.

> I tried inserting a Sync right before the second Execute, this caused an
> error with the message 'portal "MQ1" does not exist'.
> This seems like problematic behavior on its own, regardless of my issues
> here (Sync shouldn't be causing an implicit close of the portal, should
> it?).

Sync results in closing the transaction, if you've not explicitly executed
a BEGIN.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Tom Lane
Shay Rojansky  writes:
>> To my mind there is not a lot of value in performing Bind until you
>> are ready to do Execute.  The only reason the operations are separated
>> in the protocol is so that you can do multiple Executes with a row limit
>> on each one, to retrieve a large query result in chunks.

> So you would suggest changing my message chain to send Bind right after
> Execute, right? This would yield the following messages:

> P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current
> P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S)

> This would mean that I would switch to using named statements and the
> unnamed portal, rather than the current unnamed statement
> and named portals. If I recall correctly, I was under the impression that
> there are some PostgreSQL performance benefits to using the
> unnamed statement over named statements, although I admit I can't find any
> documentation backing that. Can you confirm that the two
> are equivalent performance-wise?

Hmm.  I do not recall exactly what performance optimizations apply to
those two cases; they're probably not "equivalent", though I do not think
the difference is major in either case.  TBH I was a bit surprised on
reading your message to hear that the system would take that sequence at
all; it's not obvious that it should be allowed to replace a statement,
named or not, while there's an open portal that depends on it.

I think you might have more issues with lifespans, since portals go away
at commit whereas named statements don't.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-04 Thread Tom Lane
Peter Geoghegan  writes:
> Attached, revised patch deals with the issues around removing the
> query text file when garbage collection encounters trouble. There is
> no reason to be optimistic about any error within gc_qtexts() not
> recurring during a future garbage collection. OOM might be an
> exception, but probably not, since gc_qtexts() is reached when a new
> entry is created with a new query text, which in general makes OOM
> progressively more likely.

Hm.  I'm unconvinced by the aspects of this that involve using
mean_query_len as a filter on which texts will be accepted.  While that's
not necessarily bad in the abstract, there are way too many implementation
artifacts here that will result in random-seeming decisions about whether
to normalize.  For instance:

* mean_query_len only gets recomputed in entry_dealloc(), which is only
run if we exceed pgss_max, and gc_qtexts(), which is only run if we decide
the query texts file is more than 50% bloat.  So there could be quite a
long startup transient before the value gets off its initialization
minimum, and I'm suspicious that there might be plausible use-cases where
it never does.  So it's not so much "restrict to a multiple of the mean
query len" as "restrict to some number that might once upon a time have
had some relation to the mean query len, or maybe not".

* One could expect that after changing mean_query_len, the population of
query texts would change character as a result of the filter behavior
changing, so that convergence to stable behavior over the long haul is
not exactly self-evident.

* As you've got it here, entry_dealloc() and gc_qtexts() don't compute
mean_query_len the same way, because only one of them discriminates
against sticky entries.  So the value would bounce about rather randomly
based on which one had run last.

* I'm not exactly convinced that sticky entries should be ignored for
this purpose anyway.


Taking a step back, ISTM the real issue you're fighting here is lots of
orphaned sticky entries, but the patch doesn't do anything directly to fix
that problem.  I wonder if we should do something like making
entry_dealloc() and/or gc_qtexts() aggressively remove sticky entries,
or at least those with "large" texts.

I think the aspects of this patch that are reasonably uncontroversial are
increasing the allowed malloc attempt size in gc_qtexts, flushing the
query text file on malloc failure, fixing the missing cleanup steps after
a gc failure, and making entry_dealloc's recomputation of mean_query_len
sane (which I'll define for the moment as "the same as gc_qtexts would
get").  Since we're hard against a release deadline, I propose to commit
just those changes, and we can consider the idea of a query size filter
and/or redefining mean_query_len at leisure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] check fails on Fedora 23

2015-10-04 Thread Pavel Stehule
2015-10-04 17:52 GMT+02:00 Andrew Dunstan :

>
>
> On 10/04/2015 11:35 AM, Pavel Stehule wrote:
>
>>
>>
>>
>> > fails on assert
>>
>> Works for me ... what locale/collation are you running in?
>>
>>
>> LANG=cs_CZ.UTF-8
>>
>>
>> it depends on locale - it is working with C or en_US.UTF-8, but
>> doesn't work with Czech locale
>>
>>
>> and fails with Hungarian locales too
>>
>>
>>
>>
>
> Isn't this arguably a Fedora regression? What did they change in F23 to
> make it fail? I note that F23 is still in Beta.
>

Hard to say what can be wrong:

* locale
* gcc
* glibc

Regards

Pavel


>
> cheers
>
> andrew
>


Re: [HACKERS] check fails on Fedora 23

2015-10-04 Thread Andrew Dunstan



On 10/04/2015 11:35 AM, Pavel Stehule wrote:




> fails on assert

Works for me ... what locale/collation are you running in?


LANG=cs_CZ.UTF-8


it depends on locale - it is working with C or en_US.UTF-8, but
doesn't work with Czech locale


and fails with Hungarian locales too






Isn't this arguably a Fedora regression? What did they change in F23 to 
make it fail? I note that F23 is still in Beta.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
>
> I'm fairly sure that the query snapshot is established at Bind time,
> which means that this SELECT will run with a snapshot that indeed
> does not see the effects of the UPDATE.
>
> To my mind there is not a lot of value in performing Bind until you
> are ready to do Execute.  The only reason the operations are separated
> in the protocol is so that you can do multiple Executes with a row limit
> on each one, to retrieve a large query result in chunks.
>

So you would suggest changing my message chain to send Bind right after
Execute, right? This would yield the following messages:

P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current
P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S)

This would mean that I would switch to using named statements and the
unnamed portal, rather than the current unnamed statement
and named portals. If I recall correctly, I was under the impression that
there are some PostgreSQL performance benefits to using the
unnamed statement over named statements, although I admit I can't find any
documentation backing that. Can you confirm that the two
are equivalent performance-wise?

Shay


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
>
> Try adding a sync before the second execute.
>

I tried inserting a Sync right before the second Execute, this caused an
error with the message 'portal "MQ1" does not exist'.
This seems like problematic behavior on its own, regardless of my issues
here (Sync shouldn't be causing an implicit close of the portal, should
it?).


Re: [HACKERS] check fails on Fedora 23

2015-10-04 Thread Pavel Stehule
>>> > fails on assert
>>>
>>> Works for me ... what locale/collation are you running in?
>>>
>>
>> LANG=cs_CZ.UTF-8
>>
>
> it depends on locale - it is working with C or en_US.UTF-8, but doesn't
> work with Czech locale
>

and fails with Hungarian locales too


>
> Pavel
>
>
>>
>> Regards
>>
>> Pavel
>>
>>
>>
>>>
>>> regards, tom lane
>>>
>>
>>
>


Re: [HACKERS] check fails on Fedora 23

2015-10-04 Thread Pavel Stehule
2015-10-04 17:07 GMT+02:00 Pavel Stehule :

>
>
> 2015-10-04 16:37 GMT+02:00 Tom Lane :
>
>> Pavel Stehule  writes:
>> > I am testing PostgreSQL (master) on Fedora 23. The query
>>
>> > ELECT p1.oid, p1.proname, p2.oid, p2.proname
>> > FROM pg_proc AS p1, pg_proc AS p2
>> > WHERE p1.oid < p2.oid AND
>> > p1.prosrc = p2.prosrc AND
>> > p1.prolang = 12 AND p2.prolang = 12 AND
>> > (p1.proisagg = false OR p2.proisagg = false) AND
>> > (p1.prolang != p2.prolang OR
>> >  p1.proisagg != p2.proisagg OR
>> >  p1.prosecdef != p2.prosecdef OR
>> >  p1.proleakproof != p2.proleakproof OR
>> >  p1.proisstrict != p2.proisstrict OR
>> >  p1.proretset != p2.proretset OR
>> >  p1.provolatile != p2.provolatile OR
>> >  p1.pronargs != p2.pronargs);
>>
>> > fails on assert
>>
>> Works for me ... what locale/collation are you running in?
>>
>
> LANG=cs_CZ.UTF-8
>

it depends on locale - it is working with C or en_US.UTF-8, but doesn't
work with Czech locale

Pavel


>
> Regards
>
> Pavel
>
>
>
>>
>> regards, tom lane
>>
>
>


Re: [HACKERS] check fails on Fedora 23

2015-10-04 Thread Pavel Stehule
2015-10-04 16:37 GMT+02:00 Tom Lane :

> Pavel Stehule  writes:
> > I am testing PostgreSQL (master) on Fedora 23. The query
>
> > ELECT p1.oid, p1.proname, p2.oid, p2.proname
> > FROM pg_proc AS p1, pg_proc AS p2
> > WHERE p1.oid < p2.oid AND
> > p1.prosrc = p2.prosrc AND
> > p1.prolang = 12 AND p2.prolang = 12 AND
> > (p1.proisagg = false OR p2.proisagg = false) AND
> > (p1.prolang != p2.prolang OR
> >  p1.proisagg != p2.proisagg OR
> >  p1.prosecdef != p2.prosecdef OR
> >  p1.proleakproof != p2.proleakproof OR
> >  p1.proisstrict != p2.proisstrict OR
> >  p1.proretset != p2.proretset OR
> >  p1.provolatile != p2.provolatile OR
> >  p1.pronargs != p2.pronargs);
>
> > fails on assert
>
> Works for me ... what locale/collation are you running in?
>

LANG=cs_CZ.UTF-8

Regards

Pavel



>
> regards, tom lane
>


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Tom Lane
Shay Rojansky  writes:
> Npgsql supports sending multiple SQL statements in a single packet via the
> extended protocol. This works fine, but when the second query SELECTs a
> value modified by the first's UPDATE, I'm getting a result as if the UPDATE
> hasn't yet occurred.

> The exact messages send by Npgsql are:

> Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
> Describe (statement=unnamed)
> Bind (statement=unnamed, portal=MQ0)
> Parse (SELECT * FROM data WHERE id=1), statement=unnamed
> Describe (statement=unnamed)
> Bind (statement=unnamed, portal=MQ1)
> Execute (portal=MQ0)
> Close (portal=MQ0)
> Execute (portal=MQ1)
> Close (portal=MQ1)
> Sync

I'm fairly sure that the query snapshot is established at Bind time,
which means that this SELECT will run with a snapshot that indeed
does not see the effects of the UPDATE.

To my mind there is not a lot of value in performing Bind until you
are ready to do Execute.  The only reason the operations are separated
in the protocol is so that you can do multiple Executes with a row limit
on each one, to retrieve a large query result in chunks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] check fails on Fedora 23

2015-10-04 Thread Tom Lane
Pavel Stehule  writes:
> I am testing PostgreSQL (master) on Fedora 23. The query

> ELECT p1.oid, p1.proname, p2.oid, p2.proname
> FROM pg_proc AS p1, pg_proc AS p2
> WHERE p1.oid < p2.oid AND
> p1.prosrc = p2.prosrc AND
> p1.prolang = 12 AND p2.prolang = 12 AND
> (p1.proisagg = false OR p2.proisagg = false) AND
> (p1.prolang != p2.prolang OR
>  p1.proisagg != p2.proisagg OR
>  p1.prosecdef != p2.prosecdef OR
>  p1.proleakproof != p2.proleakproof OR
>  p1.proisstrict != p2.proisstrict OR
>  p1.proretset != p2.proretset OR
>  p1.provolatile != p2.provolatile OR
>  p1.pronargs != p2.pronargs);

> fails on assert

Works for me ... what locale/collation are you running in?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row_security GUC, BYPASSRLS

2015-10-04 Thread Stephen Frost
* Noah Misch (n...@leadboat.com) wrote:
> On Mon, Sep 28, 2015 at 05:13:56PM -0400, Stephen Frost wrote:
> > * Noah Misch (n...@leadboat.com) wrote:
> > > In schema reviews, I will raise a red flag for use of this feature; the 
> > > best
> > > designs will instead use additional roles.  I forecast that PostgreSQL 
> > > would
> > > fare better with no owner-constrained-by-RLS capability.  Even so, others 
> > > want
> > > it, and FORCE ROW SECURITY would deliver it with an acceptable risk 
> > > profile.
> > 
> > I've attached a patch to implement it.  It's not fully polished but it's
> > sufficient for comment, I believe.  Additional comments, documentation
> > and regression tests are to be added, if we have agreement on the
> > grammer and implementation approach.
> 
> This patch has FORCE ROW LEVEL SECURITY take precedence over row_security=off,
> which thwarts pg_dump use of row_security=off to ensure dump completeness.

Fixed.

> Should this be a table-level flag, or should it be a policy-level flag?  A
> policy-level flag is more powerful.  If nobody really anticipates using that
> power, this table-level flag works for me.

table-level seems the right level to me and no one is calling for
policy-level.  Further, policy-level could be added later if there ends
up being significant interest later.

> > > SECURITY_ROW_LEVEL_DISABLED could have been okay.  I removed an incomplete
> > > implementation (e.g. didn't affect CASCADE constraints).  Writing a full 
> > > one
> > > would be a mammoth job, and for what?  Setting the wrong SELECT policies 
> > > can
> > > disrupt _any_ application logic; no foreign keys or FORCE ROW SECURITY 
> > > need be
> > > involved.  Protecting just foreign keys brings some value, but it will not
> > > materially reduce the vigilance demanded of RLS policy authors and 
> > > reviewers.
> > 
> > I have a hard time with this.  We're not talking about the application
> > logic in this case, we're talking about the guarantees which the
> > database is making to the user, be it an application or an individual.
> 
> If disabling policies has an effect, table owners must be feeding conflicting
> requirements into the system.  Violating policies during referential integrity
> queries is not, in general, less serious than violating referential integrity
> itself.  Rules and triggers pose the same threat, and we let those break
> referential integrity.  I think the ideal in all of these cases is rather to
> detect the conflict and raise an error.  SECURITY_ROW_LEVEL_DISABLED and
> SECURITY_NOFORCE_RLS send policies in a third direction, neither the beaten
> path of rules/triggers nor the ideal.

The agreement between the user and the system with regard to permissions
and referential integrity is that referential integrity takes priority
over permissions.  Prior to FORCE and SECURITY_NOFORCE_RLS that is true
for RLS.  I don't believe it makes sense that adding FORCE would change
that agreement, nor do I agree that the ideal would be for the system to
throw errors when the permissions system would deny access during RI
checks.

While I appreciate that rules and triggers can break RI, the way RLS
works is consistent with the ACL system and FORCE should be consistent
with the ACL system and normal/non-FORCE RLS with regard to referential
integrity.

> > I've included a patch (the second in the set attached) which adds a
> > SECURITY_NOFORCE_RLS bit which has a much tighter scope- it only applies
> > after the regular owner check is done.  This reduces the risk of it
> > being set mistakenly dramatically, I believe.
> 
> Yes, that's far safer than SECURITY_ROW_LEVEL_DISABLED.  I assume the final
> design will let table owners completely bypass FORCE ROW LEVEL SECURITY under
> "SET row_security = off".  If so, SECURITY_NOFORCE_RLS poses negligible risk.

I've made that change.

> Functions differing only in s/ = true/ = false/?  ATExecEnableDisableTrigger()
> is a better model for this.

I've changed that to be one function.  As an independent patch, I'll do
the same for ATExecEnable/DisableRowSecurity.

Apologies about the timing, I had intended to get this done yesterday.

Barring further concerns, I'll push the attached later today with the
necessary catversion bump.

Thanks!

Stephen
From 810c8f6ea717303a537b1c80337f98d3ad282645 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Mon, 28 Sep 2015 11:28:15 -0400
Subject: [PATCH] ALTER TABLE .. FORCE ROW LEVEL SECURITY

To allow users to force RLS to always be applied, even for table owners,
add ALTER TABLE .. FORCE ROW LEVEL SECURITY.

row_security=off overrides FORCE ROW LEVEL SECURITY, to ensure pg_dump
output is complete (by default).

Also add SECURITY_NOFORCE_RLS context to avoid data corruption when
ALTER TABLE .. FORCE ROW SECURITY is being used. The
SECURITY_NOFORCE_RLS security context is used only during referential
integrity checks and is only considered in check_enable_rls() after we
have already checked that the current u

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-04 Thread Amir Rohan
On 10/04/2015 04:29 PM, Andres Freund wrote:
> On October 4, 2015 3:27:00 PM GMT+02:00, Amir Rohan  
> wrote:
> 
>> Perhaps it would help a little if you posted the latest patch here as
>> well? So that at least the app picks it up again.
> 
> You can as additional threads in the cf app.
> 

Done, thank you.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-04 Thread Andres Freund
On October 4, 2015 3:27:00 PM GMT+02:00, Amir Rohan  wrote:

>Perhaps it would help a little if you posted the latest patch here as
>well? So that at least the app picks it up again.

You can as additional threads in the cf app.

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Rework access method interface

2015-10-04 Thread Amit Kapila
On Sat, Oct 3, 2015 at 5:07 PM, Petr Jelinek  wrote:

> On 2015-10-03 08:27, Amit Kapila wrote:
>
>> On Fri, Oct 2, 2015 at 8:14 PM, Alexander Korotkov
>> mailto:a.korot...@postgrespro.ru>> wrote:
>>  >
>>  >
>>  > I agree about staying with one SQL-visible function.
>>
>
> Okay, this does not necessarily mean there should be only one validation
> function in the C struct though. I wonder if it would be more future proof
> to name the C interface as something else than the current generic
> amvalidate. Especially considering that it basically only does opclass
> validation at the moment (It's IMHO saner in terms of API evolution to
> expand the struct with more validator functions in the future compared to
> adding arguments to the existing function).
>
>
I also agree with you that adding more arguments in future might
not be a good idea for exposed API.  I don't know how much improvement
we can get if we use structure and then keep on adding more members
to it based on future need, but atleast that way it will be less prone to
breakage.

I think adding multiple validator functions is another option, but that
also doesn't sound like a good way as it can pose difficulty in
understanding the right version of API to be used.


>
>> Few assorted comments:
>>
>> 1.
>> +  * Get IndexAmRoutine structure from access method oid.
>> +  */
>> + IndexAmRoutine *
>> + GetIndexAmRoutine(Oid
>> amoid)
>> ...
>> + if (!RegProcedureIsValid
>> (amhandler))
>> + elog(ERROR, "invalid %u regproc", amhandler);
>>
>> I have noticed that currently, the above kind of error is reported
>> slightly
>> differently as in below code:
>> if (!RegProcedureIsValid(procOid)) \
>> elog(ERROR, "invalid %s regproc", CppAsString
>> (pname)); \
>>
>> If you feel it is better to do the way as it is in current code, then you
>> can change accordingly.
>>
>
> It's completely different use-case from existing code. And tbh I think it
> should have completely different and more informative error message
> something in the style of "index access method %s does not have a handler"
> (see for example GetFdwRoutineByServerId or transformRangeTableSample how
> this is handled for similar cases currently).
>
>
makes sense to me, but in that case isn't it better to use ereport
(as used in GetFdwRoutineByServerId()) rather than elog?


> This however brings another comment - I think it would be better if the
> GetIndexAmRoutine would be split into two interfaces. The GetIndexAmRoutine
> itself would accept the amhandler Oid and should just do the
> OidFunctionCall and then check the result is not NULL and possibly that it
> is an IndexAmRoutine node. And then all the
>
> (IndexAmRoutine*)DatumGetPointer(!OidFunctionCall0(accessMethodForm->amhandler));
> calls in the code should be replaced with calls to the GetIndexAmRoutine
> instead.
>
> The other routine (let's call it GetIndexAmRoutineByAmId for example)
> would get IndexAmRoutine from amoid by looking up catalog, doing that
> validation of amhandler Oid/regproc and calling the GetIndexAmRoutine.
>
>
+1, I think that will make this API's design closer to what we have
for corresponding FDW API.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-04 Thread Amir Rohan
On 10/02/2015 03:33 PM, Michael Paquier wrote:
>
>

Michael, I'm afraid my email bungling has damaged your thread.

I didn't include an "In-reply-To" header when I posted:

trinity-b4a8035d-59af-4c42-a37e-258f0f28e44a-1443795007012@3capp-mailcom-lxa08.

And we subsequently had our discussion over there instead of here, where
the commitfest app is tracking it.

https://commitfest.postgresql.org/6/197/

Perhaps it would help a little if you posted the latest patch here as
well? So that at least the app picks it up again.

Apologies for my ML n00bness,
Amir





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Andres Freund
On October 4, 2015 2:50:10 PM GMT+02:00, Shay Rojansky  wrote:
>>
>> > Npgsql supports sending multiple SQL statements in a single packet
>via
>> the extended protocol. This works fine, but when the second query
>SELECTs a
>> value modified by the first's UPDATE, I'm getting a result as if the
>> > UPDATE hasn't yet occurred.
>>
>> Looks like the first updating statement is not committed, assuming
>that
>> the two statements run in different transactions.
>>
>
>I did try to prefix the message chain with an explicit transaction
>BEGIN
>(with the several different isolation levels) without a difference in
>behavior.
>
>> The exact messages send by Npgsql are:
>> >
>> > Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
>> > Describe (statement=unnamed)
>> > Bind (statement=unnamed, portal=MQ0)
>> > Parse (SELECT * FROM data WHERE id=1), statement=unnamed
>> > Describe (statement=unnamed)
>> > Bind (statement=unnamed, portal=MQ1)
>> > Execute (portal=MQ0)
>> > Close (portal=MQ0)
>> > Execute (portal=MQ1)
>> > Close (portal=MQ1)
>> > Sync
>>
>> I never used Npgsql so I don't know if there is something missing
>there.
>> Would you need an explicit commit before closing MQ0?
>>
>
>I guess this is exactly my question to PostgreSQL... But unless I'm
>misunderstanding the transaction semantics I shouldn't need to commit
>the
>first UPDATE in order to see its effect in the second SELECT...

Try adding a sync before the second execute.

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
>
> > Npgsql supports sending multiple SQL statements in a single packet via
> the extended protocol. This works fine, but when the second query SELECTs a
> value modified by the first's UPDATE, I'm getting a result as if the
> > UPDATE hasn't yet occurred.
>
> Looks like the first updating statement is not committed, assuming that
> the two statements run in different transactions.
>

I did try to prefix the message chain with an explicit transaction BEGIN
(with the several different isolation levels) without a difference in
behavior.

> The exact messages send by Npgsql are:
> >
> > Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
> > Describe (statement=unnamed)
> > Bind (statement=unnamed, portal=MQ0)
> > Parse (SELECT * FROM data WHERE id=1), statement=unnamed
> > Describe (statement=unnamed)
> > Bind (statement=unnamed, portal=MQ1)
> > Execute (portal=MQ0)
> > Close (portal=MQ0)
> > Execute (portal=MQ1)
> > Close (portal=MQ1)
> > Sync
>
> I never used Npgsql so I don't know if there is something missing there.
> Would you need an explicit commit before closing MQ0?
>

I guess this is exactly my question to PostgreSQL... But unless I'm
misunderstanding the transaction semantics I shouldn't need to commit the
first UPDATE in order to see its effect in the second SELECT...

Also I am not in clear what "statement=unnamed" means, but it is used
> twice. Is it possible that the update is overwritten with select before it
> executes?
>

statement=unnamed means that the destination statement is the unnamed
prepared statement (as described in
http://www.postgresql.org/docs/current/static/protocol-message-formats.html).
Right after the Parse I bind the unnamed statement which I just parsed to
cursor MQ0. In other words, Npgsql first parses the two queries and binds
them to portals MQ0 and MQ1, and only then executes both portals

BTW: Do you see the change after update in your DB if you look into it with
> another tool (e.g. psql)?
>

That's a good suggestion, I'll try to check it out, thanks!


[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-04 Thread Michael Paquier
On Sat, Oct 3, 2015 at 10:47 PM, Michael Paquier wrote:
> On Sat, Oct 3, 2015 at 9:50 PM, Amir Rohan wrote:
 Block until recovery is finished, before testing. eliminate races, and
 avoid the stupid sleep(3) I used.
>>>
>>> TODO
>
> Well. I just recalled this item in the list of things you mentioned. I
> marked it but forgot to address it. It sounds right that we may want
> something using pg_isready in a loop as a node in recovery would
> reject connections.

I just hacked up an updated version with the following things:
- Optional argument for stop_node to define the stop mode of pg_ctl
- Addition of wait_for_node where pg_isready is used to wait until a
node is ready to accept queries
- Addition of a local lookup variable to track the last port assigned.
This accelerates get_free_port.
Regards,
-- 
Michael
diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm
index a4c1737..ea219d7 100644
--- a/src/bin/pg_rewind/RewindTest.pm
+++ b/src/bin/pg_rewind/RewindTest.pm
@@ -125,38 +125,6 @@ sub check_query
 	}
 }
 
-# Run a query once a second, until it returns 't' (i.e. SQL boolean true).
-sub poll_query_until
-{
-	my ($query, $connstr) = @_;
-
-	my $max_attempts = 30;
-	my $attempts = 0;
-	my ($stdout, $stderr);
-
-	while ($attempts < $max_attempts)
-	{
-		my $cmd = [ 'psql', '-At', '-c', "$query", '-d', "$connstr" ];
-		my $result = run $cmd, '>', \$stdout, '2>', \$stderr;
-
-		chomp($stdout);
-		$stdout =~ s/\r//g if $Config{osname} eq 'msys';
-		if ($stdout eq "t")
-		{
-			return 1;
-		}
-
-		# Wait a second before retrying.
-		sleep 1;
-		$attempts++;
-	}
-
-	# The query result didn't change in 30 seconds. Give up. Print the stderr
-	# from the last attempt, hopefully that's useful for debugging.
-	diag $stderr;
-	return 0;
-}
-
 sub append_to_file
 {
 	my ($filename, $str) = @_;
diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..d6e51eb 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -17,7 +17,7 @@ SUBDIRS = regress isolation modules
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want "make clean" etc to recurse into them.  Likewise for ssl/,
 # because the SSL test suite is not secure to run on a multi-user system.
-ALWAYS_SUBDIRS = examples locale thread ssl
+ALWAYS_SUBDIRS = examples locale thread ssl recovery
 
 # We want to recurse to all subdirs for all standard targets, except that
 # installcheck and install should not recurse into the subdirectory "modules".
diff --git a/src/test/perl/RecoveryTest.pm b/src/test/perl/RecoveryTest.pm
new file mode 100644
index 000..aa8998c
--- /dev/null
+++ b/src/test/perl/RecoveryTest.pm
@@ -0,0 +1,401 @@
+package RecoveryTest;
+
+# Set of common routines for recovery regression tests for a PostgreSQL
+# cluster. This includes global variables and methods that can be used
+# by the various set of tests present to set up cluster nodes and
+# configure them according to the test scenario wanted.
+#
+# Cluster nodes can be freely created using initdb or using the existing
+# base backup of another node, with minimum configuration done when the
+# node is created for the first time like having a proper port number.
+# It is then up to the test to decide what to do with the newly-created
+# node.
+#
+# Environment configuration of each node is available through a set
+# of global variables provided by this package, hashed depending on the
+# port number of a node:
+# - connstr_nodes connection string to connect to this node
+# - datadir_nodes to get the data folder of a given node
+# - archive_nodes for the location of the WAL archives of a node
+# - backup_nodes for the location of base backups of a node
+# - applname_nodes, application_name to use for a standby
+#
+# Nodes are identified by their port number, which should be unique
+# for each node of the cluster as it is run locally.
+
+use Cwd;
+use TestLib;
+use Test::More;
+
+use Archive::Tar;
+use IPC::Run qw(run start);
+
+use Exporter 'import';
+
+our @EXPORT = qw(
+	%connstr_nodes
+	%datadir_nodes
+	%backup_nodes
+	%archive_nodes
+	%applname_nodes
+
+	append_to_file
+	backup_node
+	disable_node
+	dump_node_info
+	enable_archiving
+	enable_node
+	enable_restoring
+	enable_streaming
+	get_free_port
+	init_node
+	init_node_from_backup
+	make_master
+	make_warm_standby
+	make_hot_standby
+	restart_node
+	start_node
+	stop_node
+	teardown_node
+);
+
+# Global variables for node data
+%datadir_nodes = {};	# PGDATA folders
+%backup_nodes = {};		# Backup base folder
+%archive_nodes = {};	# Archive base folder
+%connstr_nodes = {};	# Connection strings
+%applname_nodes = {};	# application_name used for standbys
+
+# Tracking of last port value assigned to accelerate free port lookup.
+# XXX: Should this part use PG_VERSION_NUM?
+my $last_port_assigned =  90600 % 16384 + 49152;
+
+# Database used for each connection attempt via psql
+$ENV{PGDATABASE} = "postgres";
+
+# Tracker of active nodes
+my @active_nodes = ()

Re: [HACKERS] check fails on Fedora 23

2015-10-04 Thread Pavel Stehule
#15 0x00469376 in main (argc=8, argv=0x16a45e0) at main.c:223
>>
>> Linux yen 4.2.1-300.fc23.x86_64+debug #1 SMP Mon Sep 21 21:58:30 UTC 2015
>> x86_64 x86_64 x86_64 GNU/Linux
>> gcc (GCC) 5.1.1 20150618 (Red Hat 5.1.1-4)
>>
>> Postgres 9.4.4 is working well
>>
>
>
configured with defaults - only --enable-cassert

Regards

Pavel


Re: [HACKERS] check fails on Fedora 23

2015-10-04 Thread Pavel Stehule
2015-10-04 10:50 GMT+02:00 Pavel Stehule :

> Hi
>
> I am testing PostgreSQL (master) on Fedora 23. The query
>
> ELECT p1.oid, p1.proname, p2.oid, p2.proname
> FROM pg_proc AS p1, pg_proc AS p2
> WHERE p1.oid < p2.oid AND
> p1.prosrc = p2.prosrc AND
> p1.prolang = 12 AND p2.prolang = 12 AND
> (p1.proisagg = false OR p2.proisagg = false) AND
> (p1.prolang != p2.prolang OR
>  p1.proisagg != p2.proisagg OR
>  p1.prosecdef != p2.prosecdef OR
>  p1.proleakproof != p2.proleakproof OR
>  p1.proisstrict != p2.proisstrict OR
>  p1.proretset != p2.proretset OR
>  p1.provolatile != p2.provolatile OR
>  p1.pronargs != p2.pronargs);
>
> fails on assert
>
> Program terminated with signal SIGABRT, Aborted.
> #0  0x7f3e1dfe5a98 in __GI_raise (sig=sig@entry=6) at
> ../sysdeps/unix/sysv/linux/raise.c:55
> 55  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
> (gdb) bt
> #0  0x7f3e1dfe5a98 in __GI_raise (sig=sig@entry=6) at
> ../sysdeps/unix/sysv/linux/raise.c:55
> #1  0x7f3e1dfe769a in __GI_abort () at abort.c:89
> #2  0x007c5401 in ExceptionalCondition
> (conditionName=conditionName@entry=0x935157 "!(compareResult < 0)",
> errorType=errorType@entry=0x802217 "FailedAssertion",
> fileName=fileName@entry=0x935147 "nodeMergejoin.c",
> lineNumber=lineNumber@entry=942) at assert.c:54
> #3  0x005eba9f in ExecMergeJoin (node=node@entry=0x175f120) at
> nodeMergejoin.c:942
> #4  0x005d3958 in ExecProcNode (node=node@entry=0x175f120) at
> execProcnode.c:480
> #5  0x005cfe87 in ExecutePlan (dest=0x177d1e0,
> direction=, numberTuples=0, sendTuples=,
> operation=CMD_SELECT, planstate=0x175f120, estate=0x175f008) at
> execMain.c:1562
> #6  standard_ExecutorRun (queryDesc=0x16c7e88, direction=,
> count=0) at execMain.c:342
> #7  0x006dd038 in PortalRunSelect (portal=portal@entry=0x16bed38,
> forward=forward@entry=1 '\001', count=0,
> count@entry=9223372036854775807, dest=dest@entry=0x177d1e0) at
> pquery.c:942
> #8  0x006de57e in PortalRun (portal=portal@entry=0x16bed38,
> count=count@entry=9223372036854775807,
> isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x177d1e0,
> altdest=altdest@entry=0x177d1e0,
> completionTag=completionTag@entry=0x7ffe4f8236f0 "") at pquery.c:786
> #9  0x006db29b in exec_simple_query (
> query_string=0x1715318 "SELECT p1.oid, p1.proname, p2.oid,
> p2.proname\nFROM pg_proc AS p1, pg_proc AS p2\nWHERE p1.oid < p2.oid
> AND\np1.prosrc = p2.prosrc AND\np1.prolang = 12 AND p2.prolang = 12
> AND\n(p1.proisagg = f"...) at postgres.c:1105
> #10 PostgresMain (argc=, argv=argv@entry=0x16a57a0,
> dbname=0x16a5500 "regression", username=)
> at postgres.c:4033
> #11 0x0046810f in BackendRun (port=0x16c5f50) at postmaster.c:4204
> #12 BackendStartup (port=0x16c5f50) at postmaster.c:3880
> #13 ServerLoop () at postmaster.c:1683
> #14 0x0067e98b in PostmasterMain (argc=argc@entry=8,
> argv=argv@entry=0x16a45e0) at postmaster.c:1292
> #15 0x00469376 in main (argc=8, argv=0x16a45e0) at main.c:223
>
> Linux yen 4.2.1-300.fc23.x86_64+debug #1 SMP Mon Sep 21 21:58:30 UTC 2015
> x86_64 x86_64 x86_64 GNU/Linux
> gcc (GCC) 5.1.1 20150618 (Red Hat 5.1.1-4)
>
> Postgres 9.4.4 is working well
>

git bisect shows

4ea51cdfe85ceef8afabceb03c446574daa0ac23 is the first bad commit
commit 4ea51cdfe85ceef8afabceb03c446574daa0ac23
Author: Robert Haas 
Date:   Mon Jan 19 15:20:31 2015 -0500

Use abbreviated keys for faster sorting of text datums.

This commit extends the SortSupport infrastructure to allow operator
classes the option to provide abbreviated representations of Datums;
in the case of text, we abbreviate by taking the first few characters
of the strxfrm() blob.  If the abbreviated comparison is insufficent
to resolve the comparison, we fall back on the normal comparator.
This can be much faster than the old way of doing sorting if the
first few bytes of the string are usually sufficient to resolve the
comparison.

There is the potential for a performance regression if all of the
strings to be sorted are identical for the first 8+ characters and
differ only in later positions; therefore, the SortSupport machinery
now provides an infrastructure to abort the use of abbreviation if
it appears that abbreviation is producing comparatively few distinct
keys.  HyperLogLog, a streaming cardinality estimator, is included in
this commit and used to make that determination for text.

Peter Geoghegan, reviewed by me.



>
> Regards
>
> Pavel
>
>
>
>
>


[HACKERS] check fails on Fedora 23

2015-10-04 Thread Pavel Stehule
Hi

I am testing PostgreSQL (master) on Fedora 23. The query

ELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
(p1.proisagg = false OR p2.proisagg = false) AND
(p1.prolang != p2.prolang OR
 p1.proisagg != p2.proisagg OR
 p1.prosecdef != p2.prosecdef OR
 p1.proleakproof != p2.proleakproof OR
 p1.proisstrict != p2.proisstrict OR
 p1.proretset != p2.proretset OR
 p1.provolatile != p2.provolatile OR
 p1.pronargs != p2.pronargs);

fails on assert

Program terminated with signal SIGABRT, Aborted.
#0  0x7f3e1dfe5a98 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:55
55  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x7f3e1dfe5a98 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:55
#1  0x7f3e1dfe769a in __GI_abort () at abort.c:89
#2  0x007c5401 in ExceptionalCondition
(conditionName=conditionName@entry=0x935157 "!(compareResult < 0)",
errorType=errorType@entry=0x802217 "FailedAssertion",
fileName=fileName@entry=0x935147 "nodeMergejoin.c",
lineNumber=lineNumber@entry=942) at assert.c:54
#3  0x005eba9f in ExecMergeJoin (node=node@entry=0x175f120) at
nodeMergejoin.c:942
#4  0x005d3958 in ExecProcNode (node=node@entry=0x175f120) at
execProcnode.c:480
#5  0x005cfe87 in ExecutePlan (dest=0x177d1e0, direction=, numberTuples=0, sendTuples=,
operation=CMD_SELECT, planstate=0x175f120, estate=0x175f008) at
execMain.c:1562
#6  standard_ExecutorRun (queryDesc=0x16c7e88, direction=,
count=0) at execMain.c:342
#7  0x006dd038 in PortalRunSelect (portal=portal@entry=0x16bed38,
forward=forward@entry=1 '\001', count=0,
count@entry=9223372036854775807, dest=dest@entry=0x177d1e0) at
pquery.c:942
#8  0x006de57e in PortalRun (portal=portal@entry=0x16bed38,
count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x177d1e0,
altdest=altdest@entry=0x177d1e0,
completionTag=completionTag@entry=0x7ffe4f8236f0 "") at pquery.c:786
#9  0x006db29b in exec_simple_query (
query_string=0x1715318 "SELECT p1.oid, p1.proname, p2.oid,
p2.proname\nFROM pg_proc AS p1, pg_proc AS p2\nWHERE p1.oid < p2.oid
AND\np1.prosrc = p2.prosrc AND\np1.prolang = 12 AND p2.prolang = 12
AND\n(p1.proisagg = f"...) at postgres.c:1105
#10 PostgresMain (argc=, argv=argv@entry=0x16a57a0,
dbname=0x16a5500 "regression", username=)
at postgres.c:4033
#11 0x0046810f in BackendRun (port=0x16c5f50) at postmaster.c:4204
#12 BackendStartup (port=0x16c5f50) at postmaster.c:3880
#13 ServerLoop () at postmaster.c:1683
#14 0x0067e98b in PostmasterMain (argc=argc@entry=8,
argv=argv@entry=0x16a45e0)
at postmaster.c:1292
#15 0x00469376 in main (argc=8, argv=0x16a45e0) at main.c:223

Linux yen 4.2.1-300.fc23.x86_64+debug #1 SMP Mon Sep 21 21:58:30 UTC 2015
x86_64 x86_64 x86_64 GNU/Linux
gcc (GCC) 5.1.1 20150618 (Red Hat 5.1.1-4)

Postgres 9.4.4 is working well

Regards

Pavel