date:20120627

Re: [HACKERS] WIP patch for Todo Item : Provide fallback_application_name in contrib/pgbench, oid2name, and dblink

2012-06-27 Thread Amit Kapila

From: Robert Haas [mailto:robertmh...@gmail.com] 
Sent: Thursday, June 28, 2012 7:46 AM
On Wed, Jun 27, 2012 at 9:57 PM, Shigeru HANADA
 wrote:
> On Thu, Jun 14, 2012 at 10:55 PM, Robert Haas 
wrote:
>> Indeed reparsing connection string is not cheap, but dblink does it for
>> checking password requirement for non-in dblink_connstr_check when the
>> local user was not a superuser.  So Amit's idea doesn't seem
>> unreasonable to me, if we can avoid extra PQconninfoParse call.
>>
>> Just an idea, but how about pushing fallback_application_name handling
>> into dblink_connstr_check?  We reparse connection string unless local
>> user was a superuser, so it would not be serious overhead in most cases.
>>  Although it might require changes in DBLINK_GET_CONN macro...

> If it can be done without costing anything meaningful, I don't object,
> but I would humbly suggest that this is not hugely important one way
> or the other.  application_name is primarily a monitoring convenience,
> so it's not hugely important to have it set anyway, 

In some cases like when DBA does the monitoring in night or sometimes when
he doesn't expect much load on database to do some maintenance tasks,
it can be helpful for him to know if there is any application which he
doesn't expect
to be there. This can be one of the ways he can know the which applications
are currently active.

Users are not bothered to set application_name in general cases as it
doesn't server any
big purpose for them.

I understand this is good to have feature, but if it doesn't cost much then
according to me
it can be done.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread Tom Lane

Robert Haas  writes:
> It's just ridiculous to assert that it doesn't matter if all the
> anti-wraparound vacuums start simultaneously.  It does matter.  For
> one thing, once every single autovacuum worker is pinned down doing an
> anti-wraparound vacuum of some table, then a table that needs an
> ordinary vacuum may have to wait quite some time before a worker is
> available.

Well, that's a fair point, but I don't think it has anything to do with
Josh's complaint --- which AFAICT is about imposed load, not about
failure to vacuum things that need vacuumed.  Any scheme you care to
design will sometimes be running max_workers workers at once, and if
that's too much load there will be trouble.  I grant that there can be
value in a more complex strategy for when to schedule vacuuming
activities, but I don't think that it has a lot to do with solving the
present complaint.

> Parallelism is not free, ever, and particularly not here, where it has
> the potential to yank the disk head around between five different
> files, seeking like crazy, instead of a nice sequential I/O pattern on
> each file in turn.

Interesting point.  Maybe what's going on here is that
autovac_balance_cost() is wrong to suppose that N workers can each have
1/N of the I/O bandwidth that we'd consider okay for a single worker to
eat.  Maybe extra seek costs mean we have to derate that curve by some
large factor.  1/(N^2), perhaps?  I bet the nature of the disk subsystem
affects this a lot, though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread Robert Haas

On Thu, Jun 28, 2012 at 12:51 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> For example, suppose that 26 tables each of which is 4GB in size are
>> going to simultaneously come due for an anti-wraparound vacuum in 26
>> hours.  For the sake of simplicity suppose that each will take 1 hour
>> to vacuum.  What we currently do is wait for 26 hours and then start
>> vacuuming them all at top speed, thrashing the I/O system.
>
> This is a nice description of a problem that has nothing to do with
> reality.  In the first place, we don't vacuum them all at once; we can
> only vacuum max_workers of them at a time.  In the second place, the
> cost-delay features ought to be keeping autovacuum from thrashing the
> I/O, entirely independently of what the reason was for starting the
> vacuums.

I don't think it works that way.  The point is that the workload
imposed by autovac is intermittent and spikey.  If you configure the
cost limit too low, or the delay too high, or the number of autovac
workers is too low, then autovac can't keep up, which causes all of
your tables to bloat and is a total disaster.  You have to make sure
that isn't going to happen, so you naturally configure the settings
aggressively enough that you're sure autovac will be able to stay
ahead of your bloat problem.  But then autovac is more
resource-intensive ALL the time, not just when there's a real need for
it.  This is like giving a kid a $20 bill to buy lunch and having them
walk around until they find a restaurant sufficiently expensive that
lunch there costs $20.  The point of handing over $20 was that you
were willing to spend that much *if needed*, not that the money was
burning a hole in your pocket.

To make that more concrete, suppose that a table has an update rate
such that it hits the autovac threshold every 10 minutes.  If you set
the autovac settings such that an autovacuum of that table takes 9
minutes to complete, you are hosed: there will eventually be some
10-minute period where the update rate is ten times the typical
amount, and the table will gradually become horribly bloated.  But if
you set the autovac settings such that an autovacuum of the table can
finish in 1 minute, so that you can cope with a spike, then whenever
there isn't a spike you are processing the table ten times faster than
necessary, and now one minute out of every ten carries a heavier I/O
load than the other 9, leading to uneven response times.

It's just ridiculous to assert that it doesn't matter if all the
anti-wraparound vacuums start simultaneously.  It does matter.  For
one thing, once every single autovacuum worker is pinned down doing an
anti-wraparound vacuum of some table, then a table that needs an
ordinary vacuum may have to wait quite some time before a worker is
available.  Depending on the order in which workers iterate through
the tables, you could end up finishing all of the anti-wraparound
vacuums before doing any of the regular vacuums.  If the wraparound
vacuums had been properly spread out, then there would at all times
have been workers available for regular vacuums as needed.  For
another thing, you can't possibly think that three or five workers
running simultaneously, each reading a different table, is just as
efficient as having one worker grind through them consecutively.
Parallelism is not free, ever, and particularly not here, where it has
the potential to yank the disk head around between five different
files, seeking like crazy, instead of a nice sequential I/O pattern on
each file in turn.  Josh wouldn't keep complaining about this if it
didn't suck.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/perl and utf-8 in sql_ascii databases

2012-06-27 Thread Kyotaro HORIGUCHI

Hello,  thank you for your sugestion.

> > I agree. That is the fundamental question. I've coded just for my
> > fun but I don't see not so much signicance to do that. We might
> > omit the test for this which is non-ciritical and corner cases.
> 
> We need these tests to work on Windows too, so fancy gmake tricks are
> probably not the way to deal with varying results.

Hmm. I understand that you suggested that we should do this in
normal regression test.

Ok. Since there found to be only two patterns in the regression
test. The fancy thing is no more needed. I will unfold them and
make sure to work on mingw build environment.

And for one more environment, on the one with VC++.. I'll need a
bit longer time to make out what vcregress.pl does.


On the other things, I will decide as following and sent to
committer as soon as the above is finished.

- The main patch fixes the sql-ascii handling itself shoud ported
  into 9.2 and 9.1. Someone shoud work for this. (me?)

- The remainder of the patch whic fixes the easy fixable leakes
  of palloc'ed memory won't be ported into 9.1. This is only for
  9.3dev.

- The patch for 9.3dev will be provided with the new regression
  test. It will be easily ported into 9.1 and 9.2 and there seems
  to be no problem technically, but a bit unsure from the other
  points of view...

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Robert Haas

On Wed, Jun 27, 2012 at 9:44 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Jun 27, 2012 at 12:00 AM, Tom Lane  wrote:
>>> Would Posix shmem help with that at all?  Why did you choose not to
>>> use the Posix API, anyway?
>
>> It seemed more complicated.  If we use the POSIX API, we've got to
>> have code to find a non-colliding name for the shm, and we've got to
>> arrange to clean it up at process exit.  Anonymous shm doesn't require
>> a name and goes away automatically when it's no longer in use.
>
> I see.  Those are pretty good reasons ...

So, should we do it this way?

I did a little research and discovered that Linux 2.3.51 (released
3/11/2000) apparently returns EINVAL for MAP_SHARED|MAP_ANONYMOUS.
That combination is documented to work beginning in Linux 2.4.0.  How
worried should we be about people trying to run PostgreSQL 9.3 on
pre-2.4 kernels?  If we want to worry about it, we could try mapping a
one-page shared MAP_SHARED|MAP_ANONYMOUS segment first.  If that
works, we could assume that we have a working MAP_SHARED|MAP_ANONYMOUS
facility and try to allocate the whole segment plus a minimal sysv
shm.  If the single page allocation fails with EINVAL, we could fall
back to allocating the entire segment as sysv shm.

A related question is - if we do this - should we enable it only on
ports where we've verified that it works, or should we just turn it on
everywhere and fix breakage if/when it's reported?  I lean toward the
latter.

If we find that there are platforms where (a) mmap is not supported or
(b) MAP_SHARED|MAP_ANON works but has the wrong semantics, we could
either shut off this optimization on those platforms by fiat, or we
could test not only that the call succeeds, but that it works
properly: create a one-page mapping and fork a child process; in the
child, write to the mapping and exit; in the parent, wait for the
child to exit and then test that we can read back the correct
contents.  This would protect against a hypothetical system where the
flags are accepted but fail to produce the correct behavior.  I'm
inclined to think this is over-engineering in the absence of evidence
that there are platforms that work this way.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread Tom Lane

Robert Haas  writes:
> For example, suppose that 26 tables each of which is 4GB in size are
> going to simultaneously come due for an anti-wraparound vacuum in 26
> hours.  For the sake of simplicity suppose that each will take 1 hour
> to vacuum.  What we currently do is wait for 26 hours and then start
> vacuuming them all at top speed, thrashing the I/O system.

This is a nice description of a problem that has nothing to do with
reality.  In the first place, we don't vacuum them all at once; we can
only vacuum max_workers of them at a time.  In the second place, the
cost-delay features ought to be keeping autovacuum from thrashing the
I/O, entirely independently of what the reason was for starting the
vacuums.  Clearly, since people are complaining, there's something that
needs work there.  But not the work you're proposing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread Robert Haas

On Wed, Jun 27, 2012 at 11:38 PM, Tom Lane  wrote:
> Stephen Frost  writes:
>> * Josh Berkus (j...@agliodbs.com) wrote:
>>> Yeah, I can't believe I'm calling for *yet another* configuration
>>> variable either.  Suggested workaround fixes very welcome.
>
>> As I suggested on IRC, my thought would be to have a goal-based system
>> for autovacuum which is similar to our goal-based commit system.  We
>> don't need autovacuum sucking up all the I/O in the box, nor should we
>> ask the users to manage that.  Instead, let's decide when the autovacuum
>> on a given table needs to finish and then plan to keep on working at a
>> rate that'll allow us to get done well in advance of that deadline.
>
> If we allow individual vacuum operations to stretch out just because
> they don't need to be completed right away, we will need more concurrent
> vacuum workers (so that we can respond to vacuum requirements for other
> tables).  So I submit that this would only move the problem around:
> the number of active workers would increase to the point where things
> are just as painful, plus or minus a bit.
>
> The intent of the autovacuum cost delay features is to ensure that
> autovacuum doesn't suck an untenable fraction of the machine's I/O
> capacity, even when it's running flat out.  So I think Josh's complaint
> indicates that we have a problem with cost-delay tuning; hard to tell
> what exactly without more info.  It might only be that the defaults
> are bad for these particular users, or it could be more involved.

I've certainly come across many reports of the cost delay settings
being difficult to tune, both on pgsql-hackers/performance and in
various private EnterpriseDB correspondence.  I think Stephen's got it
exactly right: the system needs to figure out the rate at which vacuum
needs to happen, not rely on the user to provide that information.

For checkpoints, we estimated the percentage of the checkpoint that
ought to be completed and the percentage that actually is completed;
if the latter is less than the former, we speed things up until we're
back on track.  For autovacuum, the trick is to speed things up when
the rate at which tables are coming due for autovacuum exceeds the
rate at which we are vacuuming them; or, when we anticipate that a
whole bunch of wraparound vacuums are going to come due
simultaneously, to start doing them sooner so that they are more
spread out.

For example, suppose that 26 tables each of which is 4GB in size are
going to simultaneously come due for an anti-wraparound vacuum in 26
hours.  For the sake of simplicity suppose that each will take 1 hour
to vacuum.  What we currently do is wait for 26 hours and then start
vacuuming them all at top speed, thrashing the I/O system.  What we
ought to do is start vacuuming them much sooner and do them
consecutively.  Of course, the trick is to design a mechanism that
does something intelligent if we think we're on track and then all of
a sudden the rate of XID consumption changes dramatically, and now
we've got vacuum faster or with more workers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regarding WAL Format Changes

2012-06-27 Thread Amit Kapila


From: Heikki Linnakangas [mailto:heikki.linnakan...@enterprisedb.com] 
Sent: Wednesday, June 27, 2012 8:26 PM
On 27.06.2012 17:14, Amit Kapila wrote:
>> 2. @@ -2680,8 +2645,8 @@ InstallXLogFileSegment(uint32 *log, uint32 *seg,
>> char *tmppath,
>>   LWLockRelease(ControlFileLock);
>>   ereport(LOG,
>>   (errcode_for_file_access(),
>> - errmsg("could not link file \"%s\" to
>> \"%s\" (initialization of log file %u, segment %u): %m",
>> -tmppath, path, *log,
>> *seg)));
>> + errmsg("could not link file \"%s\" to
>> \"%s\" (initialization of log file): %m",
>> +tmppath, path)));
>> If Changed error message can contain log file and segment number, it
>>  would be more clear. That should be easily
>> deducible from segment number.

>That seems redundant. The target file name is calculated from the 
>segment number, and we're now using the file name instead of log+seg in 
>other messages too.

errmsg("could not link file \"%s\" to  \"%s\" (initialization of log file):
%m", +tmppath, path)));

In this if we try to get the meaning of second part of message
"(initialization of log file)", it was much 
better previously as in this message it refers 2 files and previously it was
clear initialization of which log
file failed. So we can mention file name in second part of message
"(initialization of log file)" as well.




>> 3.   -RemoveOldXlogFiles(uint32 log, uint32 seg, XLogRecPtr endptr)
>>For the above 2 changed error messages, 'log segment' is used for
>> filename.
>>However all similar changes has 'log file' for filename. There are
some
>> places
>>where 'log segment' is used and other places it is 'log file'.
>>So is there any particular reason for it?

> Not really. There are several messages that use "log file %s", and also 
> several places that use "log segment %s" Should we make it consistent 
> and use either "log segment" or "log file" everywhere?

'file' seems to be better option as some users may not be even aware of
segments, they would be using default values of segments and they can relate
to 'file' easily. 
Also using 'WAL' instead of 'log' as suggested by Alvaro is good if others
also thinks same.

With Regards,
Amit Kapila.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread Tom Lane

Stephen Frost  writes:
> * Josh Berkus (j...@agliodbs.com) wrote:
>> Yeah, I can't believe I'm calling for *yet another* configuration
>> variable either.  Suggested workaround fixes very welcome.

> As I suggested on IRC, my thought would be to have a goal-based system
> for autovacuum which is similar to our goal-based commit system.  We
> don't need autovacuum sucking up all the I/O in the box, nor should we
> ask the users to manage that.  Instead, let's decide when the autovacuum
> on a given table needs to finish and then plan to keep on working at a
> rate that'll allow us to get done well in advance of that deadline.

If we allow individual vacuum operations to stretch out just because
they don't need to be completed right away, we will need more concurrent
vacuum workers (so that we can respond to vacuum requirements for other
tables).  So I submit that this would only move the problem around:
the number of active workers would increase to the point where things
are just as painful, plus or minus a bit.

The intent of the autovacuum cost delay features is to ensure that
autovacuum doesn't suck an untenable fraction of the machine's I/O
capacity, even when it's running flat out.  So I think Josh's complaint
indicates that we have a problem with cost-delay tuning; hard to tell
what exactly without more info.  It might only be that the defaults
are bad for these particular users, or it could be more involved.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread Tom Lane

Josh Berkus  writes:
>> I think what you've really got here is inappropriate autovacuum cost
>> delay settings, and/or the logic in autovacuum.c to try to divvy up the
>> available I/O capacity by tweaking workers' delay settings isn't working
>> very well.  It's hard to propose improvements without a lot more detail
>> than you've provided, though.

> Wait, we *have* that logic?  If so, that's the problem ... it's not
> working very well.

> What detail do you want?

What's it doing?  What do you think it should do instead?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Server crash while trying to fetch EXPLAIN query results with a cursor

2012-06-27 Thread Tom Lane

Rushabh Lathia  writes:
> Above testcase endup with server crash. Crash is due to following assert
> into ScanQueryForLocks()
> Assert(parsetree->commandType != CMD_UTILITY);

Meh, yeah, more fallout from the CREATE TABLE AS representation change.

> PFA patch for the same.

Applied with some editorialization.  Thanks!

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread Stephen Frost

Josh, all,

* Josh Berkus (j...@agliodbs.com) wrote:
> Yeah, I can't believe I'm calling for *yet another* configuration
> variable either.  Suggested workaround fixes very welcome.

As I suggested on IRC, my thought would be to have a goal-based system
for autovacuum which is similar to our goal-based commit system.  We
don't need autovacuum sucking up all the I/O in the box, nor should we
ask the users to manage that.  Instead, let's decide when the autovacuum
on a given table needs to finish and then plan to keep on working at a
rate that'll allow us to get done well in advance of that deadline.

Just my 2c.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread Josh Berkus


> I think what you've really got here is inappropriate autovacuum cost
> delay settings, and/or the logic in autovacuum.c to try to divvy up the
> available I/O capacity by tweaking workers' delay settings isn't working
> very well.  It's hard to propose improvements without a lot more detail
> than you've provided, though.

Wait, we *have* that logic?  If so, that's the problem ... it's not
working very well.

What detail do you want?


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread David Johnston

On Jun 27, 2012, at 22:00, Josh Berkus  wrote:

> Folks,
> 
> Yeah, I can't believe I'm calling for *yet another* configuration
> variable either.  Suggested workaround fixes very welcome.
> 
> The basic issue is that autovacuum_max_workers is set by most users
> based on autovac's fairly lightweight action most of the time: analyze,
> vacuuming pages not on the visibility list, etc.  However, when XID
> wraparound kicks in, then autovac starts reading entire tables from disk
> ... and those tables may be very large.
> 
> This becomes a downtime issue if you've set autovacuum_max_workers to,
> say, 5 and several large tables hit the wraparound threshold at the same
> time (as they tend to do if you're using the default settings).  Then
> you have 5 autovacuum processes concurrently doing heavy IO and getting
> in each others' way.
> 
> I've seen this at two sites now, and my conclusion is that a single
> autovacuum_max_workers isn't sufficient if to cover the case of
> wraparound vacuum.  Nor can we just single-thread the wraparound vacuum
> (i.e. just one worker) since that would hurt users who have thousands of
> small tables.
> 
> 

Would there be enough benefit to setting up separate small/medium?/large 
thresholds with user-changeable default table size boundaries so that you can 
configure 6 workers where 3 handle the small tables, 2 handle the medium 
tables, and 1 handles the large tables.  Or alternatively a small worker 
consumes 1, medium 2, and large 3 'units' from whatever size pool has been 
defined.  So you could have 6 small tables or two large tables in-progress 
simultaneously.

David J.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread Tom Lane

Josh Berkus  writes:
> Yeah, I can't believe I'm calling for *yet another* configuration
> variable either.  Suggested workaround fixes very welcome.

> The basic issue is that autovacuum_max_workers is set by most users
> based on autovac's fairly lightweight action most of the time: analyze,
> vacuuming pages not on the visibility list, etc.  However, when XID
> wraparound kicks in, then autovac starts reading entire tables from disk
> ... and those tables may be very large.

It doesn't seem to me that this has much of anything to do with
wraparound; that just happens to be one possible trigger condition
for a lot of vacuuming activity to be happening.  (Others are bulk
data loads or bulk updates, for instance.)  Nor am I convinced that
changing the max_workers setting is an appropriate fix anyway.

I think what you've really got here is inappropriate autovacuum cost
delay settings, and/or the logic in autovacuum.c to try to divvy up the
available I/O capacity by tweaking workers' delay settings isn't working
very well.  It's hard to propose improvements without a lot more detail
than you've provided, though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Todo Item : Provide fallback_application_name in contrib/pgbench, oid2name, and dblink

2012-06-27 Thread Robert Haas

On Wed, Jun 27, 2012 at 9:57 PM, Shigeru HANADA
 wrote:
> On Thu, Jun 14, 2012 at 10:55 PM, Robert Haas  wrote:
>> On Wed, Jun 13, 2012 at 12:07 AM, Amit Kapila  wrote:
>>> To achieve the same in dblink, we need to parse the passed connection string
>>> and check if it contains fallback_application_name, if yes then its okay,
>>> otherwise we need to append fallback_application_name in connection string.
>>
>> That seems undesirable.  I don't think this is important enough to be
>> worth reparsing the connection string for.  I'd just forget about
>> doing it for dblink if there's no cheaper way.
>
> Indeed reparsing connection string is not cheap, but dblink does it for
> checking password requirement for non-in dblink_connstr_check when the
> local user was not a superuser.  So Amit's idea doesn't seem
> unreasonable to me, if we can avoid extra PQconninfoParse call.
>
> Just an idea, but how about pushing fallback_application_name handling
> into dblink_connstr_check?  We reparse connection string unless local
> user was a superuser, so it would not be serious overhead in most cases.
>  Although it might require changes in DBLINK_GET_CONN macro...

*shrug*

If it can be done without costing anything meaningful, I don't object,
but I would humbly suggest that this is not hugely important one way
or the other.  application_name is primarily a monitoring convenience,
so it's not hugely important to have it set anyway, and
fallback_application_name is only going to apply in cases where the
user doesn't care enough to bother setting application_name.  Let's
not knock ourselves out to solve a problem that may not be that
important to begin with.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] We probably need autovacuum_max_wraparound_workers

2012-06-27 Thread Josh Berkus

Folks,

Yeah, I can't believe I'm calling for *yet another* configuration
variable either.  Suggested workaround fixes very welcome.

The basic issue is that autovacuum_max_workers is set by most users
based on autovac's fairly lightweight action most of the time: analyze,
vacuuming pages not on the visibility list, etc.  However, when XID
wraparound kicks in, then autovac starts reading entire tables from disk
... and those tables may be very large.

This becomes a downtime issue if you've set autovacuum_max_workers to,
say, 5 and several large tables hit the wraparound threshold at the same
time (as they tend to do if you're using the default settings).  Then
you have 5 autovacuum processes concurrently doing heavy IO and getting
in each others' way.

I've seen this at two sites now, and my conclusion is that a single
autovacuum_max_workers isn't sufficient if to cover the case of
wraparound vacuum.  Nor can we just single-thread the wraparound vacuum
(i.e. just one worker) since that would hurt users who have thousands of
small tables.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Todo Item : Provide fallback_application_name in contrib/pgbench, oid2name, and dblink

2012-06-27 Thread Shigeru HANADA

On Thu, Jun 14, 2012 at 10:55 PM, Robert Haas  wrote:
> On Wed, Jun 13, 2012 at 12:07 AM, Amit Kapila  wrote:
>> To achieve the same in dblink, we need to parse the passed connection string
>> and check if it contains fallback_application_name, if yes then its okay,
>> otherwise we need to append fallback_application_name in connection string.
>
> That seems undesirable.  I don't think this is important enough to be
> worth reparsing the connection string for.  I'd just forget about
> doing it for dblink if there's no cheaper way.

Indeed reparsing connection string is not cheap, but dblink does it for
checking password requirement for non-in dblink_connstr_check when the
local user was not a superuser.  So Amit's idea doesn't seem
unreasonable to me, if we can avoid extra PQconninfoParse call.

Just an idea, but how about pushing fallback_application_name handling
into dblink_connstr_check?  We reparse connection string unless local
user was a superuser, so it would not be serious overhead in most cases.
 Although it might require changes in DBLINK_GET_CONN macro...

Regards,
-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_signal_backend() asymmetry

2012-06-27 Thread Josh Kupershmidt

Hi all,

I have one nitpick related to the recent changes for
pg_cancel_backend() and pg_terminate_backend(). If you use these
functions as an unprivileged user, and try to signal a nonexistent
PID, you get:

  ERROR:  must be superuser or have the same role to cancel queries
running in other server processes

Whereas if you do the same thing as a superuser, you get:

  WARNING:  PID 123 is not a PostgreSQL server process
   pg_cancel_backend
  ---
   f
  (1 row)

The comment above the WARNING generated for the latter case says:

/*
 * This is just a warning so a loop-through-resultset
will not abort
 * if one backend terminated on its own during the run
 */

which is nice, but wouldn't unprivileged users want the same behavior?
Not to mention, the ERROR above is misleading, as it claims the
nonexistent PID really belongs to another user.

A simple fix is attached. The existing code called both
BackendPidGetProc(pid) and IsBackendPid(pid) while checking a
non-superuser's permissions, which really means two separate calls to
BackendPidGetProc(pid). I simplified that to down to a single
BackendPidGetProc(pid) call.

Josh


pg_signal_backend_asymmetry.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Uh, I change my mind about commit_delay + commit_siblings (sort of)

2012-06-27 Thread Peter Geoghegan

On 27 June 2012 23:01, Robert Haas  wrote:
> 1. Are there any call sites from which this should be disabled, either
> because the optimization won't help, or because the caller is holding
> a lock that we don't want them sitting on for a moment longer than
> necessary?  I think the worst case is that we're doing something like
> splitting the root page of a btree, and now nobody can walk the btree
> until the flush finishes, and here we are doing an "unnecessary"
> sleep.  I am not sure where I can construct a workload where this is a
> real as opposed to a theoretical problem, though.  So maybe we should
> just not worry about it.  It suffices for this to be an improvement
> over the status quo; it doesn't have to be perfect.  Thoughts?

I also find it hard to believe this is a concern. As I've said before
though, it hasn't been adequately explained just how you're supposed
to skip the delay if you're in this situation. It's highly unlikely
that you'll be the one doing the delaying anyway. Skipping the delay
iff you're the leader when this might possibly be a problem will only
*conceivably* be effective in a tiny minority of cases, so this seems
to make little sense to me.

Look at the original benchmark - there are latency numbers there too.
The patch actually performs slightly better than baseline in terms of
latency (worst and average cases) as well as throughput. To get back
to bus analogies again, if people start treating buses like Taxis, you
get slightly better minimum latency figures (not included in any
benchmark performed), because one or two people immediately
high-jacked a bus and left on an hour long journey rather than waiting
another 15 minutes for more passengers to come along. That doesn't
seem like it should be something to concern ourselves with. More to
the point, I think that there are "laws of physics" reasons why these
backends that might be particularly adversely affected by a delay (due
to some eventuality like the one you described) cannot *really* avoid
a delay. Of course this isn't an absolute, and certainly won't apply
when there are relatively few clients, when the delay really is
useless, but then we trust the dba to set commit_siblings (and indeed
commit_delay) correctly, and it's hardly that big of a deal, since
it's only typically a fraction of a single fsync() longer at worst -
certainly not a total wait that is longer than the total wait the
backend could reasonably expect to have.

Incidentally, I'm guessing someone is going to come up with a
rule-of-thumb for setting commit_delay that sounds something like
"half the latency of your wal_sync_method as reported by
pg_test_fsync()".

> 2. Should we rename the GUCs, since this patch will cause them to
> control WAL flush in general, as opposed to commit specifically?
> Peter Geoghegan and Simon were arguing that we should retitle it to
> group_commit_delay rather than just commit_delay, but that doesn't
> seem to be much of an improvement in describing what the new behavior
> will actually be, and I am doubtful that it is worth creating a naming
> incompatibility with previous releases for a cosmetic change.  I
> suggested wal_flush_delay, but there's no consensus on that.
> Opinions?

My main problem with the existing name is that every single article on
commit_delay since it was introduce over ten years ago has been on
balance very negative. Greg Smith's book, the docs, my talk at PgCon,
and any other material in existence about commit_delay that I'm aware
of says the same thing.

commit_delay in master is essentially the same now as it was in 2000;
it's just that commit_siblings is a bit smarter, in that it is now
based on number of active transactions specifically.

Now, based on the fact that the doc changes concerning the practical
details of setting commit_delay survived your editorialisations, I
take it that you accept my view that this is going to fundamentally
alter the advice that surrounds commit_delay, from "just don't touch
it" to something like "this is something that you should probably
consider, that may be very compelling in certain situations".

Let's not shoot ourselves in the foot by keeping the old, disreputable
name. You felt that wal_flush_delay more accurately reflected what the
new setting actually does. You were right, and I'd be perfectly happy
to go along with that.

> The XLByteLE test four lines from the bottom should happen before we
> consider whether to sleep, because it's possible that we'll discover
> that our flush request has already been satisfied and exit without
> doing anything, in which case the sleep is a waste.  The attached
> version adjusts the logic accordingly.

Seems like a minor point, unlikely to make any appreciable difference,
but there's no reason to think that it will have any negative effect
compared to what I've proposed either, so that's fine with me.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Re: [HACKERS] Uh, I change my mind about commit_delay + commit_siblings (sort of)

2012-06-27 Thread Robert Haas

On Sat, Jun 2, 2012 at 10:44 AM, Kevin Grittner
 wrote:
> Tom Lane  wrote:
>> Simon Riggs  writes:
>>> On 31 May 2012 15:00, Tom Lane  wrote:
 If we want to finish the beta cycle in a reasonable time period
 and get back to actual development, we have to refrain from
 adding more possibly-destabilizing development work to 9.2. And
 that is what this is.
>>
>>> In what way is it possibly destabilising?
>>
>> I'm prepared to believe that it only affects performance, but it
>> could be destabilizing to that. It needs proper review and testing,
>> and the next CF is the right environment for that to happen.
>
> +1
>
> This is not a bug fix or even a fix for a performance regression.
> The train has left the station; the next one will be along shortly.

And here it is.  There are a couple of outstanding issues here upon
which it would be helpful to get input from a few more people:

1. Are there any call sites from which this should be disabled, either
because the optimization won't help, or because the caller is holding
a lock that we don't want them sitting on for a moment longer than
necessary?  I think the worst case is that we're doing something like
splitting the root page of a btree, and now nobody can walk the btree
until the flush finishes, and here we are doing an "unnecessary"
sleep.  I am not sure where I can construct a workload where this is a
real as opposed to a theoretical problem, though.  So maybe we should
just not worry about it.  It suffices for this to be an improvement
over the status quo; it doesn't have to be perfect.  Thoughts?

2. Should we rename the GUCs, since this patch will cause them to
control WAL flush in general, as opposed to commit specifically?
Peter Geoghegan and Simon were arguing that we should retitle it to
group_commit_delay rather than just commit_delay, but that doesn't
seem to be much of an improvement in describing what the new behavior
will actually be, and I am doubtful that it is worth creating a naming
incompatibility with previous releases for a cosmetic change.  I
suggested wal_flush_delay, but there's no consensus on that.
Opinions?

Also, I think I see a straightforward, if minor, improvement.   Right
now, the patch has this:

 * Sleep before flush! By adding a delay here, we may give further
 * backends the opportunity to join the backlog of group commit
 * followers; this can significantly improve transaction throughput, at
 * the risk of increasing transaction latency.
 *
 * We do not sleep if enableFsync is not turned on, nor if there are
 * fewer than CommitSiblings other backends with active transactions.
 */
if (CommitDelay > 0 && enableFsync &&
MinimumActiveBackends(CommitSiblings))
pg_usleep(CommitDelay);

/* Got the lock */
LogwrtResult = XLogCtl->LogwrtResult;
if (!XLByteLE(record, LogwrtResult.Flush))
{
/* try to write/flush later additions to XLOG as well */
if (LWLockConditionalAcquire(WALInsertLock, LW_EXCLUSIVE))

The XLByteLE test four lines from the bottom should happen before we
consider whether to sleep, because it's possible that we'll discover
that our flush request has already been satisfied and exit without
doing anything, in which case the sleep is a waste.  The attached
version adjusts the logic accordingly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

move_delay_2012_06_27.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte

2012-06-27 Thread Robert Haas

On Thu, May 24, 2012 at 12:04 AM, Alexander Korotkov
 wrote:
> Thanks. I rewrote inverse conversion from pg_wchar to mule. New version of
> patch is attached.

Review:

It looks to me like pg_wchar2utf_with_len will not work, because
unicode_to_utf8 returns its second argument unmodified - not, as your
code seems to assume, the byte following what was already written.

MULE also looks problematic.  The code that you've written isn't
symmetric with the opposite conversion, unlike what you did in all
other cases, and I don't understand why.  I'm also somewhat baffled by
the reverse conversion: it treats a multi-byte sequence beginning with
a byte for which IS_LCPRV1(x) returns true as invalid if there are
less than 3 bytes available, but it only reads two; similarly, for
IS_LCPRV2(x), it demands 4 bytes but converts only 3.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread A.M.


On Jun 27, 2012, at 7:34 AM, Robert Haas wrote:

> On Wed, Jun 27, 2012 at 12:00 AM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> So, here's a patch.  Instead of using POSIX shmem, I just took the
>>> expedient of using mmap() to map a block of MAP_SHARED|MAP_ANONYMOUS
>>> memory.  The sysv shm is still allocated, but it's just a copy of
>>> PGShmemHeader; the "real" shared memory is the anonymous block.  This
>>> won't work if EXEC_BACKEND is defined so it just falls back on
>>> straight sysv shm in that case.
>> 
>> Um.  I hadn't thought about the EXEC_BACKEND interaction, but that seems
>> like a bit of a showstopper.  I would not like to give up the ability
>> to debug EXEC_BACKEND mode on Unixen.
>> 
>> Would Posix shmem help with that at all?  Why did you choose not to
>> use the Posix API, anyway?
> 
> It seemed more complicated.  If we use the POSIX API, we've got to
> have code to find a non-colliding name for the shm, and we've got to
> arrange to clean it up at process exit.  Anonymous shm doesn't require
> a name and goes away automatically when it's no longer in use.
> 
> With respect to EXEC_BACKEND, I wasn't proposing to kill it, just to
> make it continue to use a full-sized sysv shm.
> 

I solved this by unlinking the posix shared memory segment immediately after 
creation. The file descriptor to the shared memory is inherited, so, by 
definition, only the postmaster children can access the memory. This ensures 
that shared memory cleanup is immediate after the postmaster and all children 
close, as well. The fcntl locking is not required to protect the posix shared 
memory- it can protect itself.

Cheers,
M




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] pg_basebackup blocking all queries with horrible performance

2012-06-27 Thread Simon Riggs

On 27 June 2012 18:24, Fujii Masao  wrote:

> will never become sync standby even
> if their name is in synchronous_standby_names.

I don't understand why you'd want that.

What is wrong with removing the name from synchronous_standby_names if
you don't like it?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regarding WAL Format Changes

2012-06-27 Thread Fujii Masao

On Thu, Jun 28, 2012 at 1:13 AM, Heikki Linnakangas
 wrote:
> On 27.06.2012 18:55, Fujii Masao wrote:
>>
>> On Wed, Jun 27, 2012 at 11:56 PM, Heikki Linnakangas
>>   wrote:
>>>
>>> On 27.06.2012 17:14, Amit Kapila wrote:


 1. Function header for following functions still contains referece to
 log,
 seg
    a. InstallXLogFileSegment()
    b. RemoveOldXlogFiles()
    c. XLogFileCopy()
    d. XLogGetLastRemoved()
    e. UpdateLastRemovedPtr()
    f. RemoveOldXlogFiles()
>>>
>>>
>>>
>>> Thanks, fixed.
>>
>>
>> There is still reference to log/seg in the comment of
>> InstallXLogFileSegment().
>> The attached patch should be applied?
>
>
> Thanks, applied. Sorry for being so sloppy with this..

No problem. WAL format change you did is really nice!

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] pg_basebackup blocking all queries with horrible performance

2012-06-27 Thread Fujii Masao

On Thu, Jun 21, 2012 at 3:18 AM, Robert Haas  wrote:
> On Wed, Jun 20, 2012 at 7:18 AM, Magnus Hagander  wrote:
> You agreed to add something like NOSYNC option into START_REPLICATION 
> command?

 I'm on the fence. I was hoping somebody else would chime in with an
 opinion as well.
>>>
>>> +1
>>
>> Nobody else with any opinion on this? :(
>
> I don't think we really need a NOSYNC flag at this point.  Just not
> setting the flush location in clients that make a point of flushing in
> a timely fashion seems fine.

Okay, I'm in the minority, so I'm writing the patch that way. WIP
patch attached.

In the patch, pg_basebackup background process and pg_receivexlog always
return invalid location as flush one, and will never become sync standby even
if their name is in synchronous_standby_names. The timing of their sending
the reply depends on the standby_message_timeout specified in -s option. So
the write position may lag behind the true position.

pg_receivexlog accepts new option -S (better option character?). If this option
is specified, pg_receivexlog returns true flush position, and can become sync
standby. It sends back the reply to the master each time the write position
changes or the timeout passes. If synchronous_commit is set to remote_write,
synchronous replication to pg_receivexlog would work well.

The patch needs more documentation. But I think that it's worth reviewing the
code in advance, so I attached the WIP patch. Comments? Objections?

The patch is based on current HEAD, i.e., 9.3dev. If the patch is applied,
we need to write the backport version of the patch for 9.2.

Regards,

-- 
Fujii Masao

pg_receivexlog_syncstandby_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regarding WAL Format Changes

2012-06-27 Thread Tom Lane

Heikki Linnakangas  writes:
>> So is there any particular reason for it?

> Not really. There are several messages that use "log file %s", and also 
> several places that use "log segment %s" Should we make it consistent 
> and use either "log segment" or "log file" everywhere?

+1 for uniformity.  I think I'd vote for using "file" and eliminating
the "segment" terminology altogether, but the other direction would be
okay too, and might require fewer changes.

IIRC, in the original coding "segment" meant 16MB worth of WAL while
"file" was sometimes used to denote 4GB worth (ie, the boundaries where
we had to increment the high half of the LSN struct).  Now that 4GB
boundaries are not special, there's no reason to retain the "file"
concept or terminology.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regarding WAL Format Changes

2012-06-27 Thread Heikki Linnakangas


On 27.06.2012 18:55, Fujii Masao wrote:

On Wed, Jun 27, 2012 at 11:56 PM, Heikki Linnakangas
  wrote:

On 27.06.2012 17:14, Amit Kapila wrote:


1. Function header for following functions still contains referece to log,
seg
a. InstallXLogFileSegment()
b. RemoveOldXlogFiles()
c. XLogFileCopy()
d. XLogGetLastRemoved()
e. UpdateLastRemovedPtr()
f. RemoveOldXlogFiles()



Thanks, fixed.


There is still reference to log/seg in the comment of InstallXLogFileSegment().
The attached patch should be applied?


Thanks, applied. Sorry for being so sloppy with this..

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Move WAL continuation record information to WAL page header.

2012-06-27 Thread Andres Freund

On Wednesday, June 27, 2012 05:09:46 PM Heikki Linnakangas wrote:
> On 27.06.2012 17:53, Andres Freund wrote:
> > I had noticed one thing when reviewing the patches before:
> > 
> > @@ -717,6 +719,15 @@ XLogInsert(RmgrId rmid, uint8 info, XLogRecData
> > *rdata)
> > 
> >  booldoPageWrites;
> >  boolisLogSwitch = (rmid == RM_XLOG_ID&&  info ==
> >  XLOG_SWITCH); uint8   info_orig = info;
> > 
> > +   static XLogRecord *rechdr;
> > +
> > +   if (rechdr == NULL)
> > +   {
> > +   rechdr = malloc(SizeOfXLogRecord);
> > +   if (rechdr == NULL)
> > +   elog(ERROR, "out of memory");
> > +   MemSet(rechdr, 0, SizeOfXLogRecord);
> > +   }
> > 
> >  /* cross-check on whether we should be here or not */
> >  if (!XLogInsertAllowed())
> > 
> > Why do you allocate this dynamically? XLogRecord is 32bytes, there
> > doesn't seem to be much point in this?
> 
> On 64-bit architectures, the struct needs padding at the end to make the
> size MAXALIGNed to 32 bytes; a simple "XLogRecord rechdr" local variable
> would not include that. You could do something like:
> 
> union
> {
>XLogRecord rechdr;
>char bytes[SizeOfXLogRecord];
> }
> 
> but that's quite ugly too.
Ah, yes. Makes sense.

Btw, the XLogRecord struct is directly layed out with 32bytes here (64bit, 
linux) because xl_prev needs to be 8byte aligned...

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.3] Row-Level Security

2012-06-27 Thread Florian Pflug

On Jun27, 2012, at 15:07 , Kohei KaiGai wrote:
> Probably, PlannedStmt->invalItems allows to handle invalidation of
> plan-cache without big code changes. I'll try to put a flag of user-id
> to track the query plan with RLS assumed, or InvalidOid if no RLS
> was applied in this plan.
> I'll investigate the implementation for more details.
> 
> Do we have any other scenario that run a query plan under different
> user privilege rather than planner stage?

Hm, what happens if a SECURITY DEFINER functions returns a refcursor?

Actually, I wonder how we handle that today. If the executor is
responsible for permission checks, that wouldn't we apply the calling
function's privilege level in that case, at least of the cursor isn't
fetched from in the SECURITY DEFINER function? If I find some time,
I'll check...

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regarding WAL Format Changes

2012-06-27 Thread Fujii Masao

On Wed, Jun 27, 2012 at 11:56 PM, Heikki Linnakangas
 wrote:
> On 27.06.2012 17:14, Amit Kapila wrote:
>>
>> 1. Function header for following functions still contains referece to log,
>> seg
>>    a. InstallXLogFileSegment()
>>    b. RemoveOldXlogFiles()
>>    c. XLogFileCopy()
>>    d. XLogGetLastRemoved()
>>    e. UpdateLastRemovedPtr()
>>    f. RemoveOldXlogFiles()
>
>
> Thanks, fixed.

There is still reference to log/seg in the comment of InstallXLogFileSegment().
The attached patch should be applied?

Regards,

-- 
Fujii Masao


logseg2segno_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regarding WAL Format Changes

2012-06-27 Thread Alvaro Herrera


Excerpts from Heikki Linnakangas's message of mié jun 27 10:56:13 -0400 2012:
> On 27.06.2012 17:14, Amit Kapila wrote:

> >For the above 2 changed error messages, 'log segment' is used for
> > filename.
> >However all similar changes has 'log file' for filename. There are some
> > places
> >where 'log segment' is used and other places it is 'log file'.
> >So is there any particular reason for it?
> 
> Not really. There are several messages that use "log file %s", and also 
> several places that use "log segment %s" Should we make it consistent 
> and use either "log segment" or "log file" everywhere?

I think it would be better to use "log segment" for WAL segments.  That
way we don't cause confusion with the regular text/csv log output files.
Heck, maybe even "WAL segment" instead of "log".

As a translator, I can't have a single, clear explanation of what "log
file" is because there are multiple meanings.  It would be better not to
depend on context.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] foreign key locks

2012-06-27 Thread Alvaro Herrera


Excerpts from Kevin Grittner's message of mié jun 27 08:40:58 -0400 2012:
> Alvaro Herrera  wrote:
>  
> > here's fklocks v14, which also merges to new master as there were
> > several other conflicts. It passes make installcheck-world now.
>  
> Recent commits broke it again, so here's a rebased v15.  (No changes
> other than attempting to merge your work with the pg_controldata and
> pg_resetxlog changes and wrapping that FIXME in a comment.)

Oooh, so sorry -- I was supposed to have git-stashed that before
producing the patch.  This change cannot have caused the pg_dumpall
problem below though, because it only touched the multixact cache code.

> Using that patch, pg_upgrade completes, but pg_dumpall fails:

Hmm, something changed enough that requires more than just a code merge.
I'll look into it.

Thanks for the merge.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] warning handling in Perl scripts

2012-06-27 Thread David E. Wheeler

On Jun 27, 2012, at 4:07 PM, Andrew Dunstan wrote:

> I realise I'm late to this party, but I'm with Robert. The root cause of the 
> errors should be fixed.
> 
> That's not to say that making warnings fatal might not also be a good idea as 
> a general defense mechanism.

ISTM that if they are fatal, one will be more motivated to fix them. I think 
that was the point.

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Move WAL continuation record information to WAL page header.

2012-06-27 Thread Heikki Linnakangas


On 27.06.2012 17:53, Andres Freund wrote:

I had noticed one thing when reviewing the patches before:

@@ -717,6 +719,15 @@ XLogInsert(RmgrId rmid, uint8 info, XLogRecData *rdata)
 booldoPageWrites;
 boolisLogSwitch = (rmid == RM_XLOG_ID&&  info == XLOG_SWITCH);
 uint8   info_orig = info;
+   static XLogRecord *rechdr;
+
+   if (rechdr == NULL)
+   {
+   rechdr = malloc(SizeOfXLogRecord);
+   if (rechdr == NULL)
+   elog(ERROR, "out of memory");
+   MemSet(rechdr, 0, SizeOfXLogRecord);
+   }

 /* cross-check on whether we should be here or not */
 if (!XLogInsertAllowed())

Why do you allocate this dynamically? XLogRecord is 32bytes, there doesn't
seem to be much point in this?


On 64-bit architectures, the struct needs padding at the end to make the 
size MAXALIGNed to 32 bytes; a simple "XLogRecord rechdr" local variable 
would not include that. You could do something like:


union
{
  XLogRecord rechdr;
  char bytes[SizeOfXLogRecord];
}

but that's quite ugly too.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regarding WAL Format Changes

2012-06-27 Thread Heikki Linnakangas


On 27.06.2012 17:14, Amit Kapila wrote:

1. Function header for following functions still contains referece to log,
seg
a. InstallXLogFileSegment()
b. RemoveOldXlogFiles()
c. XLogFileCopy()
d. XLogGetLastRemoved()
e. UpdateLastRemovedPtr()
f. RemoveOldXlogFiles()


Thanks, fixed.


2. @@ -2680,8 +2645,8 @@ InstallXLogFileSegment(uint32 *log, uint32 *seg,
char *tmppath,
  LWLockRelease(ControlFileLock);
  ereport(LOG,
  (errcode_for_file_access(),
- errmsg("could not link file \"%s\" to
\"%s\" (initialization of log file %u, segment %u): %m",
-tmppath, path, *log,
*seg)));
+ errmsg("could not link file \"%s\" to
\"%s\" (initialization of log file): %m",
+tmppath, path)));
If Changed error message can contain log file and segment number, it
would be more clear. That should be easily
deducible from segment number.


That seems redundant. The target file name is calculated from the 
segment number, and we're now using the file name instead of log+seg in 
other messages too.



3.   -RemoveOldXlogFiles(uint32 log, uint32 seg, XLogRecPtr endptr)
+RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr endptr)
.
.
.
@@ -4016,8 +3953,9 @@ retry:
  if (!(((XLogPageHeader) readBuf)->xlp_info&
XLP_FIRST_IS_CONTRECORD))
  {
  ereport(emode_for_corrupt_record(emode,
*RecPtr),
-(errmsg("there is no
contrecord flag in log file %u, segment %u, offset %u",
-readId,
readSeg, readOff)));
+(errmsg("there is no
contrecord flag in log segment %s, offset %u",
+
XLogFileNameP(curFileTLI, readSegNo),
+readOff)));

  goto next_record_is_invalid;
  }
  pageHeaderSize =
XLogPageHeaderSize((XLogPageHeader) readBuf);
@@ -4025,10 +3963,13 @@ retry:
  if (contrecord->xl_rem_len == 0 ||
  total_len != (contrecord->xl_rem_len +
gotlen))
  {
+char fname[MAXFNAMELEN];
+XLogFileName(fname, curFileTLI, readSegNo);

  ereport(emode_for_corrupt_record(emode,
*RecPtr),
-(errmsg("invalid contrecord
length %u in log file %u, segment %u, offset %u",
+(errmsg("invalid contrecord
length %u in log segment %s, offset %u",

contrecord->xl_rem_len,
-readId,
readSeg, readOff)));
+
XLogFileNameP(curFileTLI, readSegNo),
+readOff)));

  goto next_record_is_invalid;
  }

   For the above 2 changed error messages, 'log segment' is used for
filename.
   However all similar changes has 'log file' for filename. There are some
places
   where 'log segment' is used and other places it is 'log file'.
   So is there any particular reason for it?


Not really. There are several messages that use "log file %s", and also 
several places that use "log segment %s" Should we make it consistent 
and use either "log segment" or "log file" everywhere?



4. @@ -533,33 +533,17 @@ pg_xlog_location_diff(PG_FUNCTION_ARGS)
-/*
- * Sanity check
- */
-if (loc1.xrecoff>  XLogFileSize)
-ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("xrecoff \"%X\" is out of valid
range, 0..%X", loc1.xrecoff, XLogFileSize)));
-if (loc2.xrecoff>  XLogFileSize)
-ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("xrecoff \"%X\" is out of valid
range, 0..%X", loc2.xrecoff, XLogFileSize)));
+bytes1 = (((uint64)loc1.xlogid)<<  32L) + loc1.xrecoff;
+bytes2 = (((uint64)loc2.xlogid)<<  32L) + loc2.xrecoff;

 Is there no chance that it can be out of valid range after new changes,
just a doubt?


No. Not in the sense it used to be, anyway, the XLogFileSize check is no 
longer relevant. Perhaps we should check for InvalidXLogRecPtr or that 
the pointer doesn't point e.g in the middle of a page header. But then 
again, this calculation works fine with both of those cases, so I see no 
reason to make it stricter.



5.
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replica

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Robert Haas

On Wed, Jun 27, 2012 at 9:52 AM, Stephen Frost  wrote:
> What this all boils down to is- can you have a shm segment that goes
> away when no one is still attached to it, but actually give it a name
> and then detect if it already exists atomically on startup on
> Linux/Unixes?  If so, perhaps we could use the same mechanism on both..

As I understand it, no.  You can either have anonymous shared
mappings, which go away when no longer in use but do not have a name.
Or you can have POSIX or sysv shm, which have a name but do not
automatically go away when no longer in use.  There seems to be no
method for setting up a segment that both has a name and goes away
automatically.  POSIX shm in particular tries to "look like a file",
whereas anonymous memory tries to look more like malloc (except that
you can share the mapping with child processes).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reporting hba lines

2012-06-27 Thread Tom Lane

Magnus Hagander  writes:
> On Wed, Jun 27, 2012 at 4:14 PM, Tom Lane  wrote:
>> cases where people are modifying the wrong hba file.  Can we show
>> the source text of the hba line?

> We don't currently keep the full source text around - but we certainly
> could do that if we wanted to.

If we're concerned about providing a message like this, I think it'd be
well worthwhile.  We presumably would only store the active lines not
comment lines, so the extra memory space would be negligible in just
about any real use-case.

> I'm not sure how much it helps - usually, you're going to end up on a
> line that's completely irrelevant if you get the wrong hba file (e.g.
> a comment or a line that's not even in the file at all due to size).

Only if you count accurately, and aren't fooled into miscounting by the
expectation that you must arrive on a non-comment line.  I don't buy the
"the editor can do it for you" argument either, at least not since
noticing that recent versions of emacs don't count lines the way I do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Robert Haas

On Wed, Jun 27, 2012 at 9:44 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Jun 27, 2012 at 12:00 AM, Tom Lane  wrote:
>>> Would Posix shmem help with that at all?  Why did you choose not to
>>> use the Posix API, anyway?
>
>> It seemed more complicated.  If we use the POSIX API, we've got to
>> have code to find a non-colliding name for the shm, and we've got to
>> arrange to clean it up at process exit.  Anonymous shm doesn't require
>> a name and goes away automatically when it's no longer in use.
>
> I see.  Those are pretty good reasons ...
>
>> With respect to EXEC_BACKEND, I wasn't proposing to kill it, just to
>> make it continue to use a full-sized sysv shm.
>
> Well, if the ultimate objective is to get out from under the SysV APIs
> entirely, we're not going to get there if we still have to have all that
> code for the EXEC_BACKEND case.  Maybe it's time to decide that we don't
> need to support EXEC_BACKEND on Unix.

I don't personally see a need to do anything that drastic at this
point.  Admittedly, I rarely compile with EXEC_BACKEND, but I don't
think it's bad to have the option available.  Adjusting shared memory
limits isn't really a big problem for PostgreSQL developers; what
we're trying to avoid is the need for PostgreSQL *users* to concern
themselves with it.  And surely anyone who is using EXEC_BACKEND on
Unix is a developer, not a user.

If and when we come up with a substitute for the nattch interlock,
then this might be worth thinking a bit harder about.  At that point,
if we still want to support EXEC_BACKEND on Unix, then we'd need the
EXEC_BACKEND case at least to use POSIX shm rather than anonymous
shared mmap.  Personally I think that would be not that hard and
probably worth doing, but there doesn't seem to be any point in
writing that code now, because for the simple case of just reducing
the amount of shm that we allocate, an anonymous mapping seems better
all around.

We shouldn't overthink this.  Our shared memory code has allocated a
bunch of crufty hacks over the years to work around various
platform-specific issues, but it's still not a lot of code, so I don't
see any reason to worry unduly about making a surgical fix without
having a master plan.  Nothing we want to do down the road will
require moving the earth.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reporting hba lines

2012-06-27 Thread Magnus Hagander

On Wed, Jun 27, 2012 at 4:14 PM, Tom Lane  wrote:
> Magnus Hagander  writes:
>> On Wed, Jun 27, 2012 at 3:55 PM, Tom Lane  wrote:
>>> BTW, are you sure that auth_failed is only called in cases where
>>> an hba line has already been identified?  Even if true today,
>>> it seems fairly risky to assume that.
>
>> It is true today, but yes, it might be safe to guard against it with
>> something like this?
>
> FWIW, the usual approach for conditionally emitting bits of an ereport
> is more like
>
>        ereport(FATAL,
>                (errcode(errcode_return),
>                 errmsg(errstr, port->user_name),
>                 port->hba ? errdetail_log("Connection matched pg_hba.conf 
> line %d", port->hba->linenumber) : 0));

Hmm. Ok. So it treats a 0/NULL there as a way to ignore it. I tried
something with the NULL inside the errdetail, which obviously failed.


> but that's just a nitpick.  A bigger issue is that I'm not convinced
> that a line number will be tremendously helpful: it's easy to miscount
> lines, and a line number will certainly not be helpful in the frequent

Editors will help you count the lines, no? :-)

> cases where people are modifying the wrong hba file.  Can we show
> the source text of the hba line?

We don't currently keep the full source text around - but we certainly
could do that if we wanted to.

I'm not sure how much it helps - usually, you're going to end up on a
line that's completely irrelevant if you get the wrong hba file (e.g.
a comment or a line that's not even in the file at all due to size).
Maybe we should just include the *name* of the HBA file in the error
message?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Tom Lane

Magnus Hagander  writes:
> On Wed, Jun 27, 2012 at 3:40 PM, Tom Lane  wrote:
>> AFAIR we basically punted on those problems for the Windows port,
>> for lack of an equivalent to nattch.

> No, we spent a lot of time trying to *fix* it, and IIRC we did.

OK, in that case this isn't as interesting as I thought.

If we do go over to a file-locking-based solution on Unix, it might be
worthwhile changing to something similar on Windows.  But it would be
more about reducing coding differences between the platforms than
plugging any real holes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Regarding WAL Format Changes

2012-06-27 Thread Amit Kapila

While reading patch-1 (

1-use-uint64-got-segno.patch) of WAL Format
Changes(http://archives.postgresql.org/message-id/4FDA5136.6080206@enterpris
edb.com), I had few observations which are summarized below:

 

1. Function header for following functions still contains referece to log,
seg 
   a. InstallXLogFileSegment() 
   b. RemoveOldXlogFiles() 
   c. XLogFileCopy() 
   d. XLogGetLastRemoved() 
   e. UpdateLastRemovedPtr() 
   f. RemoveOldXlogFiles() 

2. @@ -2680,8 +2645,8 @@ InstallXLogFileSegment(uint32 *log, uint32 *seg,
char *tmppath, 
 LWLockRelease(ControlFileLock); 
 ereport(LOG, 
 (errcode_for_file_access(), 
- errmsg("could not link file \"%s\" to
\"%s\" (initialization of log file %u, segment %u): %m", 
-tmppath, path, *log,
*seg))); 
+ errmsg("could not link file \"%s\" to
\"%s\" (initialization of log file): %m", 
+tmppath, path))); 
   If Changed error message can contain log file and segment number, it
would be more clear. That should be easily 
   deducible from segment number. 

3.   -RemoveOldXlogFiles(uint32 log, uint32 seg, XLogRecPtr endptr) 
+RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr endptr) 
. 
. 
.
@@ -4016,8 +3953,9 @@ retry: 
 if (!(((XLogPageHeader) readBuf)->xlp_info &
XLP_FIRST_IS_CONTRECORD)) 
 { 
 ereport(emode_for_corrupt_record(emode,
*RecPtr), 
-(errmsg("there is no
contrecord flag in log file %u, segment %u, offset %u", 
-readId,
readSeg, readOff))); 
+(errmsg("there is no
contrecord flag in log segment %s, offset %u", 
+
XLogFileNameP(curFileTLI, readSegNo), 
+readOff)));

 goto next_record_is_invalid; 
 } 
 pageHeaderSize =
XLogPageHeaderSize((XLogPageHeader) readBuf); 
@@ -4025,10 +3963,13 @@ retry: 
 if (contrecord->xl_rem_len == 0 || 
 total_len != (contrecord->xl_rem_len +
gotlen)) 
 { 
+char fname[MAXFNAMELEN]; 
+XLogFileName(fname, curFileTLI, readSegNo);

 ereport(emode_for_corrupt_record(emode,
*RecPtr), 
-(errmsg("invalid contrecord
length %u in log file %u, segment %u, offset %u", 
+(errmsg("invalid contrecord
length %u in log segment %s, offset %u", 
 
contrecord->xl_rem_len, 
-readId,
readSeg, readOff))); 
+
XLogFileNameP(curFileTLI, readSegNo), 
+readOff)));

 goto next_record_is_invalid; 
 } 
  
  For the above 2 changed error messages, 'log segment' is used for
filename. 
  However all similar changes has 'log file' for filename. There are some
places 
  where 'log segment' is used and other places it is 'log file'. 
  So is there any particular reason for it? 
  
  
4. @@ -533,33 +533,17 @@ pg_xlog_location_diff(PG_FUNCTION_ARGS)
-/* 
- * Sanity check 
- */ 
-if (loc1.xrecoff > XLogFileSize) 
-ereport(ERROR, 
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE), 
- errmsg("xrecoff \"%X\" is out of valid
range, 0..%X", loc1.xrecoff, XLogFileSize))); 
-if (loc2.xrecoff > XLogFileSize) 
-ereport(ERROR, 
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE), 
- errmsg("xrecoff \"%X\" is out of valid
range, 0..%X", loc2.xrecoff, XLogFileSize))); 
+bytes1 = (((uint64)loc1.xlogid) << 32L) + loc1.xrecoff; 
+bytes2 = (((uint64)loc2.xlogid) << 32L) + loc2.xrecoff; 

Is there no chance that it can be out of valid range after new changes,
just a doubt? 


5. 
--- a/src/backend/replication/walreceiver.c 
+++ b/src/backend/replication/walreceiver.c 
@@ -69,11 +69,12 @@ walrcv_disconnect_type walrcv_disconnect = NULL; 
  
 /* 
  * These variables are used similarly to openLogFile/Id/Seg/Off, 
- * but for walreceiver to write the XLOG. 
+ * but for walreceiver to write the XLOG. recvFileTLI is the TimeLineID 

In the above comments, there is still reference to Id/Seg/Off.

 

 

With Regards,

Amit Kapila.

Re: [HACKERS] Reporting hba lines

2012-06-27 Thread Tom Lane

Magnus Hagander  writes:
> On Wed, Jun 27, 2012 at 3:55 PM, Tom Lane  wrote:
>> BTW, are you sure that auth_failed is only called in cases where
>> an hba line has already been identified?  Even if true today,
>> it seems fairly risky to assume that.

> It is true today, but yes, it might be safe to guard against it with
> something like this?

FWIW, the usual approach for conditionally emitting bits of an ereport
is more like

ereport(FATAL,
(errcode(errcode_return),
 errmsg(errstr, port->user_name),
 port->hba ? errdetail_log("Connection matched pg_hba.conf line 
%d", port->hba->linenumber) : 0));

but that's just a nitpick.  A bigger issue is that I'm not convinced
that a line number will be tremendously helpful: it's easy to miscount
lines, and a line number will certainly not be helpful in the frequent
cases where people are modifying the wrong hba file.  Can we show
the source text of the hba line?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] warning handling in Perl scripts

2012-06-27 Thread Andrew Dunstan




On 06/24/2012 04:05 PM, Robert Haas wrote:

On Sun, Jun 24, 2012 at 2:40 PM, Peter Eisentraut  wrote:

Every time I make a change to the structure of the catalog files,
genbki.pl produces a bunch of warnings (like "Use of uninitialized value
in string eq at genbki.pl line ..."), and produces corrupted output
files, that are then (possibly) detected later by the compiler.  Also,
getting out of that is difficult because due to the complicated
dependency relationship between the involved files, you need to remove a
bunch of files manually, or clean everything.  So error handling could
be better.

It seems that adding

diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index ebc4825..7d66da9 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -19,6 +19,8 @@
  use strict;
  use warnings;

+local $SIG{__WARN__} = sub { die $_[0] };
+
  my @input_files;
  our @include_path;
  my $output_path = '';

would address that.

Could that cause any other problems?  Should it be added to all Perl
scripts?

This seems like a band-aid.  How about if we instead add whatever
error-handling the script is missing, so that it produces an
appropriate, human-readable error message?



I realise I'm late to this party, but I'm with Robert. The root cause of 
the errors should be fixed.


That's not to say that making warnings fatal might not also be a good 
idea as a general defense mechanism.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Magnus Hagander

On Wed, Jun 27, 2012 at 3:40 PM, Tom Lane  wrote:
> Magnus Hagander  writes:
>> On Wed, Jun 27, 2012 at 3:50 AM, Tom Lane  wrote:
>>> I wonder whether this design can be adapted to Windows?  IIRC we do
>>> not have a bulletproof data directory lock scheme for Windows.
>>> It seems like this makes few enough demands on the lock mechanism
>>> that there ought to be suitable primitives available there too.
>
>> I assume you're saying we need to make changes in the internal API,
>> right? Because we alreayd have a windows native implementation of
>> shared memory that AFAIK works,
>
> Right, but does it provide honest protection against starting two
> postmasters in the same data directory?  Or more to the point,
> does it prevent starting a new postmaster when the old postmaster
> crashed but there are still orphaned backends making changes?
> AFAIR we basically punted on those problems for the Windows port,
> for lack of an equivalent to nattch.

No, we spent a lot of time trying to *fix* it, and IIRC we did.

We create a shared memory segment with a fixed name based on the data
directory. This shared memory segment is inherited by all children. It
will automatically go away only when all processes that have an open
handle to it go away (in fact, it can even take a second or two more,
if they go away by crash and not by cleanup - we have a workaround in
the code for that). But as long as there is an orphaned backend
around, the shared memory segment stays around.

We don't have "nattch". But we do have "nattch>0". Or something like that.

You can work around it if you find two different paths to the same
data directory (e.g .using junctions), but you are really actively
trying to break the system if you do that...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reporting hba lines

2012-06-27 Thread Magnus Hagander

On Wed, Jun 27, 2012 at 3:55 PM, Tom Lane  wrote:
> Magnus Hagander  writes:
>> When debugging strange and complex pg_hba lines, it can often be quite
>> useful to know which line is matching a particular connection that
>> failed for some reason. Because more often than not, it's actually not
>> using the line in pg_hba.conf that's expected.
>
>> The easiest way to do this is to emit an errdetail for the login
>> failure, per this patch.
>
>> Question is - is that leaking information to the client that we
>> shouldn't be leaking?
>
> Yes.
>
>> And if it is, what would be the preferred way to deal with it?
>
> Report to the postmaster log only.  errdetail_log should do.

Oh, I wasn't aware we had that :) You learn something new every day.


> BTW, are you sure that auth_failed is only called in cases where
> an hba line has already been identified?  Even if true today,
> it seems fairly risky to assume that.

It is true today, but yes, it might be safe to guard against it with
something like this?

I also fixed the error message to follow the guidelines better - I think :)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


hba_line.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reporting hba lines

2012-06-27 Thread Tom Lane

Magnus Hagander  writes:
> When debugging strange and complex pg_hba lines, it can often be quite
> useful to know which line is matching a particular connection that
> failed for some reason. Because more often than not, it's actually not
> using the line in pg_hba.conf that's expected.

> The easiest way to do this is to emit an errdetail for the login
> failure, per this patch.

> Question is - is that leaking information to the client that we
> shouldn't be leaking?

Yes.

> And if it is, what would be the preferred way to deal with it?

Report to the postmaster log only.  errdetail_log should do.

BTW, are you sure that auth_failed is only called in cases where
an hba line has already been identified?  Even if true today,
it seems fairly risky to assume that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [PATCH 07/16] Log enough data into the wal to reconstruct logical changes from it if wal_level=logical

2012-06-27 Thread Heikki Linnakangas


On 13.06.2012 14:28, Andres Freund wrote:

@@ -2584,6 +2610,73 @@ l1:
rdata[1].buffer_std = true;
rdata[1].next = NULL;

+   /*
+* XXX: We could decide not to log changes when the origin is 
not the
+* local node, that should reduce redundant logging.
+*/
+   if(need_tuple){
+   xl_heap_header xlhdr;

> ...

+   relationFindPrimaryKey(relation,&indexoid,&pknratts, 
pkattnum, pktypoid, pkopclass);
+
+   if(!indexoid){
+   elog(WARNING, "Could not find primary key for table 
with oid %u",
+relation->rd_id);
+   goto no_index_found;
+   }
+
+   index_rel = index_open(indexoid, AccessShareLock);
+
+   indexdesc = RelationGetDescr(index_rel);
+
+   for(natt = 0; natt<  indexdesc->natts; natt++){
+   idxvals[natt] =
+   fastgetattr(&tp, pkattnum[natt], 
desc,&idxisnull[natt]);
+   Assert(!idxisnull[natt]);
+   }
+
+   idxtuple = heap_form_tuple(indexdesc, idxvals, 
idxisnull);
+
+   xlhdr.t_infomask2 = idxtuple->t_data->t_infomask2;
+   xlhdr.t_infomask = idxtuple->t_data->t_infomask;
+   xlhdr.t_hoff = idxtuple->t_data->t_hoff;
+
+   rdata[1].next =&(rdata[2]);
+   rdata[2].data = (char*)&xlhdr;
+   rdata[2].len = SizeOfHeapHeader;
+   rdata[2].buffer = InvalidBuffer;
+   rdata[2].next = NULL;
+
+   rdata[2].next =&(rdata[3]);
+   rdata[3].data = (char *) idxtuple->t_data + 
offsetof(HeapTupleHeaderData, t_bits);
+   rdata[3].len = idxtuple->t_len - 
offsetof(HeapTupleHeaderData, t_bits);
+   rdata[3].buffer = InvalidBuffer;
+   rdata[3].next = NULL;
+
+   heap_close(index_rel, NoLock);
+   no_index_found:
+   ;
+   }
+
recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_DELETE, rdata);

PageSetLSN(page, recptr);


It's not cool to do all that primary key lookup stuff within the 
critical section, while holding a lock on the page.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Right, but does it provide honest protection against starting two
> postmasters in the same data directory?  Or more to the point,
> does it prevent starting a new postmaster when the old postmaster
> crashed but there are still orphaned backends making changes?
> AFAIR we basically punted on those problems for the Windows port,
> for lack of an equivalent to nattch.

See my other mail, but, after talking to Magnus, it's my understanding
that we had that problem initially, but it was later solved by using a
named shared memory segment which the kernel will clean up when all
children are gone.  That, combined with a 'create-if-exists' call,
allows detection of lost children to be done.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Stephen Frost

All,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Robert Haas  writes:
> > On Wed, Jun 27, 2012 at 12:00 AM, Tom Lane  wrote:
> >> Would Posix shmem help with that at all?  Why did you choose not to
> >> use the Posix API, anyway?
> 
> > It seemed more complicated.  If we use the POSIX API, we've got to
> > have code to find a non-colliding name for the shm, and we've got to
> > arrange to clean it up at process exit.  Anonymous shm doesn't require
> > a name and goes away automatically when it's no longer in use.
> 
> I see.  Those are pretty good reasons ...

After talking to Magnus a bit this morning regarding this, it sounds
like what we're doing on Windows is closer to Anonymous shm, except that
they use an intentionally specific name, which also allows them to
detect if any children are still alive by using a "create-if-not-exists"
approach on the shm segment and failing if it still exists.  There were
some corner cases around restarts due to it taking a few seconds for the
Windows kernel to pick up on the fact that all the children are dead and
that the shm segment should go away, but they were able to work around
that, and failure to start is surely much better than possible
corruption.

What this all boils down to is- can you have a shm segment that goes
away when no one is still attached to it, but actually give it a name
and then detect if it already exists atomically on startup on
Linux/Unixes?  If so, perhaps we could use the same mechanism on both..

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Tom Lane

Robert Haas  writes:
> On Wed, Jun 27, 2012 at 12:00 AM, Tom Lane  wrote:
>> Would Posix shmem help with that at all?  Why did you choose not to
>> use the Posix API, anyway?

> It seemed more complicated.  If we use the POSIX API, we've got to
> have code to find a non-colliding name for the shm, and we've got to
> arrange to clean it up at process exit.  Anonymous shm doesn't require
> a name and goes away automatically when it's no longer in use.

I see.  Those are pretty good reasons ...

> With respect to EXEC_BACKEND, I wasn't proposing to kill it, just to
> make it continue to use a full-sized sysv shm.

Well, if the ultimate objective is to get out from under the SysV APIs
entirely, we're not going to get there if we still have to have all that
code for the EXEC_BACKEND case.  Maybe it's time to decide that we don't
need to support EXEC_BACKEND on Unix.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Visual Studio 2012 RC

2012-06-27 Thread Heikki Linnakangas


On 10.06.2012 11:41, Magnus Hagander wrote:

On Sun, Jun 10, 2012 at 3:16 AM, Brar Piening  wrote:

Magnus Hagander wrote:


I don't have too much hope for them actually changing it - they seem
hell-bent on forcing everybody into metro, and this seems to be yet another
way to do that. But we can always hope...



Looks like they've learnt their lesson...

http://blogs.msdn.com/b/visualstudio/archive/2012/06/08/visual-studio-express-2012-for-windows-desktop.aspx


Yeah. Didn't expect that to happen, but happy to see that it did! :-)


I don't think we can realistically support VS2012 until Microsoft 
releases the gratis Visual Studio Express 2012 for Windows Desktop. 
We'll hardly find anyone willing to run a buildfarm machine with VS2012 
until that, and I don't think any of the committers have access to 
VS2012 to test with.


So, I'm bumping this to the next commitfest. By the time that begins, 
let's see if Microsoft has released it yet.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Tom Lane

Magnus Hagander  writes:
> On Wed, Jun 27, 2012 at 3:50 AM, Tom Lane  wrote:
>> I wonder whether this design can be adapted to Windows?  IIRC we do
>> not have a bulletproof data directory lock scheme for Windows.
>> It seems like this makes few enough demands on the lock mechanism
>> that there ought to be suitable primitives available there too.

> I assume you're saying we need to make changes in the internal API,
> right? Because we alreayd have a windows native implementation of
> shared memory that AFAIK works,

Right, but does it provide honest protection against starting two
postmasters in the same data directory?  Or more to the point,
does it prevent starting a new postmaster when the old postmaster
crashed but there are still orphaned backends making changes?
AFAIR we basically punted on those problems for the Windows port,
for lack of an equivalent to nattch.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reporting hba lines

2012-06-27 Thread Cédric Villemain

Le mercredi 27 juin 2012 14:54:15, Magnus Hagander a écrit :
> When debugging strange and complex pg_hba lines, it can often be quite
> useful to know which line is matching a particular connection that
> failed for some reason. Because more often than not, it's actually not
> using the line in pg_hba.conf that's expected.
> 
> The easiest way to do this is to emit an errdetail for the login
> failure, per this patch.
> 
> Question is - is that leaking information to the client that we
> shouldn't be leaking?

I fear it is exactly the problem. Too bad because the feature is interesting.

> And if it is, what would be the preferred way to deal with it? We
> could put that as a detail to basically every single error message
> coming out of the auth system, but that seems like a bad idea. Or we
> could make a separate ereport(LOG) before send it to the client,
> perhaps?

The problem is also that 1/ you're not sure how you did connect exactly  2/ 
you're not connecting like you wanted to do. (my case with ipv4 vs ipv6 just 
below). Providing more information on what we receive from client and tell it 
to the client  sounds good, no ?

> Thoughts?

I just consume some moment before fixing an ipv6 vs ipv4 login failure. The 
matched line from pg_hba .conf would have been very useful.

Maybe it is enough to report what happens on the connection: from which host, 
TCP/socket (and which  one, given that we should have multiple option here 
soon), dbname, user,  inspecting pg_hba.conf will remain difficult if map 
is 
used or some other  +group option.

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] [v9.3] Row-Level Security

2012-06-27 Thread Kohei KaiGai

2012/6/27 Robert Haas :
> On Wed, Jun 27, 2012 at 7:21 AM, Florian Pflug  wrote:
>> On Jun27, 2012, at 07:18 , Kohei KaiGai wrote:
>>> The problem is the way to implement it.
>>> If we would have permission checks on planner stage, it cannot handle
>>> a case when user-id would be switched prior to executor stage, thus
>>> it needs something remedy to handle the scenario correctly.
>>> Instead of a unique plan per query, it might be a solution to generate
>>> multiple plans depending on user-id, and choose a proper one in
>>> executor stage.
>>>
>>> Which type of implementation is what everybody is asking for?
>>
>> I think you need to
>>
>>  a) Determine the user-id at planning time, and insert the matching
>>    RLS clause
>>
>> b1) Either re-plan the query if the user-id changes between planning
>>    and execution time, which means making the user-id a part of the
>>    plan-cache key.
>>
>> b2) Or decree that for RLS purposes, it's the user-id at planning time,
>>    not execution time, that counts.
>
> Or b3, flag plans that depend on the user ID inside the plan-cache,
> and just flush all of those (but only those) when the user ID changes.
>  In the common case where RLS is not used, that might ease the sting.
>
Probably, PlannedStmt->invalItems allows to handle invalidation of
plan-cache without big code changes. I'll try to put a flag of user-id
to track the query plan with RLS assumed, or InvalidOid if no RLS
was applied in this plan.
I'll investigate the implementation for more details.

Do we have any other scenario that run a query plan under different
user privilege rather than planner stage?

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Reporting hba lines

2012-06-27 Thread Magnus Hagander

When debugging strange and complex pg_hba lines, it can often be quite
useful to know which line is matching a particular connection that
failed for some reason. Because more often than not, it's actually not
using the line in pg_hba.conf that's expected.

The easiest way to do this is to emit an errdetail for the login
failure, per this patch.

Question is - is that leaking information to the client that we
shouldn't be leaking?

And if it is, what would be the preferred way to deal with it? We
could put that as a detail to basically every single error message
coming out of the auth system, but that seems like a bad idea. Or we
could make a separate ereport(LOG) before send it to the client,
perhaps?

Thoughts?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


hba_line.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend for same-role

2012-06-27 Thread Robert Haas

On Wed, Jun 27, 2012 at 8:13 AM, Magnus Hagander  wrote:
>> I went ahead and committed this.
>>
>> I kinda think we should back-patch this into 9.2.  It doesn't involve
>> a catalog change, and would make the behavior consistent between the
>> two releases, instead of changing in 9.1 and then changing again in
>> 9.2.  Thoughts?
>
> +1.

Three votes with no objections is good enough for me, so, done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.3] Row-Level Security

2012-06-27 Thread Robert Haas

On Wed, Jun 27, 2012 at 7:21 AM, Florian Pflug  wrote:
> On Jun27, 2012, at 07:18 , Kohei KaiGai wrote:
>> The problem is the way to implement it.
>> If we would have permission checks on planner stage, it cannot handle
>> a case when user-id would be switched prior to executor stage, thus
>> it needs something remedy to handle the scenario correctly.
>> Instead of a unique plan per query, it might be a solution to generate
>> multiple plans depending on user-id, and choose a proper one in
>> executor stage.
>>
>> Which type of implementation is what everybody is asking for?
>
> I think you need to
>
>  a) Determine the user-id at planning time, and insert the matching
>    RLS clause
>
> b1) Either re-plan the query if the user-id changes between planning
>    and execution time, which means making the user-id a part of the
>    plan-cache key.
>
> b2) Or decree that for RLS purposes, it's the user-id at planning time,
>    not execution time, that counts.

Or b3, flag plans that depend on the user ID inside the plan-cache,
and just flush all of those (but only those) when the user ID changes.
 In the common case where RLS is not used, that might ease the sting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Lazy hashaggregate when no aggregation is needed

2012-06-27 Thread Ants Aasma

Sorry about the delay in answering. I have been swamped with non-PG
related things lately.

On Tue, Jun 26, 2012 at 11:08 PM, Robert Haas  wrote:
> On Fri, Jun 22, 2012 at 10:12 AM, Robert Haas  wrote:
>> On Tue, Jun 19, 2012 at 5:41 AM, Etsuro Fujita
>>  wrote:
 I'm confused by this remark, because surely the query planner does it this
 way only if there's no LIMIT.  When there is a LIMIT, we choose based on
 the startup cost plus the estimated fraction of the total cost we expect
 to pay based on dividing the LIMIT by the overall row count estimate.  Or
 is this not what you're talking about?

My reasoning was that query_planner returns the cheapest-total path
and cheapest fractional presorted (by the aggregation pathkeys). When
evaluating hash-aggregates with this patch these two are indeed
compared considering the esimated fraction of the total cost, but this
might miss cheapest fractional unordered path for lazy hashaggregates.

Reviewing the code now I discovered this path could be picked out from
the pathlist, just like it is done by
get_cheapest_fractional_path_for_pathkeys when pathkeys is nil. This
would need to be returned in addition to the other two paths. To
minimize overhead, this should only be done when we possibly want to
consider lazy hash-aggregation (there is a group clause with no
aggregates and grouping is hashable) But this is starting to get
pretty crufty considering that there doesn't seem to be any really
compelling usecases for this.

> Ants, do you intend to update this patch for this CommitFest?  Or at
> all?  It seems nobody's too excited about this, so I'm not sure
> whether it makes sense for you to put more work on it.  But please
> advise as to your plans.

If anyone thinks that this patch might be worth considering, then I'm
prepared to do minor cleanup this CF (I saw some possibly unnecessary
cruft in agg_fill_hash_and_retrieve). On the other hand, if you think
the use case is too marginal to consider for inclusion then I won't
shed a tear if this gets rejected. For me this was mostly a learning
experience for poking around in the planner.

Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.3] Row-Level Security

2012-06-27 Thread Kohei KaiGai

2012/6/27 Florian Pflug :
> On Jun27, 2012, at 07:18 , Kohei KaiGai wrote:
>> The problem is the way to implement it.
>> If we would have permission checks on planner stage, it cannot handle
>> a case when user-id would be switched prior to executor stage, thus
>> it needs something remedy to handle the scenario correctly.
>> Instead of a unique plan per query, it might be a solution to generate
>> multiple plans depending on user-id, and choose a proper one in
>> executor stage.
>>
>> Which type of implementation is what everybody is asking for?
>
> I think you need to
>
>  a) Determine the user-id at planning time, and insert the matching
>    RLS clause
>
> b1) Either re-plan the query if the user-id changes between planning
>    and execution time, which means making the user-id a part of the
>    plan-cache key.
>
> b2) Or decree that for RLS purposes, it's the user-id at planning time,
>    not execution time, that counts.
>
My preference is b1, because b2 approach takes user visible changes
in concepts of permission checks.

Probably, plan-cache should be also invalidated when user's property
was modified or grant/revoke is issued, in addition to the table itself.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_terminate_backend for same-role

2012-06-27 Thread Magnus Hagander

On Tue, Jun 26, 2012 at 10:58 PM, Robert Haas  wrote:
> On Thu, Mar 29, 2012 at 3:04 AM, Daniel Farina  wrote:
>> On Mon, Mar 26, 2012 at 12:14 AM, Jeff Davis  wrote:
>>> On Tue, 2012-03-20 at 01:38 -0700, Daniel Farina wrote:
 On Thu, Mar 15, 2012 at 9:39 PM, Fujii Masao  wrote:
 > On Fri, Mar 16, 2012 at 8:14 AM, Daniel Farina  wrote:
 >> Parallel to pg_cancel_backend, it'd be nice to allow the user to just
 >> outright kill a backend that they own (politely, with a SIGTERM),
 >> aborting any transactions in progress, including the idle transaction,
 >> and closing the socket.
 >
 > +1

 Here's a patch implementing the simple version, with no more guards
 against signal racing than have been seen previously.  The more
 elaborate variants to close those races is being discussed in a
 parallel thread, but I thought I'd get this simple version out there.
>>>
>>> Review:
>>>
>>> After reading through the threads, it looks like there was no real
>>> objection to this approach -- pid recycling is not something we're
>>> concerned about.
>>>
>>> I think you're missing a doc update though, in func.sgml:
>>>
>>> "For the less restrictive pg_cancel_backend, the role of an
>>> active backend can be found from
>>> the usename column of the
>>> pg_stat_activity view."
>>>
>>> Also, high-availability.sgml makes reference to pg_cancel_backend(), and
>>> it might be worthwhile to add an "...and pg_terminate_backend()" there.
>>>
>>> Other than that, it looks good to me.
>>
>> Good comments. Patch attached to address the doc issues.  The only
>> iffy thing is that the paragraph "For the less restrictive..." I have
>> opted to remove in its entirely.  I think the documents are already
>> pretty clear about the same-user rule, and I'm not sure if this is the
>> right place for a crash-course on attributes in pg_stat_activity (but
>> maybe it is).
>>
>> "...and pg_terminate_backend" seems exactly right.
>>
>> And I think now that the system post-patch doesn't have such a strange
>> contrast between the ability to send SIGINT vs. SIGTERM, such a
>> contrast may not be necessary.
>>
>> I'm also not sure what the policy is about filling paragraphs in the
>> manual.  I filled one, which increases the sgml churn a bit. git
>> (show|diff) --word-diff helps clean it up.
>
> I went ahead and committed this.
>
> I kinda think we should back-patch this into 9.2.  It doesn't involve
> a catalog change, and would make the behavior consistent between the
> two releases, instead of changing in 9.1 and then changing again in
> 9.2.  Thoughts?

+1.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpython issue with Win64 (PG 9.2)

2012-06-27 Thread Jan Urbański


On 27/06/12 11:51, Asif Naeem wrote:

Hi,

On Windows 7 64bit, plpython is causing server crash with the following
test case i.e.

CREATE PROCEDURAL LANGUAGE 'plpython3u';

CREATE OR REPLACE FUNCTION pymax (a integer, b integer)
   RETURNS integer
AS $$
   if a>  b:
 return a
   return b
$$ LANGUAGE plpython3u;
SELECT pymax(1, 2);




I think primary reason that trigger this issue is when Function
PLyUnicode_Bytes() calls "PyUnicode_AsEncodedString( ,WIN1252 /*Server
encoding*/, ) " it fails with null. I built latest pg 9.2 source code with
python 3.2.2.3 by using Visual Studio 2010. Thanks.


I'll try to reproduce this on Linux, which should be possible given the 
results of your investigation.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Magnus Hagander

On Wed, Jun 27, 2012 at 3:50 AM, Tom Lane  wrote:
> "A.M."  writes:
>> On 06/26/2012 07:30 PM, Tom Lane wrote:
 I solved this via fcntl locking.
>
>>> No, you didn't, because fcntl locks aren't inherited by child processes.
>>> Too bad, because they'd be a great solution otherwise.
>
>> You claimed this last time and I replied:
>> http://archives.postgresql.org/pgsql-hackers/2011-04/msg00656.php
>
>> "I address this race condition by ensuring that a lock-holding violator
>> is the postmaster or a postmaster child. If such as condition is
>> detected, the child exits immediately without touching the shared
>> memory. POSIX shmem is inherited via file descriptors."
>
>> This is possible because the locking API allows one to request which PID
>> violates the lock. The child expects the lock to be held and checks that
>> the PID is the parent. If the lock is not held, that means that the
>> postmaster is dead, so the child exits immediately.
>
> OK, I went back and re-read the original patch, and I now agree that
> something like this is possible --- but I don't like the way you did
> it. The dependence on particular PIDs seems both unnecessary and risky.
>
> The key concept here seems to be that the postmaster first stakes a
> claim on the data directory by exclusive-locking a lock file.  If
> successful, it reduces that lock to shared mode (which can be done
> atomically, according to the SUS fcntl specification), and then holds
> the shared lock until it exits.  Spawned children will not initially
> have a lock, but what they can do is attempt to acquire shared lock on
> the lock file.  If fail, exit.  If successful, *check to see that the
> parent postmaster is still alive* (ie, getppid() != 1).  If so, the
> parent must have been continuously holding the lock, and the child has
> successfully joined the pool of shared lock holders.  Otherwise bail
> out without having changed anything.  It is the "parent is still alive"
> check, not any test on individual PIDs, that makes this work.
>
> There are two concrete reasons why I don't care for the
> GetPIDHoldingLock() way.  Firstly, the fact that you can get a blocking
> PID from F_GETLK isn't an essential part of the concept of file locking
> IMO --- it's just an incidental part of this particular API.  May I
> remind you that the reason we're stuck on SysV shmem in the first place
> is that we decided to depend on an incidental part of that API, namely
> nattch?  I would like to not require file locking to have any semantics
> more specific than "a process can hold an exclusive or a shared lock on
> a file, which is auto-released at process exit".  Secondly, in an NFS
> world I don't believe that the returned l_pid value can be trusted for
> anything.  If it's a PID from a different machine then it might
> accidentally conflict with one on our machine, or not.
>
> Reflecting on this further, it seems to me that the main remaining
> failure modes are (1) file locking doesn't work, or (2) idiot DBA
> manually removes the lock file.  Both of these could be ameliorated
> with some refinements to the basic idea.  For (1), I suggest that
> we tweak the startup process (only) to attempt to acquire exclusive lock
> on the lock file.  If it succeeds, we know that file locking is broken,
> and we can complain.  (This wouldn't help for cases where cross-machine
> locking is broken, but I see no practical way to detect that.)
> For (2), the problem really is that the proposed patch conflates the PID
> file with the lock file, but people are conditioned to think that PID
> files are removable.  I suggest that we create a separate, permanently
> present file that serves only as the lock file and doesn't ever get
> modified (it need have no content other than the string "Don't remove
> this!").  It'd be created by initdb, not by individual postmaster runs;
> indeed the postmaster should fail if it doesn't find the lock file
> already present.  The postmaster PID file should still exist with its
> current contents, but it would serve mostly as documentation and as
> server-contact information for pg_ctl; it would not be part of the data
> directory locking mechanism.
>
> I wonder whether this design can be adapted to Windows?  IIRC we do
> not have a bulletproof data directory lock scheme for Windows.
> It seems like this makes few enough demands on the lock mechanism
> that there ought to be suitable primitives available there too.

I assume you're saying we need to make changes in the internal API,
right? Because we alreayd have a windows native implementation of
shared memory that AFAIK works, so if the new Unix stuff can be done
with the same internal APIs, it shouldn't nede to be changed. (Sorry,
haven't followed the thread in detail)

If so - can we define exactly what properties it is we *need*?

(A native API worth looking at is e.g.
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365203(v=vs.85).aspx
- but there are probably others as well if that one

Re: [HACKERS] proof concept - access to session variables on client side

2012-06-27 Thread Pavel Stehule

2012/6/27 Tom Lane :
> Pavel Stehule  writes:
>>> Another thing I don't care for is the unannounced protocol extension.
>
>> yes, it needs protocol extension and increasing version too. But I
>> don't afraid about dissynchronisation - server doesn't send 'v'
>> message when client doesn't support it.
>
> And you would know that how, exactly?

minor version of protocol can be used

http://archives.postgresql.org/pgsql-hackers/2011-12/msg00025.php

I don't know if this topic is done, I only remember this thread

Regards

Pavel Stehule

>
>                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Posix Shared Mem patch

2012-06-27 Thread Robert Haas

On Wed, Jun 27, 2012 at 12:00 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> So, here's a patch.  Instead of using POSIX shmem, I just took the
>> expedient of using mmap() to map a block of MAP_SHARED|MAP_ANONYMOUS
>> memory.  The sysv shm is still allocated, but it's just a copy of
>> PGShmemHeader; the "real" shared memory is the anonymous block.  This
>> won't work if EXEC_BACKEND is defined so it just falls back on
>> straight sysv shm in that case.
>
> Um.  I hadn't thought about the EXEC_BACKEND interaction, but that seems
> like a bit of a showstopper.  I would not like to give up the ability
> to debug EXEC_BACKEND mode on Unixen.
>
> Would Posix shmem help with that at all?  Why did you choose not to
> use the Posix API, anyway?

It seemed more complicated.  If we use the POSIX API, we've got to
have code to find a non-colliding name for the shm, and we've got to
arrange to clean it up at process exit.  Anonymous shm doesn't require
a name and goes away automatically when it's no longer in use.

With respect to EXEC_BACKEND, I wasn't proposing to kill it, just to
make it continue to use a full-sized sysv shm.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.3] Row-Level Security

2012-06-27 Thread Florian Pflug

On Jun27, 2012, at 07:18 , Kohei KaiGai wrote:
> The problem is the way to implement it.
> If we would have permission checks on planner stage, it cannot handle
> a case when user-id would be switched prior to executor stage, thus
> it needs something remedy to handle the scenario correctly.
> Instead of a unique plan per query, it might be a solution to generate
> multiple plans depending on user-id, and choose a proper one in
> executor stage.
> 
> Which type of implementation is what everybody is asking for?

I think you need to

 a) Determine the user-id at planning time, and insert the matching
RLS clause

b1) Either re-plan the query if the user-id changes between planning
and execution time, which means making the user-id a part of the
plan-cache key.

b2) Or decree that for RLS purposes, it's the user-id at planning time,
not execution time, that counts.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 01/16] Overhaul walsender wakeup handling

2012-06-27 Thread Andres Freund

On Tuesday, June 26, 2012 04:06:08 PM Andres Freund wrote:
> On Tuesday, June 26, 2012 04:01:26 PM Robert Haas wrote:
> > On Fri, Jun 22, 2012 at 12:35 PM, Andres Freund 
> 
> wrote:
> > >> Can you elaborate on that a bit?  What scenarios did you play around
> > >> with, and what does "win" mean in this context?
> > > 
> > > I had two machines connected locally and setup HS and my prototype
> > > between them (not at once obviously).
> > > The patch reduced all the average latency between both nodes (measured
> > > by 'ticker' rows arriving in a table on the standby), the jitter in
> > > latency and the amount of load I had to put on the master before the
> > > standby couldn't keep up anymore.
> > > 
> > > I played with different loads:
> > > * multple concurrent ~50MB COPY's
> > > * multple concurrent ~50MB COPY's, pgbench
> > > * pgbench
> > > 
> > > All three had a ticker running concurrently with synchronous_commit=off
> > > (because it shouldn't cause any difference in the replication pattern
> > > itself).
> > > 
> > > The difference in averagelag and cutoff were smallest with just pgbench
> > > running alone and biggest with COPY running alone. Highjitter was most
> > > visible with just pgbench running alone but thats likely just because
> > > the average lag was smaller.
> > 
> > OK, that sounds pretty promising.  I'd like to run a few performance
> > tests on this just to convince myself that it doesn't lead to a
> > significant regression in other scenarios.  Assuming that doesn't turn
> > up anything major, I'll go ahead and commit this.
> 
> Independent testing would be great, its definitely possible that I oversaw
> something although I obviously don't think so ;).
> 
> > Can you provide a rebased version?  It seems that one of the hunks in
> > xlog.c no longer applies.
> 
> Will do so. Not sure if I can finish it today though, I am in the midst of
> redoing the ilist and xlogreader patches. I guess tomorrow will suffice
> otherwise...
Ok, attached are two patches:
The first is the rebased version of the original patch with 
WalSndWakeupProcess renamed to WalSndWakeupProcessRequests (seems clearer).

The second changes WalSndWakeupRequest and WalSndWakeupProcessRequests into 
macros as you requested before. I am not sure if its a good idea or not.

Anything else?

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 164276afc60fa2451f28de996fcc54567e6168e2 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 29 May 2012 20:00:16 +0200
Subject: [PATCH 1/2] Overhaul walsender wakeup handling

The previous coding could miss xlog writeouts at several places. E.g. when wal
was written out by the background writer or even after a commit if
synchronous_commit=off.
This could lead to delays in sending data to the standby of up to 7 seconds.

To fix this move the responsibility of notification to the layer where the
neccessary information is actually present. We take some care not to do the
notification while we hold conteded locks like WALInsertLock or WalWriteLock
locks.

Document the preexisting fact that we rely on SetLatch to be safe from within
signal handlers and critical sections.

This removes the temporary bandaid from 2c8a4e9be2730342cbca85150a2a9d876aa77ff6
---
 src/backend/access/transam/twophase.c |   21 -
 src/backend/access/transam/xact.c |7 --
 src/backend/access/transam/xlog.c |   25 ++--
 src/backend/port/unix_latch.c |3 +++
 src/backend/port/win32_latch.c|4 
 src/backend/replication/walsender.c   |   41 -
 src/include/replication/walsender.h   |2 ++
 7 files changed, 67 insertions(+), 36 deletions(-)

diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e8fb78b..7f198c2 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1042,13 +1042,6 @@ EndPrepare(GlobalTransaction gxact)
 
 	/* If we crash now, we have prepared: WAL replay will fix things */
 
-	/*
-	 * Wake up all walsenders to send WAL up to the PREPARE record immediately
-	 * if replication is enabled
-	 */
-	if (max_wal_senders > 0)
-		WalSndWakeup();
-
 	/* write correct CRC and close file */
 	if ((write(fd, &statefile_crc, sizeof(pg_crc32))) != sizeof(pg_crc32))
 	{
@@ -2045,13 +2038,6 @@ RecordTransactionCommitPrepared(TransactionId xid,
 	/* Flush XLOG to disk */
 	XLogFlush(recptr);
 
-	/*
-	 * Wake up all walsenders to send WAL up to the COMMIT PREPARED record
-	 * immediately if replication is enabled
-	 */
-	if (max_wal_senders > 0)
-		WalSndWakeup();
-
 	/* Mark the transaction committed in pg_clog */
 	TransactionIdCommitTree(xid, nchildren, children);
 
@@ -2133,13 +2119,6 @@ RecordTransactionAbortPrepared(TransactionId xid,
 	XLogFlush(recptr);
 
 	/*
-	 * Wake up all walsenders to send WAL up to the ABORT PREPA

Re: [HACKERS] experimental: replace s_lock spinlock code with pthread_mutex on linux

2012-06-27 Thread Nils Goroll

>> Using futexes directly could be even cheaper.
> Note that below this you only have the futex(2) system call.
I was only referring to the fact that we could save one function and one library
call, which could make a difference for the uncontended case.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] plpython issue with Win64 (PG 9.2)

2012-06-27 Thread Asif Naeem

Hi,

On Windows 7 64bit, plpython is causing server crash with the following
test case i.e.

CREATE PROCEDURAL LANGUAGE 'plpython3u';
> CREATE OR REPLACE FUNCTION pymax (a integer, b integer)
>   RETURNS integer
> AS $$
>   if a > b:
> return a
>   return b
> $$ LANGUAGE plpython3u;
> SELECT pymax(1, 2);


Server exit with the following exception i.e.

 Unhandled exception at 0x777a3483 in postgres.exe: 0xC0FD: Stack
overflow.

  plpython3.dll!PLy_traceback(char * * xmsg, char * * tbmsg, int *
tb_depth)  Line 174 C
  plpython3.dll!PLy_elog(int elevel, const char * fmt, ...)  Line 67
  C
  plpython3.dll!PLyUnicode_AsString(_object * unicode)  Line 96 C
  plpython3.dll!PLy_traceback(char * * xmsg, char * * tbmsg, int *
tb_depth)  Line 176 + 0x8 bytes C
  plpython3.dll!PLy_elog(int elevel, const char * fmt, ...)  Line 67
  C
  plpython3.dll!PLyUnicode_AsString(_object * unicode)  Line 96 C
  ...
  ...
  plpython3.dll!PLy_traceback(char * * xmsg, char * * tbmsg, int *
tb_depth)  Line 176 + 0x8 bytes C
  plpython3.dll!PLy_elog(int elevel, const char * fmt, ...)  Line 67
  C
  plpython3.dll!PLyUnicode_AsString(_object * unicode)  Line 96 C
  plpython3.dll!PLy_traceback(char * * xmsg, char * * tbmsg, int *
tb_depth)  Line 176 + 0x8 bytes C

Dbserver get stuck in the following call loop i.e.
... PLy_elog() -> PLy_traceback() -> PLyUnicode_AsString() ->
PLyUnicode_Bytes() -> PLy_elog() ...

I think primary reason that trigger this issue is when Function
PLyUnicode_Bytes() calls "PyUnicode_AsEncodedString( ,WIN1252 /*Server
encoding*/, ) " it fails with null. I built latest pg 9.2 source code with
python 3.2.2.3 by using Visual Studio 2010. Thanks.

Best Regards,
Muhammad Asif Naeem

Re: [HACKERS] new --maintenance-db options

2012-06-27 Thread Amit Kapila


Tom Lane  writes:
Amit Kapila  writes:
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
>>> The implementation I've wanted to see for some time is that you can
>>> start a standalone backend, but it speaks FE/BE protocol to its caller
>>> (preferably over pipes, so that there is no issue whatsoever of where
>>> you can securely put a socket or anything like that).  

>> Can't it be done like follow the FE/BE protocol, but call directly the
>> server API's 
>> at required places. 

> That wouldn't be easier, nor cleaner, and it would open us up to
> client-induced database corruption (from failure to follow APIs, crashes
> in the midst of an operation, memory stomps, etc).  We decided long ago
> that we would never support truly embedded operation in the sense of PG
> executing in the client's process/address space.  

Okay.

> I like the design
> suggested above because it has many of the good properties of an
> embedded database (in particular, no need to manage or contact a server)
> but still keeps the client code at arm's length.

In such a case will that standalone backend manage other processes like (wal
writer, checkpoint, ...) or no background processes like in current --single
mode.

Can there be any performance advantage also in such a mode as compare to
current when client and server on same m/c and uses Domain Socket?


With Regards,
Amit Kapila.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

2012-06-27 Thread Joel Jacobson

Robert, thank you for keeping this thread alive.

Hopefully some more will join the discussion.

I'm still hopeful the community can manage to agree upon acceptable
tradeoffs and work-arounds to make this possible.

I think the benefits clearly outweighs the minor issues of filenames,
dumping order, etc.

On Tue, Jun 26, 2012 at 6:04 PM, Robert Haas  wrote:

> I don't think either of these problems ought to be a complete
> show-stopper.  It seems to me that the trade-off is that when object
> names are long, contain special characters, or are overloaded, we'll
> have to munge the names in some way to prevent collisions.  That could
> mean that the names are not 100% stable, which would possibly produce
> some annoyance if you're using a VCS to track changes, but maybe
> that's an acceptable trade-off, because it shouldn't happen very
> often.  If we could guararantee that identifiers less than 64
> characters which are not overloaded and contain no special characters
> requiring quoting end up in an eponymous file, I think that would be
> good enough to make most of our users pretty happy.  In other cases, I
> think the point would just be to make it work (with a funny name)
> rather than fail.
>

I agree. It's not a problem if the filename is not identical to the name of
the object, as long as the same name generates the same filename on
all architectures. Url escaping would work, but converting all non-ascii
characters to ascii would be nicer, and dropping any problematic characters,
or replacing them with "_" or any other suitable character.

For the small fraction of users how have managed to find a good reason
to name a function "this/is\a/good.name/of/a\function.." the filename
of such a function would be "this_is_a_good_name_of_a_function".

As long as the objects are dumped in the same order, there will be no
merge problems when two developers commit changes of the same
file. I think pg_dump does a reasonable job already making sure the order is
always the same. How big is the problem, really?

It would of course be a little easier to keep track of changes and do
merging
if all overloaded functions would be kept in separate files, but I see that
as a
minor feature request. As long as all objects with the same name are kept in
separate files, that's good enough for my needs, and I have _a lot_ of
functions,
whereof quite a few are overloaded.

>
> > \i /home/postgres/database/gluepay-split/aml/FUNCTION/get_linkid.sql
> > \i /home/postgres/database/gluepay-split/aml/FUNCTION/set_address.sql
>
> It would be better to use \ir here rather than hard-code path names, I
> think.  Then you'd only need to require that all the files be in the
> same directory, rather than requiring them to be at a certain
> hard-coded place in the filesystem.
>

I fully agree!
I didn't know about the \ir feature.

Best regards,

Joel Jacobson

Re: [HACKERS] experimental: replace s_lock spinlock code with pthread_mutex on linux

2012-06-27 Thread Martijn van Oosterhout

On Wed, Jun 27, 2012 at 12:58:47AM +0200, Nils Goroll wrote:
> So it looks like using pthread_mutexes could at least be an option on Linux.
> 
> Using futexes directly could be even cheaper.

Note that below this you only have the futex(2) system call. Futexes
require all counter manipulation to happen in userspace, just like now,
so all the per architecture stuff remains.  On Linux pthread mutexes
are really just a thin wrapper on top of this.

The futex(2) system call merely provides an interface for handling the
blocking and waking of other processes and releasing locks on process
exit (so everything can still work after a kill -9).

So it's more a replacement for the SysV semaphores than anything else.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer

signature.asc
Description: Digital signature

72 matches

Mail list logo