date:20141022

Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2014-10-22 Thread Martijn van Oosterhout

On Wed, Oct 22, 2014 at 09:36:59PM +0200, Dag-Erling Smørgrav wrote:
> Martijn van Oosterhout  writes:
> > Dag-Erling Smørgrav  writes:
> > > If I understand correctly, imaps has been shown to be vulnerable as
> > > well, so I wouldn't be so sure.
> > Reference?
> 
> Sorry, no reference.  I was told that Thunderbird was vulnerable to
> POODLE when talking imaps.

Ugh, found it. It does the same connection fallback stuff as firefox.

https://securityblog.redhat.com/2014/10/20/can-ssl-3-0-be-fixed-an-analysis-of-the-poodle-attack/

> > Since you can already specify the cipher list, couldn't you just add
> > -SSLv3 to the cipher list and be done?
> 
> I didn't want to change the existing behavior; all I wanted was to give
> users a way to do so if they wish.

I think we should just disable SSL3.0 altogether. The only way this
could cause problems is if people are using PostgreSQL with an OpenSSL
library from last century.  As for client libraries, even Windows XP
supports TLS1.0.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)

2014-10-22 Thread Fujii Masao

On Mon, Oct 20, 2014 at 3:26 PM, Michael Paquier
 wrote:
> On Fri, Oct 17, 2014 at 10:37 PM, Michael Paquier
>  wrote:
>>
>> On Fri, Oct 17, 2014 at 9:23 PM, Fujii Masao  wrote:
>>>
>>> In this case, the patch seems to make the restartpoint recycle even WAL 
>>> files
>>> which have .ready files and will have to be archived later. Thought?
>>
>> The real problem currently is that it is possible to have a segment file not 
>> marked as .done during recovery when stream connection is abruptly cut when 
>> this segment is switched, marking it as .ready in archive_status and simply 
>> letting this segment in pg_xlog because it will neither be recycled nor 
>> removed. I have not been able to look much at this code these days, so I am 
>> not sure how invasive it would be in back-branches, but perhaps we should 
>> try to improve code such as when a segment file is switched and connection 
>> to the is cut, we guarantee that this file is completed and marked as .done.
>
> I have spent more time on that, with a bit more of underground work...
> First, the problem can be reproduced most of the time by running this
> simple command:
> psql -c 'select pg_switch_xlog()'; pg_ctl restart -m immediate

What about fixing this problem directly? That is, we can make walreceiver
check whether the end of last received WAL data is the end of current WAL file
or not, and then close the WAL file and create .done file if the test is true.

This is not a perfect solution. If the standby crashes during very
short interval
(i.e., after closing the WAL file and before creating .done file), the problem
would happen again. But it can really rarely happen, so I don't think that it's
worth fixing the corner case at least in back-branches. Of course, we can
find out the "perfect" solution for the master, though.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function array_agg(array)

2014-10-22 Thread Pavel Stehule

2014-10-23 4:00 GMT+02:00 Ali Akbar :

> 2014-10-22 22:48 GMT+07:00 Pavel Stehule :
>
>> 2014-10-22 16:58 GMT+02:00 Ali Akbar :
>>
>>> Thanks for the review
>>>
>>> 2014-10-22 20:51 GMT+07:00 Pavel Stehule :
>>>
 I agree with your proposal. I have a few comments to design:

>>>
 1. patch doesn't hold documentation and regress tests, please append it.

>>> OK, i'll add the documentation and regression test
>>>
>>>
 2. this functionality (multidimensional aggregation) can be interesting
 more times, so maybe some interface like array builder should be preferred.

>>> We already have accumArrayResult and makeArrayResult/makeMdArrayResult,
>>> haven't we?
>>>
>>> Actually array_agg(anyarray) can be implemented by using
>>> accumArrayResult and makeMdArrayResult, but in the process we will
>>> deconstruct the array data and null bit-flags into ArrayBuildState->dvalues
>>> and dnulls. And we will reconstruct it through makeMdArrayResult. I think
>>> this way it's not efficient, so i decided to reimplement it and memcpy the
>>> array data and null flags as-is.
>>>
>>
>> aha, so isn't better to fix a performance for accumArrayResult ?
>>
>
> Ok, i'll go this route. While reading the code of array(subselect) as you
> pointed below, i think the easiest way to implement support for both
> array_agg(anyarray) and array(subselect) is to change accumArrayResult so
> it operates both with scalar datum(s) and array datums, with performance
> optimization for the latter.
>
> In other places, i think it's clearer if we just use accumArrayResult and
>>> makeArrayResult/makeMdArrayResult as API for building (multidimensional)
>>> arrays.
>>>
>>>
 3. array_agg was consistent with array(subselect), so it should be
 fixed too

 postgres=# select array_agg(a) from test;
array_agg
 ---
  {{1,2,3,4},{1,2,3,4}}
 (1 row)

 postgres=# select array(select a from test);
 ERROR:  could not find array type for data type integer[]

>>>
>>> I'm pretty new in postgresql development. Can you point out where is
>>> array(subselect) implemented?
>>>
>>
>> where you can start?
>>
>> postgres=# explain select array(select a from test);
>> ERROR:  42704: could not find array type for data type integer[]
>> LOCATION:  exprType, nodeFuncs.c:116
>>
>> look to code ... your magic keyword is a ARRAY_SUBLINK .. search in
>> postgresql sources this keyword
>>
>
> Found it. Thanks.
>
> On a side note in postgres array type for integer[] is still integer[],
> but in pg_type, integer[] has no array type. I propose we consider to
> change it so array type of anyarray is itself (not in this patch, of
> course, because it is a big change). Consider the following code in
> nodeFuncs.c:109
>

yes, it is true - this is really big change and maybe needs separate
discuss - ***if we allow cycle there. I am not sure about possible side
effects***.

Maybe this change is not necessary, you can fix a check only ... "if type
is not array or if get_array_type is null raise a error"


>
> if (sublink->subLinkType == ARRAY_SUBLINK)
> {
> type = get_array_type(type);
> if (!OidIsValid(type))
> ereport(ERROR,
> (errcode(ERRCODE_UNDEFINED_OBJECT),
>  errmsg("could not find array type for data type %s",
> format_type_be(exprType((Node *) tent->expr);
> }
>
> to implement array(subselect) you pointed above, we must specially checks
> for array types. Quick search on get_array_type returns 10 places.
>

attention: probably we don't would to allow arrays everywhere.


>
> I'll think more about this later. For this patch, i'll go without changes
> in pg_type.h.
>
> 4. why you use a magic constant (64) there?

 + astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes);
 + astate->aitems = 64 * nitems;

 + astate->nullbitmap = (bits8 *)
 + repalloc(astate->nullbitmap, (astate->aitems + 7) /
 8);

>>>
>>> just follow the arbitrary size choosen in accumArrayResult
>>> (arrayfuncs.c):
>>> astate->alen = 64; /* arbitrary starting array size */
>>>
>>> it can be any number not too small and too big. Too small, and we will
>>> realloc shortly. Too big, we will end up wasting memory.
>>>
>>
>> you can try to alloc 1KB instead as start -- it is used more times in
>> Postgres. Then a overhead is max 1KB per agg call - what is acceptable.
>>
>> You take this value from accumArrayResult, but it is targeted for shorted
>> scalars - you should to expect so any array will be much larger.
>>
>
> Ok.
>
>
> Regards,
> --
> Ali Akbar
>

Re: [HACKERS] Function array_agg(array)

2014-10-22 Thread Ali Akbar

2014-10-22 22:48 GMT+07:00 Pavel Stehule :

> 2014-10-22 16:58 GMT+02:00 Ali Akbar :
>
>> Thanks for the review
>>
>> 2014-10-22 20:51 GMT+07:00 Pavel Stehule :
>>
>>> I agree with your proposal. I have a few comments to design:
>>>
>>
>>> 1. patch doesn't hold documentation and regress tests, please append it.
>>>
>> OK, i'll add the documentation and regression test
>>
>>
>>> 2. this functionality (multidimensional aggregation) can be interesting
>>> more times, so maybe some interface like array builder should be preferred.
>>>
>> We already have accumArrayResult and makeArrayResult/makeMdArrayResult,
>> haven't we?
>>
>> Actually array_agg(anyarray) can be implemented by using accumArrayResult
>> and makeMdArrayResult, but in the process we will deconstruct the array
>> data and null bit-flags into ArrayBuildState->dvalues and dnulls. And we
>> will reconstruct it through makeMdArrayResult. I think this way it's not
>> efficient, so i decided to reimplement it and memcpy the array data and
>> null flags as-is.
>>
>
> aha, so isn't better to fix a performance for accumArrayResult ?
>

Ok, i'll go this route. While reading the code of array(subselect) as you
pointed below, i think the easiest way to implement support for both
array_agg(anyarray) and array(subselect) is to change accumArrayResult so
it operates both with scalar datum(s) and array datums, with performance
optimization for the latter.

In other places, i think it's clearer if we just use accumArrayResult and
>> makeArrayResult/makeMdArrayResult as API for building (multidimensional)
>> arrays.
>>
>>
>>> 3. array_agg was consistent with array(subselect), so it should be fixed
>>> too
>>>
>>> postgres=# select array_agg(a) from test;
>>>array_agg
>>> ---
>>>  {{1,2,3,4},{1,2,3,4}}
>>> (1 row)
>>>
>>> postgres=# select array(select a from test);
>>> ERROR:  could not find array type for data type integer[]
>>>
>>
>> I'm pretty new in postgresql development. Can you point out where is
>> array(subselect) implemented?
>>
>
> where you can start?
>
> postgres=# explain select array(select a from test);
> ERROR:  42704: could not find array type for data type integer[]
> LOCATION:  exprType, nodeFuncs.c:116
>
> look to code ... your magic keyword is a ARRAY_SUBLINK .. search in
> postgresql sources this keyword
>

Found it. Thanks.

On a side note in postgres array type for integer[] is still integer[], but
in pg_type, integer[] has no array type. I propose we consider to change it
so array type of anyarray is itself (not in this patch, of course, because
it is a big change). Consider the following code in nodeFuncs.c:109

if (sublink->subLinkType == ARRAY_SUBLINK)
{
type = get_array_type(type);
if (!OidIsValid(type))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
 errmsg("could not find array type for data type %s",
format_type_be(exprType((Node *) tent->expr);
}

to implement array(subselect) you pointed above, we must specially checks
for array types. Quick search on get_array_type returns 10 places.

I'll think more about this later. For this patch, i'll go without changes
in pg_type.h.

4. why you use a magic constant (64) there?
>>>
>>> + astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes);
>>> + astate->aitems = 64 * nitems;
>>>
>>> + astate->nullbitmap = (bits8 *)
>>> + repalloc(astate->nullbitmap, (astate->aitems + 7) / 8);
>>>
>>
>> just follow the arbitrary size choosen in accumArrayResult (arrayfuncs.c):
>> astate->alen = 64; /* arbitrary starting array size */
>>
>> it can be any number not too small and too big. Too small, and we will
>> realloc shortly. Too big, we will end up wasting memory.
>>
>
> you can try to alloc 1KB instead as start -- it is used more times in
> Postgres. Then a overhead is max 1KB per agg call - what is acceptable.
>
> You take this value from accumArrayResult, but it is targeted for shorted
> scalars - you should to expect so any array will be much larger.
>

Ok.


Regards,
--
Ali Akbar

Re: [HACKERS] Simplify calls of pg_class_aclcheck when multiple modes are used

2014-10-22 Thread Peter Eisentraut

On 10/21/14 6:19 PM, Michael Paquier wrote:
> On Wed, Oct 22, 2014 at 5:03 AM, Peter Eisentraut  > wrote:
> 
> While looking at this, I wrote a few tests cases for sequence
> privileges, because that was not covered at all.  That patch is
> attached.
> 
> +1 for those tests.
> -- 
> Michael

Committed your patch and tests.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Moving of INT64_FORMAT to c.h

2014-10-22 Thread Steve Singer


On 10/22/2014 04:17 PM, Robert Haas wrote:



Slonik used to include postgre_fe.h but back in 2011 or so we stopped doing
so because it was causing issues (I think on win32 builds)

That seems like something we ought to consider fixing, but obviously
we'd need more details.

When I'll try to find sometime to get a windows environment running and 
try to figure out what the problems were.
It's also possible that I removed the postgres_fe include at the same 
time as I was fixing other win32 issues and the postgres_fe include 
wasn't causing a problem it was just removed because it was unnecessary.


At the moment slony only includes the server include dir if we are 
building with pgport (which we normally do on windows)
We have had reports of issues like described 
(http://www.slony.info/bugzilla/show_bug.cgi?id=315) where some installs 
make us pick up server and client includes from different versions of PG.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of "accept incoming network connections" prompts on OS X

2014-10-22 Thread Tom Lane

Robert Haas  writes:
> On Tue, Oct 21, 2014 at 1:16 PM, Tom Lane  wrote:
>> If you do any Postgres development on OS X, you've probably gotten
>> seriously annoyed by the way that, every single time you reinstall the
>> postmaster executable, you get a dialog box asking whether you'd like
>> to allow it to accept incoming network connections.

> Ugh.  This must be new in Mavericks, because I don't get any such
> behavior on 10.8.5.

Hm, I thought it went further back than that ... I remember having put up
with it for some time now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-22 Thread Joshua D. Drake





What about moving it the opposite direction, from contrib all the way
into core?  The main advantage of that in my book is that it might
make it easier to reduce the code duplication between pg_background
and exec_simple_query; the main disadvantage is that it reduces the
super-user's control, because now the functionality is always
available instead of being something that the super-user can choose to
install, or not.

But these are all judgement calls, of course.


I lean toward a push into core because:

1. "I expect that eventually all of this stuff will be
used in core - for parallelism."

2. Contrib is already bloated out. (which is a whole other discussion)

3. I am not sure I buy into the super-user argument. Just because the 
functionality is there, doesn't mean it has to be used.


4. "The main advantage of that in my book is that it might
make it easier to reduce the code duplication between pg_background
and exec_simple_query;"

JD



--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
 not be surprised when they come back as Romans."


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of "accept incoming network connections" prompts on OS X

2014-10-22 Thread Robert Haas

On Tue, Oct 21, 2014 at 1:16 PM, Tom Lane  wrote:
> If you do any Postgres development on OS X, you've probably gotten
> seriously annoyed by the way that, every single time you reinstall the
> postmaster executable, you get a dialog box asking whether you'd like
> to allow it to accept incoming network connections.  (At least, you
> do unless you disable the OS firewall, which is not a great idea.)
> It's particularly awful to run "make check-world" in this environment,
> because you get a pop-up for each test install.

Ugh.  This must be new in Mavericks, because I don't get any such
behavior on 10.8.5.

What an awful, awful behavior.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-22 Thread Robert Haas

On Wed, Oct 22, 2014 at 7:12 PM, Joshua D. Drake  wrote:
>> I don't think pg_background is merely a test program: I think it's a
>> quite useful piece of functionality.  It can be used for running
>> VACUUM from a procedure, which is something people have asked for more
>> than once, or for simulating an autonomous transaction.  Granted,
>> it'll be a lot slower than a real autonomous transaction, but it's
>> still better than doing it via dblink, because you don't have to futz
>> with authentication.
>
> I think this is a very useful program but I wonder if it makes more sense to
> push it out to pgxn? Why do we need an ever growing contrib? Further, if it
> is good enough to go into contrib, why isn't it just in core?

Sure, there's always this debate.  Certainly, there's no reason it has
to be in contrib rather than core or PGXN.  From my point of view,
putting it on PGXN would be a less-than-awesome development because
then the infrastructure that the patch introduces (patches 1-5 of the
series) would have no users in core or contrib, which means that (A)
if anyone is thinking of modifying that code in the future, they'll
have more difficulty knowing whether their changes break anything and
(B) if someone breaks something unwittingly, the buildfarm will not
tell us.

Now, admittedly, I expect that eventually all of this stuff will be
used in core - for parallelism.  But in the meantime I think having at
least one example in the PostgreSQL source tree of how each piece of
extensibility infrastructure that we have can be used is handy and
useful and generally something that we ought to encourage.  And this
one has the advantage - unlike some stuff in contrib - of actually
being somewhat useful for real work, which a decent amount of the
stuff in contrib isn't.

What about moving it the opposite direction, from contrib all the way
into core?  The main advantage of that in my book is that it might
make it easier to reduce the code duplication between pg_background
and exec_simple_query; the main disadvantage is that it reduces the
super-user's control, because now the functionality is always
available instead of being something that the super-user can choose to
install, or not.

But these are all judgement calls, of course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX SCHEMA

2014-10-22 Thread Alvaro Herrera

Sawada Masahiko wrote:

> Thank you for reviewing.
> I agree 2) - 5).
> Attached patch is latest version patch I modified above.
> Also, I noticed I had forgotten to add the patch regarding document of
> reindexdb.

Please don't use pg_catalog in the regression test.  That way we will
need to update the expected file whenever a new catalog is added, which
seems pointless.  Maybe create a schema with a couple of tables
specifically for this, instead.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] superuser() shortcuts

2014-10-22 Thread Alvaro Herrera

Brightwell, Adam wrote:
> All,
> 
> 
> > Thanks!  Please add it to the next commitfest.
> >
> >
> > Sounds good.  I'll update the patch and add accordingly.
> >
> 
> Attached is an updated patch.

I noticed something strange while perusing this patch, but the issue
predates the patch.  Some messages say "must be superuser or replication
role to foo", but our longstanding practice is to say "permission denied
to foo".  Why do we have this inconsistency?  Should we remove it?  If
we do want to keep the old wording this patch should use "permission
denied to" in the places that it touches.

Other than that, since we already agreed that it's something we want,
the only comment I have about this patch is an empty line in variable
declarations here which should be removed:

> diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
> new file mode 100644
> index c9a9baf..ed89b23
> *** a/src/backend/commands/alter.c

> --- 807,848 
>   bool   *nulls;
>   bool   *replaces;
>   
> ! AclObjectKind aclkind = get_object_aclkind(classId);
> ! 


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-22 Thread Joshua D. Drake



On 10/22/2014 04:03 PM, Robert Haas wrote:

On Wed, Oct 8, 2014 at 6:32 PM, Andres Freund  wrote:

I got to ask: Why is it helpful that we have this in contrib? I have a
good share of blame to bear for that, but I think we need to stop
dilluting contrib evermore with test programs. These have a place, but I
don't think it should be contrib.


I don't think pg_background is merely a test program: I think it's a
quite useful piece of functionality.  It can be used for running
VACUUM from a procedure, which is something people have asked for more
than once, or for simulating an autonomous transaction.  Granted,
it'll be a lot slower than a real autonomous transaction, but it's
still better than doing it via dblink, because you don't have to futz
with authentication.


I think this is a very useful program but I wonder if it makes more 
sense to push it out to pgxn? Why do we need an ever growing contrib? 
Further, if it is good enough to go into contrib, why isn't it just in core?


Sincerely,

Joshua D. Drake


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
 not be surprised when they come back as Romans."


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-10-22 Thread Robert Haas

On Wed, Oct 8, 2014 at 6:32 PM, Andres Freund  wrote:
> I got to ask: Why is it helpful that we have this in contrib? I have a
> good share of blame to bear for that, but I think we need to stop
> dilluting contrib evermore with test programs. These have a place, but I
> don't think it should be contrib.

I don't think pg_background is merely a test program: I think it's a
quite useful piece of functionality.  It can be used for running
VACUUM from a procedure, which is something people have asked for more
than once, or for simulating an autonomous transaction.  Granted,
it'll be a lot slower than a real autonomous transaction, but it's
still better than doing it via dblink, because you don't have to futz
with authentication.

I would be all in favor of moving things like test_decoding,
test_parser, and test_shm_mq to src/test or contrib/test or wherever
else we want to put them, but I think pg_background belongs in
contrib.

>> +/* Fixed-size data passed via our dynamic shared memory segment. */
>> +typedef struct pg_background_fixed_data
>> +{
>> + Oid database_id;
>> + Oid authenticated_user_id;
>> + Oid current_user_id;
>> + int sec_context;
>> + char database[NAMEDATALEN];
>> + char authenticated_user[NAMEDATALEN];
>> +} pg_background_fixed_data;
>
> Why not NameData?

No particular reason.  Changed.

> whitespace damage.

I went through and fixed everything that git diff --check complained
about.  Let me know if you see other problems yet.

>> +static HTAB *worker_hash;
>
> Hm. So the lifetime of this hash's contents is managed via
> on_dsm_detach(), do I understand that correctly?

More or less, yes.

> Hm. So every user can do this once the extension is created as the
> functions are most likely to be PUBLIC. Is this a good idea?

Why not?  If they can log in, they could start separate sessions with
similar effect.

>> + /*
>> +  * Whether we succeed or fail, a future invocation of this 
>> function
>> +  * may not try to read from the DSM once we've begun to do so.
>> +  * Accordingly, make arrangements to clean things up at end of 
>> query.
>> +  */
>> + dsm_unkeep_mapping(info->seg);
>> +
>> + /* Set up tuple-descriptor based on colum definition list. */
>> + if (get_call_result_type(fcinfo, NULL, &tupdesc) != 
>> TYPEFUNC_COMPOSITE)
>> + ereport(ERROR,
>> + 
>> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>> +  errmsg("function returning record 
>> called in context "
>> + "that cannot accept 
>> type record"),
>> +  errhint("Try calling the function in 
>> the FROM clause "
>> +  "using a column 
>> definition list.")));
>
> Hm, normally we don't add linebreaks inside error messages.

I copied it from dblink.

> I'm unsure right now about the rules surrounding this, but shouldn't we
> check that the user is allowed to execute these? And shouldn't we fall
> back to non binary functions if no binary ones are available?

I can't see any reason to do either of those things.  I'm not aware
that returning data in binary format is in any way intended to be a
security-restricted operation, or that we have any data types that
actually matter without send and receive functions.  If we do, I think
the solution is to add them, not make this more complicated.

>> + /*
>> +  * Limit the maximum error level to 
>> ERROR.  We don't want
>> +  * a FATAL inside the background 
>> worker to kill the user
>> +  * session.
>> +  */
>> + if (edata.elevel > ERROR)
>> + edata.elevel = ERROR;
>
> Hm. But we still should report that it FATALed? Maybe add error context
> notice about it? Not nice, but I don't have a immediately better idea. I
> think it generally would be rather helpful to add the information that
> this wasn't originally an error triggered by this process. The user
> might otherwise be confused when looking for the origin of the error in
> the log.

Yeah, I was wondering if we needed some kind of annotation here.  What
I'm wondering about is appending something to the errcontext, perhaps
"background worker, PID %d".

>> + case 'A':
>> + {
>> + /* Propagate NotifyResponse. */
>> + pq_putmessage(msg.data[0], 
>> &msg.data[1], nbytes - 1);
>> + break;
>
> Hm. Are we sure to be in a situation where the client expects these? And
> are we sur

Re: [HACKERS] delta relations in AFTER triggers

2014-10-22 Thread Tom Lane

Heikki Linnakangas  writes:
> On 10/22/2014 11:10 PM, Robert Haas wrote:
>> Studying this proposed design a bit further, I am a little fuzzy on
>> what is supposed to happen in this design during parse analysis.  In
>> Kevin's current code, after checking whether a RangeVar might be a CTE
>> reference and before deciding that it must be a table reference,
>> scanNameSpaceForTsr() is used to check whether there's a tuplestore by
>> that name and, if so, then we insert a RTE with type RTE_TUPLESTORE
>> which references the tuplestore by name.  To me, the obvious thing to
>> do here seems to be to instead call p_tableref_hook and let it return
>> a suitable RangeTblRef (or NULL if it wishes to take no action).  In
>> the case where the hook wants to redirect the use of that name to a
>> tuplestore, it can add a range-table entry of type RTE_TUPLESTORE, and
>> that entry can store a parameter-index rather than, as in the current
>> design, a name.  But then where does Heikki's notion of a
>> RelationParam get used?

> I was thinking that the hook would return a RelationParam. When parse 
> analysis sees the returned RelationParam, it adds an entry for that to 
> the range table, and creates the RangeTblRef for it. The way you 
> describe it works too, but manipulating the range table directly in the 
> hook seems a bit too low-level.

The problem with that idea is that then the API for the hook has to cover
every possible sort of RTE that hooks might wish to create; I see no
reason to restrict them to creating just one kind.  I agree that the hook
should avoid *physically* manipulating the rangetable, but it seems
reasonable to expect that it can call one of the addRangeTableEntryXXX
functions provided by parse_relation.c, and then return a RangeTblEntry*
gotten that way.  So hooks would have an API more or less equivalent
to, eg, transformRangeFunction().

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] delta relations in AFTER triggers

2014-10-22 Thread Heikki Linnakangas


On 10/22/2014 11:10 PM, Robert Haas wrote:

On Tue, Sep 23, 2014 at 1:51 PM, Tom Lane  wrote:

Add a new p_tableref_hook function pointer, similar to p_paramref_hook.
Whenever the parser sees a RangeVar that it doesn't recognize (or actually,
I think it should call it *before* resolving regular tables, but let's
ignore that for now), it calls the p_tableref_hook. It can return a new
RelationParam node (similar to regular Param), which contains a numeric ID
for the table/tuplestore, as well as its tuple descriptor.

For the execution phase, add a new array of Tuplestorestates to
ParamListInfo. Similar to the existing array of ParamExternalDatas.



I haven't been following this issue closely, but this sounds like a
really nice design.


I'm on board with the parser hooks part of that.  I don't especially agree
with the idea of a new sub-structure for ParamListInfo: if you do that you
will need a whole bunch of new boilerplate infrastructure to allocate,
copy, and generally manage that structure, for darn little gain.  What I'd
suggest is just saying that some Params might have type INTERNAL with
Datum values that are pointers to tuplestores; then all you need to do is
remember which Param number has been assigned to the particular tuplestore
you want.  There is already precedent for that in the recursive CTE code,
IIRC.


Studying this proposed design a bit further, I am a little fuzzy on
what is supposed to happen in this design during parse analysis.  In
Kevin's current code, after checking whether a RangeVar might be a CTE
reference and before deciding that it must be a table reference,
scanNameSpaceForTsr() is used to check whether there's a tuplestore by
that name and, if so, then we insert a RTE with type RTE_TUPLESTORE
which references the tuplestore by name.  To me, the obvious thing to
do here seems to be to instead call p_tableref_hook and let it return
a suitable RangeTblRef (or NULL if it wishes to take no action).  In
the case where the hook wants to redirect the use of that name to a
tuplestore, it can add a range-table entry of type RTE_TUPLESTORE, and
that entry can store a parameter-index rather than, as in the current
design, a name.  But then where does Heikki's notion of a
RelationParam get used?


I was thinking that the hook would return a RelationParam. When parse 
analysis sees the returned RelationParam, it adds an entry for that to 
the range table, and creates the RangeTblRef for it. The way you 
describe it works too, but manipulating the range table directly in the 
hook seems a bit too low-level.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Moving of INT64_FORMAT to c.h

2014-10-22 Thread Tom Lane

Robert Haas  writes:
> On Wed, Oct 22, 2014 at 4:12 PM, Steve Singer  wrote:
>> So the header of c.h says "Note that the definitions here are not intended
>> to be exposed to clients"
>> but
>> postgres_fe.h says "This should be the first file included by PostgreSQL
>> client libraries and"
>> 
>> Should client programs that live outside the postgres source tree be
>> including postgres_fe.h ?  I have a feeling the answer is no. If the answer
>> is no, then why does a make install install postgres_fe.h ?

> I think the answer is yes.

Yeah.  In particular, postgres_fe.h sees to it that FRONTEND is #define'd;
without that there is *not* a guarantee that c.h will work for non-backend
compiles.  We would much rather you were including postgres_fe.h than c.h
directly.  Having said that, there is not and never will be a guarantee
that everything that postgres_fe.h declares is immutable, and that
certainly applies to pg_config.h in particular.  There's a reason why
libpq doesn't include that into its public headers ...

>> Slonik used to include postgre_fe.h but back in 2011 or so we stopped doing
>> so because it was causing issues (I think on win32 builds)

> That seems like something we ought to consider fixing, but obviously
> we'd need more details.

Indeed.  But I suspect the core of it is going to be that clients of
postgres_fe.h are expected to be linking in libpgport.a ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Moving of INT64_FORMAT to c.h

2014-10-22 Thread Robert Haas

On Wed, Oct 22, 2014 at 4:12 PM, Steve Singer  wrote:
> So the header of c.h says "Note that the definitions here are not intended
> to be exposed to clients"
> but
> postgres_fe.h says "This should be the first file included by PostgreSQL
> client libraries and"
>
> Should client programs that live outside the postgres source tree be
> including postgres_fe.h ?  I have a feeling the answer is no. If the answer
> is no, then why does a make install install postgres_fe.h ?

I think the answer is yes.

> Slonik used to include postgre_fe.h but back in 2011 or so we stopped doing
> so because it was causing issues (I think on win32 builds)

That seems like something we ought to consider fixing, but obviously
we'd need more details.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Moving of INT64_FORMAT to c.h

2014-10-22 Thread Steve Singer


On 10/16/2014 08:08 AM, Andres Freund wrote:

On 2014-10-16 08:04:17 -0400, Jan Wieck wrote:

Hi,

PostgreSQL has for ages defined INT64_FORMAT and UINT64_FORMAT in
pg_config.h. This commit

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ce486056ecd28050

moved those definitions to c.h, which breaks compilation of all released
Slony-I versions against current master. Can those be moved back to where
they used to be?

Well, you could add additional configure stuff to also emit what you
want.


Slony uses the definitions in external tools, like slon and slonik, to
format sequence numbers in log output.

Then it should include c.h/postgres_fe.h?


So the header of c.h says "Note that the definitions here are not 
intended to be exposed to clients"

but
postgres_fe.h says "This should be the first file included by PostgreSQL 
client libraries and"


Should client programs that live outside the postgres source tree be 
including postgres_fe.h ?  I have a feeling the answer is no. If the 
answer is no, then why does a make install install postgres_fe.h ?


Slonik used to include postgre_fe.h but back in 2011 or so we stopped 
doing so because it was causing issues (I think on win32 builds)


Maybe slony client programs shouldn't be trying to steal portability 
definitions from postgres headers, but I doubt we are the only ones 
doing that.  It isn't a big deal for slony to define it's own 
INT64_FORMAT for 9.5+ but third party projects that have been including 
pg_config.h will hit similar issues.  if there was good reason for the 
change then fine (Postgres isn't intended to be a general purpose C 
portability layer).




Steve


Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] delta relations in AFTER triggers

2014-10-22 Thread Robert Haas

On Tue, Sep 23, 2014 at 1:51 PM, Tom Lane  wrote:
>>> Add a new p_tableref_hook function pointer, similar to p_paramref_hook.
>>> Whenever the parser sees a RangeVar that it doesn't recognize (or actually,
>>> I think it should call it *before* resolving regular tables, but let's
>>> ignore that for now), it calls the p_tableref_hook. It can return a new
>>> RelationParam node (similar to regular Param), which contains a numeric ID
>>> for the table/tuplestore, as well as its tuple descriptor.
>>>
>>> For the execution phase, add a new array of Tuplestorestates to
>>> ParamListInfo. Similar to the existing array of ParamExternalDatas.
>
>> I haven't been following this issue closely, but this sounds like a
>> really nice design.
>
> I'm on board with the parser hooks part of that.  I don't especially agree
> with the idea of a new sub-structure for ParamListInfo: if you do that you
> will need a whole bunch of new boilerplate infrastructure to allocate,
> copy, and generally manage that structure, for darn little gain.  What I'd
> suggest is just saying that some Params might have type INTERNAL with
> Datum values that are pointers to tuplestores; then all you need to do is
> remember which Param number has been assigned to the particular tuplestore
> you want.  There is already precedent for that in the recursive CTE code,
> IIRC.

Studying this proposed design a bit further, I am a little fuzzy on
what is supposed to happen in this design during parse analysis.  In
Kevin's current code, after checking whether a RangeVar might be a CTE
reference and before deciding that it must be a table reference,
scanNameSpaceForTsr() is used to check whether there's a tuplestore by
that name and, if so, then we insert a RTE with type RTE_TUPLESTORE
which references the tuplestore by name.  To me, the obvious thing to
do here seems to be to instead call p_tableref_hook and let it return
a suitable RangeTblRef (or NULL if it wishes to take no action).  In
the case where the hook wants to redirect the use of that name to a
tuplestore, it can add a range-table entry of type RTE_TUPLESTORE, and
that entry can store a parameter-index rather than, as in the current
design, a name.  But then where does Heikki's notion of a
RelationParam get used?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2014-10-22 Thread Dag-Erling Smørgrav

Martijn van Oosterhout  writes:
> Dag-Erling Smørgrav  writes:
> > If I understand correctly, imaps has been shown to be vulnerable as
> > well, so I wouldn't be so sure.
> Reference?

Sorry, no reference.  I was told that Thunderbird was vulnerable to
POODLE when talking imaps.

> Since you can already specify the cipher list, couldn't you just add
> -SSLv3 to the cipher list and be done?

I didn't want to change the existing behavior; all I wanted was to give
users a way to do so if they wish.

DES
-- 
Dag-Erling Smørgrav - d...@des.no


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)

2014-10-22 Thread Heikki Linnakangas


On 10/22/2014 04:24 PM, Michael Paquier wrote:

On Wed, Oct 22, 2014 at 1:53 PM, Heikki Linnakangas 
wrote:



I think we should take a more wholesale approach to this. We should
enforce the rule that the server only ever archives WAL files belonging to
the same timeline that the server generates. IOW, the server only archives
the WAL that it has generated.


Hm?! Would that be really back-patchable? There may be in the wild tools or
users that rely on the fact a node archives segment files from all
timelines.


Hmm, so it would be a tool or user that manually copies a file to the 
pg_xlog directory of a standby server, and expects the standby to 
archive the file after promotion. That seems a bit far-fetched, although 
I've seen people do strange things. I think it would be acceptable as 
long as we document the change in behavior in the release notes. I don't 
have much hope that we'll ever be able to nail down the correct behavior 
with the current approach.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2014-10-22 Thread Martijn van Oosterhout

On Wed, Oct 22, 2014 at 03:14:26PM +0200, Dag-Erling Smørgrav wrote:
> > In a case like POODLE we probably wouldn't have done it anyway, as it
> > doesn't affect our connections (we're not http)
> 
> If I understand correctly, imaps has been shown to be vulnerable as
> well, so I wouldn't be so sure.

Reference? It's an SSL3 specific attack, so most servers are not
vulnerable because TLS will negotiate to the highest supported
protocol.  So unless one of the client/server doesn't support TLS1.0
there's no issue.  The only reason http is vulnerable is because
browsers do protocol downgrading, something strictly forbidden by the
spec.

Additionally, the man-in-the-middle must be able to control the padding
in the startup packet, which just isn't possible (no scripting language
in the client).

Since you can already specify the cipher list, couldn't you just add
-SSLv3 to the cipher list and be done?

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer

signature.asc
Description: Digital signature

Re: [HACKERS] idea: allow AS label inside ROW constructor

2014-10-22 Thread Pavel Stehule

2014-10-22 18:35 GMT+02:00 Merlin Moncure :

> On Wed, Oct 22, 2014 at 11:21 AM, Pavel Stehule 
> wrote:
> > Hi
> >
> > with new functions row_to_json(b), there is more often usage of ROW
> > constructor. Using names in fields is relative difficult. Because ROW has
> > special clause in parser, I am thinking so we can enable labeling inside
> ROW
> > constructor
> >
> > so instead currently supported:
> >
> > select row_to_json(r) from (select 10 as a, 20 as b) r;
> >
> > users can to write:
> >
> > select row_to_json(row(10 as a,20 as b));
> >
> > labeling will be enabled "only" inside ROW constructor. I don't propose
> > enable it everywhere.
> >
> > What do you think about it?
>
> It's a neat idea -- maybe a better alternative to what I was thinking
> here:
> http://postgresql.1045698.n5.nabble.com/Support-UPDATE-table-SET-tp5823073p5823944.html
>
> Some questions:
> *) What would the parser transformation resolve to
>

row:ROW '(' expr_list ')' { $$
= $3; }
| ROW '('
')'   { $$ = NIL; }
| '(' expr_list ',' a_expr ')'  {
$$ = lappend($2, $4); }
;

we can replace a expr_list by target_list. I know only so it doesn't
enforce a problems with gramatic  - bison doesn't raise any warning.


*) Are we ok with SQL standard
>

SQL standard doesn't think named attributes in row - so it is out of range
ANSI. But it is not in conflict with standard. "AS name" is used more in
SQL/MM, SQL/XML -- and function named parameters has different syntax
"parameter_name <= value" - I checked it against SQL99.


> *) Do you think this (or some similar variant) would work?
>
> select row_to_json(row(foo.*)) from foo;
>

It looks like independent feature and can work too - it is more natural for
user.


>
> merlin
>

Re: [HACKERS] Question about RI checks

2014-10-22 Thread Kevin Grittner

Florian Pflug  wrote:

>> This should not be considered a problem for repeatable read
>> transactions because the change in visible rows meet the
>> definition of phantom reads, which are allowed in repeatable
>> read: "A transaction re-executes a query returning a set of rows
>> that satisfy a search condition and finds that the set of rows
>> satisfying the condition has changed due to another
>> recently-committed transaction."
>
> Now I'm confused. Isn't the whole point of REPEATABLE READ to
> provide, well, repeatable reads?

What the standard requires for REPEATABLE READ is that if you
re-read the same row later in a transaction it will containt the
same values -- the read of any particular *row* is repeatable (if
it exists at both times); it does not guarantee that the same
selection criteria will return the same set of rows every time.
The standard only requires that of SERIALIZABLE transactions.

PostgreSQL has historically provided more rigorous protections at
REPEATABLE READ than the standard requires.  Our docs claim:

| When you select the level Read Uncommitted you really get Read
| Committed, and phantom reads are not possible in the PostgreSQL
| implementation of Repeatable Read, so the actual isolation level
| might be stricter than what you select. This is permitted by the
| SQL standard: the four isolation levels only define which
| phenomena must not happen, they do not define which phenomena
| must happen.

As you have shown, our FK/RI implementation exposes phantom reads
to clients, so at a minimum our docs are wrong.

> Also, note that after the DELETE FROM parent, further SELECTS in
> the same transaction will use the original snapshot again, und
> thus will see the conflicting child rows again that were ignored
> by the RI trigger. But they won't, of course, see the parent row.
>
> IOW, transaction A will, after the delete, see a state of the
> database in which the PK constraint is broken. I don't think
> that's acceptable in any isolation level.

Good point.  Based on that observation, I agree that our RI is
broken at both the REPEATABLE READ and SERIALIZABLE isolation
levels.  I think that READ COMMITTED is OK, because it will see the
child row as deleted in time to prevent problems.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Question about RI checks

2014-10-22 Thread Florian Pflug

> This should not be considered a problem for repeatable read
> transactions because the change in visible rows meet the definition
> of phantom reads, which are allowed in repeatable read: "A
> transaction re-executes a query returning a set of rows that
> satisfy a search condition and finds that the set of rows
> satisfying the condition has changed due to another
> recently-committed transaction." 

Now I'm confused. Isn't the whole point of REPEATABLE READ to provide, well, 
repeatable reads? Also, note that after the DELETE FROM parent, further SELECTS 
in the same transaction will use the original snapshot again, und thus will see 
the conflicting child rows again that were ignored by the RI trigger. But they 
won't, of course, see the parent row.

IOW, transaction A will, after the delete, see a state of the database in which 
the PK constraint is broken. I don't think that's acceptable in any isolation 
level.

Best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idea: allow AS label inside ROW constructor

2014-10-22 Thread Merlin Moncure

On Wed, Oct 22, 2014 at 11:21 AM, Pavel Stehule  wrote:
> Hi
>
> with new functions row_to_json(b), there is more often usage of ROW
> constructor. Using names in fields is relative difficult. Because ROW has
> special clause in parser, I am thinking so we can enable labeling inside ROW
> constructor
>
> so instead currently supported:
>
> select row_to_json(r) from (select 10 as a, 20 as b) r;
>
> users can to write:
>
> select row_to_json(row(10 as a,20 as b));
>
> labeling will be enabled "only" inside ROW constructor. I don't propose
> enable it everywhere.
>
> What do you think about it?

It's a neat idea -- maybe a better alternative to what I was thinking
here: 
http://postgresql.1045698.n5.nabble.com/Support-UPDATE-table-SET-tp5823073p5823944.html

Some questions:
*) What would the parser transformation resolve to
*) Are we ok with SQL standard
*) Do you think this (or some similar variant) would work?

select row_to_json(row(foo.*)) from foo;

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idea: allow AS label inside ROW constructor

2014-10-22 Thread Pavel Stehule

here is a motivation, why I propose this feature

http://dba.stackexchange.com/questions/27732/set-names-to-attributes-when-creating-json-with-row-to-json

same query I have in Czech postgres users mailing list

Pavel

2014-10-22 18:21 GMT+02:00 Pavel Stehule :

> Hi
>
> with new functions row_to_json(b), there is more often usage of ROW
> constructor. Using names in fields is relative difficult. Because ROW has
> special clause in parser, I am thinking so we can enable labeling inside
> ROW constructor
>
> so instead currently supported:
>
> select row_to_json(r) from (select 10 as a, 20 as b) r;
>
> users can to write:
>
> select row_to_json(row(10 as a,20 as b));
>
> labeling will be enabled "only" inside ROW constructor. I don't propose
> enable it everywhere.
>
> What do you think about it?
>
> Regards
>
> Pavel
>
> Currently supported syntax is natural for long time PostgreSQL user, but
> it is relative strange for usual user.
>
>

[HACKERS] idea: allow AS label inside ROW constructor

2014-10-22 Thread Pavel Stehule

Hi

with new functions row_to_json(b), there is more often usage of ROW
constructor. Using names in fields is relative difficult. Because ROW has
special clause in parser, I am thinking so we can enable labeling inside
ROW constructor

so instead currently supported:

select row_to_json(r) from (select 10 as a, 20 as b) r;

users can to write:

select row_to_json(row(10 as a,20 as b));

labeling will be enabled "only" inside ROW constructor. I don't propose
enable it everywhere.

What do you think about it?

Regards

Pavel

Currently supported syntax is natural for long time PostgreSQL user, but it
is relative strange for usual user.

Re: [HACKERS] Function array_agg(array)

2014-10-22 Thread Pavel Stehule

2014-10-22 16:58 GMT+02:00 Ali Akbar :

> Thanks for the review
>
> 2014-10-22 20:51 GMT+07:00 Pavel Stehule :
>
>> I agree with your proposal. I have a few comments to design:
>>
>
>> 1. patch doesn't hold documentation and regress tests, please append it.
>>
> OK, i'll add the documentation and regression test
>
>
>> 2. this functionality (multidimensional aggregation) can be interesting
>> more times, so maybe some interface like array builder should be preferred.
>>
> We already have accumArrayResult and makeArrayResult/makeMdArrayResult,
> haven't we?
>
> Actually array_agg(anyarray) can be implemented by using accumArrayResult
> and makeMdArrayResult, but in the process we will deconstruct the array
> data and null bit-flags into ArrayBuildState->dvalues and dnulls. And we
> will reconstruct it through makeMdArrayResult. I think this way it's not
> efficient, so i decided to reimplement it and memcpy the array data and
> null flags as-is.
>

aha, so isn't better to fix a performance for accumArrayResult ?


>
> In other places, i think it's clearer if we just use accumArrayResult and
> makeArrayResult/makeMdArrayResult as API for building (multidimensional)
> arrays.
>
>
>> 3. array_agg was consistent with array(subselect), so it should be fixed
>> too
>>
>> postgres=# select array_agg(a) from test;
>>array_agg
>> ---
>>  {{1,2,3,4},{1,2,3,4}}
>> (1 row)
>>
>> postgres=# select array(select a from test);
>> ERROR:  could not find array type for data type integer[]
>>
>
> I'm pretty new in postgresql development. Can you point out where is
> array(subselect) implemented?
>

where you can start?

postgres=# explain select array(select a from test);
ERROR:  42704: could not find array type for data type integer[]
LOCATION:  exprType, nodeFuncs.c:116

look to code ... your magic keyword is a ARRAY_SUBLINK .. search in
postgresql sources this keyword


>
>
>
>> 4. why you use a magic constant (64) there?
>>
>> + astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes);
>> + astate->aitems = 64 * nitems;
>>
>> + astate->nullbitmap = (bits8 *)
>> + repalloc(astate->nullbitmap, (astate->aitems + 7) / 8);
>>
>
> just follow the arbitrary size choosen in accumArrayResult (arrayfuncs.c):
> astate->alen = 64; /* arbitrary starting array size */
>
> it can be any number not too small and too big. Too small, and we will
> realloc shortly. Too big, we will end up wasting memory.
>

you can try to alloc 1KB instead as start -- it is used more times in
Postgres. Then a overhead is max 1KB per agg call - what is acceptable.

You take this value from accumArrayResult, but it is targeted for shorted
scalars - you should to expect so any array will be much larger.





> Regards,
> --
> Ali Akbar
>

Re: [HACKERS] Question about RI checks

2014-10-22 Thread Kevin Grittner

Florian Pflug  wrote:
> Florian Pflug  wrote:

> But that's wrong. The transaction's snapshot *would* see that row, so we
> ought to raise an error. Note that this applies also to mode SERIALIZABLE, and
> breaks true serializability in some cases, since we don't do conflict 
> detection
> for RI enforcement queries.
>
> Here's a test case, involving two transaction A and B. I tried this on
> REL9_4_STABLE.
>
> Setup
> -
> CREATE TABLE parent (id int NOT NULL PRIMARY KEY);
> CREATE TABLE child (id int NOT NULL PRIMARY KEY,
> parent_id int NOT NULL REFERENCES parent (id));
> INSERT INTO parent (id) VALUES (1);
> INSERT INTO child (id, parent_id) VALUES (1, 1);
>
> Failure Case
> 
> A:: set default_transaction_isolation='serializable';
> A:: begin;
> A:: select * from child;
> ->  id | parent_id
> +---
>   1 |1
> B:: set default_transaction_isolation='serializable';
> B:: begin;
> B:: delete from child;
> -> DELETE 1
> B:: commit;
> A:: delete from parent;
> -> DELETE 1
> A:: commit;
>
> A can neither come before B in any serializable history (since the DELETE
> should fail then), nor after it (since it shouldn't see the row in the child
> table then). Yet we don't complain, even though both transaction run in
> SERIALIZABLE mode.

Simplifying the display of this a bit:

  Tables parent and child each have one row.

  Transaction A
  =
  select * from child;
  [it sees the child row]
Transaction B
=
delete from child;
[child row is deleted]
  delete from parent;
  [parent row deleted]

TA sees the row that TB deletes, creating a rw-dependency that
implies that TA ran first.  On the other hand, the FK trigger
fired by the delete from parent *doesn't* see the row deleted by
TB, which implies that TB ran first.  I agree, this is a problem
for serializable transactions.

This should not be considered a problem for repeatable read
transactions because the change in visible rows meet the definition
of phantom reads, which are allowed in repeatable read: "A
transaction re-executes a query returning a set of rows that
satisfy a search condition and finds that the set of rows
satisfying the condition has changed due to another
recently-committed transaction."  Phantom reads are not *required*
to occur in repeatable read transactions, and in PostgreSQL they
generally don't, so we *might* want to change this behavior; I'm
just saying that we are conforming to requirements of the standard
even if we don't.  Leaving this alone for repeatable read
transactions would require a change to our documentation, though,
since we currently assert that we don't allow phantom reads in our
repeatable read implementation.

Either SSI needs to include the RI checking queries in its tracking
*or* TA needs to throw an error if there are child rows visible
according to its transaction snapshot -- at least in serializable
transactions.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for better support of time-varying timezone abbreviations

2014-10-22 Thread Tom Lane

Michael Meskes  writes:
> On Wed, Oct 15, 2014 at 09:50:16AM -0400, Tom Lane wrote:
>> The same thought had occurred to me.  Probably the main use of the
>> datetime parsing code in ecpg is for interpreting outputs from the
>> server, and (at least by default) the server doesn't use timezone
>> abbreviations when printing timestamps.  So maybe that's largely
>> dead code anyhow.  I would not propose back-patching such a change,
>> but we could try it in 9.5 and see if anyone complains.

> Agreed on all accounts.

>> A less drastic remedy would be to remove just those abbreviations
>> whose meaning has actually changed over time.  Eventually that
>> might be all of them ... but in the meantime, we could at least
>> argue that we weren't breaking any case that worked well before.

> This is what your patch did, right?

No, I did not touch ecpg's set of tokens at all, just changed the
representation of datetktbl to match the new backend coding.
I figured we could discuss behavioral changes separately.

I don't have a strong opinion about which of the above things to do ...
what's your preference?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support UPDATE table SET(*)=...

2014-10-22 Thread Merlin Moncure

On Fri, Oct 17, 2014 at 10:47 AM, Merlin Moncure  wrote:
> On Fri, Oct 17, 2014 at 10:30 AM, Tom Lane  wrote:
>> Merlin Moncure  writes:
>>> On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane  wrote:
 I think it wouldn't; Merlin is proposing that f would be taken as the
 field name.  A more realistic objection goes like this:

 create table foo(f int, g int);
 update foo x set x = (1,2);  -- works
 alter table foo add column x int;
 update foo x set x = (1,2,3);  -- no longer works

 It's not a real good thing if a column addition or renaming can
 so fundamentally change the nature of a query.
>>
>>> That's exactly how SELECT works.  I also dispute that the user should
>>> be surprised in such cases.
>>
>> Well, the reason we have a problem in SELECT is that we support an
>> unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that
>> "SELECT foo FROM foo" could represent a whole-row selection is nowhere
>> to be found in the SQL standard, for good reason IMO.  But we've never
>> had the courage to break cleanly with this Berkeley leftover and
>> insist that people spell it SQL-style as "SELECT ROW(foo.*) FROM foo".
>> I'd just as soon not introduce the same kind of ambiguity into UPDATE
>> if we have a reasonable alternative.
>
> Ah, interesting point (I happen to like the terse syntax and use it
> often).  This is for posterity anyways since you guys seem to like
> Atri's proposal, which surprised me.  However, I think you're over
> simplifying things here.  Syntax aside: I think
> SELECT f FROM foo f;
> and a hypothetical
> SELECT row(f.*) FROM foo f;
>
> give different semantics.  The former gives an object of type 'f' and
> the latter gives type 'row'.  To get parity, you'd have to add an
> extra cast which means you'd have to play tricky games to avoid extra
> performance overhead besides being significantly more verbose.  I'm
> aware some of the other QUELisms are pretty dodgy and have burned us
> in the past (like that whole function as record reference thing) but
> pulling a record as a field in select is pretty nice.  It's also
> widely used and quite useful in json serialization.

Been thinking about this in the back of my mind the last couple of
days.  There are some things you can do with the QUEL 'table as
column' in select syntax that are impossible otherwise, at least
today, and its usage is going to proliferate because of that.  Row
construction via () or row() needs to be avoided whenever the column
names are important and there is no handy type to cast to. For
example, especially during json serialization it's convenient to do
things like:

select
  array_agg((select q from (select a, b) q) order by ...)
from foo;

...where a,b are fields of foo.  FWICT, this is the only way to do
json serialization of arbitrary runtime row constructions in a way
that does not anonymize the type.  Until I figured out this trick I
used to create lots of composite types that served no purpose other
than to give me a type to cast to which is understandably annoying.

if:
select (x).* from (select (1, 2) as x) q;

worked and properly expanded x to given names should they exist and:
SELECT row(f.*) FROM foo f;

worked and did same, and:
SELECT (row(f.*)).* FROM foo f;

was as reasonably performant and gave the same results as:
SELECT (f).* FROM foo f;

...then IMSNHO you'd have a reasonable path to deprecating the QUEL
inspired syntax.  Food for thought.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function array_agg(array)

2014-10-22 Thread Ali Akbar

Thanks for the review

2014-10-22 20:51 GMT+07:00 Pavel Stehule :

> I agree with your proposal. I have a few comments to design:
>

> 1. patch doesn't hold documentation and regress tests, please append it.
>
OK, i'll add the documentation and regression test


> 2. this functionality (multidimensional aggregation) can be interesting
> more times, so maybe some interface like array builder should be preferred.
>
We already have accumArrayResult and makeArrayResult/makeMdArrayResult,
haven't we?

Actually array_agg(anyarray) can be implemented by using accumArrayResult
and makeMdArrayResult, but in the process we will deconstruct the array
data and null bit-flags into ArrayBuildState->dvalues and dnulls. And we
will reconstruct it through makeMdArrayResult. I think this way it's not
efficient, so i decided to reimplement it and memcpy the array data and
null flags as-is.

In other places, i think it's clearer if we just use accumArrayResult and
makeArrayResult/makeMdArrayResult as API for building (multidimensional)
arrays.


> 3. array_agg was consistent with array(subselect), so it should be fixed
> too
>
> postgres=# select array_agg(a) from test;
>array_agg
> ---
>  {{1,2,3,4},{1,2,3,4}}
> (1 row)
>
> postgres=# select array(select a from test);
> ERROR:  could not find array type for data type integer[]
>

I'm pretty new in postgresql development. Can you point out where is
array(subselect) implemented?



> 4. why you use a magic constant (64) there?
>
> + astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes);
> + astate->aitems = 64 * nitems;
>
> + astate->nullbitmap = (bits8 *)
> + repalloc(astate->nullbitmap, (astate->aitems + 7) / 8);
>

just follow the arbitrary size choosen in accumArrayResult (arrayfuncs.c):
astate->alen = 64; /* arbitrary starting array size */

it can be any number not too small and too big. Too small, and we will
realloc shortly. Too big, we will end up wasting memory.

Regards,
-- 
Ali Akbar

Re: [HACKERS] from_collapse_limit considerations

2014-10-22 Thread Antonin Houska

[ I think I responded earlier but don't see my mail in the archive... ]

Tom Lane  wrote:
> Antonin Houska  writes:
> > I noticed that - unlike join_collapse_limit - the from_collapse_limit does 
> > not
> > enforce maximum length of the top-level list. Shouldn't it do?
>
> How would it do that?  You want it to fail outright if there are more than
> N tables?  That seems unhelpful.

Sure, truncation of the FROM list would be crazy, and that's not what I tried 
to propose.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for better support of time-varying timezone abbreviations

2014-10-22 Thread Michael Meskes

On Wed, Oct 15, 2014 at 09:50:16AM -0400, Tom Lane wrote:
> The same thought had occurred to me.  Probably the main use of the
> datetime parsing code in ecpg is for interpreting outputs from the
> server, and (at least by default) the server doesn't use timezone
> abbreviations when printing timestamps.  So maybe that's largely
> dead code anyhow.  I would not propose back-patching such a change,
> but we could try it in 9.5 and see if anyone complains.

Agreed on all accounts.

> A less drastic remedy would be to remove just those abbreviations
> whose meaning has actually changed over time.  Eventually that
> might be all of them ... but in the meantime, we could at least
> argue that we weren't breaking any case that worked well before.

This is what your patch did, right?

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-10-22 Thread Andres Freund

On 2014-10-21 19:56:05 +0530, Amit Kapila wrote:
> On Wed, Oct 8, 2014 at 6:17 PM, Andres Freund 
> wrote:
> >
> > On 2014-06-25 19:06:32 +0530, Amit Kapila wrote:
> > > 2.
> > > LWLockWakeup()
> > > {
> > > ..
> > > #ifdef LWLOCK_STATS
> > > lwstats->spin_delay_count += SpinLockAcquire(&lock->mutex);
> > > #else
> > > SpinLockAcquire(&lock->mutex);
> > > #endif
> > > ..
> > > }
> > >
> > > Earlier while releasing lock, we don't count it towards LWLock stats
> > > spin_delay_count.  I think if we see other places in lwlock.c, it only
> gets
> > > counted when we try to acquire it in a loop.
> >
> > I think the previous situation was clearly suboptimal. I've now modified
> > things so all spinlock acquirations are counted.
> 
> Code has mainly 4 stats (sh_acquire_count, ex_acquire_count,
> block_count, spin_delay_count) to track, if I try to see
> all stats together to understand the contention situation,
> the unpatched code makes sense.

I don't think it does. It completely disregards that the contention may
actually be in LWLockRelease(). That contributes to to spinlock
contention just as much as LWLockAcquire().

> spin_delay_count gives
> how much delay has happened to acquire spinlock which when
> combined with other stats gives the clear situation about
> the contention around aquisation of corresponding LWLock.
> Now if we want to count the spin lock delay for Release call
> as well, then the meaning of the stat is getting changed.
> It might be that new meaning of spin_delay_count stat is more
> useful in some situations, however the older one has its own
> benefits, so I am not sure if changing this as part of this
> patch is the best thing to do.

In which case does the old definition make sense, where the new one
doesn't? I don't think it exists.

And changing it here seems to make sense because spinlock contention
fundamentally changes it meaning for lwlocks anyway as in most paths we
don't take a spinlock anymore.

> > > 5.
> > > LWLockWakeup()
> > > {
> > > ..
> > > dlist_foreach_modify(iter, (dlist_head *) &wakeup)
> > > {
> > > PGPROC *waiter = dlist_container(PGPROC, lwWaitLink, iter.cur);
> > > LOG_LWDEBUG("LWLockRelease", l, mode, "release waiter");
> > > dlist_delete(&waiter->lwWaitLink);
> > > pg_write_barrier();
> > > waiter->lwWaiting = false;
> > > PGSemaphoreUnlock(&waiter->sem);
> > > }
> > > ..
> > > }
> > >
> > > Why can't we decrement the nwaiters after waking up? I don't think
> > > there is any major problem even if callers do that themselves, but
> > > in some rare cases LWLockRelease() might spuriously assume that
> > > there are some waiters and tries to call LWLockWakeup().  Although
> > > this doesn't create any problem, keeping the value sane is good unless
> > > there is some problem in doing so.
> >
> > That won't work because then LWLockWakeup() wouldn't be called when
> > necessary - precisely because nwaiters is 0.
> 
> > The reason I've done so is that it's otherwise much harder to debug
> > issues where there are backend that have been woken up already, but
> > haven't rerun yet. Without this there's simply no evidence of that
> > state. I can't see this being relevant for performance, so I'd rather
> > have it stay that way.
> 
> I am not sure what useful information we can get during debugging by not
> doing this in LWLockWakeup()

It's useful because you can detect backends that have been scheduled to
acquire the lock, but haven't yet. They're otherwise "invisible".

> and w.r.t performance it can lead extra
> function call, few checks and I think in some cases even can
> acquire/release spinlock.

I fail to see how that could be the case. And again, this is code that's
only executed around a couple syscalls. And the cacheline will be
touched around there *anyway*.

> > > 6.
> > > LWLockWakeup()
> > > {
> > > ..
> > > dlist_foreach_modify(iter, (dlist_head *) &l->waiters)
> > > {
> > > ..
> > > if (wokeup_somebody && waiter->lwWaitMode == LW_EXCLUSIVE)
> > > continue;
> > > ..
> > > if (waiter->lwWaitMode != LW_WAIT_UNTIL_FREE)
> > > {
> > > ..
> > > wokeup_somebody = true;
> > > }
> > > ..
> > > }
> > > ..
> > > }
> > >
> > > a.
> > > IIUC above logic, if the waiter queue is as follows:
> > > (S-Shared; X-Exclusive) S X S S S X S S
> > >
> > > it can skip the exclusive waiters and release shared waiter.
> > >
> > > If my understanding is right, then I think instead of continue, there
> > > should be *break* in above logic.
> >
> > No, it looks correct to me. What happened is that the first S was woken
> > up. So there's no point in waking up an exclusive locker, but further
> > non-exclusive lockers can be woken up.
> 
> Okay, even then it makes the current logic of wakingup
> different which I am not sure is what this patch is intended
> for.

It's already done in a separate patch...

> > > b.
> > > Consider below sequence of waiters:
> > > (S-Shared; X-Exclusive) S S X S S
> > >
> > > I think as per un-patched code, it will wakeup waiters uptill
> (includin

Re: [HACKERS] Reducing lock strength of adding foreign keys

2014-10-22 Thread Tom Lane

Andreas Karlsson  writes:
> I have attached a proof of concept 
> patch which reduces the lock strength to ShareLock.

You're kidding, right?  ShareLock isn't even self-exclusive.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function array_agg(array)

2014-10-22 Thread Pavel Stehule

Hi

2014-10-19 8:02 GMT+02:00 Ali Akbar :

> So, is there any idea how we will handle NULL and empty array in
>> array_agg(anyarray)?
>> I propose we just reject those input because the output will make no
>> sense:
>> - array_agg(NULL::int[]) --> the result will be indistinguished from
>> array_agg of NULL ints.
>> - array_agg('{}'::int[]) --> how we determine the dimension of the
>> result? is it 0? Or the result will be just an empty array {} ?
>>
>
> This updated patch rejects NULL and {} arrays as noted above.
>
>

I agree with your proposal. I have a few comments to design:

1. patch doesn't hold documentation and regress tests, please append it.

2. this functionality (multidimensional aggregation) can be interesting
more times, so maybe some interface like array builder should be preferred.

3. array_agg was consistent with array(subselect), so it should be fixed too

postgres=# select array_agg(a) from test;
   array_agg
---
 {{1,2,3,4},{1,2,3,4}}
(1 row)

postgres=# select array(select a from test);
ERROR:  could not find array type for data type integer[]

4. why you use a magic constant (64) there?

+ astate->abytes = 64 * (ndatabytes == 0 ? 1 : ndatabytes);
+ astate->aitems = 64 * nitems;

+ astate->nullbitmap = (bits8 *)
+ repalloc(astate->nullbitmap, (astate->aitems + 7) / 8);

Regards

Pavel



> Regards,
> --
> Ali Akbar
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Re: [HACKERS] pg_receivexlog --status-interval add fsync feedback

2014-10-22 Thread Simon Riggs

On 22 October 2014 14:26, Heikki Linnakangas  wrote:

> We seem to be going in circles. You suggested having two options,
> --feedback, and --fsync, which is almost exactly what Furuya posted
> originally. I objected to that, because I think that user interface is too
> complicated. Instead, I suggested having just a single option called
> --synchronous, or even better, have no option at all and have the server
> tell the client if it's participating in synchronous replication, and have
> pg_receivexlog automatically fsync when it is, and not otherwise [1]. That
> way you don't need to expose any new options to the user. What did you think
> of that idea?

Sorry, if we're going in circles.

Yes, I like the idea.

The master setting of synchronous_standby_names defines which
standbys/rep clients will have their feedback used to release waiting
COMMITs. That uses application_name, which is set at the replication
client connection. (That has a default value, but lets assume the user
sets this). So when a replication client connects, the WALSender knows
its priority, as defined in sync_standby_names.

I guess we can have WALSender send a protocol message to the client
every time the sync priority changes (as a result of a changed value
of sync_standby_names). Which is the function SyncRepInitConfig()

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-10-22 Thread Andres Freund

On 2014-10-22 13:32:07 +0530, Amit Kapila wrote:
> On Tue, Oct 21, 2014 at 7:56 PM, Amit Kapila 
> wrote:
> > On Wed, Oct 8, 2014 at 6:17 PM, Andres Freund 
> wrote:
> > >
> > > On 2014-06-25 19:06:32 +0530, Amit Kapila wrote:
> 
> Today, I have verified all previous comments raised by
> me and looked at new code and below are my findings:
> 
> >>
> >> 4.
> >> LWLockAcquireCommon()
> >> {
> >> ..
> >> if (!LWLockDequeueSelf(l))
> >> {
> >>  for (;;)
> >> {
> >> PGSemaphoreLock(&proc->sem, false);
> >>  if (!proc->lwWaiting)
> >> break;
> >> extraWaits++;
> >>  }
> >> lock->releaseOK = true;
> >> ..
> >> }
> >>
> >> Setting releaseOK in above context might not be required  because if the
> >> control comes in this part of code, it will not retry to acquire another
> >> time.
> 
> > Hm. You're probably right.
> 
> You have agreed to fix this comment, but it seems you have forgot
> to change it.

After I've thought more about it, it's is actually required. This
essentially *is* a retry. Someobdy woke us up, which is where releaseOK
is supposed to be set.

> >> 11.
> >> LWLockRelease()
> >> {
> >> ..
> >> PRINT_LWDEBUG("LWLockRelease", lock, mode);
> >> }
> >>
> >> Shouldn't this be in begining of LWLockRelease function rather than
> >> after processing held_lwlocks array?
> 
> > Ok.
> 
> You have agreed to fix this comment, but it seems you have forgot
> to change it.



> Below comment doesn't seem to be adressed?
> 
> > LWLockAcquireOrWait()
> > {
> > ..
> > LOG_LWDEBUG("LWLockAcquireOrWait", lockid, mode, "succeeded");
> > ..
> > }
> 
> > a. such a log is not there in any other LWLock.. variants,
> >  if we want to introduce it, then shouldn't it be done at
> >  other places as well.

I think you're placing unneccessarily high consistency constraints on a
debugging feature here.

> Below point is yet to be resolved.
> 
> > > 12.
> > > #ifdef LWLOCK_DEBUG
> > > lock->owner = MyProc;
> > > #endif
> > >
> > > Shouldn't it be reset in LWLockRelease?
> >
> > That's actually intentional. It's quite useful to know the last owner
> > when debugging lwlock code.
> 
> Won't it cause any problem if the last owner process exits?

No. PGPROCs aren't deallocated or anything. And it's a debugging only
variable.

> Can you explain how pg_read_barrier() in below code makes this
> access safe?
> 
> LWLockWakeup()
> {
> ..
> + pg_read_barrier(); /* pairs with nwaiters-- */
> + if (!BOOL_ACCESS_ONCE(lock->releaseOK))
> ..
> }

What's the concern you have? Full memory barriers (the atomic
nwaiters--) pair with read memory barriers.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] compress method for spgist

2014-10-22 Thread Teodor Sigaev

When we developed SP-GiST we missed analogue of GiST's compress method. There 
was two reasons for that: lack of imagination to imagine case with different 
types of indexed value and column, and we didn't want call some method while 
index fit in one page. Some discussion on that

http://www.postgresql.org/message-id/27542.1323534...@sss.pgh.pa.us
http://www.postgresql.org/message-id/4ee77ea1.6030...@sigaev.ru

But we was wrong: PostGIS guys found an example: polygon indexing with storing 
just a bounding box. Actually, I don't know index structure for boxes suitable 
for SP-GiST but I'm not a geometer. They are.


Attached patch provides support of optional compress method for SP-GiST.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


spgist_compress_method-1.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2014-10-22 Thread Michael Paquier

On Wed, Oct 22, 2014 at 3:12 PM, Dag-Erling Smørgrav  wrote:

> Tom Lane  writes:
> > This looks to me like re-fighting the last war.  Such a GUC has zero
> value
> > *unless* some situation exactly like the POODLE bug comes up again, and
> > the odds of that are not high.
>
> Many people would have said the exact same thing before POODLE, and
> BEAST, and CRIME, and Heartbleed.  You never know what sort of bugs or
> weaknesses will show up or when; all you know is that there are a lot of
> people working very hard to find these things and exploit them, and that
> they *will* succeeded, again and again and again.  You can gamble that
> PostgreSQL will not be vulnerable due to specific details of its
> protocol or how it uses TLS, but that's a gamble which you will
> eventually lose.
>
There are some companies, without naming them, that have increased the
resources dedicated to analyze existing security protocols and libraries,
so even if the chances are low, IMO the occurence that similar problems
show up are getting to increase wit the time.


> > Moreover, the GUC could easily be misused to decrease rather than
> increase
> > one's security, if it's carelessly set.
>
> That's the user's responsibility.
>
I actually just had a user knocking about having a way to control the
protocols used by server... So, changing my opinion on the matter, that
would be nice to have even such a parameter on back-branches, with
'default' to let the server decide which one is better.
-- 
Michael

Re: [HACKERS] pg_receivexlog --status-interval add fsync feedback

2014-10-22 Thread Heikki Linnakangas


On 10/17/2014 01:59 PM, Simon Riggs wrote:

On 17 October 2014 09:55,   wrote:


A new parameter to send feedback should be called --feedback



A second parameter to decide whether to fsync should be called --fsync


I think keep using "--reply-fsync" and "--fsync-interval" is better than make 
new options.
Thought?


We already have hot_standby_feedback, so using the name feedback is best idea.

I am suggesting that we send feedback even if we do not fsync, to
allow the master to track our progress. Hence the name of the second
parameter was just fsync.

So both names were suggested because of links to those terms already
being used for similar reasons elsewhere in Postgres.


We seem to be going in circles. You suggested having two options, 
--feedback, and --fsync, which is almost exactly what Furuya posted 
originally. I objected to that, because I think that user interface is 
too complicated. Instead, I suggested having just a single option called 
--synchronous, or even better, have no option at all and have the server 
tell the client if it's participating in synchronous replication, and 
have pg_receivexlog automatically fsync when it is, and not otherwise 
[1]. That way you don't need to expose any new options to the user. What 
did you think of that idea?


[1] http://www.postgresql.org/message-id/5434e0ef.9050...@vmware.com

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2014-10-22 Thread Dag-Erling Smørgrav

Magnus Hagander  writes:
> Yes, it does that. Though it only does it on 9.4,but with the facts we
> know now, what 9.4+ does is perfectly safe.

Agreed.

DES
-- 
Dag-Erling Smørgrav - d...@des.no


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2014-10-22 Thread Dag-Erling Smørgrav

Tom Lane  writes:
> As far as protocol version goes, I think our existing coding basically
> says "prefer newest available version, but at least TLS 1.0".  I think
> that's probably a reasonable approach.

The client side forces TLS 1.0:

SSL_context = SSL_CTX_new(TLSv1_method());

In typical OpenSSL fashion, this does *not* mean 1.0 or higher.  It
means 1.0 exactly.

> If the patch exposed a GUC that set a "minimum" version, rather than
> calling out specific acceptable protocols, it might be less risky.

Not necessarily.  Someone might find a weakness in TLS 1.1 which is not
present in 1.0 because it involves a specific algorithm or mode that 1.0
does not support.

DES
-- 
Dag-Erling Smørgrav - d...@des.no

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)

2014-10-22 Thread Michael Paquier

On Fri, Oct 17, 2014 at 2:23 PM, Fujii Masao  wrote:

> I found one problem in the 0002 patch. The patch changes the recovery so
> that
> it creates .done files for every WAL files which exist in pg_xlog
> directory at
> the end of recovery. But even WAL files which will have to be archived
> later
> can exist in pg_xlog at that moment. For example, the latest, recycled and
> fully-written-but-not-archived-yet (i.e., maybe having .ready files) WAL
> files.
> The patch wrongly prevents them from being archived at all.
>
Re-looking at patch 2, yes you are right. Even if it was mentioned that we
should do that for a node that had promotion triggered it was not done this
way as a check on CheckForStandbyTrigger() is actually missing. Attached is
an updated patch.


> ISTM that the 0001 patch has the similar problem. Please imagine the
> following
> scenario.
> In this case, the patch seems to make the restartpoint recycle even WAL
> files
> which have .ready files and will have to be archived later. Thought?
>
Right, that's really backward. This was the approach taken before c9cc7e0,
and this commit actually prevents removal of unarchived WAL files during
recovery.
-- 
Michael
From fdf1f4a46a2f870ee2aeda11f909c465ed12450a Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 22 Oct 2014 15:18:33 +0200
Subject: [PATCH] Enforce all WAL segment files to be marked as .done at  node
 promotion

This is a safety mechanism to ensure that there are no files that are not
considered as .done as some segments may have been missed particularly in
the case of partially written files or files being written when a disconnection
occurred between a streaming standby and its root node. This makes the node
reaching promotion having a state consistent with what is expected using
the assumption that all the WAL segment files that are done being streamed
should be always considered as archived by the node.
---
 src/backend/access/transam/xlog.c| 10 
 src/backend/access/transam/xlogarchive.c | 39 +++-
 src/include/access/xlog_internal.h   |  1 +
 3 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 235b442..7179003 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6862,6 +6862,16 @@ StartupXLOG(void)
 	}
 
 	/*
+	 * Create a .done entry for each WAL file present in pg_xlog that has
+	 * not been yet marked as such for a node that has been promoted. In some
+	 * cases where for example a streaming replica has had a connection to a
+	 * remote node cut abruptly, such WAL files may have been only partially
+	 * written or even not flagged correctly with .done.
+	 */
+	if (InRecovery && CheckForStandbyTrigger())
+		XLogArchiveForceDoneAll();
+
+	/*
 	 * Kill WAL receiver, if it's still running, before we continue to write
 	 * the startup checkpoint record. It will trump over the checkpoint and
 	 * subsequent records if it's still alive when we start writing WAL.
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 047efa2..931106f 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -554,6 +554,40 @@ XLogArchiveNotifySeg(XLogSegNo segno)
 }
 
 /*
+ * XLogArchiveForceDoneAll
+ *
+ * Wrapper of XLogArchiveForceDone scanning all the XLOG segments files in
+ * XLOGDIR, switching them forcibly to .done.
+ */
+void
+XLogArchiveForceDoneAll(void)
+{
+	DIR			   *xldir;
+	struct dirent  *xlde;
+
+
+	xldir = AllocateDir(XLOGDIR);
+	if (xldir == NULL)
+		ereport(ERROR,
+(errcode_for_file_access(),
+ errmsg("could not open transaction log directory \"%s\": %m",
+		XLOGDIR)));
+
+	/*
+	 * Scan all the WAL segments present in the archives and switch them to
+	 * .done.
+	 */
+	while ((xlde = ReadDir(xldir, XLOGDIR)) != NULL)
+	{
+		if (strlen(xlde->d_name) == 24 &&
+			strspn(xlde->d_name, "0123456789ABCDEF") == 24)
+			XLogArchiveForceDone(xlde->d_name);
+	}
+
+	FreeDir(xldir);
+}
+
+/*
  * XLogArchiveForceDone
  *
  * Emit notification forcibly that an XLOG segment file has been successfully
@@ -582,7 +616,6 @@ XLogArchiveForceDone(const char *xlog)
 	(errcode_for_file_access(),
 	 errmsg("could not rename file \"%s\" to \"%s\": %m",
 			archiveReady, archiveDone)));
-
 		return;
 	}
 
@@ -604,6 +637,10 @@ XLogArchiveForceDone(const char *xlog)
 		archiveDone)));
 		return;
 	}
+
+	ereport(DEBUG2,
+			(errmsg("archive status file of \"%s\" has been forcibly switched from \"%s\" to \"%s\"",
+	xlog, archiveReady, archiveDone)));
 }
 
 /*
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 27b9899..04a0de3 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -286,6 +286,7 @@ extern void ExecuteRecoveryCommand(char *command, char *commandName,
 extern void KeepFileRe

Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2014-10-22 Thread Dag-Erling Smørgrav

Tom Lane  writes:
> And in the end, if we set values like this from PG --- whether
> hard-wired or via a GUC --- the SSL library people will have exactly
> the same perspective with regards to *our* values.  And not without
> reason; we were forcing very obsolete settings up till recently,
> because nobody had looked at the issue for a decade.  I see no reason
> to expect that that history won't repeat itself.

I'm not sure what you're saying here, but - I'm not sure how familiar
you are with the OpenSSL API, but it's insecure by default.  There is
*no other choice* for an application than to explicitly select which
protocols it wants to use (or at least which protocols it wants to
avoid).  And you can't change OpenSSL, because a ton of old crappy
software is going to break.

DES
-- 
Dag-Erling Smørgrav - d...@des.no

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] speedup tidbitmap patch: hash BlockNumber

2014-10-22 Thread Teodor Sigaev

Just replace tag_hash in tidbitmap which uses hash_any to direct call of 
hash_uint32, it saves ~5% of execution time.


An example:
# create extension btree_gin;
# select (v / 10)::int4 as i into t from generate_series(1, 500) as v;
# create index idx on t using gin (i);
# set enable_seqscan  = off;

# explain analyze select * from t where i >= 0;
without patch: Execution time: 2427.692 ms
with patch:Execution time: 2319.376 ms

# explain analyze select * from t where i >= 100 and i<= 100;
without patch: Execution time: 524.441 ms
with patch:Execution time: 504.478 ms
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


tbm_blocknumber-2.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2014-10-22 Thread Dag-Erling Smørgrav

Magnus Hagander  writes:
> If anything, I think the default should be "default", and then we have
> that map out to something. Because once you've initdb'ed, the config
> file wil be stuck with a default and we can't change that in a minor
> release *if* something like POODLE shows up again.

Actually, I had that in an earlier version of the patch, but I thought
it was too obscure / circular.  I can easily re-add it.

> In a case like POODLE we probably wouldn't have done it anyway, as it
> doesn't affect our connections (we're not http)

If I understand correctly, imaps has been shown to be vulnerable as
well, so I wouldn't be so sure.

> Having the guc could certainly be useful in some cases. If we do, we
> should of course *also* have a corresponding configuration option in
> libpq, so I'd say this patch is incomplete if we do want to do it.

I can update the patch to include the client side.

DES
-- 
Dag-Erling Smørgrav - d...@des.no


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2014-10-22 Thread Dag-Erling Smørgrav

Tom Lane  writes:
> This looks to me like re-fighting the last war.  Such a GUC has zero value
> *unless* some situation exactly like the POODLE bug comes up again, and
> the odds of that are not high.

Many people would have said the exact same thing before POODLE, and
BEAST, and CRIME, and Heartbleed.  You never know what sort of bugs or
weaknesses will show up or when; all you know is that there are a lot of
people working very hard to find these things and exploit them, and that
they *will* succeeded, again and again and again.  You can gamble that
PostgreSQL will not be vulnerable due to specific details of its
protocol or how it uses TLS, but that's a gamble which you will
eventually lose.

> Moreover, the GUC could easily be misused to decrease rather than increase
> one's security, if it's carelessly set.

That's the user's responsibility.

DES
-- 
Dag-Erling Smørgrav - d...@des.no

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Obsolete reference to _bt_tuplecompare() within tuplesort.c

2014-10-22 Thread Heikki Linnakangas


On 10/19/2014 11:29 AM, Peter Geoghegan wrote:

On Fri, Oct 10, 2014 at 12:33 AM, Peter Geoghegan  wrote:

Attached


Have you looked at this?


Committed now, thanks.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-10-22 Thread Craig Ringer

Here's an updated patch addressing David's points.

I haven't had a chance to test it yet, on win2k8 or win2k12 due to
pgconf.eu .


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From d5cd8c82bdc3caa9d18586eac8e4e083a2729d61 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Thu, 18 Sep 2014 23:02:14 +0800
Subject: [PATCH 2/2] Use GetSystemTimePreciseAsFileTime when available

This will cause PostgreSQL on Windows 8 or Windows Server 2012 to
obtain high-resolution timestamps while allowing the same binaries
to run without problems on older releases.
---
 src/backend/main/main.c |  6 +
 src/include/port.h  |  2 ++
 src/port/gettimeofday.c | 59 +++--
 3 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index c51b391..73c30c5 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -260,6 +260,12 @@ startup_hacks(const char *progname)
 
 		/* In case of general protection fault, don't show GUI popup box */
 		SetErrorMode(SEM_FAILCRITICALERRORS | SEM_NOGPFAULTERRORBOX);
+
+#ifndef HAVE_GETTIMEOFDAY
+		/* Figure out which syscall to use to capture timestamp information */
+		init_win32_gettimeofday();
+#endif
+
 	}
 #endif   /* WIN32 */
 
diff --git a/src/include/port.h b/src/include/port.h
index 9f8465e..4f8af0a 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -328,6 +328,8 @@ extern FILE *pgwin32_popen(const char *command, const char *type);
 #ifndef HAVE_GETTIMEOFDAY
 /* Last parameter not used */
 extern int	gettimeofday(struct timeval * tp, struct timezone * tzp);
+/* On windows we need to call some backend start setup for accurate timing */
+extern void init_win32_gettimeofday(void);
 #endif
 #else			/* !WIN32 */
 
diff --git a/src/port/gettimeofday.c b/src/port/gettimeofday.c
index 73ec406..b8871d9 100644
--- a/src/port/gettimeofday.c
+++ b/src/port/gettimeofday.c
@@ -30,14 +30,69 @@
 
 #include 
 
+#ifndef FRONTEND
+#include 
+#endif
+
 
 /* FILETIME of Jan 1 1970 00:00:00. */
 static const unsigned __int64 epoch = UINT64CONST(1164447360);
 
 /*
+ * Both GetSystemTimeAsFileTime and GetSystemTimePreciseAsFileTime share a
+ * signature, so we can just store a pointer to whichever we find. This
+ * is the pointer's type.
+ */
+typedef VOID (WINAPI *PgGetSystemTimeFn)(LPFILETIME);
+/* Storage for the function we pick at runtime */
+static PgGetSystemTimeFn pg_get_system_time = NULL;
+
+/*
+ * During backend startup, determine if GetSystemTimePreciseAsFileTime is
+ * available and use it; if not, fall back to GetSystemTimeAsFileTime.
+ */
+void
+init_win32_gettimeofday(void)
+{
+	/*
+	 * Because it's guaranteed that kernel32.dll will be linked into our
+	 * address space already, we don't need to LoadLibrary it and worry about
+	 * closing it afterwards, so we're not using Pg's dlopen/dlsym() wrapper.
+	 *
+	 * We'll just look up the address of GetSystemTimePreciseAsFileTime if
+	 * present.
+	 *
+	 * While we could look up the Windows version and skip this on Windows
+	 * versions below Windows 8 / Windows Server 2012 there isn't much point,
+	 * and determining the windows version is its self somewhat Windows version
+	 * and development SDK specific...
+	 */
+	pg_get_system_time = (PgGetSystemTimeFn) GetProcAddress(
+			GetModuleHandle(TEXT("kernel32.dll")),
+"GetSystemTimePreciseAsFileTime");
+	if (pg_get_system_time == NULL)
+	{
+		/*
+		 * The expected error is ERROR_PROC_NOT_FOUND, if the function isn't
+		 * present. No other error should occur.
+		 *
+		 * It's too early in startup to elog(...) if we get some unexpected
+		 * error, and not serious enough to warrant a fprintf to stderr about
+		 * it or save the error and report it later. So silently fall back
+		 * to GetSystemTimeAsFileTime unless we're on a cassert build.
+		 */
+		DWORD errcode = GetLastError();
+		Assert(errcode == ERROR_PROC_NOT_FOUND);
+
+		pg_get_system_time = &GetSystemTimeAsFileTime;
+	}
+
+}
+
+/*
  * timezone information is stored outside the kernel so tzp isn't used anymore.
  *
- * Note: this function is not for Win32 high precision timing purpose. See
+ * Note: this function is not for Win32 high precision timing purposes. See
  * elapsed_time().
  */
 int
@@ -46,7 +101,7 @@ gettimeofday(struct timeval * tp, struct timezone * tzp)
 	FILETIME	file_time;
 	ULARGE_INTEGER ularge;
 
-	GetSystemTimeAsFileTime(&file_time);
+	(*pg_get_system_time)(&file_time);
 	ularge.LowPart = file_time.dwLowDateTime;
 	ularge.HighPart = file_time.dwHighDateTime;
 
-- 
1.9.3

>From 810abe48ca9bc6dfa0c4422052ea7817e0fb256a Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Fri, 12 Sep 2014 12:41:35 +0800
Subject: [PATCH 1/2] Use GetSystemTimeAsFileTime directly in windows
 gettimeofday
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

PostgreSQL was calling GetSystemTime

Re: [HACKERS] btree_gin and ranges

2014-10-22 Thread Teodor Sigaev


Shouldn't this be implemented in a more generic manner? An ordinary
Unfortunately I don't see a way for that. GIN is generalized - and it doesn't 
know semantic. Actually, we could fix first five strategy numbers for BTree's 
strategies, and then we could teach GIN core to use BTRee semantic, but what 
about with already existed operator classes?




The documentation describes btree-gin as providing "GIN operator
classes that implement B-tree equivalent behavior", but now the
behavior diverges.


Anyway GIN couldn't be used for ORDER BY clause.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)

2014-10-22 Thread Heikki Linnakangas


On 10/20/2014 09:26 AM, Michael Paquier wrote:

On Fri, Oct 17, 2014 at 10:37 PM, Michael Paquier
 wrote:


On Fri, Oct 17, 2014 at 9:23 PM, Fujii Masao  wrote:


In this case, the patch seems to make the restartpoint recycle even WAL files
which have .ready files and will have to be archived later. Thought?


The real problem currently is that it is possible to have a segment file not 
marked as .done during recovery when stream connection is abruptly cut when 
this segment is switched, marking it as .ready in archive_status and simply 
letting this segment in pg_xlog because it will neither be recycled nor 
removed. I have not been able to look much at this code these days, so I am not 
sure how invasive it would be in back-branches, but perhaps we should try to 
improve code such as when a segment file is switched and connection to the is 
cut, we guarantee that this file is completed and marked as .done.


I have spent more time on that, with a bit more of underground work...
First, the problem can be reproduced most of the time by running this
simple command:
psql -c 'select pg_switch_xlog()'; pg_ctl restart -m immediate

This will enforce a segment file switch and restart the master in
crash recovery. This has as effect to immediately cut the WAL stream
on slave, symbolized by a FATAL in libpqrcv_receive where rawlen == 0.
For example, let's imagine that stream fails when switching from
00010003 to the next segment, then the
the last XLogRecPtr in XLogWalRcvProcessMsg for dataStart is for
example 0/310, and walrcv->latestWalEnd is 0/400. When stream
restarts it will begin once again from 0/400, ignoring that
00010003 should be marked as .done, ultimately marking
it in .ready state when old segment files are recycled or removed.
There is nothing that can really be done to enforce the creation of a
.done file before the FATAL of libpqrcv_receive because we cannot
predict the stream failure..

Now, we can do better than what we have now by looking at WAL start
position used when starting streaming in WAL receiver and enforce
.done if the start position is the last one of previous segment.
Hence, in the case of start position 0/400 that was found
previously, the file that will be enforced to .done is
00010003. I have written the patch attached that
implements this idea and fixes the problem. Now let's see if you guys
see any flaws in this simple logic which uses a sniper gun instead of
a bazooka as in the previous patches sent.


Hmm. This will still miss the .done file if you don't re-establish the 
streaming replication connection after the restart. For example, if you 
shut down the master, and promote the standby server.


I think we should take a more wholesale approach to this. We should 
enforce the rule that the server only ever archives WAL files belonging 
to the same timeline that the server generates. IOW, the server only 
archives the WAL that it has generated.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] speedup tidbitmap patch: cache page

2014-10-22 Thread Teodor Sigaev

In specific workload postgres could spend a lot of time in tbm_add_tuples,  up 
to 50% of query time. hash_search call is expensive and called twice for each 
ItemPointer to insert. Suggested patch tries to cache last PagetableEntry 
pointer in hope that next ItemPointer will be on the same relation page.


Patch is rather trivial and I don't believe that it could cause performance 
degradation. But it could increase performance on some workload.


An example:
# create extension btree_gin;
# select (v / 10)::int4 as i into t from generate_series(1, 500) as v;
# create index idx on t using gin (i);
# set enable_seqscan  = off;

# explain analyze select * from t where i >= 0;
without patch: Execution time: 2427.692 ms
with patch:Execution time: 2058.718 ms

# explain analyze select * from t where i >= 100 and i<= 100;
without patch: Execution time: 524.441 ms
with patch:Execution time: 195.381 ms
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


tbm_cachepage-1.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] btree_gin and ranges

2014-10-22 Thread Marti Raudsepp

Hi

On Wed, Oct 22, 2014 at 1:55 PM, Teodor Sigaev  wrote:
> With patch it's possible to rewrite query with ranges
> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
> and GIN index will support this query with single scan from -1 to 1.

Shouldn't this be implemented in a more generic manner? An ordinary
btree index could very well support <@ queries too, but your patch
only adds this capability to btree-gin.

The documentation describes btree-gin as providing "GIN operator
classes that implement B-tree equivalent behavior", but now the
behavior diverges.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] btree_gin and ranges

2014-10-22 Thread Teodor Sigaev


Suggested patch adds GIN support contains operator for ranges over scalar 
column.

It allows more effective GIN scan. Currently, queries like
SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
will be excuted by GIN with two scans: one is from mines infinity to 1 and 
another is from -1 to plus infinity. That's because GIN is "generalized" and it 
doesn't know a semantics of operation.


With patch it's possible to rewrite query with ranges
SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
and GIN index will support this query with single scan from -1 to 1.

Patch provides index support only for existing range types: int4, int8, numeric, 
date and timestamp with and without time zone.




--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


btree_gin_range-1.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Simplify EXISTS subqueries containing LIMIT

2014-10-22 Thread David Rowley

On Tue, Oct 21, 2014 at 11:00 PM, Marti Raudsepp  wrote:

> On Sun, Oct 19, 2014 at 1:22 PM, David Rowley 
> wrote:
> > the argument for this would
> > have been much stronger if anti join support had just been added last
> week.
> > It's been quite a few years now and the argument for this must be getting
> > weaker with every release.
>
> I see your point, but I would put it another way: we've had this for a
> few years, but people haven't learned and are *still* using LIMIT 1.
>
>
I've had a bit of a look at this and here's a couple of things:

/*
+* LIMIT clause can be removed if it's a positive constant or ALL,
to
+* prevent it from being an optimization barrier. It's a common
meme to put
+* LIMIT 1 within EXISTS subqueries.
+*/

I think this comment may be better explained along the lines of:

"A subquery which has a LIMIT clause with a positive value is effectively a
no-op in this scenario. With EXISTS we only care about the first row
anyway, so any positive limit value will have no behavioral change to the
query, so we'll simply remove the LIMIT clause. If we're unable to prove
that the LIMIT value is a positive number then we'd better not touch it."

+ /* Checking for negative values is done later; 0 is just silly */
+ if (! limit->constisnull && DatumGetInt64(limit->constvalue) <= 0)

I'm a bit confused by the comment here. You're saying that we'll check for
negatives later... but you're checking for <= 0 on the next line... Did I
miss something or is this a mistake?

This test:

+select * from int4_tbl o where exists (select 1 limit 0);
+ f1
+
+(0 rows)

I guess here you're just testing to ensure that you've not broken this and
that the new code does not accidentally remove the LIMIT clause. I think it
also might be worth adding some tests to ensure that co-related EXISTS
clauses with a positive LIMIT value get properly changed into a SEMI JOIN
instead of remaining as a subplan. And also make sure that a LIMIT 0 or
LIMIT -1 gets left as a subplan. I'd say such a test would supersede the
above test, and I think it's also the case you were talking about
optimising anyway?

You can use EXPLAIN (COSTS OFF) to get a stable explain output.

Regards

David Rowley

Re: [HACKERS] Question about RI checks

2014-10-22 Thread Florian Pflug

Florian Pflug  wrote:
> So in conclusion, the lock avoids raising constraint violation errors in
> a few cases in READ COMMITTED mode. In REPEATABLE READ mode, it converts some
> constraint violation errors into serialization failures. Or at least that's
> how it looks to me.

I go the REPEATABLE READ case wrong there, and that case is actually broken!.
I initially assumed that trying to lock a row whose latest version is invisible
a transaction in mode REPEATABLE READ or above will result in a serialization
error. But for the queries run by ri_triggers.c, this is not the case.

AFAICS, the only executor node which takes the crosscheck snapshot into account
is nodeModifyTable. So both a plain SELECT and a SELECT FOR [KEY] SHARE | UPDATE
will simply run with the snapshot passed to the SPI, which for the query in
question is always a *current* snapshot (we pass the original snapshot as the
crosscheck snapshot in REPEATABLE READ). Thus, if there's a child row that
is visible to the transaction deleting the parent row, but that child was 
meanwhile
removed, it seems that we currently *won't* complain. The same holds, of course
for mode SERIALIZABLE -- we don't distinguish the two in ri_triggers.c.

But that's wrong. The transaction's snapshot *would* see that row, so we
ought to raise an error. Note that this applies also to mode SERIALIZABLE, and
breaks true serializability in some cases, since we don't do conflict detection
for RI enforcement queries.

Here's a test case, involving two transaction A and B. I tried this on
REL9_4_STABLE.

Setup
-
CREATE TABLE parent (id int NOT NULL PRIMARY KEY);
CREATE TABLE child (id int NOT NULL PRIMARY KEY,
parent_id int NOT NULL REFERENCES parent (id));
INSERT INTO parent (id) VALUES (1);
INSERT INTO child (id, parent_id) VALUES (1, 1);

Failure Case

A:: set default_transaction_isolation='serializable';
A:: begin;
A:: select * from child;
 ->  id | parent_id 
+---
  1 | 1
B:: set default_transaction_isolation='serializable';
B:: begin;
B:: delete from child;
 -> DELETE 1
B:: commit;
A:: delete from parent;
 -> DELETE 1
A:: commit;

A can neither come before B in any serializable history (since the DELETE
should fail then), nor after it (since it shouldn't see the row in the child
table then). Yet we don't complain, even though both transaction run in
SERIALIZABLE mode.

On Oct21, 2014, at 19:27 , Nick Barnes  wrote:
>> On Wed, Oct 22, 2014 at 3:19 AM, Kevin Grittner  wrote:
>> 
>> It doesn't seem like this analysis considers all of the available ON
>> DELETE and ON UPDATE behaviors available.  Besides RESTRICT there is
>> CASCADE, SET NULL, SET DEFAULT, and NO ACTION.  Some of those
>> require updating the referencing rows.
>  
> I think the logic in question is specific to RESTRICT and NO ACTION.
> The other cases don't look like they need to explicitly lock anything;
> the UPDATE / DELETE itself should take care of that.

Exactly. We run the dubious SELECT FROM  ... FOR KEY SHARE query *only*
for RESTRICT and NO ACTION.

I've traced through the revisions of ri_triggers.c a bit. Locking the
conflicting child rows in the parent table's DELETE and UPDATE triggers was
added in commit 0882951b0cdb4c6686e57121d56216cb2044f7eb which is dated
1999-12-08. (This was before we even has row-level SHARE locks, so the code
did a SELECT .. FOR UPDATE back then). This is also the commit which added
FOR UPDATE locking to RI_FKey_check, i.e. which ensured that parent rows are
locked when adding new child rows. It's commit message unfortunately doesn't
offer much in terms of explanations - it just says "Fixed concurrent visibility
bug."

When SHARE locks where implemented in commit 
bedb78d386a47fd66b6cda2040e0a5fb545ee371,
dating from 2005-04-28, it seems that "FOR UPDATE" was simply replaced by
"FOR SHARE" throughout ri_triggers.c. The same seems to have happened when
KEY SHARE locks where introduced on 2013-01-13 with commit
0ac5ad5134f2769ccbaefec73844f8504c4d6182.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-10-22 Thread Craig Ringer

On 10/22/2014 04:12 PM, David Rowley wrote:

> I was just having a quick look at this with the view of testing it on a
> windows 8 machine.

Thankyou. I really appreciate your taking the time to do this, as one of
the barriers to getting Windows-specific patches accepted is that
usually people just don't want to review Windows patches.

I see you've already linked it on the commitfest too, so thanks again.

I'll follow up with a fixed patch and my test code shortly. I'm at PGEU
in Madrid so things are a bit chaotic, but I can make some time.

> +pg_get_system_time = (PgGetSystemTimeFn) GetProcAddress(
> +GetModuleHandle(TEXT("kernel32.dll")),
> +"GetSystemRimePreciseAsFileTime");
> 
> 
> "Rime", not doubt is meant to be "Time".

Hm.

I must've somehow managed to attach and post the earlier version of the
patch before I fixed a few issues and tested it, because I've compiled
and tested this feature on Win2k12.

Apparently just not the version I published.

Thanks for catching that. I'll fix it up.

> +elog(DEBUG1, "GetProcAddress(\"GetSystemRimePreciseAsFileTime\") on
> kernel32.dll failed with error code %d not expected
> ERROR_PROC_NOT_FOUND(127)", errcode);
> 
> But I don't think you'll be able to elog quite that early.  I tested by
> getting rid of the if (errcode != ERROR_PROC_NOT_FOUND) test and I get:
> 
> D:\Postgres\install\bin>postgres -D ..\data
> error occurred at src\port\gettimeofday.c:87 before error message
> processing is available

Thankyou. I didn't consider that logging wouldn't be available there.

This case shouldn't really happen.

> Perhaps we needn't bother with this debug message? Either that you'll
> probably need to cache the error code and do something when the logger
> is initialised. 

In a "shouldn't happen" case like this I think it'll be OK to just print
to stderr.

> I see:
> test=# select current_timestamp;
> NOTICE:  4
> NOTICE:  GetSystemTimePreciseAsFileTime
> 
> Which indicates that this is quite a precise timer.

Great.

Because I was testing on AWS I wasn't getting results that fine, but the
kind of granularity you're seeing is consistent with what I get on my
Linux laptop.

> Whereas if I add a quick hack into init_win32_gettimeofday() so that it
> always chooses GetSystemTimeAsFileTime() I see:
> 
> test=# select current_timestamp;
> NOTICE:  9953
> NOTICE:  GetSystemTimeAsFileTime

I'll publish the test code I was using too. I was doing it from SQL
level with no code changes other than the ones required for timestamp
precision.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Windows,PATCH] Use faster, higher precision timer API

2014-10-22 Thread David Rowley

On Fri, Oct 10, 2014 at 10:08 PM, Craig Ringer 
wrote:

> On 09/17/2014 08:27 PM, Craig Ringer wrote:
> > Hi all
> >
> > Attached is a patch to switch 9.5 over to using the
> > GetSystemTimeAsFileTime call instead of separate GetSystemTime and
> > SystemTimeToFileTime calls.
>
> Following on from my prior patch that switches to using
> GetSystemTimeAsFileTime, I now attach a two-patch series that also adds
> support for GetFileTimePreciseAsFileTime where it is available.
>
>
Hi Craig,

I was just having a quick look at this with the view of testing it on a
windows 8 machine.

Here's a couple of things I've noticed:

+ pg_get_system_time = (PgGetSystemTimeFn) GetProcAddress(
+ GetModuleHandle(TEXT("kernel32.dll")),
+ "GetSystemRimePreciseAsFileTime");


"Rime", not doubt is meant to be "Time".

Same here:

+ elog(DEBUG1, "GetProcAddress(\"GetSystemRimePreciseAsFileTime\") on
kernel32.dll failed with error code %d not expected
ERROR_PROC_NOT_FOUND(127)", errcode);

But I don't think you'll be able to elog quite that early.  I tested by
getting rid of the if (errcode != ERROR_PROC_NOT_FOUND) test and I get:

D:\Postgres\install\bin>postgres -D ..\data
error occurred at src\port\gettimeofday.c:87 before error message
processing is available

Perhaps we needn't bother with this debug message? Either that you'll
probably need to cache the error code and do something when the logger is
initialised.

Also, just for testing the resolution of the 2 functions, I added some code
into PostgreSQL's gettimeofday()

do{
(*pg_get_system_time)(&ft2);
}while ((file_time.dwLowDateTime - ft2.dwLowDateTime) == 0);

#ifndef FRONTEND
elog(NOTICE, "%d",  ft2.dwLowDateTime - file_time.dwLowDateTime);
if (pg_get_system_time == &GetSystemTimeAsFileTime)
elog(NOTICE, "GetSystemTimeAsFileTime");
else
elog(NOTICE, "GetSystemTimePreciseAsFileTime");
#endif

return 0;


I see:
test=# select current_timestamp;
NOTICE:  4
NOTICE:  GetSystemTimePreciseAsFileTime

Which indicates that this is quite a precise timer.

Whereas if I add a quick hack into init_win32_gettimeofday() so that it
always chooses GetSystemTimeAsFileTime() I see:

test=# select current_timestamp;
NOTICE:  9953
NOTICE:  GetSystemTimeAsFileTime


Also, I've attached gettimeofday.c, which is a small test programme which
just makes 10 million calls to each function, in order to find out which
one of the 3 method performs best. Here I just wanted to ensure that we'd
not get any performance regression in gettimeofday() calls.

Here's the output from running this on this windows 8.1 laptop.

D:\>cl /Ox gettimeofday.c
Microsoft (R) C/C++ Optimizing Compiler Version 17.00.61030 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

gettimeofday.c
Microsoft (R) Incremental Linker Version 11.00.61030.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:gettimeofday.exe
gettimeofday.obj

D:\>gettimeofday.exe
GetSystemTimePreciseAsFileTime() in 0.157480 seconds
GetSystemTimeAsFileTime() in 0.060075 seconds
Current Method in 0.742677 seconds

Regards

David Rowley

#include 
#include 

#define LOOPS 1000

int main(void)
{
	LARGE_INTEGER start, end, frequency;
	FILETIME ft;
	SYSTEMTIME	system_time;
	
	int x;
	if (!QueryPerformanceFrequency(&frequency))
		exit(-1); 
	
	QueryPerformanceCounter(&start);
	
	for (x = 0; x < LOOPS; x++)
	{
		GetSystemTimePreciseAsFileTime(&ft);
	}
	QueryPerformanceCounter(&end);
	
	printf("GetSystemTimePreciseAsFileTime() in %f seconds\n", (end.QuadPart - start.QuadPart) / (double)frequency.QuadPart);
	
	QueryPerformanceCounter(&start);
	
	for (x = 0; x < LOOPS; x++)
	{
		GetSystemTimeAsFileTime(&ft);
	}
	QueryPerformanceCounter(&end);
	
	printf("GetSystemTimeAsFileTime() in %f seconds\n", (end.QuadPart - start.QuadPart) / (double)frequency.QuadPart);
	
	QueryPerformanceCounter(&start);

	for (x = 0; x < LOOPS; x++)
	{
		GetSystemTime(&system_time);
		SystemTimeToFileTime(&system_time, &ft);
	}
	QueryPerformanceCounter(&end);
	
	printf("Current Method in %f seconds\n", (end.QuadPart - start.QuadPart) / (double)frequency.QuadPart);
	
	return 0;
}
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-10-22 Thread Amit Kapila

On Tue, Oct 21, 2014 at 7:56 PM, Amit Kapila 
wrote:
> On Wed, Oct 8, 2014 at 6:17 PM, Andres Freund 
wrote:
> >
> > On 2014-06-25 19:06:32 +0530, Amit Kapila wrote:

Today, I have verified all previous comments raised by
me and looked at new code and below are my findings:

>>
>> 4.
>> LWLockAcquireCommon()
>> {
>> ..
>> if (!LWLockDequeueSelf(l))
>> {
>>  for (;;)
>> {
>> PGSemaphoreLock(&proc->sem, false);
>>  if (!proc->lwWaiting)
>> break;
>> extraWaits++;
>>  }
>> lock->releaseOK = true;
>> ..
>> }
>>
>> Setting releaseOK in above context might not be required  because if the
>> control comes in this part of code, it will not retry to acquire another
>> time.

> Hm. You're probably right.

You have agreed to fix this comment, but it seems you have forgot
to change it.

>> 11.
>> LWLockRelease()
>> {
>> ..
>> PRINT_LWDEBUG("LWLockRelease", lock, mode);
>> }
>>
>> Shouldn't this be in begining of LWLockRelease function rather than
>> after processing held_lwlocks array?

> Ok.

You have agreed to fix this comment, but it seems you have forgot
to change it.

Below comment doesn't seem to be adressed?

> LWLockAcquireOrWait()
> {
> ..
> LOG_LWDEBUG("LWLockAcquireOrWait", lockid, mode, "succeeded");
> ..
> }

> a. such a log is not there in any other LWLock.. variants,
>  if we want to introduce it, then shouldn't it be done at
>  other places as well.


Below point is yet to be resolved.

> > 12.
> > #ifdef LWLOCK_DEBUG
> > lock->owner = MyProc;
> > #endif
> >
> > Shouldn't it be reset in LWLockRelease?
>
> That's actually intentional. It's quite useful to know the last owner
> when debugging lwlock code.

Won't it cause any problem if the last owner process exits?


Can you explain how pg_read_barrier() in below code makes this
access safe?

LWLockWakeup()
{
..
+ pg_read_barrier(); /* pairs with nwaiters-- */
+ if (!BOOL_ACCESS_ONCE(lock->releaseOK))
..
}



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

[HACKERS] Reducing lock strength of adding foreign keys

2014-10-22 Thread Andreas Karlsson


Hi,

I have been thinking about why we need to grab an AccessExclusiveLock on 
the table with the PK when we add a foreign key. Adding new tables with 
foreign keys to old ones is common so it would be really nice if the 
lock strength could be reduced.


A comment in the code claims that we need to lock so no rows are deleted 
under us and that adding a trigger will lock in AccessExclusive anyway. 
But with MVCC catalog access and the removal of SnapshotNow I do not see 
why adding a trigger would require an exclusive lock. Just locking for 
data changes should be enough.


Looking at the code the only see the duplicate name check use the fact 
that we have grabbed an exclusive lock and that should work anyway due 
to the unique constraint, but since I am pretty new to the code base I 
could be missing something obvious. I have attached a proof of concept 
patch which reduces the lock strength to ShareLock.


What do you say? Am I missing something?

My gut feel also says that if we know it is a RI trigger we are adding 
then AccessShare should be enough for the PK table, since we could rely 
on row locks to prevent rows from being deleted. It would be really nice 
though if this was possible since this would make it possible to add a 
new table with foreign keys and data without locking anything more than 
the referred rows.


Andreas
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
new file mode 100644
index ecdff1e..49ad323
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
*** ATAddForeignKeyConstraint(AlteredTableIn
*** 6016,6031 
  	ListCell   *old_pfeqop_item = list_head(fkconstraint->old_conpfeqop);
  
  	/*
! 	 * Grab an exclusive lock on the pk table, so that someone doesn't delete
! 	 * rows out from under us. (Although a lesser lock would do for that
! 	 * purpose, we'll need exclusive lock anyway to add triggers to the pk
! 	 * table; trying to start with a lesser lock will just create a risk of
! 	 * deadlock.)
  	 */
  	if (OidIsValid(fkconstraint->old_pktable_oid))
! 		pkrel = heap_open(fkconstraint->old_pktable_oid, AccessExclusiveLock);
  	else
! 		pkrel = heap_openrv(fkconstraint->pktable, AccessExclusiveLock);
  
  	/*
  	 * Validity checks (permission checks wait till we have the column
--- 6016,6028 
  	ListCell   *old_pfeqop_item = list_head(fkconstraint->old_conpfeqop);
  
  	/*
! 	 * Grab an share lock on the pk table, so that someone doesn't delete
! 	 * rows out from under us.
  	 */
  	if (OidIsValid(fkconstraint->old_pktable_oid))
! 		pkrel = heap_open(fkconstraint->old_pktable_oid, ShareLock);
  	else
! 		pkrel = heap_openrv(fkconstraint->pktable, ShareLock);
  
  	/*
  	 * Validity checks (permission checks wait till we have the column
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
new file mode 100644
index f4c0ffa..214755d
*** a/src/backend/commands/trigger.c
--- b/src/backend/commands/trigger.c
*** CreateTrigger(CreateTrigStmt *stmt, cons
*** 157,165 
  referenced;
  
  	if (OidIsValid(relOid))
! 		rel = heap_open(relOid, AccessExclusiveLock);
  	else
! 		rel = heap_openrv(stmt->relation, AccessExclusiveLock);
  
  	/*
  	 * Triggers must be on tables or views, and there are additional
--- 157,165 
  referenced;
  
  	if (OidIsValid(relOid))
! 		rel = heap_open(relOid, ShareLock);
  	else
! 		rel = heap_openrv(stmt->relation, ShareLock);
  
  	/*
  	 * Triggers must be on tables or views, and there are additional
*** CreateTrigger(CreateTrigStmt *stmt, cons
*** 525,532 
  	 * can skip this for internally generated triggers, since the name
  	 * modification above should be sufficient.
  	 *
! 	 * NOTE that this is cool only because we have AccessExclusiveLock on the
! 	 * relation, so the trigger set won't be changing underneath us.
  	 */
  	if (!isInternal)
  	{
--- 525,531 
  	 * can skip this for internally generated triggers, since the name
  	 * modification above should be sufficient.
  	 *
! 	 * NOTE that this is cool only because of the unique contraint.
  	 */
  	if (!isInternal)
  	{
*** RemoveTriggerById(Oid trigOid)
*** 1102,1108 
  	 */
  	relid = ((Form_pg_trigger) GETSTRUCT(tup))->tgrelid;
  
! 	rel = heap_open(relid, AccessExclusiveLock);
  
  	if (rel->rd_rel->relkind != RELKIND_RELATION &&
  		rel->rd_rel->relkind != RELKIND_VIEW &&
--- 1101,1107 
  	 */
  	relid = ((Form_pg_trigger) GETSTRUCT(tup))->tgrelid;
  
! 	rel = heap_open(relid, ShareLock);
  
  	if (rel->rd_rel->relkind != RELKIND_RELATION &&
  		rel->rd_rel->relkind != RELKIND_VIEW &&
*** renametrig(RenameStmt *stmt)
*** 1257,1263 
  	 * Look up name, check permissions, and acquire lock (which we will NOT
  	 * release until end of transaction).
  	 */
! 	relid = RangeVarGetRelidExtended(stmt->relation, AccessExclusiveLock,
  	 false, false,
  	 RangeVarCallbackForRenameTrigger,

64 matches

Mail list logo