date:20131119

Re: [HACKERS] LISTEN / NOTIFY enhancement request for Postgresql

2013-11-19 Thread Pavel Golub

Hello, Sev.

You wrote:

SZ> Thank you all for considering my feature request.

SZ> Dimitri's suggestion is a very good one - I feel it will accomplish the
SZ> goal of allowing more granularity in the "Listen".

SZ> We might also want to add a flag in postgresql.conf to disable this 
SZ> enhancement so that we don't break existing code.

I suppose it should be GUC variable (not only global entry) for per
session settings.

SZ> On 11/15/2013 8:19 AM, Pavel Golub wrote:
>> Hello, Dimitri.
>>
>> You wrote:
>>
>> DF> Bruce Momjian  writes:
>• is used to separate names in a path
>• * is used to match any name in a path
>• > is used to recursively match any destination starting from this 
> name
>
> For example using the example above, these subscriptions are possible
>
>  Subscription  Meaning
> PRICE.>  Any price for any product on any exchange
> PRICE.STOCK.>Any price for a stock on any exchange
> PRICE.STOCK.NASDAQ.* Any stock price on NASDAQ
> PRICE.STOCK.*.IBMAny IBM stock price on any exchange
>
>
> My request is to implement the same or similar feature in Postgresql.
 This does seem useful and pretty easy to implement.  Should we add a
 TODO?
>> DF> I think we should consider the ltree syntax in that case, as documented
>> DF> in the following link:
>>
>> DF>   http://www.postgresql.org/docs/9.3/interactive/ltree.html
>>
>> Great idea! Thanks for link.
>>
>> DF> Regards,
>> DF> --
>> DF> Dimitri Fontaine
>> DF> http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
>>
>>
>>
>>
>>




-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])

2013-11-19 Thread Haribabu kommi

On 19 November 2013 09:59 Amit Kapila wrote:
> On Mon, Nov 18, 2013 at 8:31 PM, Haribabu kommi
>  wrote:
> > On 18 November 2013 20:01 Amit Kapila wrote:
> >> > Code changes are fine.
> >> > If two or three errors are present in the configuration file, it
> >> prints the last error
> >> > Configuration parameter file only. Is it required to be mentioned
> >> > in
> >> the documentation?
> >>
> >> Do you mean to say parsing errors or some run-time error, could you
> >> explain with example?
> >
> > LOG:  parameter "shared_buffers" cannot be changed without restarting
> > the server
> > LOG:  parameter "port" cannot be changed without restarting the
> server
> > LOG:  configuration file
> > "/home/hari/installation/bin/../../data/postgresql.auto.conf"
> contains
> > errors; unaffected changes were applied
> >
> > The shared_buffers parameter is changed in postgresql.conf and port
> is changed in postgresql.auto.conf.
> > The error file displays the last error occurred file.
> 
>This is only possible if user tries to change configuration
> parameters both by Alter System command and manually as well and the
> case above is
>not an error, it is just an information for user that there are some
> parameters changed in config file which can only get reflected after
> server restart.
>So definitely user can check log files to see which parameter's
> doesn't get reflected if he is expecting some parameter to be changed,
> but it's not
>changed. I think here even in logs if last file containing errors is
> mentioned is not a big problem.
> 
> > Is it required to mention the above behavior in the document?
> 
>It is better to show in log both the files rather than documenting
> it, if the the above case is helpful for user which I don't think so,
> also to handle case
>in code can complicate the error handling path of code a bit. So I
> think we can leave this case as-is.

Ok fine I marked the patch as ready for committer.

Regards,
Hari babu.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist

2013-11-19 Thread Dean Rasheed

On 12 November 2013 16:00, Pavel Stehule  wrote:
> Hello
>
> here is patch with fault tolerant drop trigger and drop rule support
>
> drop trigger [if exists] trgname on [if exists] tablename;
> drop rule [if exists] trgname on [if exists] tablename;
>
> Regards
>
> Pavel
>

Hi,

I have just started looking at this patch.

It applies cleanly to head, and appears to work as intended. I have a
question though about the syntax. Looking back over this thread, there
seem to have been 3 different possibilities discussed:

1). Keep the existing syntax:

DROP TRIGGER [ IF EXISTS ] name ON table_name [ CASCADE | RESTRICT ];

but make it tolerate a non-existent table when "IF EXISTS" is specified.

2). Support 2 independent levels of "IF EXISTS" using the syntax:

DROP TRIGGER [ IF EXISTS ] name ON table_name [ IF EXISTS ] [ CASCADE
| RESTRICT ]

There was some consensus for this, but then Pavel pointed out that it
is inconsistent with other DROP commands, which all have the "IF
EXISTS" before the object to which it refers.

3). Support 2 independent levels of "IF EXISTS" using the syntax:

DROP TRIGGER [ IF EXISTS ] name ON [ IF EXISTS ] table_name  [ CASCADE
| RESTRICT ]

which is what the latest patch does.

The syntax in option (3) is certainly more consistent with other DROP
commands, but it feels pretty clunky from a grammar point-of-view. It
also feels overly complex for the use cases discussed.

Personally I would prefer option (1). The SQL standard syntax is
simply "DROP TRIGGER name". The only reason we have the "ON
table_name" part is that our trigger names aren't globally unique, so
"trigger_name ON table_name" is required to uniquely identify the
trigger to drop, which would seem to be directly analogous to
specifying a schema in DROP TABLE, and we've already made that
tolerate a non-existent schema if "IF EXISTS" is used.

This seems rather different from ALTER TABLE, which allows multiple
sub-commands on the same table, so naturally lends itself to multiple
independent DROP  [IF EXISTS] sub-commands underneath the
top-level ALTER TABLE [IF EXISTS], for example:

ALTER TABLE IF EXISTS table_name
  DROP COLUMN IF EXISTS col_name,
  DROP CONSTRAINT IF EXISTS constr_name;

So what we currently have can be summarised as 2 classes of
commands/sub-commands to which "IF EXISTS" applies:

ALTER  [IF EXISTS] ...
DROP  [IF EXISTS] ...

We don't yet have multiple levels of "IF EXISTS" within the same DROP,
and I don't think it is necessary. For example, no one seems to be
asking for

DROP TABLE [IF EXISTS] table_name IN [IF EXISTS] schema_name

Anyway, that's just my opinion. Clearly there is at least one person
with a different opinion. What do other people think?

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Windows build patch

2013-11-19 Thread Wim Dumon

Referring to the instructions located here:
http://www.postgresql.org/docs/8.3/static/install-win32-libpq.html

I noticed that the msvs makefile doesn't work. One file is missing, and the
manifest embedding test is a bit awkward. The attached patch fixes these
issues.

BR,
Wim.


postgresql-9.3.1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Call flow of btinsert(PG_FUNCTION_ARGS)

2013-11-19 Thread Rohit Goyal

Hi All,

I was reading b tree index code and I wanted to the know the calling
function which calls
btinsert
(PG_FUNCTION_ARGS)
function in nbtree.c file.
Moreover, my goal behind reading this function was to check how tuple is
inserted in btree.
I want to understand the code flow for b tree index

Please help!!

-- 
Regards,
Rohit Goyal

Re: [HACKERS] Using indices for UNION.

2013-11-19 Thread Kyotaro HORIGUCHI

Thank you for looking in detail for this patch, and giving
thoughtful advices.

> I'm aware that you said you were going to refactor this, but I took a
> quick look through it anyway.  I don't think mere refactoring is going
> to make me happy with it :-(.

Ouch.. Anyway I've refactored this patch altogether with the
another 'unique index' patch and re-splitted into three
parts. One is the 'unique-index stuff' and the second is 'Adding
pathkeys and unique info into struct Plan' patch and the third is
'maily-in-prepunion stuff'. They will be sent in other messages
although for the scarce chance to be picked up :-)

> The basic problem here, as well as with some of the hackery that's
> been proposed recently in planner.c, is that we can't really get much
> further with improving this layer of the planner until we bite the
> bullet and convert this layer to work with Paths not finished Plans.

It is right but hard to go. I would be able to put my little
power on the issue but the planner hardly seems to be get
refactored gradually.

> Look at what you're proposing here: do a complete planning cycle on
> the subquery (underneath which both ordered and unordered Paths will
> be considered, and one or the other will get thrown away).  Then do
> *another* complete planning cycle with ordered output forced.  Then
> compare costs of the results.

Exactly.

> This is hugely inefficient, and it produces far-from-ideal
> results too, because you're forced to make a decision right
> there on which subquery plan to use; you can't make the
> globally optimal choice after considering costs of all the
> subqueries.

Umm. Yes, I know I've done it in somewhat brute way and it costs
a little too much if the subqueries of UNION is rather large, but
I suppose it comparably small to the whole execution time for
most cases.

Putting it aside, from the viewpoint of global-optimizations,
current planner also doesn't do such optimizations.  Moreover it
doesn't consider sorted plans possibly available and
effective. Ignoring the additional planning cost (:-), for the
UNION queries with LIMITS on large partitioned tables with
apropriate indexes (mmm. I suppose it is not so narrow gate..),
the query runtime should be extremely shortend.

>  What we need to do is fix things so we can get multiple Paths
> out of the subquery planning step, some ordered and some not.
> Then we could construct Paths for both the brute force and
> merge-append styles of computing the UNION result, and finally
> choose the cheapest Path at the top level.

Agreed with it at a glance. It sounds quite reasonable. I've been
a bit worried about that the paths and plans seem to be fused in
some extent in grouping_planner, and prepunion
(plan_set_operations) is jumping over path-generation phase to
make plans directly. (And about that I pushed it more on the
way..)

> The same goes for hacking around in grouping_planner.  That function
> is well past the point of collapsing of its own weight; we need
> to invest some effort in refactoring before we can afford to add
> much more complexity there.  And I think the logical way to refactor
> is to rewrite the code in a Path-generation-and-comparison style.
> (Actually, grouping_planner would need to be fixed first, since
> that's what would have to produce the Paths we're hoping to compare
> in prepunion.)

It seems necessary but also seems far far away and hard to
go.

> I had hoped to make some progress on that rewrite this past summer,
> but ended up with no time to work on it :-(.  There's going to be
> a lot of boring infrastructure work before we see much in the way
> of user-visible improvement, I'm afraid, so it's kind of hard to
> make time for it.

Anyway, because I've happened to pass too close by the issue, I
will consider on that for some time (although I would hardly come
up with spiffy solution in a short time.)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Call flow of btinsert(PG_FUNCTION_ARGS)

2013-11-19 Thread Peter Geoghegan

On Tue, Nov 19, 2013 at 2:22 AM, Rohit Goyal  wrote:
>
> I was reading b tree index code and I wanted to the know the calling
> function which calls btinsert(PG_FUNCTION_ARGS) function in nbtree.c file.
> Moreover, my goal behind reading this function was to check how tuple is
> inserted in btree.
> I want to understand the code flow for b tree index

Take a look at this:

http://www.postgresql.org/docs/9.3/static/index-functions.html


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Windows build patch

2013-11-19 Thread Craig Ringer

On 11/19/2013 05:36 PM, Wim Dumon wrote:
> Referring to the instructions located here:
> http://www.postgresql.org/docs/8.3/static/install-win32-libpq.html

Are you in fact trying to build 8.3? Or the current version? You linked
to the 8.3 docs, but the patch filename suggests it's against 9.3.1.

In general patches should be against current git master.

It'd be good if you could provide some accompanying detail. What SDK or
Visual Studio are you using? On what OS and version? Exactly what build
steps did you take to encounter the error you report, and what is the
exact text of the error?

Such details will help others find out about an issue by searching
later, and they'll help focus testing of any patch.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Call flow of btinsert(PG_FUNCTION_ARGS)

2013-11-19 Thread Rohit Goyal

On Tue, Nov 19, 2013 at 11:26 AM, Peter Geoghegan  wrote:

> On Tue, Nov 19, 2013 at 2:22 AM, Rohit Goyal  wrote:
> >
> > I was reading b tree index code and I wanted to the know the calling
> > function which calls btinsert(PG_FUNCTION_ARGS) function in nbtree.c
> file.
> > Moreover, my goal behind reading this function was to check how tuple is
> > inserted in btree.
> > I want to understand the code flow for b tree index
>
> Take a look at this:
>
> http://www.postgresql.org/docs/9.3/static/index-functions.html
>
>
> --
> Peter Geoghegan
>
thanks for reply. :) Moreover, I am so confused about reading the nbtree
folder c files or the file which you gave.

For example, If I want to read how the data is inserted in B tree index.
Should I read your link or the btinsert(PG_FUNCTION_ARGS) in nbtree.c file
in nbtree folder.

In future, if i want to modify btree indexing approach, then which files to
look for.?
-- 
Regards,
Rohit Goyal

Re: [HACKERS] CLUSTER FREEZE

2013-11-19 Thread David Rowley

On Sat, Oct 26, 2013 at 11:19 AM, Thomas Munro  wrote:

> On 25 October 2013 01:17, Josh Berkus  wrote:
>
>> On 10/24/2013 04:55 PM, Robert Haas wrote:
>> > I wonder if we should go so far as to make this the default behavior,
>> > instead of just making it an option.
>>
>> +1 from me.  Can you think of a reason you *wouldn't* want to freeze?
>>
>
> Ok, I attach an alternative patch that makes CLUSTER *always* freeze,
> without any option (but doesn't affect VACUUM FULL in the same way). I will
> post both alternatives to the commitfest app since there seems to be some
> disagreement about whether tuple freezing should be an optional.
>
>
It seems that most people to voice their opinion are leaning towards this
being the more desired behaviour rather than adding the FREEZE as an
option, so I reviewed this patch tonight.

I followed the code around and checked that we do still need the freeze age
parameters in cluster_rel and we do because vacuum full uses the
cluster_rel code and it will only perform a freeze if a user does VACUUM
FULL FREEZE.

I have mixed feelings about updating the comment before the call
to vacuum_set_xid_limits. It looks like the previous comment probably had
not been looked at since vacuum full started using cluster_rel, so perhaps
removing that was good, but on the other hand maybe it should be mentioning
vacuum full and vacuum full freeze? Though I'm probably leaning more
towards what you've changed it to as previously the comment was being a bit
too clever and assuming things about the calling code which turned out bad
as it seemed out of date and lacked knowledge of vacuum full using it.

I think that the patch should include some sort of notes in the documents
to say that cluster performs freezing of tuples. I've attached a patch
which adds something there, but I'm not 100% sure it is the right thing.
Perhaps it should just read:

Cluster also performs aggressive "freezing" of tuples similar to VACUUM
FULL FREEZE.

Although it's not exactly the same as you can perform a vacuum full freeze
on a relation which does not have the clustered index set.

I'll delay a bit to see if anyone else has any comments about what the docs
should read, but I think it is pretty much ready for a commiter's eyes.

Regards

David Rowley

> Thanks
> Thomas Munro
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

cluster_freeze_always_with_docs.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Call flow of btinsert(PG_FUNCTION_ARGS)

2013-11-19 Thread Amit Langote

On Tue, Nov 19, 2013 at 7:31 PM, Rohit Goyal  wrote:
>
>
>
> On Tue, Nov 19, 2013 at 11:26 AM, Peter Geoghegan  wrote:
>>
>> On Tue, Nov 19, 2013 at 2:22 AM, Rohit Goyal  wrote:
>> >
>> > I was reading b tree index code and I wanted to the know the calling
>> > function which calls btinsert(PG_FUNCTION_ARGS) function in nbtree.c
>> > file.
>> > Moreover, my goal behind reading this function was to check how tuple is
>> > inserted in btree.
>> > I want to understand the code flow for b tree index
>>
>> Take a look at this:
>>
>> http://www.postgresql.org/docs/9.3/static/index-functions.html
>>
>>
>> --
>> Peter Geoghegan
>
> thanks for reply. :) Moreover, I am so confused about reading the nbtree
> folder c files or the file which you gave.
>
> For example, If I want to read how the data is inserted in B tree index.
> Should I read your link or the btinsert(PG_FUNCTION_ARGS) in nbtree.c file
> in nbtree folder.
>
> In future, if i want to modify btree indexing approach, then which files to
> look for.?

You could use gdb to debug a simple INSERT.

Take a look at the following also:

[http://www.postgresql.org/docs/devel/static/catalog-pg-am.html]

"The catalog pg_am stores information about index access methods.
There is one row for each index access method supported by the
system."


For example,

postgres=# \x
Expanded display (expanded) is on.
postgres=# select * from pg_am;
-[ RECORD 1 ]---+--
amname  | btree
amstrategies| 5
amsupport   | 2
amcanorder  | t
amcanorderbyop  | f
amcanbackward   | t
amcanunique | t
amcanmulticol   | t
amoptionalkey   | t
amsearcharray   | t
amsearchnulls   | t
amstorage   | f
amclusterable   | t
ampredlocks | t
amkeytype   | 0
aminsert| btinsert
ambeginscan | btbeginscan
amgettuple  | btgettuple
amgetbitmap | btgetbitmap
amrescan| btrescan
amendscan   | btendscan
ammarkpos   | btmarkpos
amrestrpos  | btrestrpos
ambuild | btbuild
ambuildempty| btbuildempty
ambulkdelete| btbulkdelete
amvacuumcleanup | btvacuumcleanup
amcanreturn | btcanreturn
amcostestimate  | btcostestimate
amoptions   | btoptions
...
...

Likewise you could find access method (am) details for other index types.

So, if you want to study the flow for a btree insert you could set a
gdb breakpoint in btinsert (listed as 'aminsert' for btree above) to
begin with.

--
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Call flow of btinsert(PG_FUNCTION_ARGS)

2013-11-19 Thread Craig Ringer

On 11/19/2013 06:22 PM, Rohit Goyal wrote:
> Hi All, 
> 
> I was reading b tree index code and I wanted to the know the calling
> function which calls btinsert
> (PG_FUNCTION_ARGS
> )
> function in nbtree.c file.

It's specified as a member of the btree index access method in
include/catalog/pg_am.h . All invocations are done via index access
method lookups; see the documentation for pg_catalog.pg_am, index access
methods, etc.

It's all abstracted so the core code doesn't know or care whether it's
using a b-tree, GiST, or something completely different. Relevant lines:

$ git grep '\'

backend/access/nbtree/nbtree.c:btinsert(PG_FUNCTION_ARGS)

include/catalog/pg_am.h:DATA(insert OID = 403 (  btree  5 2 t f
t t t t t t f t t 0 btinsert btbeginscan btgettuple btgetbitmap btrescan
btendscan btmarkpos btrestrpos btbuild btbuildempty btbulkdelete
btvacuumcleanup btcanreturn btcostestimate btoptions ));

include/catalog/pg_proc.h:DATA(insert OID = 331 (  btinsert
   PGNSP PGUID 12 1 0 0 0 f f f f t f v 6 0 16 "2281 2281 2281 2281 2281
2281" _null_ _null_ _null_ _null_  btinsert _null_ _null_ _null_ ));

See http://www.postgresql.org/docs/current/static/catalog-pg-am.html

In psql:

regress=> \x
regress=> select * from pg_am where amname = 'btree';
-[ RECORD 1 ]---+
amname  | btree
... snip ...
aminsert| btinsert
ambeginscan | btbeginscan
... snip ...

pg_catalog.pg_index rows refer to a vector of pg_opclass rows, one per
index column, and that in turns refers to pg_am via the opcmethod column.

http://www.postgresql.org/docs/current/static/catalog-pg-index.html
http://www.postgresql.org/docs/current/static/catalog-pg-am.html
http://www.postgresql.org/docs/current/static/catalog-pg-opclass.html

See:

This query demonstrates the relationship. It'd be smarter to properly
separate out each column but I couldn't be bothered, so it just prints
one row per index column in multicolumn indexes, not preserving order.

regress=> select c.relname, oc.opcname, am.amname, am.aminsert
from pg_index i
inner join pg_class c on (i.indexrelid = c.oid)
inner join pg_opclass oc ON (oc.oid = ANY(i.indclass))
inner join pg_am am ON (oc.opcmethod = am.oid);

All the executor knows is how to ask, eg "for this column of this index,
what opclass was specified, and for that opclass, what is the oid of the
function to perform the insert access-method operation".

It has no idea it's calling btinsert.

Try creating a GiST or GIN index and compare. It might help clarify things.

> I want to understand the code flow for b tree index

It'll take time, but stepping/jumping through it with a debugger will
probably be a good way to do that. Connect a psql session, run "SELECT
pg_backend_pid()", attach gdb to that pid, set breakpoints at the entry
points of interest, cont, then run a command in psql that'll trigger one
of the breakpoints. From there you can step through and watch what it
does, using "next" to skip over function invocations you aren't
interested in and "step" to enter ones you are.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Windows build patch

2013-11-19 Thread Wim Dumon

9.3.1 is the version that failed for me, MSVS 2012, Windows 7.
Build commands:
cd src
nmake /f win32.mak

The first build error:

link.exe -lib @C:\Users\wim\AppData\Local\Temp\nmB317.tmp
rc.exe /l 0x409 /fo".\Release\libpq.res" libpq-dist.rc
Microsoft (R) Windows (R) Resource Compiler Version 6.2.9200.16384
Copyright (C) Microsoft Corporation.  All rights reserved.

link.exe @C:\Users\wim\AppData\Local\Temp\nmB3A5.tmp
   Creating library .\Release\libpqdll.lib and object .\Release\libpqdll.exp
libpq.lib(dirmod.obj) : error LNK2019: unresolved external symbol _pstrdup
referenced in function _pgfnames
libpq.lib(dirmod.obj) : error LNK2019: unresolved external symbol _palloc
referenced in function _pgfnames
libpq.lib(dirmod.obj) : error LNK2019: unresolved external symbol _pfree
referenced in function _pgfnames_cleanup
libpq.lib(dirmod.obj) : error LNK2019: unresolved external symbol _repalloc
referenced in function _pgfnames
.\Release\libpq.dll : fatal error LNK1120: 4 unresolved externals
NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio
11.0
\VC\BIN\link.exe"' : return code '0x460'
Stop.
NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio
11.0
\VC\BIN\nmake.EXE"' : return code '0x2'
Stop.

The second build error, after adding the missing .c file:

   Creating library .\Release\libpqdll.lib and object .\Release\libpqdll.exp
mt -manifest .\Release\libpq.dll.manifest
-outputresource:.\Release\libpq.dll;2
Microsoft (R) Manifest Tool version 6.2.9200.16384
Copyright (c) Microsoft Corporation 2012.
All rights reserved.

.\Release\libpq.dll.manifest : general error c1010070: Failed to load and
parse the manifest. The system cannot find the file specified.
NMAKE : fatal error U1077: '"C:\Program Files (x86)\Windows
Kits\8.0\bin\x86\mt.EXE"' : return code '0x1f'
Stop.
NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio
11.0\VC\BIN\nmake.EXE"' : return code '0x2'
Stop.

2013/11/19 Craig Ringer 

> On 11/19/2013 05:36 PM, Wim Dumon wrote:
> > Referring to the instructions located here:
> > http://www.postgresql.org/docs/8.3/static/install-win32-libpq.html
>
> Are you in fact trying to build 8.3? Or the current version? You linked
> to the 8.3 docs, but the patch filename suggests it's against 9.3.1.
>
> In general patches should be against current git master.
>
> It'd be good if you could provide some accompanying detail. What SDK or
> Visual Studio are you using? On what OS and version? Exactly what build
> steps did you take to encounter the error you report, and what is the
> exact text of the error?
>
> Such details will help others find out about an issue by searching
> later, and they'll help focus testing of any patch.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS] Get more from indices.

2013-11-19 Thread Kyotaro HORIGUCHI

Hello, I've totally refactored the series of patches and cut out
the appropriate portion as 'unique (and non-nullable) index
stuff'.

As the discussion before, it got rid of path distinctness. This
patch works only on index 'full-orederedness', i.e., unique index
on non-nullable columns.

This patch itself does not so much. Will have power applied with
'Using indices for UNION' patch.


=== Making test table

create table t (a int not null, b int not null, c int, d text);
create unique index i_t_ab on t (a, b);
insert into t (select a / 5, 4 - (a % 5), a, 't' from generate_series(00, 
09) a);


=== Example 1.

not-patched=# explain select * from t order by a, b ,c limit 1;
>   QUERY PLAN  
> --
>  Limit  (cost=2041.00..2041.00 rows=1 width=14)
>->  Sort  (cost=2041.00..2291.00 rows=10 width=14)
>  Sort Key: a, b, c
>  ->  Seq Scan on t  (cost=0.00..1541.00 rows=10 width=14)

patched=# explain select * from t order by a, b ,c limit 1;
>   QUERY PLAN 
> -
>  Limit  (cost=0.29..0.33 rows=1 width=14)
>->  Index Scan using i_t_ab on t  (cost=0.29..3857.04 rows=10 width=14)


=== Example 2.

not-patched=# explain select distinct * from t order by a limit 1;
> QUERY PLAN 
> ---
>  Limit  (cost=1820.46..1820.47 rows=1 width=44)
>->  Sort  (cost=1820.46..1835.34 rows=5951 width=44)
>  Sort Key: a
>  ->  HashAggregate  (cost=1731.20..1790.71 rows=5951 width=44)
>->  Seq Scan on t  (cost=0.00..1136.10 rows=59510 width=44)

patched=# explain select distinct * from t order by a limit 1;
>  QUERY PLAN   
>   
> 
>  Limit  (cost=0.29..1.09 rows=1 width=44)
>->  Unique  (cost=0.29..4756.04 rows=5951 width=44)
>  ->  Index Scan using i_t_ab on t  (cost=0.29..4160.94 rows=59510 
> width=44)

The unique node could be removed technically but it requires to
care the distinctness of path/plan. So it has been put out to
"Using indeces for UNION" patch.


Any comments?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 606734a..43be0a5 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -953,7 +953,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		index_pathkeys = build_index_pathkeys(root, index,
 			  ForwardScanDirection);
 		useful_pathkeys = truncate_useless_pathkeys(root, rel,
-	index_pathkeys);
+	index_pathkeys,
+	index->full_ordered);
 		orderbyclauses = NIL;
 		orderbyclausecols = NIL;
 	}
@@ -1015,7 +1016,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		index_pathkeys = build_index_pathkeys(root, index,
 			  BackwardScanDirection);
 		useful_pathkeys = truncate_useless_pathkeys(root, rel,
-	index_pathkeys);
+	index_pathkeys,
+	index->full_ordered);
 		if (useful_pathkeys != NIL)
 		{
 			ipath = create_index_path(root, index,
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 032b2cd..5d8ee04 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -369,6 +369,29 @@ get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
 }
 
 /*
+ * path_is_ordered
+ * Return true if the path is apparently ordered on the pathkeys.
+ */
+bool
+path_is_ordered(Path *path, List *pathkeys)
+{
+	if (pathkeys_contained_in(pathkeys, path->pathkeys))
+		return true;
+
+	/*
+	 * If IndexPath is fully ordered, it is sufficiently ordered if index
+	 * pathkeys is a subset of target pathkeys
+	 */
+	if (pathkeys && path->pathkeys &&
+		IsA(path, IndexPath) &&
+		((IndexPath*)path)->indexinfo->full_ordered &&
+		(pathkeys_contained_in(path->pathkeys, pathkeys)))
+		return true;
+
+	return false;
+}
+
+/*
  * get_cheapest_fractional_path_for_pathkeys
  *	  Find the cheapest path (for retrieving a specified fraction of all
  *	  the tuples) that satisfies the given pathkeys and parameterization.
@@ -381,6 +404,8 @@ get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
  * 'pathkeys' represents a required ordering (in canonical form!)
  * 'required_outer' denotes allowable outer relations for parameterized paths
  * 'fraction' is the fraction of the total tuples expected to be retrieved
+ * paths->pathkeys might be replaced with pathkeys so as to declare that the

Re: [HACKERS] Using indices for UNION.

2013-11-19 Thread Kyotaro HORIGUCHI

Hello, I've totally refactored the series of pathes and cut out
the appropriate portion as 'UNION stuff'.

This patch rquires another 'pathkeys expansion using fully-orderd
index' patch to work.

http://www.postgresql.org/message-id/20131119.203516.251520490.horiguchi.kyot...@lab.ntt.co.jp

1. plan_with_pathkeys_v1_20131119.patch

 This is a patch adding pathkeys (and uniqueness) information on
 struct Plan to do what the latter part of grouping_planner does
 on current_pathkeys in seemingly autonomous way. As in the
 previous discussion, this is in a sense a bad direction to
 go. But this patch make the path manipulations occured in the
 latter of grouping_planner simple and would reduce extra sorting
 and uniq'ing.

2. union_uses_idx_v3_20131119.patch

 This is made to apply after the first patch above. The core of
 "Using indices for UNION". This is - as in the previous
 discussion - also not in so smart way to do that. Most of all,
 this patch runs subquery_planner additionally for UNION with
 some conditions. Nevertheless, the expected gain should not be
 small.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 5947e5b..d8a17a0 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -98,11 +98,13 @@ static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
 static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
 			   Oid indexid, List *indexqual, List *indexqualorig,
 			   List *indexorderby, List *indexorderbyorig,
+			   List *pathkeys, bool uniquely_ordered,
 			   ScanDirection indexscandir);
 static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
    Index scanrelid, Oid indexid,
    List *indexqual, List *indexorderby,
    List *indextlist,
+   List *pathkeys, bool uniquely_ordered,
    ScanDirection indexscandir);
 static BitmapIndexScan *make_bitmap_indexscan(Index scanrelid, Oid indexid,
 	  List *indexqual,
@@ -150,8 +152,8 @@ static MergeJoin *make_mergejoin(List *tlist,
 			   bool *mergenullsfirst,
 			   Plan *lefttree, Plan *righttree,
 			   JoinType jointype);
-static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
-		  AttrNumber *sortColIdx, Oid *sortOperators,
+static Sort *make_sort(PlannerInfo *root, Plan *lefttree, List *pathkeys,
+		  int numCols,  AttrNumber *sortColIdx, Oid *sortOperators,
 		  Oid *collations, bool *nullsFirst,
 		  double limit_tuples);
 static Plan *prepare_sort_from_pathkeys(PlannerInfo *root,
@@ -750,6 +752,8 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path)
 	plan->qual = NIL;
 	plan->lefttree = NULL;
 	plan->righttree = NULL;
+	plan->pathkeys = pathkeys;
+	plan->is_unique = false;
 
 	/* Compute sort column info, and adjust MergeAppend's tlist as needed */
 	(void) prepare_sort_from_pathkeys(root, plan, pathkeys,
@@ -810,7 +814,7 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path)
 
 		/* Now, insert a Sort node if subplan isn't sufficiently ordered */
 		if (!pathkeys_contained_in(pathkeys, subpath->pathkeys))
-			subplan = (Plan *) make_sort(root, subplan, numsortkeys,
+			subplan = (Plan *) make_sort(root, subplan, pathkeys, numsortkeys,
 		 sortColIdx, sortOperators,
 		 collations, nullsFirst,
 		 best_path->limit_tuples);
@@ -1263,6 +1267,8 @@ create_indexscan_plan(PlannerInfo *root,
 fixed_indexquals,
 fixed_indexorderbys,
 			best_path->indexinfo->indextlist,
+best_path->path.pathkeys,
+false,
 best_path->indexscandir);
 	else
 		scan_plan = (Scan *) make_indexscan(tlist,
@@ -1273,6 +1279,8 @@ create_indexscan_plan(PlannerInfo *root,
 			stripped_indexquals,
 			fixed_indexorderbys,
 			indexorderbys,
+			best_path->path.pathkeys,
+			false,
 			best_path->indexscandir);
 
 	copy_path_costsize(&scan_plan->plan, &best_path->path);
@@ -3245,6 +3253,8 @@ make_indexscan(List *qptlist,
 			   List *indexqualorig,
 			   List *indexorderby,
 			   List *indexorderbyorig,
+			   List *pathkeys,
+			   bool  uniquely_ordered,
 			   ScanDirection indexscandir)
 {
 	IndexScan  *node = makeNode(IndexScan);
@@ -3255,6 +3265,8 @@ make_indexscan(List *qptlist,
 	plan->qual = qpqual;
 	plan->lefttree = NULL;
 	plan->righttree = NULL;
+	plan->pathkeys = pathkeys;
+	plan->is_unique = uniquely_ordered;
 	node->scan.scanrelid = scanrelid;
 	node->indexid = indexid;
 	node->indexqual = indexqual;
@@ -3274,6 +3286,8 @@ make_indexonlyscan(List *qptlist,
    List *indexqual,
    List *indexorderby,
    List *indextlist,
+   List *pathkeys,
+   bool  uniquely_ordered,
    ScanDirection indexscandir)
 {
 	IndexOnlyScan *node = makeNode(IndexOnlyScan);
@@ -3284,6 +3298,8 @@ make_indexonlyscan(List *qptli

Re: [HACKERS] Windows build patch

2013-11-19 Thread Craig Ringer

On 11/19/2013 06:55 PM, Wim Dumon wrote:
> 9.3.1 is the version that failed for me, MSVS 2012, Windows 7.
> Build commands:
> cd src
> nmake /f win32.mak

OK, so you're trying to build libpq (and just libpq, not the rest of
PostgreSQL) using the standalone makefile, instead of using build.pl to
compile the whole tree.

That's supposed to be supported, but I won't be too surprised if it's
bit-rotted a bit. Your patch makes sense in this context. You add a
missing soure file, and you only process the DLL manifest if one is
generated. That seems reasonable. I'd like to test this patch out
quickly, but I'm fairly happy with it.

BTW, the well maintained procedure is for a full source build, per,
http://www.postgresql.org/docs/current/static/install-windows.html . We
should still fix standalone builds of libpq, but in general I tend to
just build the whole tree.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist

2013-11-19 Thread Pavel Stehule

Hello

I am thinking so @2 is not good idea. Using well known idiom "IF EXISTS"
once before table name and second after table name can be difficult and
messy for users. If you like it, use different idiom or different keyword,
please.

My person favourite is @1 - fault tolerant version - but I understand to
objection (what was reason, why I wrote a last version @3) - @1 and @3 are
good decision.

Regards

Pavel


2013/11/19 Dean Rasheed 

> On 12 November 2013 16:00, Pavel Stehule  wrote:
> > Hello
> >
> > here is patch with fault tolerant drop trigger and drop rule support
> >
> > drop trigger [if exists] trgname on [if exists] tablename;
> > drop rule [if exists] trgname on [if exists] tablename;
> >
> > Regards
> >
> > Pavel
> >
>
> Hi,
>
> I have just started looking at this patch.
>
> It applies cleanly to head, and appears to work as intended. I have a
> question though about the syntax. Looking back over this thread, there
> seem to have been 3 different possibilities discussed:
>
>
> 1). Keep the existing syntax:
>
> DROP TRIGGER [ IF EXISTS ] name ON table_name [ CASCADE | RESTRICT ];
>
> but make it tolerate a non-existent table when "IF EXISTS" is specified.
>
>
> 2). Support 2 independent levels of "IF EXISTS" using the syntax:
>
> DROP TRIGGER [ IF EXISTS ] name ON table_name [ IF EXISTS ] [ CASCADE
> | RESTRICT ]
>
> There was some consensus for this, but then Pavel pointed out that it
> is inconsistent with other DROP commands, which all have the "IF
> EXISTS" before the object to which it refers.
>
>
> 3). Support 2 independent levels of "IF EXISTS" using the syntax:
>
> DROP TRIGGER [ IF EXISTS ] name ON [ IF EXISTS ] table_name  [ CASCADE
> | RESTRICT ]
>
> which is what the latest patch does.
>
>
> The syntax in option (3) is certainly more consistent with other DROP
> commands, but it feels pretty clunky from a grammar point-of-view. It
> also feels overly complex for the use cases discussed.
>
> Personally I would prefer option (1). The SQL standard syntax is
> simply "DROP TRIGGER name". The only reason we have the "ON
> table_name" part is that our trigger names aren't globally unique, so
> "trigger_name ON table_name" is required to uniquely identify the
> trigger to drop, which would seem to be directly analogous to
> specifying a schema in DROP TABLE, and we've already made that
> tolerate a non-existent schema if "IF EXISTS" is used.
>
> This seems rather different from ALTER TABLE, which allows multiple
> sub-commands on the same table, so naturally lends itself to multiple
> independent DROP  [IF EXISTS] sub-commands underneath the
> top-level ALTER TABLE [IF EXISTS], for example:
>
> ALTER TABLE IF EXISTS table_name
>   DROP COLUMN IF EXISTS col_name,
>   DROP CONSTRAINT IF EXISTS constr_name;
>
> So what we currently have can be summarised as 2 classes of
> commands/sub-commands to which "IF EXISTS" applies:
>
> ALTER  [IF EXISTS] ...
> DROP  [IF EXISTS] ...
>
> We don't yet have multiple levels of "IF EXISTS" within the same DROP,
> and I don't think it is necessary. For example, no one seems to be
> asking for
>
> DROP TABLE [IF EXISTS] table_name IN [IF EXISTS] schema_name
>
> Anyway, that's just my opinion. Clearly there is at least one person
> with a different opinion. What do other people think?
>
> Regards,
> Dean
>

Re: [HACKERS] Call flow of btinsert(PG_FUNCTION_ARGS)

2013-11-19 Thread Rohit Goyal

On Tue, Nov 19, 2013 at 11:52 AM, Craig Ringer wrote:

> On 11/19/2013 06:22 PM, Rohit Goyal wrote:
> > Hi All,
> >
> > I was reading b tree index code and I wanted to the know the calling
> > function which calls btinsert
> > <
> http://doxygen.postgresql.org/nbtree_8c.html#a2b69e4abd6e38fbb317a503b8cf6e00a
> >(PG_FUNCTION_ARGS
> > <
> http://doxygen.postgresql.org/fmgr_8h.html#adf4dec9b7d23f1b4c68477affde8b7ff
> >)
> > function in nbtree.c file.
>
> It's specified as a member of the btree index access method in
> include/catalog/pg_am.h . All invocations are done via index access
> method lookups; see the documentation for pg_catalog.pg_am, index access
> methods, etc.
>
> It's all abstracted so the core code doesn't know or care whether it's
> using a b-tree, GiST, or something completely different. Relevant lines:
>
> $ git grep '\'
>
> backend/access/nbtree/nbtree.c:btinsert(PG_FUNCTION_ARGS)
>
> include/catalog/pg_am.h:DATA(insert OID = 403 (  btree  5 2 t f
> t t t t t t f t t 0 btinsert btbeginscan btgettuple btgetbitmap btrescan
> btendscan btmarkpos btrestrpos btbuild btbuildempty btbulkdelete
> btvacuumcleanup btcanreturn btcostestimate btoptions ));
>
> include/catalog/pg_proc.h:DATA(insert OID = 331 (  btinsert
>PGNSP PGUID 12 1 0 0 0 f f f f t f v 6 0 16 "2281 2281 2281 2281 2281
> 2281" _null_ _null_ _null_ _null_  btinsert _null_ _null_ _null_ ));
>
>
> See http://www.postgresql.org/docs/current/static/catalog-pg-am.html
>
> In psql:
>
> regress=> \x
> regress=> select * from pg_am where amname = 'btree';
> -[ RECORD 1 ]---+
> amname  | btree
> ... snip ...
> aminsert| btinsert
> ambeginscan | btbeginscan
> ... snip ...
>
>
> pg_catalog.pg_index rows refer to a vector of pg_opclass rows, one per
> index column, and that in turns refers to pg_am via the opcmethod column.
>
> http://www.postgresql.org/docs/current/static/catalog-pg-index.html
> http://www.postgresql.org/docs/current/static/catalog-pg-am.html
> http://www.postgresql.org/docs/current/static/catalog-pg-opclass.html
>
>
>
> See:
>
> This query demonstrates the relationship. It'd be smarter to properly
> separate out each column but I couldn't be bothered, so it just prints
> one row per index column in multicolumn indexes, not preserving order.
>
>
> regress=> select c.relname, oc.opcname, am.amname, am.aminsert
> from pg_index i
> inner join pg_class c on (i.indexrelid = c.oid)
> inner join pg_opclass oc ON (oc.oid = ANY(i.indclass))
> inner join pg_am am ON (oc.opcmethod = am.oid);
>
>
> All the executor knows is how to ask, eg "for this column of this index,
> what opclass was specified, and for that opclass, what is the oid of the
> function to perform the insert access-method operation".
>
> It has no idea it's calling btinsert.
>
> Try creating a GiST or GIN index and compare. It might help clarify things.
>
> > I want to understand the code flow for b tree index
>
> It'll take time, but stepping/jumping through it with a debugger will
> probably be a good way to do that. Connect a psql session, run "SELECT
> pg_backend_pid()", attach gdb to that pid, set breakpoints at the entry
> points of interest, cont, then run a command in psql that'll trigger one
> of the breakpoints. From there you can step through and watch what it
> does, using "next" to skip over function invocations you aren't
> interested in and "step" to enter ones you are.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Please tel me how to attached gdb to pid and debug.


Also what I understood from your reply is that everything is abstracted, so
am doesnot care abt backend implementation. Now, I want to modify B tree
indexing scheme. So, which files I should focus ?

Regards,
Rohit Goyal

Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-11-19 Thread Haribabu kommi

On 18 November 2013 23:30 Fujii Masao wrote:
> On Tue, Nov 19, 2013 at 12:01 AM, Haribabu kommi
>  wrote:
> > On 18 November 2013 18:45 Fujii Masao wrote:
> >> On Mon, Nov 18, 2013 at 6:31 PM, Haribabu kommi
> >>  wrote:
> >> >
> >> > On 18 November 2013 11:19 Haribabu kommi wrote:
> >> >> On 17 November 2013 00:55 Fujii Masao wrote:
> >> >> > On Sat, Nov 16, 2013 at 2:27 PM, Haribabu kommi
> >> >> >  wrote:
> >> >> > > on 15 November 2013 17:26 Magnus Hagander wrote:
> >> >> > >
> >> >> > >>On Fri, Nov 15, 2013 at 12:10 PM, Haribabu kommi
> >> >> > >> wrote:
> >> >> > >
> >> >> > >>On 14 November 2013 23:59 Fujii Masao wrote:
> >> >> > >>> On Thu, Nov 14, 2013 at 9:08 PM, Haribabu kommi
> >> >> > >>>  wrote:
> >> >> > >>> > Please find attached the patch, for adding a new option
> >> >> > >>> > for pg_basebackup, to specify a different directory for
> pg_xlog.
> >> >> > 
> >> >> > >>> Sounds good! Here are the review comments:
> >> >> >  Don't we need to prevent users from specifying the same
> >> >> directory
> >> >> >  in both --pgdata and --xlogdir?
> >> >> > >
> >> >> > >>>I feel no need to prevent, even if user specifies both
> >> >> > >>>--pgdata
> >> >> and
> >> >> > >>>--xlogdir as same directory all the transaction log files
> >> >> > >>>will be created in the base directory  instead of pg_xlog
> directory.
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > >>Given how easy it would be to prevent that, I think we should.
> >> It
> >> >> > >>would be  an easy misunderstanding to specify that when you
> >> >> actually
> >> >> > >>want it in  /pg_xlog. Specifying that would be
> >> >> > >>redundant in the first place,  but people ca do that, but it
> >> >> > >
> >> >> > >>would also be very easy to do it by mistake, and you'd end up
> >> >> > >>with something that's really bad, including a recursive
> symlink.
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > Presently with initdb also user can specify both data and
> xlog
> >> >> > > directories as same.
> >> >> > >
> >> >> > > To prevent the data directory and xlog directory as same,
> >> >> > > there is
> >> >> a
> >> >> > > way in windows (_fullpath api) to get absolute path from a
> >> >> > > relative path, but I didn't get any such possibilities in
> linux.
> >> >> > >
> >> >> > > I didn't find any other way to check it, if anyone have any
> >> >> > > idea regarding this please let me know.
> >> >> >
> >> >> > What about make_absolute_path() in miscinit.c?
> >> >>
> >> >> The make_absoulte_patch() function gets the current working
> >> directory
> >> >> and adds The relative path to CWD, this is not giving proper
> >> absolute
> >> >> path.
> >> >>
> >> >> I have added a new function verify_data_and_xlog_dir_same() which
> >> >> will change the Current working directory to data directory and
> >> >> gets the CWD and the same way for transaction log directory.
> >> >> Compare the both data and xlog directories and throw an error.
> >> >> Please check it
> >> once.
> >> >>
> >> >> Is there any other way to identify that both data and xlog
> >> >> directories are pointing to the same Instead of comparing both
> >> absolute paths?
> >> >>
> >> >> Updated patch attached in the mail.
> >> >
> >> > Failure when the xlogdir doesn't exist is fixed in the updated
> >> > patch
> >> attached in the mail.
> >>
> >> Thanks for updating the patch!
> >>
> >> With the patch, when I specify the same directory in both -D and --
> >> xlogdir, I got the following error.
> >>
> >> $ cd /tmp
> >> $ pg_basebackup -D hoge --xlogdir=/tmp/hoge -X fetch
> >> pg_basebackup: could not change directory to "hoge": No such file or
> >> directory
> >
> > Thanks. After finding the xlog directory absolute path returning back
> > to current working Directory is missed, because of this reason it is
> > failing while finding the absolute Path for data directory. Apart
> from this change the code is reorganized a bit.
> >
> > Updated patch is attached in the mail. Please review it once.
> 
> Thanks for newer version of the patch!
> 
> I found that the empty base directory is created and remains even when
> the same directory is specified in both -D and --xlogdir and the error
> occurs.
> I think that it's
> better to throw an error in that case before creating any new directory.
> Thought?

By creating the base directory only the patch finds whether provided base and
Xlog directories are same or not? To solve this problem following options are 
possible

1. No problem as it is just an empty base directory, so it can be reused in the 
next time
   Leave it as it is.
2. Once the error is identified, the base directory can be deleted.
3. write a new function to remove the parent references from the provided path 
to identify
   the absolute path used for detecting base and Xlog directories are same or 
not? 

Please provide your suggestions.

> +xlogdir = get_absolute_path(xlog_dir);
> 
> xlog_dir must be an absolute path. ISTM we don't do the above. No?

It is required. As us

Re: [HACKERS] Call flow of btinsert(PG_FUNCTION_ARGS)

2013-11-19 Thread Andres Freund

Hi Rohit,

On 2013-11-19 13:11:12 +0100, Rohit Goyal wrote:
> Please tel me how to attached gdb to pid and debug.

> Also what I understood from your reply is that everything is abstracted, so
> am doesnot care abt backend implementation. Now, I want to modify B tree
> indexing scheme. So, which files I should focus ?

I am sorry to be harsh, but these type of questions don't really seem to
be on-topic for -hackers. Several people gave you hints where to start
reading, but from there on you need to take the lead yourself.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication Node Identifiers and crashsafe Apply Progress

2013-11-19 Thread Robert Haas

On Thu, Nov 14, 2013 at 12:26 PM, Andres Freund  wrote:
> As you know, the reason we are working changeset extraction is that we
> want to build logical unidirection and bidirectional replication
> ontop. To use changeset extraction effectively, I think one set of
> related features ontop is very useful:
>
> When extracting changes using the changeset extraction patchset (latest
> version at [1]) the START_LOGICAL_REPLICATION command is used to stream
> changes from a source system. When started it will continue to send
> changes as long as the connection is up or it is aborted. For obvious
> performance reasons it will *not* wait for an ACK for each transaction
> commit it streams out.
> Instead it relies on the receiver, exactly as in physical replication,
> sending feedback messages containing the LSN up to which data has safely
> been received.
> That means frequently something like:
> walsender: => COMMIT 0/1000
> walsender: => COMMIT 0/1200
> walsender: => COMMIT 0/1400
> walsender: => COMMIT 0/1600
> receiver:  <= ACKNOWLEDGE 0/1270
> walsender: => COMMIT 0/1800
> is possible and important for performance. I.e. the server has streamed
> out more changes than it got confirmation for.
>
> So, when the the replication connection goes down, e.g. because the
> receiving side has crashed, we need to tell the server from where to
> start. Every position between the last ACKed and the end of WAL is
> legal.
> The receiver then can ask the source to start replication from the last
> replayed commit using START_LOGICAL_REPLICATION 'slot_name'
> '0/1600' which would then re-stream all the changes in the
> transaction that committe at 0/1600 and all that follow.
>
> But for that the receiving side needs to know up to where changes have
> been applied. One relatively easy solution for that is that the
> receiving side does something like:
> UPDATE replication_progress SET lsn = '0/1600' WHERE source_id = ...;
> before the end of every replayed transaction. But that obviously will
> quickly cause bloat.
>
> Our solution to that is that a replaying process can tell the backend
> that it is currently doing so and setup three variables for every
> transaction:
> 1) an identifier for the the source database
> 2) the LSN at which the replayed transaction has committed remotely
> 3) the time at which the replayed transaction has committed remotely
>
> When the transaction then commits the commit record will set the
> XACT_CONTAINS_ORIGIN flag to ->xinfo and will add that data to the end
> of the commit record. During crash recovery the startup process will
> remember the newest LSN for each remote database in shared memory.
>
> This way, after a crash, restart, disconnect the replay process can look
> into shared memory and check how far it has already replayed and restart
> seamlessly. With minimal effort.

It would be much less invasive for the replication apply code to fsync
its own state on the apply side.  Obviously, that means doubling the
fsync rate, which is not appealing, but I think that's still a useful
way to think about what you're aiming to accomplish here: avoid
doubling the fsync rate when applying remote transactions in a
crash-safe manner.

Although I agree that we need a way to do that, I don't have a
particularly warm and fuzzy feeling about this particular proposal:
there are too many bits of it that feel like entirely arbitrary design
decisions.  If we're going to build a full-fledged logical replication
solution into core, attempting to obsolete Slony and Bucardo and
Londiste and everything that's out there, then I think we have a great
deal of design work that we have to do before we start committing
things, or even finalizing designs.  If we're going to continue with
the philosophy of building a toolkit that can serve as a building
block for multiple solutions, then color me unconvinced that this will
do the job.

If we made the xlog system truly extensible, that seems like it'd
punch your ticket here.  I'm not sure how practical that is, though.

> We previously discussed the topic and some were very adverse to using
> any sort of numeric node identifiers across systems and suggested that
> those should only be used internally. So what the attached patch does is
> to add a new shared system catalog called 'pg_replication_identifier'
> (suggestions for a better name welcome) which translates a number of
> identifying traits into a numeric identifier.
> The set of identifiers currently are:
> * the sysid of the remote system, combined with the remote TLI
> * the oid of the local database
> * the oid of the remote database
> * an optional name
> but that's just what we needed in our multimaster prototype, and not
> what I necessarily think is correct.

The fact that you've included both local and remote database OIDs
seems wrong; shouldn't the replication identifier only serve to
identify the source node, not the replication stream?  What if you
want to replica

Re: [HACKERS] Handling GIN incomplete splits

2013-11-19 Thread Michael Paquier

Hi,

Here is a review of the first three patches:
1) Further gin refactoring:
make check passes (core tests and contrib tests).
Code compiles without warnings.
Then... About the patch... Even if I got little experience with code of
gin, moving the flag for search mode out of btree, as well as removing the
logic of PostingTreeScan really makes the code lighter and easier to follow.
Just wondering, why not simplifying as well ginTraverseLock:ginbtree.c at
the same time to something similar to that?
   if (!GinPageIsLeaf(page) || searchMode == TRUE)
   return access;

   /* we should relock our page */
   LockBuffer(buffer, GIN_UNLOCK);
   LockBuffer(buffer, GIN_EXCLUSIVE);

   /* But root can become non-leaf during relock */
   if (!GinPageIsLeaf(page))
   {
   /* restore old lock type (very rare) */
   LockBuffer(buffer, GIN_UNLOCK);
   LockBuffer(buffer, GIN_SHARE);
}
   else
   access = GIN_EXCLUSIVE;
return access;
Feel free to discard as I can imagine that changing such code would make
back-branch maintenance more difficult and it would increase conflicts with
patches currently in development.
2) Refactoring of internal gin btree (needs patch 1 applied first):
make check passes (core tests and contrib tests).
Code compiles without warnings.
Yep, removing ginPageGetLinkItup makes sense. Just to be picky, I would
have put the arguments of GinFormInteriorTuple replacing ginPageGetLinkItup
in 3 separate lines just for lisibility.
In dataPrepareDownlink:gindatapage.c, depending on if lpage is a leaf page
or not, isn't it inconsistent with the older code not to use
GinDataPageGetItemPointer and GinDataPageGetPostingItem to set
btree->pitem.key.
In ginContinueSplit:ginxlog.c, could it be possible to remove this code? It
looks that its deletion has been forgotten:
/*

 * elog(NOTICE,"ginContinueSplit root:%u l:%u r:%u",  split->rootBlkno,
 * split->leftBlkno, split->rightBlkno);
 */
Except the doubt about dataPrepareDownlink (related to my lack of knowledge
of the code), patch looks good.
3) More refactoring (needs patches 1 and 2):
make check passes (core tests and contrib tests).
Code compiles without warnings.
Perhaps this patch would have been easier to read with context diffs :) It
just moves code around so nothing to say.

Then, I have done a small test with all 3 patches applied. Test is done
with pg_trgm by uploading the book "Les Miserables":
=# CREATE TABLE les_miserables (num serial, line text);
CREATE TABLE
=# \copy les_miserables (line) FROM '~/Desktop/pg135.txt';
=# select count(*) from les_miserables;
 count
---
 68116
(1 row)
=# CREATE INDEX les_miserables_idx ON les_miserables USING gin (line
gin_trgm_ops);
CREATE INDEX

And here is the result of this query (average of a couple of 5 runs):
=# explain analyse SELECT * FROM les_miserables where line ~~ '%Cosette%';
Vanilla server: 5.289 ms
With patch 1 only: 5.283 ms
With patches 1+2: 5.232 ms
With patches 1+2+3: 5.232 ms
Based on that there is no performance degradation.

I just began reading the 4th patch. As it is a bit more complex and needs
more testing, I'll provide feedback later.
Regards,

On Thu, Nov 14, 2013 at 1:49 AM, Heikki Linnakangas  wrote:

> Here's another part of my crusade against xlog cleanup routines. This
> series of patches gets rid of the gin_cleanup() function, which is
> currently used to finish splits of GIN b-tree pages, if the system crashes
> (or an error occurs) between splitting a page and inserting its downlink to
> the parent.
>
> The first three patches just move code around. IMHO they make the code
> more readable, so they should be committed in any case. The meat is in the
> fourth patch.
>
> Thoughts, objections?
>

-- 
Michael

Re: [HACKERS] Call flow of btinsert(PG_FUNCTION_ARGS)

2013-11-19 Thread Craig Ringer

On 11/19/2013 08:11 PM, Rohit Goyal wrote:
> 
> Also what I understood from your reply is that everything is abstracted,
> so am doesnot care abt backend implementation. Now, I want to modify B
> tree indexing scheme. So, which files I should focus ?

Depends on whether you can do what you want without modifying the index
access method. You still haven't explained what you are trying to
achieve, what your goal is.

I don't think this conversation is productive. There's plenty of
existing documentation to review to help you get started, and you can
find guides on how to use gdb everywhere across the Internet. The code
is full of comments.

If you had a detailed, specific question maybe someone here could help
you. Without that you're just going to have to start learning the
relevant code.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Turning recovery.conf into GUCs

2013-11-19 Thread Michael Paquier

On Tue, Nov 19, 2013 at 2:27 AM, Andres Freund  wrote:
> * --write-standby-enable seems to loose quite some functionality in
>   comparison to --write-recovery-conf since it doesn't seem to set
>   primary_conninfo, standby anymore.
Yes... The idea here might be to generate a new file that is then
included in postgresql.conf or to add parameters at the bottom of
postgresql.conf itself. The code for plain base backup is
straight-forward, but it could get ugly for the tar part...

> * CheckRecoveryReadyFile() doesn't seem to be a very descriptive
>   function name.
CheckRecoveryTriggerPresence?

> * Why does StartupXLOG() now use ArchiveRecoveryRequested &&
>   StandbyModeRequested instead of just the former?
> * I am not sure I like "recovery.trigger" as a name. It seems to close
>   to what I've seen people use to trigger failover and too close to
>   trigger_file.
This name was chosen and kept in accordance to the spec of this
feature. Looks fine for me...

> * You made recovery_target_inclusive/pause_at_recovery_target PGC_SIGHUP
>   - did you review that actually works? Imo that should be changed in a
>   separate commit.

Yep, I'd rather see those parameters as PGC_POSTMASTER... As of now,
once recovery is started those parameter values do not change once
readRecoveryCommandFile is kicked. Having them SIGHUP would mean that
you could change them during recovery. Sounds kind of dangerous, no?

> * Maybe we should rename names like pause_at_recovery_target to
>   recovery_pause_at_target? Since we already force everyone to bother
>   changing their setup...

I disagree here. It is true that this patch introduces many changes
with a new configuration file layer, but this idea with this patch was
to allow people to reuse their old recovery.conf as it is. And what is
actually wrong with pause_at_recovery_target?

>
> * the description of archive_cleanup_command seems wrong to me.
> * Why did you change some of the recovery gucs to lowercase names, but
>   left out XLogRestoreCommand?

This was part of the former patch, perhaps you are right and keeping
the names as close as possible to the old ones would make sense to
facilitate maintenance across versions.

>
> * Why the PG_TRY/PG_CATCH in check_recovery_target_time? Besides being
>   really strangely formatted (multiline :? inside a function?) it
>   doesn't a) seem to be correct to ignore potential memory allocation
>   errors by just switching back into the context that just errored out,
>   and continue to work there b) forgo cleanup by just continuing as if
>   nothing happened. That's unlikely to be acceptable.

Interestingly, that was part of the first versions of the patch as
well. I don't recall modifying anything in this area when I hacked
that... But yes it should be modified to something like what is in
check_recovery_target_xid.

Regards,
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL renegotiation

2013-11-19 Thread Robert Haas

On Fri, Nov 15, 2013 at 10:49 AM, Andres Freund  wrote:
>> Another reason I'm not in a hurry is that the problem we're trying
>> to solve doesn't seem to be causing real-world trouble.  So by
>> "awhile", I'm thinking "let's let it get through 9.4 beta testing".
>
> Well, there have been a bunch of customer complaints about it, afair
> that's what made Alvaro look into it in the first place. So it's not a
> victimless bug.

Well, can any of those people try running with this patch?  That'd be
a good way of getting some confidence in it.

Generally, I agree that something needs to be back-patched here.  But
we don't want to create a situation where we fix some people and break
others, and it's not too obvious that we have a way to get there.
Personally, I favor adding some kind of GUC to control the behavior,
but I'm not exactly sure what the shape of it ought to be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Database disconnection and switch inside a single bgworker

2013-11-19 Thread Robert Haas

On Fri, Nov 15, 2013 at 10:14 AM, Tom Lane  wrote:
> Michael Paquier  writes:
>> Currently, bgworkers offer the possibility to connect to a given
>> database using BackgroundWorkerInitializeConnection in bgworker.h, but
>> there is actually no way to disconnect from a given database inside
>> the same bgworker process.
>
> That's isomorphic to having a backend switch to a different database,
> which occasionally gets requested, but there is no real likelihood
> that we'll ever implement.  The problem is, how can you be sure you
> have flushed all the database-specific state that's been built up?
> The relcache and catcaches are only the tip of the iceberg; we've
> got caches all over the place.  And once you had flushed that data,
> you'd have to recreate it --- but the code for doing so is intimately
> intertwined with connection startup tasks that you'd most likely not
> want to repeat.
>
> And, once you'd done all that work, what would you have?  A database
> switch methodology that would save a fork(), but not much else.
> The time to warm up the caches wouldn't be any better than in a
> fresh process.

Well, you'd have whatever backend-local state you had accumulated
apart from stuff in the caches.  It's clearly not useless, especially
for background workers.  And you might actually save a little bit,
because I think we established previously that a good fraction of the
startup cost was actually page faults, which would not need to be
re-incurred.  But that having been said...

> The cost/benefit ratio for making this work just doesn't look very
> promising.  That's why autovacuum is built the way it is.

...yeah.

>From a performance point of view, what's a bit frustrating is that we
have to reload a bunch of information that probably isn't different,
like pg_am entries, and the pg_class and pg_attribute entries for
pg_class and pg_attribute themselves.  If we could figure out some way
to avoid that, the potential performance win here would be bigger.
But it's not obvious to me that it's a good place to spend development
time; even if it worked perfectly, the overhead of forking new
backends just doesn't seem like our biggest problem right now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-11-19 Thread Fujii Masao

On Tue, Nov 19, 2013 at 9:14 PM, Haribabu kommi
 wrote:
> On 18 November 2013 23:30 Fujii Masao wrote:
>> On Tue, Nov 19, 2013 at 12:01 AM, Haribabu kommi
>>  wrote:
>> > On 18 November 2013 18:45 Fujii Masao wrote:
>> >> On Mon, Nov 18, 2013 at 6:31 PM, Haribabu kommi
>> >>  wrote:
>> >> >
>> >> > On 18 November 2013 11:19 Haribabu kommi wrote:
>> >> >> On 17 November 2013 00:55 Fujii Masao wrote:
>> >> >> > On Sat, Nov 16, 2013 at 2:27 PM, Haribabu kommi
>> >> >> >  wrote:
>> >> >> > > on 15 November 2013 17:26 Magnus Hagander wrote:
>> >> >> > >
>> >> >> > >>On Fri, Nov 15, 2013 at 12:10 PM, Haribabu kommi
>> >> >> > >> wrote:
>> >> >> > >
>> >> >> > >>On 14 November 2013 23:59 Fujii Masao wrote:
>> >> >> > >>> On Thu, Nov 14, 2013 at 9:08 PM, Haribabu kommi
>> >> >> > >>>  wrote:
>> >> >> > >>> > Please find attached the patch, for adding a new option
>> >> >> > >>> > for pg_basebackup, to specify a different directory for
>> pg_xlog.
>> >> >> > 
>> >> >> > >>> Sounds good! Here are the review comments:
>> >> >> >  Don't we need to prevent users from specifying the same
>> >> >> directory
>> >> >> >  in both --pgdata and --xlogdir?
>> >> >> > >
>> >> >> > >>>I feel no need to prevent, even if user specifies both
>> >> >> > >>>--pgdata
>> >> >> and
>> >> >> > >>>--xlogdir as same directory all the transaction log files
>> >> >> > >>>will be created in the base directory  instead of pg_xlog
>> directory.
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > >>Given how easy it would be to prevent that, I think we should.
>> >> It
>> >> >> > >>would be  an easy misunderstanding to specify that when you
>> >> >> actually
>> >> >> > >>want it in  /pg_xlog. Specifying that would be
>> >> >> > >>redundant in the first place,  but people ca do that, but it
>> >> >> > >
>> >> >> > >>would also be very easy to do it by mistake, and you'd end up
>> >> >> > >>with something that's really bad, including a recursive
>> symlink.
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > > Presently with initdb also user can specify both data and
>> xlog
>> >> >> > > directories as same.
>> >> >> > >
>> >> >> > > To prevent the data directory and xlog directory as same,
>> >> >> > > there is
>> >> >> a
>> >> >> > > way in windows (_fullpath api) to get absolute path from a
>> >> >> > > relative path, but I didn't get any such possibilities in
>> linux.
>> >> >> > >
>> >> >> > > I didn't find any other way to check it, if anyone have any
>> >> >> > > idea regarding this please let me know.
>> >> >> >
>> >> >> > What about make_absolute_path() in miscinit.c?
>> >> >>
>> >> >> The make_absoulte_patch() function gets the current working
>> >> directory
>> >> >> and adds The relative path to CWD, this is not giving proper
>> >> absolute
>> >> >> path.
>> >> >>
>> >> >> I have added a new function verify_data_and_xlog_dir_same() which
>> >> >> will change the Current working directory to data directory and
>> >> >> gets the CWD and the same way for transaction log directory.
>> >> >> Compare the both data and xlog directories and throw an error.
>> >> >> Please check it
>> >> once.
>> >> >>
>> >> >> Is there any other way to identify that both data and xlog
>> >> >> directories are pointing to the same Instead of comparing both
>> >> absolute paths?
>> >> >>
>> >> >> Updated patch attached in the mail.
>> >> >
>> >> > Failure when the xlogdir doesn't exist is fixed in the updated
>> >> > patch
>> >> attached in the mail.
>> >>
>> >> Thanks for updating the patch!
>> >>
>> >> With the patch, when I specify the same directory in both -D and --
>> >> xlogdir, I got the following error.
>> >>
>> >> $ cd /tmp
>> >> $ pg_basebackup -D hoge --xlogdir=/tmp/hoge -X fetch
>> >> pg_basebackup: could not change directory to "hoge": No such file or
>> >> directory
>> >
>> > Thanks. After finding the xlog directory absolute path returning back
>> > to current working Directory is missed, because of this reason it is
>> > failing while finding the absolute Path for data directory. Apart
>> from this change the code is reorganized a bit.
>> >
>> > Updated patch is attached in the mail. Please review it once.
>>
>> Thanks for newer version of the patch!
>>
>> I found that the empty base directory is created and remains even when
>> the same directory is specified in both -D and --xlogdir and the error
>> occurs.
>> I think that it's
>> better to throw an error in that case before creating any new directory.
>> Thought?
>
> By creating the base directory only the patch finds whether provided base and
> Xlog directories are same or not? To solve this problem following options are 
> possible
>
> 1. No problem as it is just an empty base directory, so it can be reused in 
> the next time
>Leave it as it is.
> 2. Once the error is identified, the base directory can be deleted.
> 3. write a new function to remove the parent references from the provided 
> path to identify
>the absolute path used for detecting base and Xlog directories are

Re: [HACKERS] pre-commit triggers

2013-11-19 Thread Andrew Dunstan



On 11/19/2013 12:45 AM, Noah Misch wrote:

On Fri, Nov 15, 2013 at 01:01:48PM -0500, Andrew Dunstan wrote:

The triggers don't fire if there is no real XID, so only actual data
changes should cause the trigger to fire.

What's the advantage of this provision?  Without it, an individual trigger
could make the same check and drop out quickly.  A trigger not wanting it
can't so easily work around its presence, though.  Heretofore, skipping XID
assignment has been an implementation detail that improves performance without
otherwise calling user attention to itself.  This provision would make the
decision to acquire an XID (where optional) affect application behavior.




Mainly speed. How is the trigger (especially if not written in C) going 
to check the same thing?


Conventional triggers don't fire except on data changing events, so this 
seemed consistent with that.


Perhaps my understanding of when XIDs are acquired is insufficient. When 
exactly is it optional?


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Robert Haas

On Fri, Nov 15, 2013 at 3:04 PM, Andres Freund  wrote:
> Questions:
> * What do you like/dislike about the API (storage/atomics.h)
> * decide whether it's ok to rely on inline functions or whether we need
>   to provide out-of-line versions. I think we should just bite the
>   bullet and require support. Some very old arches might need to live with
>   warnings. Patch 01 tries to silence them on HP-UX.
> * decide whether we want to keep the semaphore fallback for
>   spinlocks. I strongly vote for removing it. The patchset provides
>   compiler based default implementations for atomics, so it's unlikely
>   to be helpful when bringing up a new platform.

On point #2, I don't personally know of any systems that I care about
where inlining isn't supported.  However, we've gone to quite a bit of
trouble relatively recently to keep things working for platforms where
that is the case, so I feel the need for an obligatory disclaimer: I
will not commit any patch that moves the wood in that area unless
there is massive consensus that this is an acceptable way forward.
Similarly for point #3: Heikki not long ago went to the trouble of
unbreaking --disable-spinlocks, which suggests that there is at least
some interest in continuing to be able to run in that mode.  I'm
perfectly OK with the decision that we don't care about that any more,
but I will not be the one to make that decision, and I think it
requires a greater-than-normal level of consensus.

On point #1, I dunno.  It looks like a lot of rearrangement to me, and
I'm not really sure what the final form of it is intended to be.  I
think it desperately needs a README explaining what the point of all
this is and how to add support for a new platform or compiler if yours
doesn't work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Errors on missing pg_subtrans/ files with 9.3

2013-11-19 Thread Robert Haas

On Fri, Nov 15, 2013 at 4:01 PM, J Smith  wrote:
> On Fri, Nov 15, 2013 at 3:21 PM, Robert Haas  wrote:
>> I think what would help the most is if you could arrange to obtain a
>> stack backtrace at the point when the error is thrown.  Maybe put a
>> long sleep call in just before the error happens, and when it gets
>> stuck there, attach gdb and run bt full.
>>
>
> That could potentially be doable. Perhaps I could use something like
> google-coredumper or something similar to have a core dump generated
> if the error comes up? Part of the problem is that the error is so
> sporadic that it's going to be tough to say when the next one will
> occur. For instance, we haven't changed our load on the server, yet
> the error hasn't occurred since Nov 13, 15:01. I'd also like to avoid
> blocking on the server with sleep or anything like that unless
> absolutely necessary, as there are other services we have in
> development that are using other databases on this cluster. (I can as
> a matter of last resort, of course, but if google-coredumper can do
> the job I'd like to give that a shot first.)
>
> Any hints on where I could insert something like this? Should I try
> putting it into the section of elog.c dealing with ENOENT errors, or
> try to find a spot closer to where the file itself is being opened? I
> haven't looked at Postgres internals for a while now so I'm not quite
> sure of the best location for this sort of thing.

I'd look for the specific ereport() call that's firing, and put it
just before that.

(note that setting the error verbosity to 'verbose' will give you the
file and line number where the error is happening, which is useful if
the message can be generated from more than one place)

I'm not familiar with google-coredumper but it sounds like a promising
technique.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication Node Identifiers and crashsafe Apply Progress

2013-11-19 Thread Greg Stark

On Thu, Nov 14, 2013 at 5:26 PM, Andres Freund wrote:

> But for that the receiving side needs to know up to where changes have
> been applied. One relatively easy solution for that is that the
> receiving side does something like:
> UPDATE replication_progress SET lsn = '0/1600' WHERE source_id = ...;
> before the end of every replayed transaction. But that obviously will
> quickly cause bloat.
>
> Our solution to that is that a replaying process can tell the backend
> that it is currently doing so and setup three variables for every
> transaction:
>

This is a pretty massive design decision to hinge on such a minor
implementation detail of table bloat (which I don't think would actually be
an issue anyway -- isn't that what we have HOT for?)

Fundamentally the question here is where to keep all the book-keeping state
about replicas, in a central repository in the master or locally in each
replica. At first blush it seems obvious to me that locally in each replica
is the more flexible choice.

Replication systems become complex when you start restoring from old
backups and not every node has the same view of the topology as every other
node. I fear what will happen to a central repository when you fail over
the master and it's out of sync with where the slaves have actually
restored up to. Or where you fail over a slave to a standby of the slave
and it needs to redo some of the logical replication to catch up. Or where
you restore all your nodes, both master and slaves from backups taken at
different points in time (presumably with the master ahead of the slaves).

Having a central repository makes the whole system simpler but it also
makes it much more fragile. It's nice to have a single place to go to find
out what the state of every replica is but it should do that by actually
asking the replicas, not by maintaining state that might be out of sync.

-- 
greg

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Andres Freund

Hi,

On 2013-11-18 23:15:59 +0100, Andres Freund wrote:
> Afaics it's likely a combination/interaction of bugs and fixes between:
> * the initial HS code
> * 5a031a5556ff83b8a9646892715d7fef415b83c3
> * f44eedc3f0f347a856eea8590730769125964597

Yes, the combination of those is guilty.

Man, this is (to a good part my) bad.

> But that'd mean nobody noticed it during 9.3's beta...

It's fairly hard to reproduce artificially since a) there have to be
enough transactions starting and committing from the start of the
checkpoint the standby is starting from to the point it does
LogStandbySnapshot() to cross a 32768 boundary b) hint bits often save
the game by not accessing clog at all anymore and thus not noticing the
corruption.
I've reproduced the issue by having an INSERT ONLY table that's never
read from. It's helpful to disable autovacuum.

Imo something the attached patch should be done. The description I came
up with is:

Fix Hot-Standby initialization of clog and subtrans.

These bugs can cause data loss on standbys started with hot_standby=on
at the moment they start to accept read only queries by marking
committed transactions as uncommited. The likelihood of such
corruptions is small unless the primary has a high transaction rate.

5a031a5556ff83b8a9646892715d7fef415b83c3 fixed bugs in HS's startup
logic by maintaining less state until at least
STANDBY_SNAPSHOT_PENDING state was reached, missing the fact that both
clog and subtrans are written to before that. This only failed to fail
in common cases because the usage of ExtendCLOG in procarray.c was
superflous since clog extensions are actually WAL logged.

f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the
missing extensions of pg_subtrans due to the former commit's changes -
which are not WAL logged - by performing the extensions when switching
to a state > STANDBY_INITIALIZED and not performing xid assignments
before that - again missing the fact that ExtendCLOG is unneccessary -
but screwed up twice: Once because latestObservedXid wasn't updated
anymore in that state due to the earlier commit and once by having an
off-by-one error in the loop performing extensions.
This means that whenever a CLOG_XACTS_PER_PAGE (32768 with default
settings) boundary was crossed between the start of the checkpoint
recovery started from and the first xl_running_xact record old
transactions commit bits in pg_clog could be overwritten if they
started and committed in that window.

Fix this mess by not performing ExtendCLOG() in HS at all anymore
since it's unneeded and evidently dangerous and by performing subtrans
extensions even before reaching STANDBY_SNAPSHOT_PENDING.

Imo this warrants and expedited point release :(

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Andres Freund

On 2013-11-19 15:20:01 +0100, Andres Freund wrote:
> Imo something the attached patch should be done. The description I came
> up with is:
> 
> Fix Hot-Standby initialization of clog and subtrans.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From db5fc631102e2bf3be1034981bedb0b99871a2bb Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 19 Nov 2013 13:12:31 +0100
Subject: [PATCH] Fix Hot-Standby initialization of clog and subtrans.

These bugs can cause data loss on standbys started with hot_standby=on
at the moment they start to accept read only queries by marking
committed transactions as uncommited. The likelihood of such
corruptions is small unless the primary has a high transaction rate.

5a031a5556ff83b8a9646892715d7fef415b83c3 fixed bugs in HS's startup
logic by maintaining less state until at least
STANDBY_SNAPSHOT_PENDING state was reached, missing the fact that both
clog and subtrans are written to before that. This only failed to fail
in common cases because the usage of ExtendCLOG in procarray.c was
superflous since clog extensions are actually WAL logged.

f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the
missing extensions of pg_subtrans due to the former commit's changes -
which are not WAL logged - by performing the extensions when switching
to a state > STANDBY_INITIALIZED and not performing xid assignments
before that - again missing the fact that ExtendCLOG is unneccessary -
but screwed up twice: Once because latestObservedXid wasn't updated
anymore in that state due to the earlier commit and once by having an
off-by-one error in the loop performing extensions.
This means that whenever a CLOG_XACTS_PER_PAGE (32768 with default
settings) boundary was crossed between the start of the checkpoint
recovery started from and the first xl_running_xact record old
transactions commit bits in pg_clog could be overwritten if they
started and committed in that window.

Fix this mess by not performing ExtendCLOG() in HS at all anymore
since it's unneeded and evidently dangerous and by performing subtrans
extensions even before reaching STANDBY_SNAPSHOT_PENDING.
---
 src/backend/access/transam/clog.c   |  2 +-
 src/backend/storage/ipc/procarray.c | 64 ++---
 2 files changed, 39 insertions(+), 27 deletions(-)

diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index cb95aa3..6a963b6 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -622,7 +622,7 @@ ExtendCLOG(TransactionId newestXact)
 	LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
 
 	/* Zero the page and make an XLOG entry about it */
-	ZeroCLOGPage(pageno, !InRecovery);
+	ZeroCLOGPage(pageno, true);
 
 	LWLockRelease(CLogControlLock);
 }
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index c2f86ff..4d0bfb8 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -484,8 +484,9 @@ ProcArrayInitRecovery(TransactionId initializedUptoXID)
 
 	/*
 	 * we set latestObservedXid to the xid SUBTRANS has been initialized upto
-	 * so we can extend it from that point onwards when we reach a consistent
-	 * state in ProcArrayApplyRecoveryInfo().
+	 * so we can extend it from that point onwards in
+	 * RecordKnownAssignedTransactionIds and when we get consistent in
+	 * ProcArrayApplyRecoveryInfo().
 	 */
 	latestObservedXid = initializedUptoXID;
 	TransactionIdRetreat(latestObservedXid);
@@ -661,17 +662,23 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
 	pfree(xids);
 
 	/*
-	 * latestObservedXid is set to the the point where SUBTRANS was started up
-	 * to, initialize subtrans from thereon, up to nextXid - 1.
+	 * latestObservedXid is at least set to the the point where SUBTRANS was
+	 * started up to (c.f. ProcArrayInitRecovery()) or to the biggest xid
+	 * RecordKnownAssignedTransactionIds() was called for.  Initialize
+	 * subtrans from thereon, up to nextXid - 1.
+	 *
+	 * We need to duplicate parts of RecordKnownAssignedTransactionId() here,
+	 * because we've just added xids to the known assigned xids machinery that
+	 * haven't gone through RecordKnownAssignedTransactionId().
 	 */
 	Assert(TransactionIdIsNormal(latestObservedXid));
+	TransactionIdAdvance(latestObservedXid);
 	while (TransactionIdPrecedes(latestObservedXid, running->nextXid))
 	{
-		ExtendCLOG(latestObservedXid);
 		ExtendSUBTRANS(latestObservedXid);
-
 		TransactionIdAdvance(latestObservedXid);
 	}
+	TransactionIdRetreat(latestObservedXid);  /* = running->nextXid - 1 */
 
 	/* --
 	 * Now we've got the running xids we need to set the global values that
@@ -756,10 +763,6 @@ ProcArrayApplyXidAssignment(TransactionId topxid,
 
 	Assert(standbyState >= STANDBY_INITIALIZED);
 
-	/* can't do anything useful unless we have more state setup */
-	if (standbyState ==

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Andrew Dunstan



On 11/19/2013 09:20 AM, Andres Freund wrote:

Imo this warrants and expedited point release :(



I presume anyone who is vulnerable to it would need to recreate their 
secondary servers to get rid of potential problems?


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-11-19 Thread Tom Lane

Fujii Masao  writes:
> On the second thought, I'm thinking that it might be overkill to add
> such not simple code for that small benefit.

Yeah --- there's a limit to how much code we should expend on detecting
this type of error.  It's not like the case is all that plausible.

One idea that I don't think got discussed is stat'ing the two directories
and verifying that their dev/inode numbers are different.  I don't know
how portable that is though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: pset autocomplete add missing options

2013-11-19 Thread Fujii Masao

On Sun, Nov 17, 2013 at 8:47 PM, Ian Lawrence Barwick  wrote:
> Review for Pavel Stehule's patch in CF 2013-11:
>
>   https://commitfest.postgresql.org/action/patch_view?id=1253
>
> Patch applies cleanly and works as intended; it's a very straightforward
> patch so no surprises there.
>
> The patch expands the range of completable items for \pset, putting
> them in alphabetical order and syncs them with the list in command.c
> introduced by Gilles Darold's earlier patch for \pset without any
> options ( https://commitfest.postgresql.org/action/patch_view?id=1202 ).
>
> However double-checking the options available to \pset, I see there
> is also 'fieldsep_zero' and 'recordsep_zero', which are special cases
> for 'fieldsep' and 'recordsep' respectively and which are therefore not
> displayed separately by \pset without-any-options, but should nevertheless
> be tab-completable. Modified patch attached to include these.

Thanks! Committed.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])

2013-11-19 Thread Amit Kapila

On Tue, Nov 19, 2013 at 2:06 PM, Haribabu kommi
 wrote:
> On 19 November 2013 09:59 Amit Kapila wrote:
>> On Mon, Nov 18, 2013 at 8:31 PM, Haribabu kommi
>>  wrote:
>> > On 18 November 2013 20:01 Amit Kapila wrote:
>> >> > Code changes are fine.
>> >> > If two or three errors are present in the configuration file, it
>> >> prints the last error
>
> Ok fine I marked the patch as ready for committer.

   Thanks for review.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-11-19 Thread Magnus Hagander

On Tue, Nov 19, 2013 at 2:41 PM, Fujii Masao  wrote:

> On Tue, Nov 19, 2013 at 9:14 PM, Haribabu kommi
>  wrote:
> > On 18 November 2013 23:30 Fujii Masao wrote:
> >> On Tue, Nov 19, 2013 at 12:01 AM, Haribabu kommi
> >>  wrote:
> >> > On 18 November 2013 18:45 Fujii Masao wrote:
> >> >> On Mon, Nov 18, 2013 at 6:31 PM, Haribabu kommi
> >> >>  wrote:
> >> >> >
> >> >> > On 18 November 2013 11:19 Haribabu kommi wrote:
> >> >> >> On 17 November 2013 00:55 Fujii Masao wrote:
> >> >> >> > On Sat, Nov 16, 2013 at 2:27 PM, Haribabu kommi
> >> >> >> >  wrote:
> >> >> >> > > on 15 November 2013 17:26 Magnus Hagander wrote:
> >> >> >> > >
> >> >> >> > >>On Fri, Nov 15, 2013 at 12:10 PM, Haribabu kommi
> >> >> >> > >> wrote:
> >> >> >> > >
> >> >> >> > >>On 14 November 2013 23:59 Fujii Masao wrote:
> >> >> >> > >>> On Thu, Nov 14, 2013 at 9:08 PM, Haribabu kommi
> >> >> >> > >>>  wrote:
> >> >> >> > >>> > Please find attached the patch, for adding a new option
> >> >> >> > >>> > for pg_basebackup, to specify a different directory for
> >> pg_xlog.
> >> >> >> > 
> >> >> >> > >>> Sounds good! Here are the review comments:
> >> >> >> >  Don't we need to prevent users from specifying the same
> >> >> >> directory
> >> >> >> >  in both --pgdata and --xlogdir?
> >> >> >> > >
> >> >> >> > >>>I feel no need to prevent, even if user specifies both
> >> >> >> > >>>--pgdata
> >> >> >> and
> >> >> >> > >>>--xlogdir as same directory all the transaction log files
> >> >> >> > >>>will be created in the base directory  instead of pg_xlog
> >> directory.
> >> >> >> > >
> >> >> >> > >
> >> >> >> > >
> >> >> >> > >>Given how easy it would be to prevent that, I think we should.
> >> >> It
> >> >> >> > >>would be  an easy misunderstanding to specify that when you
> >> >> >> actually
> >> >> >> > >>want it in  /pg_xlog. Specifying that would be
> >> >> >> > >>redundant in the first place,  but people ca do that, but it
> >> >> >> > >
> >> >> >> > >>would also be very easy to do it by mistake, and you'd end up
> >> >> >> > >>with something that's really bad, including a recursive
> >> symlink.
> >> >> >> > >
> >> >> >> > >
> >> >> >> > >
> >> >> >> > > Presently with initdb also user can specify both data and
> >> xlog
> >> >> >> > > directories as same.
> >> >> >> > >
> >> >> >> > > To prevent the data directory and xlog directory as same,
> >> >> >> > > there is
> >> >> >> a
> >> >> >> > > way in windows (_fullpath api) to get absolute path from a
> >> >> >> > > relative path, but I didn't get any such possibilities in
> >> linux.
> >> >> >> > >
> >> >> >> > > I didn't find any other way to check it, if anyone have any
> >> >> >> > > idea regarding this please let me know.
> >> >> >> >
> >> >> >> > What about make_absolute_path() in miscinit.c?
> >> >> >>
> >> >> >> The make_absoulte_patch() function gets the current working
> >> >> directory
> >> >> >> and adds The relative path to CWD, this is not giving proper
> >> >> absolute
> >> >> >> path.
> >> >> >>
> >> >> >> I have added a new function verify_data_and_xlog_dir_same() which
> >> >> >> will change the Current working directory to data directory and
> >> >> >> gets the CWD and the same way for transaction log directory.
> >> >> >> Compare the both data and xlog directories and throw an error.
> >> >> >> Please check it
> >> >> once.
> >> >> >>
> >> >> >> Is there any other way to identify that both data and xlog
> >> >> >> directories are pointing to the same Instead of comparing both
> >> >> absolute paths?
> >> >> >>
> >> >> >> Updated patch attached in the mail.
> >> >> >
> >> >> > Failure when the xlogdir doesn't exist is fixed in the updated
> >> >> > patch
> >> >> attached in the mail.
> >> >>
> >> >> Thanks for updating the patch!
> >> >>
> >> >> With the patch, when I specify the same directory in both -D and --
> >> >> xlogdir, I got the following error.
> >> >>
> >> >> $ cd /tmp
> >> >> $ pg_basebackup -D hoge --xlogdir=/tmp/hoge -X fetch
> >> >> pg_basebackup: could not change directory to "hoge": No such file or
> >> >> directory
> >> >
> >> > Thanks. After finding the xlog directory absolute path returning back
> >> > to current working Directory is missed, because of this reason it is
> >> > failing while finding the absolute Path for data directory. Apart
> >> from this change the code is reorganized a bit.
> >> >
> >> > Updated patch is attached in the mail. Please review it once.
> >>
> >> Thanks for newer version of the patch!
> >>
> >> I found that the empty base directory is created and remains even when
> >> the same directory is specified in both -D and --xlogdir and the error
> >> occurs.
> >> I think that it's
> >> better to throw an error in that case before creating any new directory.
> >> Thought?
> >
> > By creating the base directory only the patch finds whether provided
> base and
> > Xlog directories are same or not? To solve this problem following
> options are possible
> >
> > 1. No problem as it is just an empty base directory, so i

Re: [HACKERS] -d option for pg_isready is broken

2013-11-19 Thread Robert Haas

On Fri, Nov 15, 2013 at 9:01 PM, Fabrízio de Royes Mello
 wrote:
> On Wed, Nov 13, 2013 at 9:37 PM, Josh Berkus  wrote:
>> handyrep@john:~/handyrep$ pg_isready --version
>> pg_isready (PostgreSQL) 9.3.1
>>
>> handyrep@john:~/handyrep$ pg_isready -h john -p 5432 -U postgres -d
>> postgres -q
>> pg_isready: missing "=" after "postgres" in connection info string
>>
>> handyrep@john:~/handyrep$ pg_isready --host=john --port=5432
>> --user=postgres --dbname=postgres
>> pg_isready: missing "=" after "postgres" in connection info string
>>
>> handyrep@john:~/handyrep$ pg_isready -h john -p 5432 -U postgres
>> john:5432 - accepting connections
>>
>> so apparently the -d option:
>>
>> a) doesn't work, and
>> b) doesn't do anything
>>
>> I suggest simply removing it from the utility.
>>
>> I'll note that the -U option doesn't appear to do anything relevant
>> either, but at least it doesn't error unnecessarily:
>>
>> handyrep@john:~/handyrep$ pg_isready -h john -p 5432 -U no_such_user
>> john:5432 - accepting connections
>>
>
> The attached patch fix it.

Well, there still seem to be some problems.

[rhaas pgsql]$ src/bin/scripts/pg_isready -d host=sgdasasg
/tmp:5432 - no attempt

I added some debugging instrumentation and it looks like there's a
problem with this code:

if (strcmp(def->keyword, "hostaddr") == 0 ||
strcmp(def->keyword, "host") == 0)

The problem is that we end up executing the code contained in that
block twice, once for host and once for hostaddr.  Thus:

[rhaas pgsql]$ src/bin/scripts/pg_isready -d host=sgdasasg
def->keyword=host pghost_str=sgdasasg
def->keyword=hostaddr pghost_str=/tmp
def->keyword=port pgport_str=5432
/tmp:5432 - no attempt

Oops.

Nevertheless, that's kind of a separate problem, so we could address
in a separate commit.  But even ignoring that, this still isn't right:
according to the fine documentation, -d will be expanded not only if
it contains an =, but also if it starts with a valid URI prefix.  This
would break that documented behavior.

http://www.postgresql.org/docs/9.3/static/app-pg-isready.html

If you submit an updated patch, please fix the comment just above the
code you're changing to more accurately reflect what this does.

The number of bugs in pg_isready certainly seems out of proportion to
the surface area of the problem it solves.  Whee!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Tom Lane

Robert Haas  writes:
> On point #2, I don't personally know of any systems that I care about
> where inlining isn't supported.  However, we've gone to quite a bit of
> trouble relatively recently to keep things working for platforms where
> that is the case, so I feel the need for an obligatory disclaimer: I
> will not commit any patch that moves the wood in that area unless
> there is massive consensus that this is an acceptable way forward.
> Similarly for point #3: Heikki not long ago went to the trouble of
> unbreaking --disable-spinlocks, which suggests that there is at least
> some interest in continuing to be able to run in that mode.  I'm
> perfectly OK with the decision that we don't care about that any more,
> but I will not be the one to make that decision, and I think it
> requires a greater-than-normal level of consensus.

Hm.  Now that I think about it, isn't Peter proposing to break systems
without working "inline" over here?
http://www.postgresql.org/message-id/1384257026.8059.5.ca...@vanquo.pezone.net

Maybe that patch needs a bit more discussion.  I tend to agree that
significantly moving our portability goalposts shouldn't be undertaken
lightly.  On the other hand, it is 2013, and so it seems not unreasonable
to reconsider choices last made more than a dozen years ago.

Here's an idea that might serve as a useful touchstone for making choices:
compiler inadequacies are easier to fix than hardware/OS inadequacies.
Even a dozen years ago, people could get gcc running on any platform of
interest, and I seriously doubt that's less true today than back then.
So if anyone complains "my compiler doesn't support 'const'", we could
reasonably reply "so install gcc".  On the other hand, if the complaint is
"my hardware doesn't support 64-bit CAS", it's not reasonable to tell them
to buy a new server.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Andres Freund

On 2013-11-19 09:33:34 -0500, Andrew Dunstan wrote:
> 
> On 11/19/2013 09:20 AM, Andres Freund wrote:
> >Imo this warrants and expedited point release :(
> 
> 
> I presume anyone who is vulnerable to it would need to recreate their
> secondary servers to get rid of potential problems?

Yes. There's less expensive ways to do it, but those seem to complicated
to suggest.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2013-11-19 Thread Robert Haas

On Fri, Nov 15, 2013 at 6:51 AM, Simon Riggs  wrote:
> Not enough. This feature is clearly being suggested as a way to offer
> Postgres in embedded mode for users by a back door. Doing that forces
> us to turn off many of the server's features and we will take a huge
> step backwards in features, testing, maintainability of code and
> wasted community time.

That's not clear to me at all.  IIRC, the original idea was Tom's, and
the idea is to make it possible to have, for example, a psql session
connected to a standalone database, which can't be done right now.  I
don't use standalone mode much, but when I do, I'd sure like to have
the psql interface rather than the existing standalone mode interface.
 I'm not aware that there's anything in this patch which targets any
other use case; if there is, sure, rip it out.  But let's not assume
this is going in a bad direction, especially considering who it was
that suggested the idea originally.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2013-11-19 Thread Tom Lane

Robert Haas  writes:
> On Fri, Nov 15, 2013 at 6:51 AM, Simon Riggs  wrote:
>> Not enough. This feature is clearly being suggested as a way to offer
>> Postgres in embedded mode for users by a back door. Doing that forces
>> us to turn off many of the server's features and we will take a huge
>> step backwards in features, testing, maintainability of code and
>> wasted community time.

> That's not clear to me at all.  IIRC, the original idea was Tom's, and
> the idea is to make it possible to have, for example, a psql session
> connected to a standalone database, which can't be done right now.  I
> don't use standalone mode much, but when I do, I'd sure like to have
> the psql interface rather than the existing standalone mode interface.

pg_dump from a standalone backend seems like another core use case.
That's not just more convenience, it's functionality you just don't
have right now.

I think that it might someday be interesting to allow a full server to be
started on-demand in this way.  But the patch as proposed is not that,
it's just a nicer interface to the existing standalone mode.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Peter Eisentraut

On 11/19/13, 9:57 AM, Tom Lane wrote:
> Hm.  Now that I think about it, isn't Peter proposing to break systems
> without working "inline" over here?
> http://www.postgresql.org/message-id/1384257026.8059.5.ca...@vanquo.pezone.net

No, that's about const, volatile, #, and memcmp.

I don't have an informed opinion about requiring inline support
(although it would surely be nice).



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Andres Freund

On 2013-11-19 09:57:20 -0500, Tom Lane wrote:
> Robert Haas  writes:
> > On point #2, I don't personally know of any systems that I care about
> > where inlining isn't supported.  However, we've gone to quite a bit of
> > trouble relatively recently to keep things working for platforms where
> > that is the case, so I feel the need for an obligatory disclaimer: I
> > will not commit any patch that moves the wood in that area unless
> > there is massive consensus that this is an acceptable way forward.
> > Similarly for point #3: Heikki not long ago went to the trouble of
> > unbreaking --disable-spinlocks, which suggests that there is at least
> > some interest in continuing to be able to run in that mode.  I'm
> > perfectly OK with the decision that we don't care about that any more,
> > but I will not be the one to make that decision, and I think it
> > requires a greater-than-normal level of consensus.
> 
> Hm.  Now that I think about it, isn't Peter proposing to break systems
> without working "inline" over here?
> http://www.postgresql.org/message-id/1384257026.8059.5.ca...@vanquo.pezone.net

Hm. Those don't directly seem to affect our inline tests. Our test is
PGAC_C_INLINE which itself uses AC_C_INLINE and then tests that
unreferenced inlines don't generate warnings.

> Maybe that patch needs a bit more discussion.  I tend to agree that
> significantly moving our portability goalposts shouldn't be undertaken
> lightly.  On the other hand, it is 2013, and so it seems not unreasonable
> to reconsider choices last made more than a dozen years ago.

I don't think there's even platforms out there that don't support const
inlines, just some that unnecessarily generate warnings. But since
builds on those platforms are already littered with warnings, that
doesn't seem something we need to majorly be concerned with.

The only animal we have that doesn't support quiet inlines today is
HP-UX/ac++, and I think - as in patch 1 in the series - we might be able
to simply suppress the warning there.

> Here's an idea that might serve as a useful touchstone for making choices:
> compiler inadequacies are easier to fix than hardware/OS inadequacies.
> Even a dozen years ago, people could get gcc running on any platform of
> interest, and I seriously doubt that's less true today than back then.
> So if anyone complains "my compiler doesn't support 'const'", we could
> reasonably reply "so install gcc".

Agreed.

> On the other hand, if the complaint is
> "my hardware doesn't support 64-bit CAS", it's not reasonable to tell them
> to buy a new server.

Agreed. I've am even wondering about 32bit CAS since it's not actually
that hard to emulate it using spinlocks. Certainly less work than
arguing about removing stone-age platforms ;)

There's no fundamental issue with continuing to support semaphore based
spinlocks I am just wondering about the point of supporting that
configuration since it will always yield horrible performance.

The only fundamental thing that I don't immediately see how we can
support is the spinlock based memory barrier since that introduces a
circularity (atomics need barrier, barrier needs spinlocks, spinlock
needs atomics).

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Errors on missing pg_subtrans/ files with 9.3

2013-11-19 Thread J Smith

Alright, we'll look into doing that heading into the weekend.
Interestingly, we haven't experienced the issue since our main Java
developer made some modifications to our backend system. I'm not
entirely sure what the changes entail except that it's a one-liner
that involves re-SELECTing a table during a transaction. We'll
rollback this change and re-compile Postgres with google-coredumper
and let it run over the weekend and see where we stand.

Cheers

On Tue, Nov 19, 2013 at 9:14 AM, Robert Haas  wrote:
> On Fri, Nov 15, 2013 at 4:01 PM, J Smith  wrote:
>> On Fri, Nov 15, 2013 at 3:21 PM, Robert Haas  wrote:
>>> I think what would help the most is if you could arrange to obtain a
>>> stack backtrace at the point when the error is thrown.  Maybe put a
>>> long sleep call in just before the error happens, and when it gets
>>> stuck there, attach gdb and run bt full.
>>>
>>
>> That could potentially be doable. Perhaps I could use something like
>> google-coredumper or something similar to have a core dump generated
>> if the error comes up? Part of the problem is that the error is so
>> sporadic that it's going to be tough to say when the next one will
>> occur. For instance, we haven't changed our load on the server, yet
>> the error hasn't occurred since Nov 13, 15:01. I'd also like to avoid
>> blocking on the server with sleep or anything like that unless
>> absolutely necessary, as there are other services we have in
>> development that are using other databases on this cluster. (I can as
>> a matter of last resort, of course, but if google-coredumper can do
>> the job I'd like to give that a shot first.)
>>
>> Any hints on where I could insert something like this? Should I try
>> putting it into the section of elog.c dealing with ENOENT errors, or
>> try to find a spot closer to where the file itself is being opened? I
>> haven't looked at Postgres internals for a while now so I'm not quite
>> sure of the best location for this sort of thing.
>
> I'd look for the specific ereport() call that's firing, and put it
> just before that.
>
> (note that setting the error verbosity to 'verbose' will give you the
> file and line number where the error is happening, which is useful if
> the message can be generated from more than one place)
>
> I'm not familiar with google-coredumper but it sounds like a promising
> technique.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 9:57 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On point #2, I don't personally know of any systems that I care about
>> where inlining isn't supported.  However, we've gone to quite a bit of
>> trouble relatively recently to keep things working for platforms where
>> that is the case, so I feel the need for an obligatory disclaimer: I
>> will not commit any patch that moves the wood in that area unless
>> there is massive consensus that this is an acceptable way forward.
>> Similarly for point #3: Heikki not long ago went to the trouble of
>> unbreaking --disable-spinlocks, which suggests that there is at least
>> some interest in continuing to be able to run in that mode.  I'm
>> perfectly OK with the decision that we don't care about that any more,
>> but I will not be the one to make that decision, and I think it
>> requires a greater-than-normal level of consensus.
>
> Hm.  Now that I think about it, isn't Peter proposing to break systems
> without working "inline" over here?
> http://www.postgresql.org/message-id/1384257026.8059.5.ca...@vanquo.pezone.net
>
> Maybe that patch needs a bit more discussion.  I tend to agree that
> significantly moving our portability goalposts shouldn't be undertaken
> lightly.  On the other hand, it is 2013, and so it seems not unreasonable
> to reconsider choices last made more than a dozen years ago.

Sure.  inline isn't C89, and to date, we've been quite rigorous about
enforcing C89-compliance, so I'm going to try not to be the guy that
casually breaks that.  On the other hand, I haven't personally seen
any systems that don't have inline in a very long time.  The first
systems I did development on were not even C89; they were K&R, and you
had to use old-style function declarations.  It's probably safe to say
that they didn't support inline, either, though I don't recall trying
it.  But the last time I used a system like that was probably in the
early nineties, and it's been a long time since then.  However, I
upgrade my hardware about every 2 years, so I'm not the best guy to
say what the oldest stuff people still care about is.

> Here's an idea that might serve as a useful touchstone for making choices:
> compiler inadequacies are easier to fix than hardware/OS inadequacies.
> Even a dozen years ago, people could get gcc running on any platform of
> interest, and I seriously doubt that's less true today than back then.
> So if anyone complains "my compiler doesn't support 'const'", we could
> reasonably reply "so install gcc".  On the other hand, if the complaint is
> "my hardware doesn't support 64-bit CAS", it's not reasonable to tell them
> to buy a new server.

I generally agree with that principle, but I'd offer the caveat that
some people do still have valid reasons for wanting to use non-GCC
compilers.  I have been told that HP's aCC produces better results on
Itanium than gcc, for example.  Still, I think we could probably make
a list of no more than half-a-dozen compilers that we really care
about - the popular open source ones (is there anything we need to
care about other than gcc and clang?) and the ones supported by large
vendors (IBM's ICC and HP's aCC being the ones that I know about) and
adopt features that are supported by everything on that list.

Another point is that, some releases ago, we began requiring working
64-bit arithmetic as per commit
d15cb38dec01fcc8d882706a3fa493517597c76b.  We might want to go through
our list of nominally-supported platforms and see whether any of them
would fail to compile for that reason.  Based on the commit message
for the aforementioned commit, such platforms haven't been viable for
PostgreSQL since 8.3, so we're not losing anything by canning them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 10:16 AM, Andres Freund  wrote:
>> On the other hand, if the complaint is
>> "my hardware doesn't support 64-bit CAS", it's not reasonable to tell them
>> to buy a new server.
>
> Agreed. I've am even wondering about 32bit CAS since it's not actually
> that hard to emulate it using spinlocks. Certainly less work than
> arguing about removing stone-age platforms ;)
>
> There's no fundamental issue with continuing to support semaphore based
> spinlocks I am just wondering about the point of supporting that
> configuration since it will always yield horrible performance.
>
> The only fundamental thing that I don't immediately see how we can
> support is the spinlock based memory barrier since that introduces a
> circularity (atomics need barrier, barrier needs spinlocks, spinlock
> needs atomics).

We've been pretty much assuming for a long time that calling a
function in another translation unit acts as a compiler barrier.
There's a lot of code that isn't actually safe against global
optimization; we assume, for example, that memory accesses can't move
over an LWLockAcquire(), but that's just using spinlocks internally,
and those aren't guaranteed to be compiler barriers, per previous
discussion.  So one idea for a compiler barrier is just to define a
function call pg_compiler_barrier() in a file by itself, and make that
the fallback implementation.  That will of course fail if someone uses
a globally optimizing compiler, but I think it'd be OK to say that if
you want to do that, you'd better have a real barrier implementation.
Right now, it's probably unsafe regardless.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Andres Freund

On 2013-11-19 09:12:42 -0500, Robert Haas wrote:
> On point #1, I dunno.  It looks like a lot of rearrangement to me, and
> I'm not really sure what the final form of it is intended to be.

Understandable. I am not that sure what parts we want to rearange
either. We very well could leave barrier.h and s_lock.h untouched and
just maintain the atomics stuff in parallel and switch over at some
later point.
I've mainly switched over s_lock.h to a) get some good coverage of the
atomics code b) make sure the api works fine for it. We could add the
atomics.h based implementation as a fallback for the cases in which no
native implementation is available.

> I think it desperately needs a README explaining what the point of all
> this is and how to add support for a new platform or compiler if yours
> doesn't work.

Ok, I'll work on that. It would be useful to get feedback on the
"frontend" API in atomics.h though. If people are unhappy with the API
it might very well change how we can implement the API on different
architectures.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 10:26 AM, Andres Freund  wrote:
> On 2013-11-19 09:12:42 -0500, Robert Haas wrote:
>> On point #1, I dunno.  It looks like a lot of rearrangement to me, and
>> I'm not really sure what the final form of it is intended to be.
>
> Understandable. I am not that sure what parts we want to rearange
> either. We very well could leave barrier.h and s_lock.h untouched and
> just maintain the atomics stuff in parallel and switch over at some
> later point.
> I've mainly switched over s_lock.h to a) get some good coverage of the
> atomics code b) make sure the api works fine for it. We could add the
> atomics.h based implementation as a fallback for the cases in which no
> native implementation is available.
>
>> I think it desperately needs a README explaining what the point of all
>> this is and how to add support for a new platform or compiler if yours
>> doesn't work.
>
> Ok, I'll work on that. It would be useful to get feedback on the
> "frontend" API in atomics.h though. If people are unhappy with the API
> it might very well change how we can implement the API on different
> architectures.

Sure, but if people can't understand the API, then it's hard to decide
whether to be unhappy with it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Andres Freund

On 2013-11-19 10:23:57 -0500, Robert Haas wrote:
> > The only fundamental thing that I don't immediately see how we can
> > support is the spinlock based memory barrier since that introduces a
> > circularity (atomics need barrier, barrier needs spinlocks, spinlock
> > needs atomics).
>
> We've been pretty much assuming for a long time that calling a
> function in another translation unit acts as a compiler barrier.
> There's a lot of code that isn't actually safe against global
> optimization; we assume, for example, that memory accesses can't move
> over an LWLockAcquire(), but that's just using spinlocks internally,
> and those aren't guaranteed to be compiler barriers, per previous
> discussion.  So one idea for a compiler barrier is just to define a
> function call pg_compiler_barrier() in a file by itself, and make that
> the fallback implementation.  That will of course fail if someone uses
> a globally optimizing compiler, but I think it'd be OK to say that if
> you want to do that, you'd better have a real barrier implementation.

That works for compiler, but not for memory barriers :/

> Right now, it's probably unsafe regardless.

Yes, I have pretty little trust in the current state. Both from the
infrastructure perspective (spinlocks, barriers) as from individual
pieces of code. To a good part we're probably primarily protected by
x86's black magic and the fact that everyone with sufficient concurrency
to see problems uses x86.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Tom Lane

Peter Eisentraut  writes:
> On 11/19/13, 9:57 AM, Tom Lane wrote:
>> Hm.  Now that I think about it, isn't Peter proposing to break systems
>> without working "inline" over here?
>> http://www.postgresql.org/message-id/1384257026.8059.5.ca...@vanquo.pezone.net

> No, that's about const, volatile, #, and memcmp.

Ah, sorry, not enough caffeine absorbed yet.  Still, we should stop
to think about whether this represents an undesirable move of the
portability goalposts.  The first three of these are certainly
compiler issues, and I personally don't have a problem with blowing
off compilers that still haven't managed to implement all of C89 :-(.
I'm not clear on which systems had the memcmp issue --- do we have
the full story on that?

> I don't have an informed opinion about requiring inline support
> (although it would surely be nice).

inline is C99, and we've generally resisted requiring C99 features.
Maybe it's time to move that goalpost, and maybe not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 10:31 AM, Andres Freund  wrote:
> On 2013-11-19 10:23:57 -0500, Robert Haas wrote:
>> > The only fundamental thing that I don't immediately see how we can
>> > support is the spinlock based memory barrier since that introduces a
>> > circularity (atomics need barrier, barrier needs spinlocks, spinlock
>> > needs atomics).
>>
>> We've been pretty much assuming for a long time that calling a
>> function in another translation unit acts as a compiler barrier.
>> There's a lot of code that isn't actually safe against global
>> optimization; we assume, for example, that memory accesses can't move
>> over an LWLockAcquire(), but that's just using spinlocks internally,
>> and those aren't guaranteed to be compiler barriers, per previous
>> discussion.  So one idea for a compiler barrier is just to define a
>> function call pg_compiler_barrier() in a file by itself, and make that
>> the fallback implementation.  That will of course fail if someone uses
>> a globally optimizing compiler, but I think it'd be OK to say that if
>> you want to do that, you'd better have a real barrier implementation.
>
> That works for compiler, but not for memory barriers :/

True, but we already assume that a spinlock is a memory barrier minus
a compiler barrier.  So if you have a working compiler barrier, you
ought to be able to fix spinlocks to be memory barriers.  And then, if
you need a memory barrier for some other purpose, you can always fall
back to acquiring and releasing a spinlock.

Maybe that's too contorted.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Andres Freund

On 2013-11-19 10:30:24 -0500, Tom Lane wrote:
> > I don't have an informed opinion about requiring inline support
> > (although it would surely be nice).
> 
> inline is C99, and we've generally resisted requiring C99 features.
> Maybe it's time to move that goalpost, and maybe not.

But it's a part of C99 that was very widely implemented before, so even
if we don't want to rely on C99 in its entirety, relying on inline
support is realistic.

I think, independent from atomics, the readability & maintainability win
by relying on inline functions instead of long macros, potentially with
multiple eval hazards, or contortions like ILIST_INCLUDE_DEFINITIONS is
significant.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Tom Lane

Andres Freund  writes:
> The only animal we have that doesn't support quiet inlines today is
> HP-UX/ac++, and I think - as in patch 1 in the series - we might be able
> to simply suppress the warning there.

Or just not worry about it, if it's only a warning?  Or does the warning
mean code bloat (lots of useless copies of the inline function)?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-11-19 Thread Robert Haas

On Sun, Nov 17, 2013 at 5:10 PM, Tom Lane  wrote:
> Pavel Stehule  writes:
>> [ rebased patch for RAISE WHEN ]
>
> I have to say I do not see the point of this.  It does nothing you
> can't do already with "IF condition THEN RAISE ...".  And frankly
> the RAISE statement has got too darn many options already.  We don't
> need yet more cruft on it that we'll have to maintain forevermore.
>
> If this were improving standards compliance somehow, I'd be okay
> with it; but what other implementation has got this?

This is a fair point.  I think the goal was to get to RAISE ASSERT
WHEN ...; then, if assertions are off, you do nothing; if they're on,
you error.  IF condition THEN RAISE..." isn't a suitable surrogate in
that case because you incur the overhead of testing the condition
regardless.

Now that having been said, I'm a bit wary of adding every new frammish
someone suggests to PL/pgsql.  Many of the things we've added recently
are things I anticipate that I'll never use.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Andres Freund

On 2013-11-19 10:37:35 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > The only animal we have that doesn't support quiet inlines today is
> > HP-UX/ac++, and I think - as in patch 1 in the series - we might be able
> > to simply suppress the warning there.
> 
> Or just not worry about it, if it's only a warning?  Or does the warning
> mean code bloat (lots of useless copies of the inline function)?

I honestly have no idea whether it causes code bloat - I'd be surprised
if it did since it detects that they are unused, but I cannot rule it
out entirely.
The suggested patch - untested since I have no access to HP-UX - just
adds +W2177 to the compiler's commandline in template/hpux which
supposedly suppressed that warning.

I think removing the quiet inline test is a good idea, but that doesn't
preclude fixing the warnings at the same time.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wildcard usage enhancements in .pgpass

2013-11-19 Thread Robert Haas

On Mon, Nov 18, 2013 at 1:57 AM, Alexey Klyukin  wrote:
> Hi Martijn,
>
> On Sun, Nov 17, 2013 at 7:56 PM, Martijn van Oosterhout 
> wrote:
>>
>> On Sat, Nov 16, 2013 at 09:26:33PM +0100, Alexey Klyukin wrote:
>> > Hi,
>> >
>> > Attached is the patch that improves usage of '*' wildcard in .pgpass,
>> > particularly in the host part. The use case is below.
>>
>> Looks interesting, though I wonder if you could use fnmatch(3) here. Or
>> woud that match more than you expect?  For example, it would allow
>> 'foo*bar' to match 'foo.bar' which your code doesn't.
>
> fnmatch(3) looks like a good deal and I'd certainly consider it if we go the
> road of matching regular expressions, although for simpler use cases it's an
> overkill, since it forces us to do an extra pass over the string to be
> matched and introduces some performance penalties of using a regexp matching
> engine.

Also, it seems likely that it's not supported on every platform (e.g.
Windows), possibly then requiring a src/port implementation.  I'd
rather avoid introducing new platform dependencies unless they do
something really key.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Christophe Pettus

On Nov 19, 2013, at 6:59 AM, Andres Freund  wrote:

> Yes. There's less expensive ways to do it, but those seem to complicated
> to suggest.

If this is something that could be built into to a tool, acknowledging the 
complexity, I'd be happy to see about building it.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nested hstore patch

2013-11-19 Thread Bruce Momjian

On Thu, Nov 14, 2013 at 05:50:02PM +0100, Hannu Krosing wrote:
> If we could somehow turn "old json" into a text domain with json syntax
> check
> (which it really is up to 9.3) via pg_upgrade that would be great.
> 
> It would be the required for pg_dump to have some swicth to output
> different typename in CREATE TABLE and similar.

I don't think pg_upgrade isn't in a position to handle this.  I think it
would require a script to be run after pg_upgrade completes.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nested hstore patch

2013-11-19 Thread Robert Haas

On Wed, Nov 13, 2013 at 6:59 PM, Hannu Krosing  wrote:
> I remember strong voices in support of *not* normalising json, so that
> things like
>
> {"a":1,"a":true, "a":"b", "a":none}
>
> would go through the system unaltered, for claimed standard usage of
> json as
> "processing instructions". That is as source code which can possibly
> converted
> to JavaScript Object and not something that would come out of
> serialising of
> any existing JavaScript Object.

Yeah, as the guy who wrote the original version of the JSON type,
which works just exactly like the XML type does, I stronly object to
changing the behavior.  And doubly so now that it's released, as we
would be breaking backward compatibility.

> I suggest we add another type, maybe jsobj, which has input and output
> as standard
>  "JSON" but which is defined from the start to be equivalent of existing
> object
> and not "preservable source code" to such object.

I think this was the consensus solution when this was last discussed,
and I support it.  There is similar space for a binary XML data type
if someone feels like implementing it.  I think the names that were
proposed previously were something like jsonb and xmlb.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nested hstore patch

2013-11-19 Thread Bruce Momjian

On Tue, Nov 19, 2013 at 10:51:06AM -0500, Robert Haas wrote:
> I think this was the consensus solution when this was last discussed,
> and I support it.  There is similar space for a binary XML data type
> if someone feels like implementing it.  I think the names that were
> proposed previously were something like jsonb and xmlb.

The natural name is OBJSON, meaning object JSON, because as PostgreSQL
people, we have to double-use letters wherever possible.  ;-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: pre-commit triggers

2013-11-19 Thread Robert Haas

On Mon, Nov 18, 2013 at 9:39 AM, Ian Lawrence Barwick  wrote:
>   postgres=# BEGIN ;
>   BEGIN
>   postgres=*# INSERT INTO foo (id) VALUES (1);
>   INSERT 0 1
>   postgres=*# COMMIT ;
>   NOTICE:  Pre-commit trigger called
>   ERROR:  relation "bar" does not exist
>   LINE 1: SELECT foo FROM bar
>  ^
>   QUERY:  SELECT foo FROM bar
>   CONTEXT:  PL/pgSQL function pre_commit_trigger() line 4 at EXECUTE statement
>   postgres=#
>
> I'd expect this to lead to a failed transaction block,
> or at least some sort of notice that the transaction itself
> has been rolled back.

Ending up in a failed transaction block would be wrong.  If the user
does a BEGIN, a bunch of stuff, and a COMMIT, they're entitled to
assume without checking that they are no longer in a transaction
block.  The COMMIT may have actually performed a ROLLBACK, but one way
or the other the transaction block will have ended.  This is important
for things like psql <
my-dumb-script-with-several-begin-commit-blocks.

It is a little less clear whether it's best for the COMMIT to return
an ERROR message or something else, but I think the ERROR is probably
the best solution.  There is already commit-time code that can fail
today, so there should be precedent here.  And I suspect anything
other than ERROR will be really messy to implement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nested hstore patch

2013-11-19 Thread Andrew Dunstan



On 11/19/2013 10:51 AM, Robert Haas wrote:



I suggest we add another type, maybe jsobj, which has input and output
as standard
  "JSON" but which is defined from the start to be equivalent of existing
object
and not "preservable source code" to such object.

I think this was the consensus solution when this was last discussed,
and I support it.  There is similar space for a binary XML data type
if someone feels like implementing it.  I think the names that were
proposed previously were something like jsonb and xmlb.




I think that's the consensus position on a strategy.


JSONB seems to be the current winner min the name stakes.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nested hstore patch

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 10:55 AM, Bruce Momjian  wrote:
> On Tue, Nov 19, 2013 at 10:51:06AM -0500, Robert Haas wrote:
>> I think this was the consensus solution when this was last discussed,
>> and I support it.  There is similar space for a binary XML data type
>> if someone feels like implementing it.  I think the names that were
>> proposed previously were something like jsonb and xmlb.
>
> The natural name is OBJSON, meaning object JSON, because as PostgreSQL
> people, we have to double-use letters wherever possible.  ;-)

Personally, I think the patch author should just run ps auxww | md5 |
sed 's/^[^0-9]//' | cut -c1-8 and call the new data type by the
resulting name.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-11-19 Thread Pavel Stehule

2013/11/19 Robert Haas 

> On Sun, Nov 17, 2013 at 5:10 PM, Tom Lane  wrote:
> > Pavel Stehule  writes:
> >> [ rebased patch for RAISE WHEN ]
> >
> > I have to say I do not see the point of this.  It does nothing you
> > can't do already with "IF condition THEN RAISE ...".  And frankly
> > the RAISE statement has got too darn many options already.  We don't
> > need yet more cruft on it that we'll have to maintain forevermore.
> >
> > If this were improving standards compliance somehow, I'd be okay
> > with it; but what other implementation has got this?
>
> This is a fair point.  I think the goal was to get to RAISE ASSERT
> WHEN ...; then, if assertions are off, you do nothing; if they're on,
> you error.  IF condition THEN RAISE..." isn't a suitable surrogate in
> that case because you incur the overhead of testing the condition
> regardless.
>
> Now that having been said, I'm a bit wary of adding every new frammish
> someone suggests to PL/pgsql.  Many of the things we've added recently
> are things I anticipate that I'll never use.
>

lot of features are popular with some delay. CTE is very popular now, and
two years ago only few developers used it. Lot of applications are
developed for 9.1 still.

Regards

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] nested hstore patch

2013-11-19 Thread Andrew Dunstan



On 11/19/2013 11:00 AM, Robert Haas wrote:

On Tue, Nov 19, 2013 at 10:55 AM, Bruce Momjian  wrote:

On Tue, Nov 19, 2013 at 10:51:06AM -0500, Robert Haas wrote:

I think this was the consensus solution when this was last discussed,
and I support it.  There is similar space for a binary XML data type
if someone feels like implementing it.  I think the names that were
proposed previously were something like jsonb and xmlb.

The natural name is OBJSON, meaning object JSON, because as PostgreSQL
people, we have to double-use letters wherever possible.  ;-)

Personally, I think the patch author should just run ps auxww | md5 |
sed 's/^[^0-9]//' | cut -c1-8 and call the new data type by the
resulting name.




My personal vote goes for "marmaduke". I've been dying to call some 
builtin object that for ages.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: pre-commit triggers

2013-11-19 Thread Andrew Dunstan



On 11/19/2013 10:58 AM, Robert Haas wrote:

On Mon, Nov 18, 2013 at 9:39 AM, Ian Lawrence Barwick  wrote:

   postgres=# BEGIN ;
   BEGIN
   postgres=*# INSERT INTO foo (id) VALUES (1);
   INSERT 0 1
   postgres=*# COMMIT ;
   NOTICE:  Pre-commit trigger called
   ERROR:  relation "bar" does not exist
   LINE 1: SELECT foo FROM bar
  ^
   QUERY:  SELECT foo FROM bar
   CONTEXT:  PL/pgSQL function pre_commit_trigger() line 4 at EXECUTE statement
   postgres=#

I'd expect this to lead to a failed transaction block,
or at least some sort of notice that the transaction itself
has been rolled back.

Ending up in a failed transaction block would be wrong.  If the user
does a BEGIN, a bunch of stuff, and a COMMIT, they're entitled to
assume without checking that they are no longer in a transaction
block.  The COMMIT may have actually performed a ROLLBACK, but one way
or the other the transaction block will have ended.  This is important
for things like psql <
my-dumb-script-with-several-begin-commit-blocks.

It is a little less clear whether it's best for the COMMIT to return
an ERROR message or something else, but I think the ERROR is probably
the best solution.  There is already commit-time code that can fail
today, so there should be precedent here.  And I suspect anything
other than ERROR will be really messy to implement.





OK, you've convinced me.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UNNEST with multiple args, and TABLE with multiple funcs

2013-11-19 Thread Tom Lane

I wrote:
> Andrew Gierth  writes:
>> Here is a new patch with the following changes on top of Heikki's
>> version (all the changes in which I've otherwise kept):

> Here is an updated version:

I've been hacking on this patch all day yesterday.  What I'm on about at
the moment is reversing the decision to move range functions' funccoltypes
etc into FuncExpr.  That's a bad idea on the grounds of bloating FuncExpr,
but the real problem with it is this: what happens if the planner decides
to inline or const-simplify the function expression?  You just lost a
critical part of the RTE's infrastructure, that's what.  So that's got to
go, and I've been fooling with different ways to represent the info for
multiple functions within RangeTblEntry.  What I have at the moment is

/*
 * Fields valid for a function RTE (else NIL/zero):
 *
 * There can be multiple function expressions in a function RTE.
 * funccoldeflist is an integer list (of the same length as funcexprs)
 * containing true if function had a column definition list, else false.
 * funccolcounts is an integer list (of the same length as funcexprs)
 * showing the number of RTE output columns produced by each function.
 * The length of eref->colnames must be equal to either the sum of the
 * funccolcounts entries, or one more than the sum if funcordinality is
 * true.  funccoltypes, funccoltypmods, and funccolcollations give type
 * information about each output column (these lists must have the same
 * length as eref->colnames).  Remember that when a function returns a
 * named composite type, any dropped columns in that type will have dummy
 * corresponding entries in these lists.
 *
 * Note: funccoltypes etc are derived from either the functions' declared
 * result types, or their column definition lists in case of functions
 * returning RECORD.  Storing this data in the RTE is redundant in the
 * former case, but for simplicity we store it always.
 */
List   *funcexprs;/* expression trees for func calls */
List   *funccoldeflist;   /* integer list of has-coldeflist booleans */
List   *funccolcounts;/* number of output columns from each func */
List   *funccoltypes; /* OID list of column type OIDs */
List   *funccoltypmods;   /* integer list of column typmods */
List   *funccolcollations;/* OID list of column collation OIDs */
boolfuncordinality;   /* is this called WITH ORDINALITY? */

which has the advantage that the ordinality column is no longer such a
special case, it's right there in the lists.  However, it turns out that
in most places where I thought we could just consult the entry-per-column
lists, we can't.  We still have to do the get_expr_result_type() dance,
because we need up-to-date information about which columns of a
composite-returning function's output have been dropped since the RTE was
made.  That means we'd have to chase the entry-per-function lists in
parallel with the entry-per-column lists, which is a PITA.

I'm thinking possibly it's worth inventing a new Node type that would just
be infrastructure for RTE_FUNCTION RTEs, so that we'd have something like
this in RangeTblEntry:

List   *functions;/* List of RangeTblFunction nodes */
boolfuncordinality;   /* is this called WITH ORDINALITY? */

and a node type RangeTblFunction with fields

Node   *funcexpr; /* executable expression for the function */
int funccolcount; /* number of columns emitted by function */
/* These lists are NIL unless function had a column definition list: */
List   *funccoltypes; /* OID list of column type OIDs */
List   *funccoltypmods;   /* integer list of column typmods */
List   *funccolcollations;/* OID list of column collation OIDs */

BTW, the reason we need to store the column count explicitly is that we
have to ignore the added columns if a composite type has had an ADD COLUMN
done to it since the RTE was made.  The submitted patch fails rather
nastily in such cases, if the composite type isn't last in the function
list.

Thoughts, better ideas?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] additional json functionality

2013-11-19 Thread Robert Haas

On Thu, Nov 14, 2013 at 2:54 PM, Hannu Krosing  wrote:
> I am sure you could also devise an json encoding scheme
> where white space is significant ;)

I don't even have to think hard.  If you want your JSON to be
human-readable, it's entirely possible that you want it stored using
the same whitespace that was present on input.  There is a valid use
case for normalizing whitespace, too, of course.

Everyone on this thread who thinks that there is Only One Right Way To
Do It should take a chill pill.  There is, in fact, more than one
right way to do it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing pg_dump default file format

2013-11-19 Thread Robert Haas

On Mon, Nov 18, 2013 at 3:21 PM, Bruce Momjian  wrote:
> On Thu, Nov  7, 2013 at 02:40:17PM -0500, Peter Eisentraut wrote:
>> On 11/7/13, 1:00 PM, Josh Berkus wrote:
>> > If we wanted to change the defaults, I think it would be easier to
>> > create a separate bin name (e.g. pg_backup) than to change the existing
>> > parameters for pg_dump.
>>
>> Note the following code in pg_dump.c:
>>
>> /* Set default options based on progname */
>> if (strcmp(progname, "pg_backup") == 0)
>> format = "c";
>
> Wow, when was that added?  git blame says 2002:
>
>   9f0ae0c8 (Tom Lane   2002-05-10 22:36:27 +   387)   /* Set 
> default options based on progname */
>   9f0ae0c8 (Tom Lane   2002-05-10 22:36:27 +   388)   if 
> (strcmp(progname, "pg_backup") == 0)
>   9f0ae0c8 (Tom Lane   2002-05-10 22:36:27 +   389)   format 
> = "c";
>
> However, pggit log 9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba does not
> show that line being added by the diff.

I dunno what your pggit script does, but "git log" doesn't normally
show the diff at all.  "git show
9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba", however, does show that
line being added.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 10:59 AM, Pavel Stehule  wrote:
>> Now that having been said, I'm a bit wary of adding every new frammish
>> someone suggests to PL/pgsql.  Many of the things we've added recently
>> are things I anticipate that I'll never use.
>
> lot of features are popular with some delay. CTE is very popular now, and
> two years ago only few developers used it. Lot of applications are developed
> for 9.1 still.

I think that's true, but not particularly relevant.  CTEs are
obviously a major feature; a lot of the stuff we've been adding to
PL/pgsql is tinkering around the edges.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing pg_dump default file format

2013-11-19 Thread Tom Lane

Robert Haas  writes:
>> However, pggit log 9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba does not
>> show that line being added by the diff.

> I dunno what your pggit script does, but "git log" doesn't normally
> show the diff at all.  "git show
> 9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba", however, does show that
> line being added.

It also shows it being removed, ie this is just -u diff format being
unhelpful, as it so often is.

The actual source of the code seems to be commit
e8f69be054e9343b3c41d7e77cc142913ee55439.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pre-commit triggers

2013-11-19 Thread Noah Misch

On Tue, Nov 19, 2013 at 08:54:49AM -0500, Andrew Dunstan wrote:
> 
> On 11/19/2013 12:45 AM, Noah Misch wrote:
> >On Fri, Nov 15, 2013 at 01:01:48PM -0500, Andrew Dunstan wrote:
> >>The triggers don't fire if there is no real XID, so only actual data
> >>changes should cause the trigger to fire.
> >What's the advantage of this provision?  Without it, an individual trigger
> >could make the same check and drop out quickly.  A trigger not wanting it
> >can't so easily work around its presence, though.  Heretofore, skipping XID
> >assignment has been an implementation detail that improves performance 
> >without
> >otherwise calling user attention to itself.  This provision would make the
> >decision to acquire an XID (where optional) affect application behavior.
> >
> 
> 
> Mainly speed. How is the trigger (especially if not written in C)
> going to check the same thing?

Probably through a thin C function calling GetCurrentTransactionIdIfAny().  If
using C is not an option, one could query pg_locks.

> Conventional triggers don't fire except on data changing events, so
> this seemed consistent with that.

The definitions of "data changing event" differ, though.  An UPDATE that finds
no rows to change will fire statement-level triggers, but this commit trigger
would not fire.

> Perhaps my understanding of when XIDs are acquired is insufficient.
> When exactly is it optional?

The following commands force XID assignment, but I think that's an
implementation detail rather than a consequence of their identity as
data-changing events:

SELECT ... FOR 
NOTIFY
PREPARE TRANSACTION (gets an XID even if nothing else had done so)

Also, parents of XID-bearing subtransactions always have XIDs, even if all
subtransactions that modified data have aborted.  This, too, is an
implementation artifact.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] shared memory message queues

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 12:33 AM, Kohei KaiGai  wrote:
> * on-dsm-detach-v2.patch
> It reminded me the hook registration/invocation mechanism on apache/httpd.
> It defines five levels for invocation order (REALLY_FIRST, FIRST, MIDDLE,
> LAST, REALLY_LAST), but these are alias of integer values, in other words,
> they are just kicks the hook in order to the priority value associated with a
> function pointer.
> These flexibility may make sense. We may want to control the order to
> release resources more fine grained in the future. For example, isn't it
> a problem if we have only two levels when a stuff in a queue needs to be
> released earlier than the queue itself?
> I'm not 100% certain on this suggestion because on_shmen_exit is a hook
> that does not host so much callbacks, thus extension may implement
> their own way on the SHMEM_EXIT_EARLY or SHMEM_EXIT_LATE stage
> of course.

I don't really see much point in adding more flexibility here until we
need it, but I can imagine that we might someday need it, for reasons
that are not now obvious.

> * shm-toc-v1.patch
>
> From my experience, it makes sense to put a magic number on the tail of
> toc segment to detect shared-memory usage overrun. It helps to find bugs
> bug hard to find because of concurrent jobs.
> Probably, shm_toc_freespace() is a point to put assertion.
>
> How many entries is shm_toc_lookup() assumed to chain?
> It assumes a liner search from the head of shm_toc segment, and it is
> prerequisite for lock-less reference, isn't it?
> Does it make a problem if shm_toc host many entries, like 100 or 1000?
> Or, it is not an expected usage?

It is not an expected usage.In typical usage, I expect that the
number of TOC entries will be about N+K, where K is a small constant
(< 10) and N is the number of cooperating parallel workers.  It's
possible that we'll someday be in a position to leverage 100 or 1000
parallel workers on the same task, but I don't expect it to be soon.
And, actually, I doubt that revising the data structure would pay off
at N=100.  At N=1000, maybe.  At N=1, probably.  But we are
*definitely* not going to need that kind of scale any time soon, and I
don't think it makes sense to design a complex data structure to
handle that case when there are so many more basic problems that need
to be solved before we can go there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More legacy code: pg_ctl

2013-11-19 Thread Robert Haas

On Mon, Nov 18, 2013 at 8:20 PM, Josh Berkus  wrote:
> On 11/18/2013 05:09 PM, Josh Berkus wrote:
>> Folks,
>>
>> Speaking of legacy code with bad default behaviors: pg_ctl.  The current
>> utility is designed to fathfully reproduce the rather hackish shell
>> script we originally had for postgres startup.   This results in a
>> couple of unintuitive defaults which give sysadmins and config
>> management developers headaches:
>>
>> a) by default, it returns to the caller without waiting for postgres to
>> actually start/stop/restart.  In this mode, it also always returns
>> success regardless of result.
>>
>> b) by default, it remains attached to the caller's tty for stdout and
>> stderr, even after it has switched to the regular postgres log.
>
> Oh, and one more:
>
> c) that "stop" defaults to "smart" mode, instead of "fast" mode.

And that "smart" mode is called "smart" instead of "footgun".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-19 Thread Robert Haas

On Mon, Nov 18, 2013 at 9:07 PM, Bruce Momjian  wrote:
> Well, ERROR is what LOCK returns, so if we change SET TRANSACTION to be
> WARNING, we should change LOCK too, so on backward-compatibility
> grounds, ERROR makes more sense.
>
> Personally, I am fine with changing them all to WARNING.

I don't think it's worth breaking backward compatibility.  I'm not
entirely sure what I would have decided here in a vacuum, but at this
point existing precedent seems determinative.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: pre-commit triggers

2013-11-19 Thread Tom Lane

Robert Haas  writes:
> On Mon, Nov 18, 2013 at 9:39 AM, Ian Lawrence Barwick  
> wrote:
>> I'd expect this to lead to a failed transaction block,
>> or at least some sort of notice that the transaction itself
>> has been rolled back.

> Ending up in a failed transaction block would be wrong.  If the user
> does a BEGIN, a bunch of stuff, and a COMMIT, they're entitled to
> assume without checking that they are no longer in a transaction
> block.

Absolutely.  There are plenty of ways to fail at COMMIT already,
eg deferred foreign key constraints.  This shouldn't act any
different.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logging WAL when updating hintbit

2013-11-19 Thread Robert Haas

On Thu, Nov 14, 2013 at 1:02 AM, Sawada Masahiko  wrote:
> I attached patch adds new wal_level 'all'.
> If wal_level is set 'all', the server logs WAL not only when wal_level
> is set 'hot_standby' ,but also when updating hint bit.
> That is, we will be able to know all of the changed block number by
> reading the WALs.
> This wal_level is infrastructure for fast failback. (i.g., without fresh 
> backup)
> It need to cooperate with pg_rewind.

This needs to be a separate parameter, not something that gets jammed
into wal_level.

I'm also not 100% sure we want it, but let's hear what others think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pre-commit triggers

2013-11-19 Thread Andres Freund

Hi,

On 2013-11-15 13:01:48 -0500, Andrew Dunstan wrote:
> Attached is a patch to provide a new event trigger that will fire on
> transaction commit. I have tried to make certain that it fires at a
> sufficiently early stage in the commit process that some of the evils
> mentioned in previous discussions on this topic aren't relevant.
> 
> The triggers don't fire if there is no real XID, so only actual data changes
> should cause the trigger to fire. They also don't fire in single user mode,
> so that if you do something stupid like create a trigger that
> unconditionally raises an error you have a way to recover.
> 
> This is intended to be somewhat similar to the same feature in the Firebird
> database, and the initial demand came from a client migrating from that
> system to Postgres.

Could you explain a bit what the use case of this is and why it's not
sufficient to allow constraint triggers to work on a statement level?
"Just" that there would be multiple ones fired?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pre-commit triggers

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 12:45 AM, Noah Misch  wrote:
> On Fri, Nov 15, 2013 at 01:01:48PM -0500, Andrew Dunstan wrote:
>> The triggers don't fire if there is no real XID, so only actual data
>> changes should cause the trigger to fire.
>
> What's the advantage of this provision?  Without it, an individual trigger
> could make the same check and drop out quickly.  A trigger not wanting it
> can't so easily work around its presence, though.  Heretofore, skipping XID
> assignment has been an implementation detail that improves performance without
> otherwise calling user attention to itself.  This provision would make the
> decision to acquire an XID (where optional) affect application behavior.

Yeah, I agree that that's an ugly wart.  If we want a pre-commit
trigger that's only called for transactions that write data, we at
least need to name it appropriately.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertions in PL/PgSQL

2013-11-19 Thread Pavel Stehule

2013/11/19 Robert Haas 

> On Tue, Nov 19, 2013 at 10:59 AM, Pavel Stehule 
> wrote:
> >> Now that having been said, I'm a bit wary of adding every new frammish
> >> someone suggests to PL/pgsql.  Many of the things we've added recently
> >> are things I anticipate that I'll never use.
> >
> > lot of features are popular with some delay. CTE is very popular now, and
> > two years ago only few developers used it. Lot of applications are
> developed
> > for 9.1 still.
>
> I think that's true, but not particularly relevant.  CTEs are
> obviously a major feature; a lot of the stuff we've been adding to
> PL/pgsql is tinkering around the edges.
>
>
I agree so almost all last features are not major features - but I don't
think so it is wrong (and structured exception is not minor feature).
Almost all work is done.

There are only few issues, that should be solved:

* deeper checking embedded SQL
* more robust work with nested types - assign statement
* some support for large and complex projects (support for developer tools
like coverage calculation, dependency graphs and assertions)

Regards

Pavel



> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] Logging WAL when updating hintbit

2013-11-19 Thread Sawada Masahiko

On Tue, Nov 19, 2013 at 3:54 PM, KONDO Mitsumasa
 wrote:
> (2013/11/15 19:27), Sawada Masahiko wrote:
>>
>> On Thu, Nov 14, 2013 at 7:51 PM, Florian Weimer 
>> wrote:
>>>
>>> On 11/14/2013 07:02 AM, Sawada Masahiko wrote:
>>>
 I attached patch adds new wal_level 'all'.
>>>
>>>
>>>
>>> Shouldn't this be a separate setting?  It's useful for storage which
>>> requires rewriting a partially written sector before it can be read
>>> again.
>>>
>>
>> Thank you for comment.
>> Actually, I had thought to add separate parameter.
>
> I think that he said that if you can proof that amount of WAL is almost same
> and
> without less performance same as before, you might not need to separate
> parameter in your patch.
>

Thanks!
I took it wrong.
I think that there are quite a few difference amount of WAL.

> Did you test about amount of WAL size in your patch?

Not yet. I will do that.

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-19 Thread David Johnston

Robert Haas wrote
> On Mon, Nov 18, 2013 at 9:07 PM, Bruce Momjian <

> bruce@

> > wrote:
>> Well, ERROR is what LOCK returns, so if we change SET TRANSACTION to be
>> WARNING, we should change LOCK too, so on backward-compatibility
>> grounds, ERROR makes more sense.
>>
>> Personally, I am fine with changing them all to WARNING.
> 
> I don't think it's worth breaking backward compatibility.  I'm not
> entirely sure what I would have decided here in a vacuum, but at this
> point existing precedent seems determinative.

Well, at this point we have already broken backward compatibility by
releasing this.  With Tom's thread necromancy I missed the fact this got
released in 9.3

Now, given normal upgrade realities the people likely to have this bite them
probably are a ways out from upgrading so I wouldn't expect to have seen
many complaints yet - but at the same time I do not recall seeing any
complaints yet (limited to -bugs and -general)

The referenced patch:

is released
is documented
is consistent with precedent established by similar codepaths
causes an obvious error in what is considered broken code
can be trivially corrected by a user willing and able to update their
application

I'd say leave this as-is and only re-evaluate the decision if complaints are
brought forth.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Suggestion-Issue-warning-when-calling-SET-TRANSACTION-outside-transaction-block-tp5743139p5779170.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication Node Identifiers and crashsafe Apply Progress

2013-11-19 Thread Andres Freund

Hi,

On 2013-11-19 07:40:30 -0500, Robert Haas wrote:
> > This way, after a crash, restart, disconnect the replay process can look
> > into shared memory and check how far it has already replayed and restart
> > seamlessly. With minimal effort.
> 
> It would be much less invasive for the replication apply code to fsync
> its own state on the apply side.  Obviously, that means doubling the
> fsync rate, which is not appealing, but I think that's still a useful
> way to think about what you're aiming to accomplish here: avoid
> doubling the fsync rate when applying remote transactions in a
> crash-safe manner.

Exactly.

> Although I agree that we need a way to do that, I don't have a
> particularly warm and fuzzy feeling about this particular proposal:
> there are too many bits of it that feel like entirely arbitrary design
> decisions.  If we're going to build a full-fledged logical replication
> solution into core, attempting to obsolete Slony and Bucardo and
> Londiste and everything that's out there, then I think we have a great
> deal of design work that we have to do before we start committing
> things, or even finalizing designs.  If we're going to continue with
> the philosophy of building a toolkit that can serve as a building
> block for multiple solutions, then color me unconvinced that this will
> do the job.

Imo we actually want and need both, wanting something builtin doesn't
preclude important usecases that need to be served by other solutions.

I think - while the API certainly needs work - the general idea
integrates pretty well with the pretty generic changeset extraction
mechanism and possible solutions replication between postgres servers.

Note that this really is a draft of what I think is needed, written
after the experience of developing a solution for the problem in a
specific replication solution and talking to some people implementing
replication solutions. Maybe somebody has a far better idea to implement
this: I am all ears!

> If we made the xlog system truly extensible, that seems like it'd
> punch your ticket here.  I'm not sure how practical that is, though.

I don't think it is.

> > We previously discussed the topic and some were very adverse to using
> > any sort of numeric node identifiers across systems and suggested that
> > those should only be used internally. So what the attached patch does is
> > to add a new shared system catalog called 'pg_replication_identifier'
> > (suggestions for a better name welcome) which translates a number of
> > identifying traits into a numeric identifier.
> > The set of identifiers currently are:
> > * the sysid of the remote system, combined with the remote TLI
> > * the oid of the local database
> > * the oid of the remote database
> > * an optional name
> > but that's just what we needed in our multimaster prototype, and not
> > what I necessarily think is correct.
> 
> The fact that you've included both local and remote database OIDs
> seems wrong; shouldn't the replication identifier only serve to
> identify the source node, not the replication stream?  What if you
> want to replicate from table A to table B within the same database?

The reason I chose those parameters is that they avoid the need for a
human to assign identifiers in many situations since they already are
unique. For the cases where they aren't I've included the "name" to
distinguish several streams.

The reason both source and target database are included is that it
avoids manual work if you want to replicate between two databases in
both directions.

> We need some kind of pretty flexible system here, if we're not to box
> ourselves into a corner.

Agreed. As an alternative we could just have a single - probably longer
than NAMEDATALEN - string to identify replication progress and rely on
the users of the facility to build the identifier automatically
themselves using components that are helpful in their system.

Thanks,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 11:53 AM, David Johnston  wrote:
> Well, at this point we have already broken backward compatibility by
> releasing this.  With Tom's thread necromancy I missed the fact this got
> released in 9.3

Eh, really?  I don't see it in 9.3.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Large offset optimization and index only scan

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 2:03 AM, Maxim Boguk  wrote:
> Case:
> SELECT * FROM very_large_table ORDER BY [some indexed column(s)]  OFFSET
> [some large value] LIMIT N;
>
> Idea:
> Use IOS (index only scan) to skip the first OFFSET values than switch to
> common index scan to fetch LIMIT N values

Interesting.  This is really a variant of an optimization that I
believe to have been originally proposed by Heikki.  His idea was to
have an Index-Only Scan node that would only ever look at the index,
and the have a Heap Fetch node which would bubble up the plan tree to
a higher level.  So, for example, if you were doing something like
this:

SELECT * FROM medium_size_table a, huge_table b WHERE a.x = b.x AND
rarely_true(a.y, b.y)

...you could implement this as:

Heap Fetch On b
-> Nested Loop
-> Seq Scan on a
-> Index-Only Scan on b

That way, we'd apply the filter condition at the nested-loop level,
using some hypothetical index on x and y, before even testing the
visibility of the rows from b (and thus incurring random I/O).  The
reason why nobody implemented this is (1) the planner challenges were
formidable and (b) if rarely_true isn't leakproof, the user (or some
other user) might find it unfortunate that it gets passed tuples not
visible to the current scan.  Thus, we stuck with a design where the
index-only scan always performs the heap fetch if that is needed to
determine visibility.

However, in your example, that wouldn't be the right thing anyway,
because you'd need to know whether the tuples were visible before
performing the limit.  So what you'd really want is to have index-only
scan work about the way it does now, except that if it does end up
fetching the heap tuple to check visibility, it somehow returns that.
If not, it returns only the index tuple, and then a Heap Fetch node
higher up the plan tree can fetch those heap tuples not yet fetched
without re-fetching those we've already got.  That seems complicated
to make work even in the executor, though, and the planning problem
makes my head hurt.  If we can sole those problems it might be neat,
though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CLUSTER FREEZE

2013-11-19 Thread Robert Haas

On Mon, Nov 18, 2013 at 11:45 AM, Andres Freund  wrote:
>> Yes, we probably should make a decision, unless Robert's idea can be
>> implemented.  We have to balance the ease of debugging MVCC failures
>> with the interface we give to the user community.
>
> Imo that patch really doesn't need too much further work.

Would you care to submit a version you're happy with?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CLUSTER FREEZE

2013-11-19 Thread Andres Freund

On 2013-11-19 12:23:30 -0500, Robert Haas wrote:
> On Mon, Nov 18, 2013 at 11:45 AM, Andres Freund  
> wrote:
> >> Yes, we probably should make a decision, unless Robert's idea can be
> >> implemented.  We have to balance the ease of debugging MVCC failures
> >> with the interface we give to the user community.
> >
> > Imo that patch really doesn't need too much further work.
> 
> Would you care to submit a version you're happy with?

I'll give it a spin sometime this week.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block

2013-11-19 Thread Tom Lane

David Johnston  writes:
> Robert Haas wrote
>> I don't think it's worth breaking backward compatibility.  I'm not
>> entirely sure what I would have decided here in a vacuum, but at this
>> point existing precedent seems determinative.

> Well, at this point we have already broken backward compatibility by
> releasing this.  With Tom's thread necromancy I missed the fact this got
> released in 9.3

Uh, what?  The commit I'm objecting to is certainly not in 9.3.
It's this one:

Author: Bruce Momjian 
Branch: master [a54141aeb] 2013-10-04 13:50:28 -0400

Issue error on SET outside transaction block in some cases

Issue error for SET LOCAL/CONSTRAINTS/TRANSACTION outside a transaction
block, as they have no effect.

Per suggestion from Morten Hustveit

I agree that it's too late to reconsider the behavior of pre-existing
cases such as LOCK TABLE, but that doesn't mean I can't complain about
this one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pre-commit triggers

2013-11-19 Thread Dimitri Fontaine

Andrew Dunstan  writes:
> Perhaps my understanding of when XIDs are acquired is insufficient. When
> exactly is it optional?

My understanding of Noah's comment is that we would be exposing what
used to be an optimisation only implementation detail to the user, and
so we would need to properly document the current situation and would
probably be forbidden to change it in the future.

Then I guess it's back to the use cases: do we have use cases where it
would be interesting for the pre-commit trigger to only get fired when
an XID has been consumed?

I don't think so, because IIRC CREATE TEMP TABLE will consume an XID
even in an otherwise read-only transaction, and maybe the TEMP TABLE
writes will not be considered "actual writes" by the confused user.

What about specifying what notion of "data modifying" transactions
you're interested into and providing an SQL callable C function that the
trigger user might then use, or even a new WHEN clause?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] additional json functionality

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 12:27 PM, David E. Wheeler
 wrote:
> On Nov 19, 2013, at 8:14 AM, Robert Haas  wrote:
>> Everyone on this thread who thinks that there is Only One Right Way To
>> Do It should take a chill pill.  There is, in fact, more than one
>> right way to do it.
>
> You shoulda been a Perl hacker, Robert.

I don't hack on Perl, but I spent about 10 years hacking *in* Perl, so
the phrasing was not a coincidence.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 3:53 AM, Dean Rasheed  wrote:
> 1). Keep the existing syntax:
>
> DROP TRIGGER [ IF EXISTS ] name ON table_name [ CASCADE | RESTRICT ];
>
> but make it tolerate a non-existent table when "IF EXISTS" is specified.

I don't love this option, but I like it better than the other proposals.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] additional json functionality

2013-11-19 Thread David E. Wheeler

On Nov 19, 2013, at 8:14 AM, Robert Haas  wrote:

> Everyone on this thread who thinks that there is Only One Right Way To
> Do It should take a chill pill.  There is, in fact, more than one
> right way to do it.

You shoulda been a Perl hacker, Robert.

D


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 11:38 AM, Andres Freund  wrote:
> /*-
>  *
>  * atomics.h
>  *Generic atomic operations support.
>  *
>  * Hardware and compiler dependent functions for manipulating memory
>  * atomically and dealing with cache coherency. Used to implement locking
>  * facilities and other concurrency safe constructs.
>  *
>  * There's three data types which can be manipulated atomically:
>  *
>  * * pg_atomic_flag - supports atomic test/clear operations, useful for
>  *  implementing spinlocks and similar constructs.
>  *
>  * * pg_atomic_uint32 - unsigned 32bit integer that can correctly manipulated
>  *  by several processes at the same time.
>  *
>  * * pg_atomic_uint64 - optional 64bit variant of pg_atomic_uint32. Support
>  *  can be tested by checking for PG_HAVE_64_BIT_ATOMICS.
>  *
>  * The values stored in theses cannot be accessed using the API provided in
>  * thise file, i.e. it is not possible to write statements like `var +=
>  * 10`. Instead, for this example, one would have to use
>  * `pg_atomic_fetch_add_u32(&var, 10)`.
>  *
>  * This restriction has several reasons: Primarily it doesn't allow non-atomic
>  * math to be performed on these values which wouldn't be concurrency
>  * safe. Secondly it allows to implement these types ontop of facilities -
>  * like C11's atomics - that don't provide direct access. Lastly it serves as
>  * a useful hint what code should be looked at more carefully because of
>  * concurrency concerns.
>  *
>  * To be useful they usually will need to be placed in memory shared between
>  * processes or threads, most frequently by embedding them in structs. Be
>  * careful to align atomic variables to their own size!

What does that mean exactly?

>  * Before variables of one these types can be used they needs to be
>  * initialized using the type's initialization function (pg_atomic_init_flag,
>  * pg_atomic_init_u32 and pg_atomic_init_u64) or in case of global
>  * pg_atomic_flag variables using the PG_ATOMIC_INIT_FLAG macro. These
>  * initializations have to be performed before any (possibly concurrent)
>  * access is possible, otherwise the results are undefined.
>  *
>  * For several mathematical operations two variants exist: One that returns
>  * the old value before the operation was performed, and one that that returns
>  * the new value. *_fetch_ variants will return the old value,
>  * *__fetch the new one.

Ugh.  Do we really need this much complexity?

>  * Use higher level functionality (lwlocks, spinlocks, heavyweight locks)
>  * whenever possible. Writing correct code using these facilities is hard.
>  *
>  * For an introduction to using memory barriers within the PostgreSQL backend,
>  * see src/backend/storage/lmgr/README.barrier

The extent to which these primitives themselves provide ordering
guarantees should be documented.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication Node Identifiers and crashsafe Apply Progress

2013-11-19 Thread Robert Haas

On Tue, Nov 19, 2013 at 11:57 AM, Andres Freund  wrote:
> Agreed. As an alternative we could just have a single - probably longer
> than NAMEDATALEN - string to identify replication progress and rely on
> the users of the facility to build the identifier automatically
> themselves using components that are helpful in their system.

I tend to feel like a generic identifier would be better.  I'm not
sure why something like a UUID wouldn't be enough, though.
Arbitrary-length identifiers will be bad for performance, and 128 bits
ought to be enough for anyone.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Josh Berkus

On 11/19/2013 06:20 AM, Andres Freund wrote:
> Hi,
> 
> On 2013-11-18 23:15:59 +0100, Andres Freund wrote:
>> Afaics it's likely a combination/interaction of bugs and fixes between:
>> * the initial HS code
>> * 5a031a5556ff83b8a9646892715d7fef415b83c3
>> * f44eedc3f0f347a856eea8590730769125964597
> 
> Yes, the combination of those is guilty.
> 
> Man, this is (to a good part my) bad.
> 
>> But that'd mean nobody noticed it during 9.3's beta...

Ah, so this affected 9.3.0 as well?

Maybe it's worth it now to devise some automated replication testing?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.2

2013-11-19 Thread Andres Freund

On 2013-11-19 12:43:44 -0500, Robert Haas wrote:
> >  * To be useful they usually will need to be placed in memory shared between
> >  * processes or threads, most frequently by embedding them in structs. Be
> >  * careful to align atomic variables to their own size!
> 
> What does that mean exactly?

The alignment part? A pg_atomic_flag needs to be aligned to
sizeof(pg_atomic_flag), pg_atomic_uint32,64s to their sizeof()
respectively. When embedding them inside a struct the whole struct needs
to be allocated at least at that alignment.
Not sure how to describe that concisely.

I've added wrapper functions around the implementation that test for
alignment to make sure it's easy to spot such mistakes which could end
up having ugly results on some platforms.

> >  * For several mathematical operations two variants exist: One that returns
> >  * the old value before the operation was performed, and one that that 
> > returns
> >  * the new value. *_fetch_ variants will return the old value,
> >  * *__fetch the new one.
> 
> Ugh.  Do we really need this much complexity?

Well, based on my experience it's really what you need when writing code
using it. The __fetch variants are implemented generically using the
_fetch_ operations, so leaving them out wouldn't win us much. What
would you like to remove?

> >  * Use higher level functionality (lwlocks, spinlocks, heavyweight locks)
> >  * whenever possible. Writing correct code using these facilities is hard.
> >  *
> >  * For an introduction to using memory barriers within the PostgreSQL 
> > backend,
> >  * see src/backend/storage/lmgr/README.barrier
> 
> The extent to which these primitives themselves provide ordering
> guarantees should be documented.

Every function should have a comment to that regard, although two have
an XXX where I am not sure what we want to mandate, specifically whether
pg_atomic_test_set_flag() should be a full or acquire barrier and
whether pg_atomic_clear_flag should be full or release.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Andres Freund

On 2013-11-19 09:51:28 -0800, Josh Berkus wrote:
> On 11/19/2013 06:20 AM, Andres Freund wrote:
> > Hi,
> > 
> > On 2013-11-18 23:15:59 +0100, Andres Freund wrote:
> >> Afaics it's likely a combination/interaction of bugs and fixes between:
> >> * the initial HS code
> >> * 5a031a5556ff83b8a9646892715d7fef415b83c3
> >> * f44eedc3f0f347a856eea8590730769125964597
> > 
> > Yes, the combination of those is guilty.
> > 
> > Man, this is (to a good part my) bad.
> > 
> >> But that'd mean nobody noticed it during 9.3's beta...
> 
> Ah, so this affected 9.3.0 as well?

Yes.

> Maybe it's worth it now to devise some automated replication testing?

It'd be a good idea, but I am not sure where to get resources for it
from.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] additional json functionality

2013-11-19 Thread Josh Berkus

On 11/19/2013 08:14 AM, Robert Haas wrote:
> On Thu, Nov 14, 2013 at 2:54 PM, Hannu Krosing  wrote:
>> I am sure you could also devise an json encoding scheme
>> where white space is significant ;)
> 
> I don't even have to think hard.  If you want your JSON to be
> human-readable, it's entirely possible that you want it stored using
> the same whitespace that was present on input.  There is a valid use
> case for normalizing whitespace, too, of course.

Given that JSON is a data interchange format, I suspect that there are
an extremely large combination of factors which would result in an
unimplementably large number of parser settings.  For example, I
personally would have use for a type which allowed the storage of JSON
*fragments*.  Therefore I am interested only in supporting two:

a) the legacy behavior from 9.2 and 9.3 so we don't destroy people's
apps, and

b) the optimal behavior for Hstore2/JSONB.

(a) is defined as:
* complete documents only (no fragments)
* whitespace not significant
* no reordering of keys
* duplicate keys allowed

(b) is defined as:
* complete documents only (no fragments)
* whitespace not significant
* reordering of keys
* duplicate keys prohibited 

If people want other manglings of JSON, they can use TEXT fields and
custom parsers written in PL/v8, the same way I do.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 194 matches

Mail list logo