date:20160217

Re: [HACKERS] a raft of parallelism-related bug fixes

2016-02-17 Thread Craig Ringer

On 9 February 2016 at 03:00, Joshua D. Drake  wrote:

>
> I think this further points to the need for more reviewers and less
> feature pushes. There are fundamental features that we could use, this is
> one of them. It is certainly more important than say pgLogical or BDR (not
> that those aren't useful but that we do have external solutions for that
> problem).

Well, with the pglogical and BDR work most of the work has been along
similar lines - getting the infrastructure in place. Commit timestamps,
logical decoding, and other features that are useful way beyond
pglogical/BDR. Logical decoding in particular is rapidly becoming a really
significant feature as people start to see the potential for it in
integration and ETL processes.

I'm not sure anyone takes the pglogical downstream submission as a serious
attempt at inclusion in 9.6, and even submitting the upstream was
significantly a RFC at least as far as 9.6 is concerned. I don't think the
downstream submission took any significant time or attention away from
other work.

The main result has been useful discussions on remaining pieces needed for
DDL replication etc and some greater awareness among others in the
community about what's going on in the area. I think that's a generally
useful thing.

>
>
> Oh: another thing that I would like to do is commit the isolation
>> tests I wrote for the deadlock detector a while back, which nobody has
>> reviewed either, though Tom and Alvaro seemed reasonably positive
>> about the concept.  Right now, the deadlock.c part of this patch isn't
>> tested at all by any of our regression test suites, because NOTHING in
>> deadlock.c is tested by any of our regression test suites.  You can
>> blow it up with dynamite and the regression tests are perfectly happy,
>> and that's pretty scary.
>>
>
> Test test test. Please commit.
>
>
Yeah. Enhancing the isolation tests would be useful. Please commit those
changes. Even if they broke something in the isolation tester - which isn't
likely - forward movement in test infrastructure is important and we should
IMO have a lower bar for committing changes there. They won't directly
affect code end users are running.

I should resurrect Abhijit's patch to allow the isolationtester to talk to
multiple servers. We'll want that when we're doing tests like "assert that
this change isn't visible on the replica before it becomes visible on the
master". (Well, except we violate that one with our funky
synchronous_commit implementation...)

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] WAL logging problem in 9.4.3?

2016-02-17 Thread Michael Paquier

On Thu, Feb 4, 2016 at 3:24 PM, Heikki Linnakangas  wrote:
> I dropped the ball on this one back in July, so here's an attempt to revive
> this thread.
>
> I spent some time fixing the remaining issues with the prototype patch I
> posted earlier, and rebased that on top of current git master. See attached.
>
> Some review of that would be nice. If there are no major issues with it, I'm
> going to create backpatchable versions of this for 9.4 and below.

I am going to look into that very soon. For now and to not forget
about this bug, I have added an entry in the CF app:
https://commitfest.postgresql.org/9/528/
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning

2016-02-17 Thread Amit Langote

On 2016/02/16 21:57, Robert Haas wrote:
> On Fri, Jan 15, 2016 at 5:48 AM, Amit Langote
>  wrote:
>> If we have a CREATE statement for each partition, how do we generalize
>> that to partitions at different levels? For example, if we use something
>> like the following to create a partition of parent_name:
>>
>> CREATE PARTITION partition_name OF parent_name FOR VALUES ...
>> WITH ... TABLESPACE ...
>>
>> Do we then say:
>>
>> CREATE PARTITION subpartition_name OF partition_name ...
>>
>> to create a level 2 partition (sub-partition) of parent_name?
> 
> Yes, exactly.
> 
> Personally, I would be more inclined to make this a CREATE TABLE statement, 
> like
> 
> CREATE TABLE partition_name PARTITION OF parent_name FOR VALUES ...
> CREATE TABLE subpartition_name PARTITION OF partition_name FOR VALUES ...

I guess the first of which would actually be:

CREATE TABLE partition_name PARTITION OF parent_name FOR VALUES ...
PARTITION BY ...

Some might think that writing potentially the same PARTITION BY clause 100
times for 100 level-1 partitions could be cumbersome. That is what
SUBPARTITION BY notation may be good as a shorthand for.

> I think if you've got SUBPARTITION as a keyword in the syntax
> anywhere, you're doing it wrong.  The toplevel object shouldn't really
> care whether its children are themselves partitioned or not.

This is fine for the internals. SUBPARTITION BY is mere syntax sugar. It
might as well be just cascaded PARTITION BYs. The point is to specify as
many of those with CREATE TABLE toplevel as the number of levels of
partitioning we want. That does however fix the number of levels in advance.

In the patch I have posted, here are some details of the tuple routing
implementation for instance - the top-level parent knows only of its
immediate partitions. Once a level-1 partition from that list is
determined using a tuple's level-1 key, the tuple is passed down to choose
one of its own partitions using the level-2 key. The key descriptor list
is associated with the top-level parent alone and the recursive code knows
to iterate the key list to apply nth key for nth level. The recursion
happens in the partition module.

Now there are also functions to let obtain, say *all* or only *leaf*
partitions (OIDs) of a top-level parent but they are used for certain DDL
scenarios. As far as DML is concerned, the level-at-a-time recursive
approach as described above is used. Queries not yet because the plan is a
flattened append of leaf partitions anyway.

If such notation convenience at the expense of loss of generality is not
worth it, I'm willing to go ahead and implement SUBPARTITION-less syntax.

CREATE TABLE toplevel() PARTITION BY
CREATE TABLE partition PARTITION OF toplevel FOR VALUES ... PARTITION BY
CREATE TABLE subpartition PARTITION OF partition FOR VALUES
ALTER TABLE partitioned ATTACH PARTITION name FOR VALUES ... USING TABLE
ALTER TABLE partitioned DETACH PARTITION name [ USING newname ]

While we are on the syntax story, how about FOR VALUES syntax for range
partitions (sorry for piggybacking it here in this message). At least some
people dislike LESS THAN notation. Corey Huinker says we should be using
range type literals for that. It's not clear though that using range type
literals directly is a way ahead. For one, range type API expects there to
exist a range type with given element type. Whereas all we require for
range partitioning proper is for column type to have a btree operator
class. Should we require it to have an associated range type as well?
Don't think that there exists an API to check that either. All in all,
range types are good to implement things in applications but not so much
within the backend (unless I'm missing something). I know reinventing the
wheel is disliked as well but perhaps we could offer something like the
following because Corey offered some examples which would help from the
flexibility:

START [ EXCL ] (startval) END [ INCL ] (endval)

That is, in range type notation, '[startval, endval)' is the default
behavior. So for each partition, there is at least the following pieces of
metadata:

Datum  *startval;
boolstartexcl;
Datum  *endval;
boolendincl;

That requires quite some gymnastics during DDL (with no help from range
type infrastructure) to maintain the invariant that no two range
partitions overlap. Even gaps can result which are considered undesirable,
so maybe, the invariant to maintain would include no gaps in addition to
no overlap. That would have us looking for a definition of a "gap" for all
sorts of btree supporting data types.

Some people said something akin to interval partitioning would be good like:

PARTITION BY RANGE ON (columns) INCREMENT BY (INTERVAL '1 month' ) START
WITH value;

But that could be just a UI addition to the design where each partition
has [startval, endval) bounds. In any case, not a version 1 material I'd
think.

>> Do we want this at all? It seems difficult to generalize this to
>> multi-level hiera

Re: [HACKERS] ALTER ROLE SET/RESET for multiple options

2016-02-17 Thread Michael Paquier

On Wed, Feb 17, 2016 at 7:23 PM, Masahiko Sawada  wrote:
> On Wed, Feb 17, 2016 at 7:14 PM, Robert Haas  wrote:
>> On Wed, Feb 17, 2016 at 3:22 PM, Masahiko Sawada  
>> wrote:
>>> On Sat, Feb 13, 2016 at 2:45 PM, Robert Haas  wrote:
 On Fri, Feb 12, 2016 at 1:35 PM, Payal Singh  wrote:
> The feature seems to work as described, but is it necessary to enclose 
> multiple GUC settings in a parenthesis? This seems a deviation from the 
> usual syntax of altering multiple settings separated with comma.

 Well, note that you can say:

 ALTER USER bob SET search_path = a, b, c;

 I'm not sure how the parentheses help exactly; it seems like there is
 an inherit ambiguity either way.

>>>
>>> I thought it would be useful for user who wants to set several GUC
>>> parameter for each user. Especially the case where changing logging
>>> parameter for each user.
>>> But it might not bring us fantastic usability.
>>
>> Yeah, it doesn't really seem like it's worth trying to figure out a
>> syntax for this that can work.  It just doesn't buy us very much vs.
>> issuing one ALTER COMMAND per setting.
>>
>
> Yeah, please mark this patch as 'rejected'.
> If I can come up with another good idea, will post.

Done so.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: function parse_ident

2016-02-17 Thread Pavel Stehule

Hi

2016-02-17 14:02 GMT+01:00 Teodor Sigaev :

> I missed some of my edits. Updated patch with those in place attached.
>>
>
> I'm sorry, but special chararacter still isn't escaped correctly in error
> messages:
>
> % select
> parse_ident(E'X\rXX');
> XX""X
> Time: 0,510 ms
>
>
:(, .. I'll fix it today or tomorrow, when I'll have free time

Pavel


>
>
>
> --
> Teodor Sigaev   E-mail: teo...@sigaev.ru
>WWW:
> http://www.sigaev.ru/
>

Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-17 Thread Pavel Stehule

2016-02-17 16:54 GMT+01:00 Catalin Iacob :

> On Wed, Feb 17, 2016 at 3:32 PM, Pavel Stehule 
> wrote:
> >> Python 3 has keyword only arguments. It occurs to me they're exactly
> >> for "optional extra stuff" like detail, hint etc.
> >> Python 2 doesn't have that notion but you can kind of fake it since
> >> you get an args tuple and a kwargs dictionary.
> >
> >
> > I prefer a possibility to use both ways - positional form is shorter,
> > keywords can help with some parameters.
> >
> > But I cannot to imagine your idea, can you show it in detail?
>
> Sure, what I mean is:
>
> plpy.error('msg') # as before produces message 'msg'
> plpy.error(42) # as before produces message '42', including the
> conversion of the int to str
> plpy.error('msg', 'arg 2 is still part of msg') # as before, produces
> message '('msg', 'arg2 is still part of msg')'
> # and so on for as many positional arguments, nothing changes
> # I still think allowing more than one positional argument is
> unfortunate but for compatibility we keep allowing more
>
> # to pass detail you MUST use keyword args to disambiguate "I really
> want detail" vs. "I have argument 2 which is part of the messsage
> tuple for compatibility"
> plpy.error('msg', 42, detail='a detail') # produces message '('msg',
> 42)' and detail 'a detail'
> plpy.error('msg', detail=77) # produces message 'msg' and detail '77'
> so detail is also converted to str just like message for consistency
> # and so on for the others
> plpy.error('msg', 42, detail='a detail', hint='a hint')
> plpy.error('msg', 42, schema='sch')
>
> Only keyword arguments are treated specially and we know no existing
> code has keyword arguments since they didn't work before.
>
> Implementation wise, it's something like this but in C:
>
> def error(*args, **kwargs):
> if len(args) == 1:
> message = str(args[0])
> else:
> message = str(args)
>
> # fetch value from dictionary or None if the key is missing
> detail = kwargs.pop('detail', None)
> hint = kwargs.pop('hint', None)
>
> # use message, detail, hint etc. to raise exception for error and
> fatal/call ereport for the other levels
>
> Is it clear now? What do you think?
>

it doesn't look badly. Is there any possibility how to emulate it with
Python2 ? What do you think about some similar implementation on Python2?

Regards

Pavel

Re: [HACKERS] WIP: Access method extendability

2016-02-17 Thread Michael Paquier

On Wed, Feb 17, 2016 at 11:56 PM, Teodor Sigaev  wrote:
> 11 I'd really like to see regression tests (TAP-tests?) for replication with
> generic xlog.

The recovery test suite can help to accomplish that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix handling of invalid sockets returned by PQsocket()

2016-02-17 Thread Michael Paquier

On Thu, Feb 18, 2016 at 1:58 AM, Alvaro Herrera
 wrote:
> Michael Paquier wrote:
>> Hi all,
>>
>> After looking at Alvaro's message mentioning the handling of
>> PQsocket() for invalid sockets, I just had a look by curiosity at
>> other calls of this routine, and found a couple of issues:
>> 1) In vacuumdb.c, init_slot() does not check for the return value of 
>> PQsocket():
>> slot->sock = PQsocket(conn);
>> 2) In isolationtester.c, try_complete_step() should do the same.
>> 3) In pg_recvlogical.c for StreamLogicalLog() I am spotting the same problem.
>> I guess those ones should be fixed as well, no?
>
> I patched pgbench to use PQerrorMessage rather than strerror(errno).  I
> think your patch should do the same.

OK, this looks like a good idea. I would suggest doing the same in
receivelog.c then.
-- 
Michael


pqsocket-error-fix-v2.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Audit Extension

2016-02-17 Thread Bruce Momjian

On Wed, Feb 17, 2016 at 01:59:09PM +0530, Robert Haas wrote:
> On Wed, Feb 17, 2016 at 5:20 AM, Bruce Momjian  wrote:
> > On Fri, Feb  5, 2016 at 01:16:25PM -0500, Stephen Frost wrote:
> >> Looking at pgaudit and the other approaches to auditing which have been
> >> developed (eg: applications which sit in front of PG and essentially
> >> have to reimplement large bits of PG to then audit the commands sent
> >> before passing them to PG, or hacks which try to make sense out of log
> >> files full of SQL statements) make it quite clear, in my view, that
> >> attempts to bolt-on auditing to PG result in a poorer solution, from a
> >> technical perspective, than what this project is known for and capable
> >> of.  To make true progress towards that, however, we need to get past
> >> the thinking that auditing doesn't need to be in-core or that it should
> >> be a second-class citizen feature or that we don't need it in PG.
> >
> > Coming in late here, but the discussion around how to maintain the
> > auditing code seems very similar to how to handle the logical
> > replication of DDL commands.  First, have we looked into hooking
> > auditing into scanning logical replication contents, and second, how are
> > we handling the logical replication of DDL and could we use the same
> > approach for auditing?
> 
> Auditing needs to trace read-only events, which aren't reflected in
> logical replication in any way.  I think it's a good idea to try to
> drive auditing off of existing machinery instead of inventing
> something new - I suggested plan invalidation items upthread.  But I
> doubt that logical replication is the thing to attach it to.

I was suggesting we could track write events via logical replication and
then we only have to figure out auditing of read events, which would be
easier.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_ctl promote wait

2016-02-17 Thread Peter Eisentraut

It would be nice if pg_ctl promote supported the -w (wait) option.

How could pg_ctl determine when the promotion has finished?

We could query pg_is_in_recovery(), but that would require database
access, which we got rid of in pg_ctl.

We could check when recovery.conf is gone or recovery.done appears.
This could work, but I think some people are trying to get rid of these
files, so building a feature on that might be a problem?  (I think the
latest news on that was that the files might be different in this
hypothetical future but there would still be a file to prevent
re-recovery on restart.)

Other ideas?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Tatsuo Ishii

> What we need is tooling around a new file format that is similar or
> analogous to our existing tooling, so that it is easy to edit, easy to
> review, easy to build, easy to test, and so on.  Otherwise, the image
> files will not get maintained properly.
> 
> As an example, if an image contains text (e.g., a flow chart or
> architecture diagram), then I expect that git grep will find it there,
> so that I know to update it when I rename or augment something.
> 
> The above is all solvable.  There are certainly image formats that fit
> those descriptions.
> 
> Personally, I'd want to know specifically what images people would want,
> so we could discuss or prototype something around that.

One of the diagrams I want to have is a query processing flowchart.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Peter Eisentraut

On 2/17/16 9:08 PM, Michael Paquier wrote:
> On Thu, Feb 18, 2016 at 11:02 AM, Peter Eisentraut  wrote:
>> On 2/17/16 5:20 PM, Josh berkus wrote:
>>> I have a use-case for this feature, at part of it containerized
>>> PostgreSQL. Right now, there is certain diagnostic information (like
>>> timeline) which is exposed ONLY in pg_controldata.
>>
>> I'm talking about the pg_config() function, not pg_controldata.
> 
> Andrew has mentioned a use case he had at the beginning of this thread
> to enhance a bit the regression tests related to libxml.

While that idea was useful, I think we had concluded that there are
better ways to do this and that this way probably wouldn't even work
(Windows?).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Peter Eisentraut

On 2/17/16 8:23 PM, Tatsuo Ishii wrote:
> Do we really need git history for each figure? It seems we are waiting
> for a solution which will never realize.

What we need is tooling around a new file format that is similar or
analogous to our existing tooling, so that it is easy to edit, easy to
review, easy to build, easy to test, and so on.  Otherwise, the image
files will not get maintained properly.

As an example, if an image contains text (e.g., a flow chart or
architecture diagram), then I expect that git grep will find it there,
so that I know to update it when I rename or augment something.

The above is all solvable.  There are certainly image formats that fit
those descriptions.

Personally, I'd want to know specifically what images people would want,
so we could discuss or prototype something around that.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Michael Paquier

On Thu, Feb 18, 2016 at 11:02 AM, Peter Eisentraut  wrote:
> On 2/17/16 5:20 PM, Josh berkus wrote:
>> I have a use-case for this feature, at part of it containerized
>> PostgreSQL. Right now, there is certain diagnostic information (like
>> timeline) which is exposed ONLY in pg_controldata.
>
> I'm talking about the pg_config() function, not pg_controldata.

Andrew has mentioned a use case he had at the beginning of this thread
to enhance a bit the regression tests related to libxml.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Peter Eisentraut

On 2/17/16 5:20 PM, Josh berkus wrote:
> I have a use-case for this feature, at part of it containerized
> PostgreSQL. Right now, there is certain diagnostic information (like
> timeline) which is exposed ONLY in pg_controldata.

I'm talking about the pg_config() function, not pg_controldata.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-17 Thread Vitaly Burovoy

On 2/17/16, Dean Rasheed  wrote:
> On 17 February 2016 at 00:39, Vitaly Burovoy 
> wrote:
>> Now parse_memory_unit returns -1024 for bytes as divider, constant
>> "bytes" has moved there.
>> Add new memory_units_bytes_hint which differs from an original
>> memory_units_int by "bytes" size unit.
>> Allow hintmsg be NULL and if so, skip setting dereferenced variable to
>> memory_units_bytes_hint.
>>
>
> I think that approach is getting more and more unwieldy, and it simply
> isn't worth the effort just to share a few values from the unit
> conversion table, especially given that the set of supported units
> differs between the two places.
>
>> On 2/16/16, Dean Rasheed  wrote:
>>> ISTM that it would be far less code, and much simpler and more
>>> readable to just parse the supported units directly in
>>> pg_size_bytes(), rather than trying to share code with guc.c, when the
>>> supported units are actually different and may well diverge further in
>>> the future.
>
> I've gone with this approach and it is indeed far less code, and much
> simpler and easier to read. This will also make it easier to
> maintain/extend in the future.
>
> I've made a few minor tweaks to the docs, and added a note to make it
> clear that the units in these functions work in powers of 2 not 10.
>
> I also took the opportunity to tidy up the number scanning code
> somewhat (I was tempted to rip it out entirely, since it feels awfully
> close to duplicating the numeric code, but it's probably worth it for
> the better error message).

Yes, the better error message was the only reason to leave that scan.

> Additionally, note that I replaced strcasecmp() with pg_strcasecmp(),
> since AIUI the former is not available on all supported platforms.

Thank you. I didn't know that function exists.

> Barring objections, and subject to some more testing, I intend to
> commit this version.
>
> Regards,
> Dean

Than you for your work. Your version seems even better that refactored by me.
But I have several questions (below).

> + boolhave_digits;
Agree, "main_digits" and "error" together was redundant, "have_digits"
is enough.


> + else
> + have_digits = false;
Does it worth to move it to the declaration and remove that "else" block?
+   boolhave_digits = false;


Is it important to use:
> +  ObjectIdGetDatum(InvalidOid),
> +  Int32GetDatum(-1)));
instead of "0, -1"? Previously I left immediate constants because I
found the same thing in jsonb.c (rows 363, 774)...


> +  if (pg_strcasecmp(strptr, "bytes") == 0)
> +  else if
> +  else if
> +  else if
It seems little ugly for me. In that case (not using values from GUC)
I'd create a static array for it and do a small cycle via it. Would it
better?


> + errmsg("invalid size: \"%s\"", text_to_cstring(arg)),
Agree. Usage of "text_to_cstring" again is a good idea for improving
readability.


> + NumericGetDatum(mul_num),
Two more space for indentation.


Also I've found that with allowing exponent there is a weird result
(we tried to avoid "type numeric" in the error message):
SELECT 
pg_size_bytes('123e1000
kB');
ERROR:  invalid input syntax for type numeric:
"123e1000"


[1]http://www.postgresql.org/docs/9.5/static/error-style-guide.html#AEN110702
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Tatsuo Ishii

>> What's wrong with LibreOffice?
> 
> Is there any reason to think it doesn't have the same disease mentioned
> in the previously-cited thread, namely that any change trashes pretty
> much the whole file?
> 
> That might be okay for things that we only change once every ten years
> or so, but otherwise it would be very git-history-unfriendly.
> 
> It's possible that we could solve that with some sort of SVG normalizer
> (think pgindent for SVG) that we're careful to use before committing.
> But I'm afraid that we'd still have issues with significantly different
> output from different versions of LibreOffice, for example.

Do we really need git history for each figure? It seems we are waiting
for a solution which will never realize.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Tom Lane

Tatsuo Ishii  writes:
>> Because no one has been able to propose a good format for storing and
>> editing pictures.

> What's wrong with LibreOffice?

Is there any reason to think it doesn't have the same disease mentioned
in the previously-cited thread, namely that any change trashes pretty
much the whole file?

That might be okay for things that we only change once every ten years
or so, but otherwise it would be very git-history-unfriendly.

It's possible that we could solve that with some sort of SVG normalizer
(think pgindent for SVG) that we're careful to use before committing.
But I'm afraid that we'd still have issues with significantly different
output from different versions of LibreOffice, for example.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Tatsuo Ishii

> On 2/16/16 8:17 PM, Tatsuo Ishii wrote:
>> It seems there's no figures/diagrams in our docs. I vaguely recall that
>> we used to have a few diagrams in our docs. If so, was there any
>> technical reason to remove them?
> 
> Because no one has been able to propose a good format for storing and
> editing pictures.

What's wrong with LibreOffice?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Tatsuo Ishii

> Gavin Flower wrote:
>> On 18/02/16 10:38, Peter Eisentraut wrote:
>> >On 2/16/16 8:17 PM, Tatsuo Ishii wrote:
>> >>It seems there's no figures/diagrams in our docs. I vaguely recall that
>> >>we used to have a few diagrams in our docs. If so, was there any
>> >>technical reason to remove them?
>> >Because no one has been able to propose a good format for storing and
>> >editing pictures.
>>
>> SVG?
> 
> Did you read the quoted threads?  In particular read the well-researched
> thread started by Rafael Martinez some years ago.

http://www.postgresql.org/message-id/20120712181636.gc11...@momjian.us

It seems the discussion never reached to a conclusion.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Tatsuo Ishii

> Gavin Flower wrote:
>> On 18/02/16 10:38, Peter Eisentraut wrote:
>> >On 2/16/16 8:17 PM, Tatsuo Ishii wrote:
>> >>It seems there's no figures/diagrams in our docs. I vaguely recall that
>> >>we used to have a few diagrams in our docs. If so, was there any
>> >>technical reason to remove them?
>> >Because no one has been able to propose a good format for storing and
>> >editing pictures.
>>
>> SVG?
> 
> Did you read the quoted threads?  In particular read the well-researched
> thread started by Rafael Martinez some years ago.

The end of the thread is here:
http://www.postgresql.org/message-id/20120712181636.gc11...@momjian.us

It seems the discussion never reached to a conclusion.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-17 Thread Michael Paquier

On Wed, Feb 17, 2016 at 5:22 PM, Fabien COELHO  wrote:
>> \set aid 1 + 1
>> pgbench -f addition.sql -t 5000
>>
>> I have the following:
>> HEAD: 3.5~3.7M TPS
>> list method: 3.6~3.7M TPS
>> array method: 3.4~3.5M TPS
>> So all approaches have a comparable performance.
>
> Yep, the execution trace is pretty similar in all cases, maybe with a little
> more work for the array method, although I'm surprise that the difference is
> discernable.

Nah, that's mostly noise I think. My conclusion is that all approaches
are rather equivalent for the operators.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Joe Conway

On 02/17/2016 03:34 PM, Josh berkus wrote:
> On 02/17/2016 03:02 PM, Tom Lane wrote:
>> Joe Conway  writes:
>>> On 02/17/2016 02:14 PM, Tom Lane wrote:
 I thought we'd agreed on requiring superuser access for this function.
 I concur that letting just anyone see the config data is inappropriate.
>>
>>> It does not let anyone see config data out of the box:
>>
>>> + CREATE VIEW pg_config AS
>>> + SELECT * FROM pg_config();
>>> +
>>> + REVOKE ALL on pg_config FROM PUBLIC;
>>> + REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;
>>
>> Ah, that's fine.  I'd looked for a superuser() check and not seen one,
>> but letting the SQL permissions system handle it seems good enough.
> 
> What I like about this is that if I want to expose it to a
> non-superuser, I can just do a GRANT instead of needing to write a
> security definer view.

Which was my reason for doing it this way, although that GRANT will not
get preserved by pg_dump currently. Stephen Frost is working on a patch
to change/fix that though (see the "Additional role attributes &&
superuser review" thread), which I believe he intends to get done RSN
and into 9.6.

Joe



-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

[HACKERS] Restructuring Paths to allow per-Path targetlist info

2016-02-17 Thread Tom Lane

Although it may not be obvious to the naked eye, I have been working away
on that long-overdue upper-planner pathification project.  One thing
that's necessary to make that happen is to keep track of the targetlists
of upper plan levels within the Path data structures, since once you get
past the scan/join phase you need to know explicitly what each level of
plan is actually going to compute.  I'd been trying to add such info only
to the new types of Path, but that wasn't proving to work very well
notationally, and in any case there's good reason to think that we should
track output tlists explicitly even while considering basic scan/join
paths.  (As an example, if we want the planner to take into account the
cost savings from fetching an expensive function value from a functional
index rather than recomputing it from scratch, we really need a way of
denoting which Paths can provide the function value at no extra cost.)

So, the attached patch just bites the bullet and adds explicit output
tlist information to struct Path.  I did set things up so that the cost
is only one pointer in each Path in the typical case where Paths emit
the set of Vars needed from their relation; in that case, they just
point to a default PathTarget struct embedded in the parent RelOptInfo.
A Path emitting something else will need its own PathTarget struct.
(Since this patch is just trying to change the notation around, there
aren't actually any cases yet of a Path with its own PathTarget.)

I did take the time to improve the cost accounting for PlaceHolderVars;
before, we sometimes overcharged substantially for those (by counting
their eval costs at multiple plan levels) and sometimes undercharged
for them (by not counting them at all).  It's not perfect yet, cf the
new comments in add_placeholders_to_joinrel(), but it's a lot better
than before.  This change results in a couple of minor regression test
changes because the planner changes join order now that it realizes that
sub-selects inside PlaceHolderVars aren't free ;-).

Barring objections, I'd like to push this soon.

regards, tom lane

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index f13316b..cf12710 100644
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
*** check_selective_binary_conversion(RelOpt
*** 806,812 
  	}
  
  	/* Collect all the attributes needed for joins or final output. */
! 	pull_varattnos((Node *) baserel->reltargetlist, baserel->relid,
     &attrs_used);
  
  	/* Add all the attributes used by restriction clauses. */
--- 806,812 
  	}
  
  	/* Collect all the attributes needed for joins or final output. */
! 	pull_varattnos((Node *) baserel->reltarget.exprs, baserel->relid,
     &attrs_used);
  
  	/* Add all the attributes used by restriction clauses. */
*** estimate_size(PlannerInfo *root, RelOptI
*** 938,944 
  		 */
  		int			tuple_width;
  
! 		tuple_width = MAXALIGN(baserel->width) +
  			MAXALIGN(SizeofHeapTupleHeader);
  		ntuples = clamp_row_est((double) stat_buf.st_size /
  (double) tuple_width);
--- 938,944 
  		 */
  		int			tuple_width;
  
! 		tuple_width = MAXALIGN(baserel->reltarget.width) +
  			MAXALIGN(SizeofHeapTupleHeader);
  		ntuples = clamp_row_est((double) stat_buf.st_size /
  (double) tuple_width);
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index cb41886..ef8eab6 100644
*** a/contrib/postgres_fdw/deparse.c
--- b/contrib/postgres_fdw/deparse.c
*** build_tlist_to_deparse(RelOptInfo *forei
*** 728,737 
  	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
  
  	/*
! 	 * We require columns specified in foreignrel->reltargetlist and those
  	 * required for evaluating the local conditions.
  	 */
! 	tlist = add_to_flat_tlist(tlist, foreignrel->reltargetlist);
  	tlist = add_to_flat_tlist(tlist,
  			  pull_var_clause((Node *) fpinfo->local_conds,
  			  PVC_REJECT_AGGREGATES,
--- 728,737 
  	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
  
  	/*
! 	 * We require columns specified in foreignrel->reltarget.exprs and those
  	 * required for evaluating the local conditions.
  	 */
! 	tlist = add_to_flat_tlist(tlist, foreignrel->reltarget.exprs);
  	tlist = add_to_flat_tlist(tlist,
  			  pull_var_clause((Node *) fpinfo->local_conds,
  			  PVC_REJECT_AGGREGATES,
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index ffe6388..465f43c 100644
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
*** postgresGetForeignRelSize(PlannerInfo *r
*** 479,485 
  	 * columns used in them.  Doesn't seem worth detecting that case though.)
  	 */
  	fpinfo->attrs_used = NULL;
! 	pull_varattnos((Node *) baserel->reltargetlist, baserel->relid,
     &fpinfo->attrs_used);
  	foreach(lc, fpinfo->local_conds)

Re: [HACKERS] pglogical - how to use pglogical.conflict_resolution

2016-02-17 Thread xujian

sorry, please ignore it.It is a setting, not function, I can set it withset 
pglogical.conflict_resolution = 'error'
thanks
JamesFrom: jame...@outlook.com
To: pgsql-hackers@postgresql.org
Subject: pglogical - how to use pglogical.conflict_resolution
Date: Wed, 17 Feb 2016 18:30:35 -0500




Hello, I setup the logical replication with pglogical on pg95, it works 
fine, however, I want to set the conflict resolution according to the document 
below=The configuration of the conflicts resolver is done 
via the`pglogical.conflict_resolution` setting. The supported values for 
the`pglogical.conflict_resolution` are:
- `error` - the replication will stop on error if conflict is detected and  
manual action is needed for resolving==
I got errortest3=# select pglogical.conflict_resolution('error');ERROR:  
function pglogical.conflict_resolution(unknown) does not existLINE 1: select 
pglogical.conflict_resolution('error');
Does anyone know how to use the function pglogical.conflict_resolution? I 
googled , but didn't find anything value. 
Thanks.
James

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Josh berkus


On 02/17/2016 03:02 PM, Tom Lane wrote:

Joe Conway  writes:

On 02/17/2016 02:14 PM, Tom Lane wrote:

I thought we'd agreed on requiring superuser access for this function.
I concur that letting just anyone see the config data is inappropriate.



It does not let anyone see config data out of the box:



+ CREATE VIEW pg_config AS
+ SELECT * FROM pg_config();
+
+ REVOKE ALL on pg_config FROM PUBLIC;
+ REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;


Ah, that's fine.  I'd looked for a superuser() check and not seen one,
but letting the SQL permissions system handle it seems good enough.


What I like about this is that if I want to expose it to a 
non-superuser, I can just do a GRANT instead of needing to write a 
security definer view.



--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pglogical - how to use pglogical.conflict_resolution

2016-02-17 Thread xujian

Hello, I setup the logical replication with pglogical on pg95, it works 
fine, however, I want to set the conflict resolution according to the document 
below=The configuration of the conflicts resolver is done 
via the`pglogical.conflict_resolution` setting. The supported values for 
the`pglogical.conflict_resolution` are:
- `error` - the replication will stop on error if conflict is detected and  
manual action is needed for resolving==
I got errortest3=# select pglogical.conflict_resolution('error');ERROR:  
function pglogical.conflict_resolution(unknown) does not existLINE 1: select 
pglogical.conflict_resolution('error');
Does anyone know how to use the function pglogical.conflict_resolution? I 
googled , but didn't find anything value. 
Thanks.
James

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Tom Lane

Joe Conway  writes:
> On 02/17/2016 02:14 PM, Tom Lane wrote:
>> I thought we'd agreed on requiring superuser access for this function.
>> I concur that letting just anyone see the config data is inappropriate.

> It does not let anyone see config data out of the box:

> + CREATE VIEW pg_config AS
> + SELECT * FROM pg_config();
> +
> + REVOKE ALL on pg_config FROM PUBLIC;
> + REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;

Ah, that's fine.  I'd looked for a superuser() check and not seen one,
but letting the SQL permissions system handle it seems good enough.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Andrew Dunstan




On 02/17/2016 05:14 PM, Tom Lane wrote:

Peter Eisentraut  writes:

On 2/17/16 12:15 PM, Joe Conway wrote:

Ok, removed the documentation on the function pg_config() and pushed.

I still have my serious doubts about this, especially not even requiring
superuser access for this information.  Could someone explain why we
need this?

I thought we'd agreed on requiring superuser access for this function.
I concur that letting just anyone see the config data is inappropriate.





I'm in favor, and don't really want to rehearse the argument. But I 
think I can live quite happily with it being superuser only. It's pretty 
hard to argue that exposing it to a superuser creates much risk, ISTM.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Josh berkus


On 02/17/2016 01:31 PM, Peter Eisentraut wrote:

On 1/31/16 7:34 AM, Michael Paquier wrote:

I am marking this patch as returned with feedback for now, not all the
issues have been fixed yet, and there are still no docs (the
conclusion being that people would like to have this stuff, right?).
Feel free to move it to the next CF should a new version be written.


I think we still don't have a real use case for this feature, and a
couple of points against it.


I have a use-case for this feature, at part of it containerized 
PostgreSQL. Right now, there is certain diagnostic information (like 
timeline) which is exposed ONLY in pg_controldata.  That leaves no 
reasonable way to expose this information in an API.


(and yes, we have a bigger issue with stuff which is only in pg_log, but 
one thing at a time)


--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] a raft of parallelism-related bug fixes

2016-02-17 Thread Jim Nasby


On 2/8/16 4:39 PM, Peter Geoghegan wrote:

On Mon, Feb 8, 2016 at 2:35 PM, Andres Freund  wrote:

I think having a public git tree, that contains the current state, is
greatly helpful for that. Just announce that you're going to screw
wildly with history, and that you're not going to be terribly careful
about commit messages.  That means observers can just do a fetch and a
reset --hard to see the absolutely latest and greatest.  By all means
post a series to the list every now and then, but I think for minor
changes it's perfectly sane to say 'pull to see the fixups for the
issues you noticed'.


I would really like for there to be a way to do that more often. It
would be a significant time saver, because it removes problems with
minor bitrot.


Yeah, I think it's rather silly that we limit ourselves to only pushing 
patches through a mailing list. That's OK (maybe even better) for simple 
stuff, but once there's more than 1 patch it's a PITA.


There's an official github mirror of the code, ISTM it'd be good for 
major features to get posted to github forks in their own branches. I 
think that would also make it easy for buildfarm owners to run tests 
against trusted forks/branches.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Joe Conway

On 02/17/2016 02:14 PM, Tom Lane wrote:
> Peter Eisentraut  writes:
>> On 2/17/16 12:15 PM, Joe Conway wrote:
>>> Ok, removed the documentation on the function pg_config() and pushed.
> 
>> I still have my serious doubts about this, especially not even requiring
>> superuser access for this information.  Could someone explain why we
>> need this?
> 
> I thought we'd agreed on requiring superuser access for this function.
> I concur that letting just anyone see the config data is inappropriate.

It does not let anyone see config data out of the box:

+ CREATE VIEW pg_config AS
+ SELECT * FROM pg_config();
+
+ REVOKE ALL on pg_config FROM PUBLIC;
+ REVOKE EXECUTE ON FUNCTION pg_config() FROM PUBLIC;
+

But it does not have an explicit superuser check. I can add that if
that's the consensus.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Tom Lane

Peter Eisentraut  writes:
> On 2/17/16 12:15 PM, Joe Conway wrote:
>> Ok, removed the documentation on the function pg_config() and pushed.

> I still have my serious doubts about this, especially not even requiring
> superuser access for this information.  Could someone explain why we
> need this?

I thought we'd agreed on requiring superuser access for this function.
I concur that letting just anyone see the config data is inappropriate.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Jim Nasby


On 2/17/16 9:03 AM, Dean Rasheed wrote:

I'm not totally opposed to specifying a column sort order in psql, and
perhaps there's a way to support both 'cols' and 'col_order' options
in psql, since there are different situations where one or the other
might be more useful.


Yeah. If there was some magic way to reference the underlying data with 
your syntax it probably wouldn't be that bad. AIUI normally we're just 
dumping data into a Portal and there's no option to read back from it, 
but if the query results were first put in a tuplestore then I suspect 
it wouldn't be that hard to query against it and produce another result set.



What I am opposed to is specifying the row order in psql, because IMO
that's something that should be done entirely in the SQL query.


+1
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Alvaro Herrera

Gavin Flower wrote:
> On 18/02/16 10:38, Peter Eisentraut wrote:
> >On 2/16/16 8:17 PM, Tatsuo Ishii wrote:
> >>It seems there's no figures/diagrams in our docs. I vaguely recall that
> >>we used to have a few diagrams in our docs. If so, was there any
> >>technical reason to remove them?
> >Because no one has been able to propose a good format for storing and
> >editing pictures.
>
> SVG?

Did you read the quoted threads?  In particular read the well-researched
thread started by Rafael Martinez some years ago.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Gavin Flower


On 18/02/16 10:38, Peter Eisentraut wrote:

On 2/16/16 8:17 PM, Tatsuo Ishii wrote:

It seems there's no figures/diagrams in our docs. I vaguely recall that
we used to have a few diagrams in our docs. If so, was there any
technical reason to remove them?

Because no one has been able to propose a good format for storing and
editing pictures.




SVG?



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Peter Eisentraut

On 2/17/16 12:15 PM, Joe Conway wrote:
> On 02/17/2016 02:32 AM, Michael Paquier wrote:
>> Actually, having second-thoughts on the matter, why is is that
>> necessary to document the function pg_config()? The functions wrapping
>> a system view just have the view documented, take for example
>> pg_show_all_file_settings, pg_show_all_settings,
>> pg_stat_get_wal_receiver, etc.
> 
> Ok, removed the documentation on the function pg_config() and pushed.

I still have my serious doubts about this, especially not even requiring
superuser access for this information.  Could someone explain why we
need this?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

2016-02-17 Thread Constantin S. Pan

I was testing with number of workers starting at 0. The 0 case is (in
theory) equivalent to master branch. But I should certainly compare to
the master branch, you are right. Will do that shortly.

On Wed, 17 Feb 2016 12:26:05 -0800
Peter Geoghegan  wrote:

> On Wed, Feb 17, 2016 at 7:55 AM, Constantin S. Pan 
> wrote:
> > 4. Hit the 8x speedup limit. Made some analysis of the reasons (see
> > the attached plot or the data file).  
> 
> Did you actually compare this to the master branch? I wouldn't like to
> assume that the one worker case was equivalent. Obviously that's the
> really interesting baseline.
> 
> 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Peter Eisentraut

On 2/16/16 8:17 PM, Tatsuo Ishii wrote:
> It seems there's no figures/diagrams in our docs. I vaguely recall that
> we used to have a few diagrams in our docs. If so, was there any
> technical reason to remove them?

Because no one has been able to propose a good format for storing and
editing pictures.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Peter Eisentraut

On 1/31/16 7:34 AM, Michael Paquier wrote:
> I am marking this patch as returned with feedback for now, not all the
> issues have been fixed yet, and there are still no docs (the
> conclusion being that people would like to have this stuff, right?).
> Feel free to move it to the next CF should a new version be written.

I think we still don't have a real use case for this feature, and a
couple of points against it.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Peter Eisentraut

On 2/9/16 11:21 AM, Daniel Verite wrote:
> Note that NULL values in the column that pivots are discarded
> by \crosstabview, because NULL as the name of a column does not
> make sense.

Why not?

All you're doing is printing it out, and psql is quite capable of
printing a null value.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pglogical - logical replication contrib module

2016-02-17 Thread Alvaro Herrera

Craig Ringer wrote:

> Improving this probably needs DDL deparse to be smarter. Rather than just
> emitting something that can be reconstructed into the SQL text of the DDL
> it needs to emit one or more steps that are semantically the same but allow
> us to skip the rewrite. Along the lines of:
> 
> * ALTER TABLE mytable ADD COLUMN somecolumn sometype;
> * ALTER TABLE mytable ALTER COLUMN somecolumn DEFAULT some_function();
> * 
> * ALTER TABLE mytable ALTER COLUMN somecolumn NOT NULL;

Compared to the effort involved in getting the current DDL-event capture
stuff in event triggers, and writing the extension that creates the JSON
representation that expands to SQL commands, this seems easy to do.
What currently happens is that we get a list of ALTER TABLE subcommands
and then produce a single ALTER TABLE that covers them all.  To fix this
problem we could mark those ALTER TABLE subcommands that require a table
rewrite, and mark them specially; emit a list of the ones before the
rewrite as one command, then emit some sort of token that indicates the
table rewrite, then emit a list of the ones after the rewrite.  The
replay code can then "wait" for the rewrite to occur.

Since this is all in extension code, it's possible to tailor to the
needs you have.  (This is the point where I'm extremely happy we ended
up creating the hooks and the new pseudo-type separately from the
extension containing the JSON-generating bits, instead of having it all
be a single piece.)

One slight pain point in the above is the handling of ALTER COLUMN / SET
NOT NULL.  That one currently requires a table scan, which would be nice
to avoid, but I don't see any way to do that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

2016-02-17 Thread Peter Geoghegan

On Wed, Feb 17, 2016 at 7:55 AM, Constantin S. Pan  wrote:
> 4. Hit the 8x speedup limit. Made some analysis of the reasons (see the
> attached plot or the data file).

Did you actually compare this to the master branch? I wouldn't like to
assume that the one worker case was equivalent. Obviously that's the
really interesting baseline.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

2016-02-17 Thread Oleg Bartunov

On Wed, Feb 17, 2016 at 6:55 PM, Constantin S. Pan  wrote:

> On Sat, 16 Jan 2016 01:38:39 +0300
> "Constantin S. Pan"  wrote:
>
> > The task of building GIN can require lots of time and eats 100 % CPU,
> > but we could easily make it use more than a 100 %, especially since we
> > now have parallel workers in postgres.
> >
> > The process of building GIN looks like this:
> >
> > 1. Accumulate a batch of index records into an rbtree in maintenance
> > work memory.
> >
> > 2. Dump the batch to disk.
> >
> > 3. Repeat.
> >
> > I have a draft implementation which divides the whole process between
> > N parallel workers, see the patch attached. Instead of a full scan of
> > the relation, I give each worker a range of blocks to read.
> >
> > This speeds up the first step N times, but slows down the second one,
> > because when multiple workers dump item pointers for the same key,
> > each of them has to read and decode the results of the previous one.
> > That is a huge waste, but there is an idea on how to eliminate it.
> >
> > When it comes to dumping the next batch, a worker does not do it
> > independently. Instead, it (and every other worker) sends the
> > accumulated index records to the parent (backend) in ascending key
> > order. The backend, which receives the records from the workers
> > through shared memory, can merge them and dump each of them once,
> > without the need to reread the records N-1 times.
> >
> > In current state the implementation is just a proof of concept
> > and it has all the configuration hardcoded, but it already works as
> > is, though it does not speed up the build process more than 4 times
> > on my configuration (12 CPUs). There is also a problem with temporary
> > tables, for which the parallel mode does not work.
>
> Hey Hackers!
>
> I have made some progress on the proposal (see the attached patch):
>
> 0. Moved some repeated code to functions (e.g. ginDumpAccumulator,
> ginbuildCommon).
>
> 1. Implemented results merging on backend.
>
> 2. Disabled the usage of parallel mode when creating index on temporary
> tables. No point in using parallel mode for temporary tables anyway,
> right?
>
> 3. Added GUC parameter to control the number of workers for GIN
> building.
>
> 4. Hit the 8x speedup limit. Made some analysis of the reasons (see the
> attached plot or the data file).
>
> In order to analyze the performance issues, I have made the following:
>
> create table t (k int, v int[]);
>
> create or replace
> function randarray(width int, low int, high int)
> returns int[] as
> $$
> select array(select (random()*(high-low) + low)::int
> from generate_series(1,width))
> $$ language sql;
>
> insert into t select k, randarray(3000, 0, 10)
> from generate_series(1, 10) as k;
>
> create index t_v_idx on t using gin (v);
>
> This creates 10 arrays of 3000 random numbers each. The random
> numbers are in range [0, 10]. Then I measure how long the gin
> building steps take. There are two steps: scan and merge.
>
> The results show that 'scan' step is sped up perfectly. But the
> 'merge' step takes longer as you increase the number of workers. The
> profiler shows that the bottleneck here is ginMergeItemPointers(), which
> I use to merge the results.
>
> Also, I did encounter the problem with workers deadlocking during
> heap_open, but that seems to have been resolved by Robert Haas in his
> commit regarding group locking.
>
> Please leave your feedback!
>

My feedback is (Mac OS X 10.11.3)

set gin_parallel_workers=2;
create index message_body_idx on messages using gin(body_tsvector);
LOG:  worker process: parallel worker for PID 5689 (PID 6906) was
terminated by signal 11: Segmentation fault
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably me

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Daniel Verite

Jim Nasby wrote:

> >   ORDER BY name
> > \crosstabview cols = (select to_char(d, 'Mon') from
> > generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
> 
> My concern with that is that often you don't know what the columns will 
> be, because you don't know what exact data the query will produce. So to 
> use this syntax you'd have to re-create a huge chunk of the original 
> query. :(

Also, if that additional query refers to tables, it should be executed
with the same data visibility as the main query. Doesn't that mean
that both queries should happen within the same repeatable
read transaction?

Another  impractical aspect of this approach is that a
meta-command invocation in psql must fit on a single line, so
queries containing newlines are not acceptable as argument.
This problem exists with "\copy (select...) to ..."  already.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates

2016-02-17 Thread Peter Geoghegan

On Wed, Feb 17, 2016 at 2:12 AM, Robert Haas  wrote:
> Committed, except I left out one comment hunk that I wasn't convinced about.

Thank you.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Joe Conway

On 02/17/2016 02:32 AM, Michael Paquier wrote:
> Actually, having second-thoughts on the matter, why is is that
> necessary to document the function pg_config()? The functions wrapping
> a system view just have the view documented, take for example
> pg_show_all_file_settings, pg_show_all_settings,
> pg_stat_get_wal_receiver, etc.

Ok, removed the documentation on the function pg_config() and pushed.
Included bumped catversion.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Fix handling of invalid sockets returned by PQsocket()

2016-02-17 Thread Alvaro Herrera

Michael Paquier wrote:
> Hi all,
> 
> After looking at Alvaro's message mentioning the handling of
> PQsocket() for invalid sockets, I just had a look by curiosity at
> other calls of this routine, and found a couple of issues:
> 1) In vacuumdb.c, init_slot() does not check for the return value of 
> PQsocket():
> slot->sock = PQsocket(conn);
> 2) In isolationtester.c, try_complete_step() should do the same.
> 3) In pg_recvlogical.c for StreamLogicalLog() I am spotting the same problem.
> I guess those ones should be fixed as well, no?

I patched pgbench to use PQerrorMessage rather than strerror(errno).  I
think your patch should do the same.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Code refactoring related to -fsanitize=use-after-scope

2016-02-17 Thread Martin Liška

On 02/15/2016 08:20 PM, Tom Lane wrote:
> I bet a nickel that this is triggered by the goto leading into those
> variables' scope ("goto process_inner_tuple" at line 2038 in HEAD).
> That probably bypasses the "unpoison" step.
> 
> However, doesn't this represent a bug in the sanitizer rather than
> anything we should change in Postgres?  There is no rule in C that
> you can't execute such a goto, especially not if there is no
> initialization of those variables.
> 
> If you can think of a reasonable refactoring that gets rid of the need
> for that goto, I'd be for that, because it's certainly unsightly.
> But I don't think it's wrong, and I don't think that the proposed patch
> is any improvement from a structured-programming standpoint.
> 
>   regards, tom lane

Hi Tom.

You are exactly right that as the code does not expose an initialization,
it should work fine. As you mentioned, unpoisoning is skipped that exposes
this false positive.

I'll try to think about the case and handle that. Application of my patch
does not make sense.

Martin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Alexander Lakhin


Hello,

In fact we use "make postgres.xml" to get a single XML, which we translate.
We convert it to .po using xml2po (with some modifications), and then 
convert it back to xml (and then to sgml files with our custom script).
So we have not changed sgml to xml files, and if it's more appropriate 
than having a single large XML, then we will propose a procedure to 
perform such conversion (that will result in corresponding mega patch).


Best regards,
Alexander Lakhin

17.02.2016 08:41, Tatsuo Ishii пишет:

On Wed, Feb 17, 2016 at 4:17 AM, Tatsuo Ishii  wrote:


It seems there's no figures/diagrams in our docs. I vaguely recall that
we used to have a few diagrams in our docs. If so, was there any
technical reason to remove them?


I don't know the reason, but it's shame, we are still in sgml !

We already do our translations (as others) in xml using custom scripting.
xml provides us better integration with available tools and ability to
insert graphics. Last time we asked in -docs about moving to xml and
Alexander demonstrated acceptable speed of xml build, but there were no
reply from Peter, who is (?) responsible for our documentation
infrastructure. Probably, we should just created a patch and submit to
commitfest.  You can check this thread
http://www.postgresql.org/message-id/1428009501118.85...@postgrespro.ru

Well, there are some PostgreSQL doc translation projects are running
including translation for Japanese, which I am working on.

If we are going to change the manual format and/or the build system, I
need to confirm it does work for Japanese as well. In theory because
the Japanese translation project uses UTF-8, there should be no
problem as far as the whole build system works for UTF-8. But I want
to confirm first...

BTW, are you going to propose a mega patch which changes all the sgml
files to xml files?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Alexander Lakhin

17.02.2016 09:17, Tatsuo Ishii wrote:
>> Hi. 
>>
>> In DocBook 4.2 sgml dtd, figure tag is supported already.
>> that was implemented for multi output format.
> Ok, there's no technical problems with figures then.  MySQL docs has
> some nice figures. I am jealous.
The "figure" tag is just a placeholder in the Docbook
(http://www.docbook.org/tdg/en/html/figure.html).
The question is what to place inside this tag: "graphic" (not inline),
"mediaobject/imageobject" (alternative object, supports inline contents
and SVG), or something else.
So you surely can insert some picture from an external file (.png,
.jpg), but if it could contain title or some labels that should be
translated, it's not a best solution.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

2016-02-17 Thread Tom Lane

I just had a rather disturbing thought to the effect that this entire
design --- ie, parallel workers taking out locks for themselves --- is
fundamentally flawed.  As far as I can tell from README.parallel,
parallel workers are supposed to exit (and, presumably, release their
locks) before the leader's transaction commits.  Releasing locks before
commit is wrong.  Do I need to rehearse why?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tsvector editing functions

2016-02-17 Thread Stas Kelvich

> 
> On 02 Feb 2016, at 20:10, Teodor Sigaev  wrote:
> 
> Some notices:
> 
> 1 tsin in documentation doesn't look like a good name. Changed to vector 
> similar to other places.
> 
> 2 I did some editorization about freeing memory/forgotten names etc

Thanks.

> 
> 3 It seems to me that tsvector_unnest() could be seriously optimized for
> large tsvectors: with current coding it detoasts/decompresses tsvector value 
> on each call. Much better to do it once in
> multi_call_memory_ctx context at first call init

Done, moved detoasting to first SRF call.

> 4 It seems debatable returning empty array for position/weight if they are 
> absent:
> =# select * from unnest('a:1 b'::tsvector);
> lexeme | positions | weights
> +---+-
> a  | {1}   | {D}
> b  | {}| {}
> I think, it's better to return NULL in this case
> 

Okay, done.

> 5 
> array_to_tsvector/tsvector_setweight_by_filter/tsvector_delete_arr/tsvector_filter
>  doesn't check or pay attention to NULL elements in input arrays
> 

Thanks! Fixed and added tests.

> -- 
> Teodor Sigaev   E-mail: teo...@sigaev.ru
>   WWW: http://www.sigaev.ru/
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



tsvector_ops-v4.diff
Description: Binary data


---
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [WIP] speeding up GIN build with parallel workers

2016-02-17 Thread Constantin S. Pan

On Sat, 16 Jan 2016 01:38:39 +0300
"Constantin S. Pan"  wrote:

> The task of building GIN can require lots of time and eats 100 % CPU,
> but we could easily make it use more than a 100 %, especially since we
> now have parallel workers in postgres.
> 
> The process of building GIN looks like this:
> 
> 1. Accumulate a batch of index records into an rbtree in maintenance
> work memory.
> 
> 2. Dump the batch to disk.
> 
> 3. Repeat.
> 
> I have a draft implementation which divides the whole process between
> N parallel workers, see the patch attached. Instead of a full scan of
> the relation, I give each worker a range of blocks to read.
> 
> This speeds up the first step N times, but slows down the second one,
> because when multiple workers dump item pointers for the same key,
> each of them has to read and decode the results of the previous one.
> That is a huge waste, but there is an idea on how to eliminate it.
> 
> When it comes to dumping the next batch, a worker does not do it
> independently. Instead, it (and every other worker) sends the
> accumulated index records to the parent (backend) in ascending key
> order. The backend, which receives the records from the workers
> through shared memory, can merge them and dump each of them once,
> without the need to reread the records N-1 times.
> 
> In current state the implementation is just a proof of concept
> and it has all the configuration hardcoded, but it already works as
> is, though it does not speed up the build process more than 4 times
> on my configuration (12 CPUs). There is also a problem with temporary
> tables, for which the parallel mode does not work.

Hey Hackers!

I have made some progress on the proposal (see the attached patch):

0. Moved some repeated code to functions (e.g. ginDumpAccumulator,
ginbuildCommon).

1. Implemented results merging on backend.

2. Disabled the usage of parallel mode when creating index on temporary
tables. No point in using parallel mode for temporary tables anyway,
right?

3. Added GUC parameter to control the number of workers for GIN
building.

4. Hit the 8x speedup limit. Made some analysis of the reasons (see the
attached plot or the data file).

In order to analyze the performance issues, I have made the following:

create table t (k int, v int[]);

create or replace
function randarray(width int, low int, high int)
returns int[] as
$$
select array(select (random()*(high-low) + low)::int
from generate_series(1,width))
$$ language sql;

insert into t select k, randarray(3000, 0, 10)
from generate_series(1, 10) as k;

create index t_v_idx on t using gin (v);

This creates 10 arrays of 3000 random numbers each. The random
numbers are in range [0, 10]. Then I measure how long the gin
building steps take. There are two steps: scan and merge.

The results show that 'scan' step is sped up perfectly. But the
'merge' step takes longer as you increase the number of workers. The
profiler shows that the bottleneck here is ginMergeItemPointers(), which
I use to merge the results.

Also, I did encounter the problem with workers deadlocking during
heap_open, but that seems to have been resolved by Robert Haas in his
commit regarding group locking.

Please leave your feedback!

Regards,

Constantin S. Pan
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgin-performance.pdf
Description: Adobe PDF document


pgin-performance.dat
Description: Binary data
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index cd21e0e..a8c5ec5 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -16,14 +16,20 @@
 
 #include "access/gin_private.h"
 #include "access/xloginsert.h"
+#include "access/parallel.h"
+#include "access/xact.h"
 #include "catalog/index.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/smgr.h"
 #include "storage/indexfsm.h"
+#include "storage/spin.h"
+#include "utils/datum.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
 
+/* GUC parameter */
+int gin_parallel_workers = 0;
 
 typedef struct
 {
@@ -265,6 +271,141 @@ ginHeapTupleBulkInsert(GinBuildState *buildstate, OffsetNumber attnum,
 	MemoryContextReset(buildstate->funcCtx);
 }
 
+#define KEY_TASK 42
+#define GIN_MAX_WORKERS 24
+#define GIN_BLOCKS_PER_WORKER 4
+#define GIN_RESULT_LEN 1024
+#define GIN_RESULT_KEYLEN 1024
+
+typedef struct {
+	bool ready;
+	bool finished;
+
+	Datum			key;
+	OffsetNumber	attnum;
+	GinNullCategory	category;
+
+	char keycoded[GIN_RESULT_KEYLEN];
+	int keylen;
+
+	ItemPointerData list[GIN_RESULT_LEN];
+	int			nlist;
+
+	Latch	blatch;
+	Latch	wlatch;
+} WorkerResult;
+
+/*
+ * This structure describes the GIN build task for the parallel workers. We use
+ * OIDs here because workers are separate processes and pointers may become
+ * meaningless for them. The "lock" is used to prot

Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-17 Thread Catalin Iacob

On Wed, Feb 17, 2016 at 3:32 PM, Pavel Stehule  wrote:
>> Python 3 has keyword only arguments. It occurs to me they're exactly
>> for "optional extra stuff" like detail, hint etc.
>> Python 2 doesn't have that notion but you can kind of fake it since
>> you get an args tuple and a kwargs dictionary.
>
>
> I prefer a possibility to use both ways - positional form is shorter,
> keywords can help with some parameters.
>
> But I cannot to imagine your idea, can you show it in detail?

Sure, what I mean is:

plpy.error('msg') # as before produces message 'msg'
plpy.error(42) # as before produces message '42', including the
conversion of the int to str
plpy.error('msg', 'arg 2 is still part of msg') # as before, produces
message '('msg', 'arg2 is still part of msg')'
# and so on for as many positional arguments, nothing changes
# I still think allowing more than one positional argument is
unfortunate but for compatibility we keep allowing more

# to pass detail you MUST use keyword args to disambiguate "I really
want detail" vs. "I have argument 2 which is part of the messsage
tuple for compatibility"
plpy.error('msg', 42, detail='a detail') # produces message '('msg',
42)' and detail 'a detail'
plpy.error('msg', detail=77) # produces message 'msg' and detail '77'
so detail is also converted to str just like message for consistency
# and so on for the others
plpy.error('msg', 42, detail='a detail', hint='a hint')
plpy.error('msg', 42, schema='sch')

Only keyword arguments are treated specially and we know no existing
code has keyword arguments since they didn't work before.

Implementation wise, it's something like this but in C:

def error(*args, **kwargs):
if len(args) == 1:
message = str(args[0])
else:
message = str(args)

# fetch value from dictionary or None if the key is missing
detail = kwargs.pop('detail', None)
hint = kwargs.pop('hint', None)

# use message, detail, hint etc. to raise exception for error and
fatal/call ereport for the other levels

Is it clear now? What do you think?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

2016-02-17 Thread Tom Lane

Robert Haas  writes:
> On Tue, Feb 16, 2016 at 2:59 AM, Tom Lane  wrote:
>> Not to be neglected also is that (I believe) this gives the right answer,
>> whereas isolationtester's existing query is currently completely broken by
>> parallel queries, and it doesn't understand non-conflicting lock modes
>> either.  (It did, at least partially, before commit 38f8bdcac4982215;
>> I am not sure that taking out the mode checks was a good idea.  But
>> putting them back would make the query slower yet.)

> The reason I took that out is because it breaks the deadlock-soft
> test.  It's possible to have a situation where no granted lock
> conflicts with an awaited lock.  If that happens, the old query
> wrongly concluded that the waiting process was not in fact waiting.
> (Consider A hold AccessShareLock, B awaits AccessExclusiveLock, C now
> requests AccessShareLock and *waits*.)

Ah, well, with this patch the deadlock-soft test still passes.

> As for the patch itself, I'm having trouble grokking what it's trying
> to do.  I think it might be worth having a comment defining precisely
> what we mean by "A blocks B".  I would define "A blocks B" in general
> as either A holds a lock which conflicts with one sought by B
> (hard-blocked) or A awaits a lock which conflicts with one sought by B
> and precedes it in the wait queue (soft-blocked).

Yes, that is exactly what I implemented ... and it's something you can't
find out from pg_locks.  I'm not sure how that view could be made to
expose wait-queue ordering.

> For parallel queries, there's a further relevant distinction when we
> say "A blocks B".  We might mean either that (1) process B cannot
> resume execution until the lock conflict is resolved or (2) the group
> leader for process B cannot complete the current parallel operation
> until the lock conflict is resolved.

The definition I used in this patch is "some member of A's lock group
blocks some member of B's lock group", because that corresponds directly
to whether A is preventing B's query from completing, which is what
isolationtester wants to know --- and, I would argue, it's generally what
any client would want to know.  99.9% of clients would just as soon not be
aware of parallel workers lurking underneath the pg_backend_pid() values
that they see for their sessions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Dean Rasheed

On 17 February 2016 at 02:32, Jim Nasby  wrote:
> On 2/11/16 4:21 AM, Dean Rasheed wrote:
>>
>> Thinking about this some more though, perhaps*sorting*  the columns is
>> the wrong way to be thinking about it. Perhaps a better approach would
>> be to allow the columns to be*listed*  (optionally, using a separate
>> query). Something like the following (don't get too hung up on the
>> syntax):
>>
>> SELECT name,
>> to_char(date, 'Mon') AS month,
>> sum(amount) AS amount
>>   FROM invoices
>>   GROUP BY 1,2
>>   ORDER BY name
>> \crosstabview cols = (select to_char(d, 'Mon') from
>> generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
>
>
> My concern with that is that often you don't know what the columns will be,
> because you don't know what exact data the query will produce. So to use
> this syntax you'd have to re-create a huge chunk of the original query. :(
>

Yeah, that's a reasonable concern.

On the flip side, one of the advantages of the above syntax is that
you have absolute control over the columns, whereas with the
sort-based syntax you might find some columns missing (e.g., if there
were no invoices in August) and that could lead to confusion parsing
the results.

I'm not totally opposed to specifying a column sort order in psql, and
perhaps there's a way to support both 'cols' and 'col_order' options
in psql, since there are different situations where one or the other
might be more useful.

What I am opposed to is specifying the row order in psql, because IMO
that's something that should be done entirely in the SQL query.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Dean Rasheed

On 15 February 2016 at 14:08, Daniel Verite  wrote:
> Dean Rasheed wrote:
>
>> My biggest problem is with the sorting, for all the reasons discussed
>> above. There is absolutely no reason for \crosstabview to be
>> re-sorting rows -- they should just be left in the original query
>> result order
>
> Normal top-down display:
>
> select v,to_char(d,'Mon') as m, c  from v_data order by d;
>
>  v  |  m  |  c
> +-+-
>  v2 | Jan | bar
>  v1 | Apr | foo
>  v1 | Jul | baz
>  v0 | Jul | qux
>
> At this point, it seems to me that it's perfectly reasonable for our user
> to expect the possibility of sorting additionally by "v" , without
> changing the query and without changing the order of the horizontal
> header:
>
>  \crosstabview +v m c
>
>  v  | Jan | Apr | Jul
> +-+-+-
>  v0 | | | qux
>  v1 | | foo | baz
>  v2 | bar | |
>

I don't find that example particularly compelling. If I want to sort
the rows coming out of a query, my first thought is always going to be
to add/adjust the query's ORDER BY clause, not use some weird +/- psql
syntax.

The crux of the problem here is that in a pivoted query resultset SQL
can be used to control the order of the rows or the columns, but not
both at the same time. IMO it is more natural to use SQL to control
the order of the rows. The columns are the result of the psql
pivoting, so it's reasonable to control them via psql options.

A couple of other points to bear in mind:

The number of columns is always going to be quite limited (at most
1600, and usually far less than that), whereas the number of rows
could be arbitrarily large. So sorting the rows client-side in the way
that you are could get very inefficient, whereas that's not such a
problem for the columns.

The column values are non-NULL, so they require a more limited set of
sort options, whereas the rows could be anything, and people will want
all the sort options to be available.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Access method extendability

2016-02-17 Thread Teodor Sigaev


Next version of patch is attached:
  * Documentation for CREATE ACCESS METHOD command is added.
  * Refactoring and comments for bloom contrib.


Cool work, nice to see.

Some comments
1 create-am.7.patch: function lookup_index_am_handler_func() why doesn't it emit 
error in case of emtpy input? If it checks signature of function and
empty handler is not allowed then, seems, all checks about handler have to be 
moved in lookup_index_am_handler_func().


2 create-am.7.patch: Comment near get_am_name() incorrectly mentions  get_am_oid 
function


3 create-am.7.patch: get_am_name(Oid amOid, char amtype). Seems, amtype argument 
is overengineering.  get_am_name() is used only in error messages and additional 
check in it will do nothing or even confuse user.


4 create-am.7.patch: Inconsistentcy with psql help. As I can see other code, 
it's forbidden to create access method without handler

postgres=# \h create access method
Command: CREATE ACCESS METHOD
Description: define a new access method
Syntax:
CREATE ACCESS METHOD name
TYPE INDEX
[ HANDLER handler_function | NO HANDLER ]

postgres=# create access method yyy type index no handler;
ERROR:  syntax error at or near "no"
LINE 1: create access method yyy type index no handler;

5 create-am.7.patch: file src/bin/pg_dump/pg_dump.h. Extra comma near DO_POLICY:
*** 77,83 
DO_POST_DATA_BOUNDARY,
DO_EVENT_TRIGGER,
DO_REFRESH_MATVIEW,
!   DO_POLICY
  } DumpableObjectType;

  typedef struct _dumpableObject
--- 78,84 
DO_POST_DATA_BOUNDARY,
DO_EVENT_TRIGGER,
DO_REFRESH_MATVIEW,
!   DO_POLICY,
  } DumpableObjectType;

6 generic-xlog.7.patch: writeDelta() Seems, even under USE_ASSERT_CHECKING 
define checking diff by its applying is to expensive. May be, let we use here 
GENERIC_WAL_DEBUG macros?


7 generic-xlog.7.patch: src/backend/access/transam/generic_xlog.c It's unclear 
for me, what does MATCH_THRESHOLD do or intended for? Pls, add comments here.


8 generic-xlog.7.patch: src/backend/access/transam/generic_xlog.c. Again, it's 
unclear for me, what is motivation of size of PageData.data?


9 generic-xlog.7.patch: GenericXLogRegister(), Seems, emitting an error if 
caller tries to register buffer which is already registered isn't good idea. In 
practice, say, SP-GIST, buffer variable is used and page could be easily get 
from buffer. Suppose, GenericXLogRegister() should not emit an error and ignore 
isnew flag if case of second registration of the same buffer.


10 bloom-contrib.7.patch:
blutils.c: In function 'initBloomState':
blutils.c:128:20: warning: variable 'opaque' set but not used 
[-Wunused-but-set-variable]

   BloomPageOpaque  opaque;

11 I'd really like to see regression tests (TAP-tests?) for replication with 
generic xlog.





--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

2016-02-17 Thread Robert Haas

On Tue, Feb 16, 2016 at 2:59 AM, Tom Lane  wrote:
> Andres Freund  writes:
>> I wonder if we shouldn't just expose a 'which pid is process X waiting
>> for' API, implemented serverside. That's generally really useful, and
>> looks like it's actually going to be less complicated than that
>> query... And it's surely going to be faster.
>
> Attached is a draft patch for a new function that reports the set of PIDs
> directly blocking a given PID (that is, holding or awaiting conflicting
> locks on the lockable object it's waiting for).
>
> I replaced isolationtester's pg_locks query with this, and found that
> it's about 9X faster in a normal build, and 3X faster with
> CLOBBER_CACHE_ALWAYS turned on.  That would give us some nice headroom
> for the isolation tests with CLOBBER_CACHE_ALWAYS animals.  (Note that
> in view of
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=markhor&dt=2016-02-14%2007%3A38%3A37
> we still need to do *something* about the speed of the new deadlock-hard
> test; this patch could avoid the need to dumb down or slow down that test
> even further.)
>
> Not to be neglected also is that (I believe) this gives the right answer,
> whereas isolationtester's existing query is currently completely broken by
> parallel queries, and it doesn't understand non-conflicting lock modes
> either.  (It did, at least partially, before commit 38f8bdcac4982215;
> I am not sure that taking out the mode checks was a good idea.  But
> putting them back would make the query slower yet.)

The reason I took that out is because it breaks the deadlock-soft
test.  It's possible to have a situation where no granted lock
conflicts with an awaited lock.  If that happens, the old query
wrongly concluded that the waiting process was not in fact waiting.
(Consider A hold AccessShareLock, B awaits AccessExclusiveLock, C now
requests AccessShareLock and *waits*.)

As for the patch itself, I'm having trouble grokking what it's trying
to do.  I think it might be worth having a comment defining precisely
what we mean by "A blocks B".  I would define "A blocks B" in general
as either A holds a lock which conflicts with one sought by B
(hard-blocked) or A awaits a lock which conflicts with one sought by B
and precedes it in the wait queue (soft-blocked).  I have wondered
before if we shouldn't modify pg_locks to expose the wait-queue
ordering; without that, you can't reliably determine in general
whether A soft-blocks B, which means every view anyone has ever
written over pg_locks that purports to say who blocks who is
necessarily buggy.

For parallel queries, there's a further relevant distinction when we
say "A blocks B".  We might mean either that (1) process B cannot
resume execution until the lock conflict is resolved or (2) the group
leader for process B cannot complete the current parallel operation
until the lock conflict is resolved.  If you're trying to figure out
why one particular member of a parallel group is stuck, you want to
answer question #1.  If you're trying to figure out what all the
things that need to get out of the way to finish the query, you want
to answer question #2.  I think this function is aiming to answer
question #2, not question #1, but I'm less clear on the reason behind
that definitional choice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-17 Thread Artur Zakirov


On 12.02.2016 20:56, Teodor Sigaev wrote:

On Thu, Feb 11, 2016 at 9:56 AM, Teodor Sigaev  wrote:

1 - sml_limit to similarity_limit. sml_threshold is difficult to
write I
think,
similarity_limit is more simple.


It seems to me that threshold is right word by meaning. sml_threshold
is my
choice.


Why abbreviate it like that?  Nobody's going to know that "sml" stands
for "similarity" without consulting the documentation, and that sucks.


Ok, I don't have an objections. I worked a lot on various similarity
modules and sml becomes usual for me. That's why I was asking.



Hi!

I attached new version of the patch. It fixes names of GUC variables and 
functions.


Now the patch introduces:
1 - functions:
- word_similarity()
2 - operators:
- %> (and <%)
- <->> (and <<->)
3 - GUC variables:
- pg_trgm.similarity_threshold
- pg_trgm.word_similarity_threshold

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
*** a/contrib/pg_trgm/pg_trgm--1.2.sql
--- b/contrib/pg_trgm/pg_trgm--1.2.sql
***
*** 3,13 
--- 3,15 
  -- complain if script is sourced in psql, rather than via CREATE EXTENSION
  \echo Use "CREATE EXTENSION pg_trgm" to load this file. \quit
  
+ -- Deprecated function
  CREATE FUNCTION set_limit(float4)
  RETURNS float4
  AS 'MODULE_PATHNAME'
  LANGUAGE C STRICT VOLATILE;
  
+ -- Deprecated function
  CREATE FUNCTION show_limit()
  RETURNS float4
  AS 'MODULE_PATHNAME'
***
*** 26,32  LANGUAGE C STRICT IMMUTABLE;
  CREATE FUNCTION similarity_op(text,text)
  RETURNS bool
  AS 'MODULE_PATHNAME'
! LANGUAGE C STRICT STABLE;  -- stable because depends on trgm_limit
  
  CREATE OPERATOR % (
  LEFTARG = text,
--- 28,34 
  CREATE FUNCTION similarity_op(text,text)
  RETURNS bool
  AS 'MODULE_PATHNAME'
! LANGUAGE C STRICT STABLE;  -- stable because depends on pg_trgm.similarity_threshold
  
  CREATE OPERATOR % (
  LEFTARG = text,
*** a/contrib/pg_trgm/trgm.h
--- b/contrib/pg_trgm/trgm.h
***
*** 105,111  typedef char *BITVECP;
  
  typedef struct TrgmPackedGraph TrgmPackedGraph;
  
! extern float4 trgm_limit;
  
  extern uint32 trgm2int(trgm *ptr);
  extern void compact_trigram(trgm *tptr, char *str, int bytelen);
--- 105,111 
  
  typedef struct TrgmPackedGraph TrgmPackedGraph;
  
! extern double similarity_threshold;
  
  extern uint32 trgm2int(trgm *ptr);
  extern void compact_trigram(trgm *tptr, char *str, int bytelen);
*** a/contrib/pg_trgm/trgm_gin.c
--- b/contrib/pg_trgm/trgm_gin.c
***
*** 206,212  gin_trgm_consistent(PG_FUNCTION_ARGS)
  			 * similarity is just c / len1.
  			 * So, independly on DIVUNION the upper bound formula is the same.
  			 */
! 			res = (nkeys == 0) ? false : ((float4) ntrue) / ((float4) nkeys))) >= trgm_limit) ? true : false);
  			break;
  		case ILikeStrategyNumber:
  #ifndef IGNORECASE
--- 206,214 
  			 * similarity is just c / len1.
  			 * So, independly on DIVUNION the upper bound formula is the same.
  			 */
! 			res = (nkeys == 0) ? false :
! ((float4) ntrue) / ((float4) nkeys))) >= similarity_threshold)
! 	? true : false);
  			break;
  		case ILikeStrategyNumber:
  #ifndef IGNORECASE
***
*** 283,289  gin_trgm_triconsistent(PG_FUNCTION_ARGS)
  			/*
  			 * See comment in gin_trgm_consistent() about * upper bound formula
  			 */
! 			res = (nkeys == 0) ? GIN_FALSE : (float4) ntrue) / ((float4) nkeys)) >= trgm_limit) ? GIN_MAYBE : GIN_FALSE);
  			break;
  		case ILikeStrategyNumber:
  #ifndef IGNORECASE
--- 285,293 
  			/*
  			 * See comment in gin_trgm_consistent() about * upper bound formula
  			 */
! 			res = (nkeys == 0) ? GIN_FALSE :
! (float4) ntrue) / ((float4) nkeys)) >= similarity_threshold)
! 	? GIN_MAYBE : GIN_FALSE);
  			break;
  		case ILikeStrategyNumber:
  #ifndef IGNORECASE
*** a/contrib/pg_trgm/trgm_gist.c
--- b/contrib/pg_trgm/trgm_gist.c
***
*** 294,300  gtrgm_consistent(PG_FUNCTION_ARGS)
  float4		tmpsml = cnt_sml(key, qtrg);
  
  /* strange bug at freebsd 5.2.1 and gcc 3.3.3 */
! res = (*(int *) &tmpsml == *(int *) &trgm_limit || tmpsml > trgm_limit) ? true : false;
  			}
  			else if (ISALLTRUE(key))
  			{	/* non-leaf contains signature */
--- 294,301 
  float4		tmpsml = cnt_sml(key, qtrg);
  
  /* strange bug at freebsd 5.2.1 and gcc 3.3.3 */
! res = (*(int *) &tmpsml == *(int *) &similarity_threshold
! 		|| tmpsml > similarity_threshold) ? true : false;
  			}
  			else if (ISALLTRUE(key))
  			{	/* non-leaf contains signature */
***
*** 308,314  gtrgm_consistent(PG_FUNCTION_ARGS)
  if (len == 0)
  	res = false;
  else
! 	res = (float8) count) / ((float8) len))) >= trgm_limit) ? true : false;
  			}
  			break;
  		case ILikeStrategyNumber:
--- 309,316 
  if (len == 0)
  	res = false;
  else
! 	res = (float8) count) / ((fl

Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-17 Thread Pavel Stehule

Hi

2016-02-17 7:34 GMT+01:00 Catalin Iacob :

> After I unzip the patch it doesn't apply.
> patch says:
> patch:  Only garbage was found in the patch input.
>
> It's a combined diff, the git-diff manual says this about it:
> Chunk header format is modified to prevent people from accidentally
> feeding it to patch -p1. Combined diff format was created for review
> of merge commit changes, and was not meant for apply.
>
> Thanks for doing the test changes. It definitely shows the change is
> big. The tests at least were heavily relying on both the str
> conversion and on passing more than one argument. Sad face.


> You're going to hate me but seeing this I changed my mind about 5,
> requiring all those extra str calls is too much change in behaviour. I
> was going to propose passing everything through str (so message,
> detail, hint but also schema, table) but thinking about it some more,
> I think I have something better.
>

Ok, it is no problem - this work is +/- research, and there are not direct
way often :)


>
> Python 3 has keyword only arguments. It occurs to me they're exactly
> for "optional extra stuff" like detail, hint etc.
> Python 2 doesn't have that notion but you can kind of fake it since
> you get an args tuple and a kwargs dictionary.
>

I prefer a possibility to use both ways - positional form is shorter,
keywords can help with some parameters.

But I cannot to imagine your idea, can you show it in detail?


>
> What about keeping the same behaviour for multiple positional
> arguments (single one -> make it str, multiple -> use str of the args
> tuple) and requiring users to pass detail, hint only as keyword
> arguments? Code wise it would only mean adding PyObject* kw to
> PLy_output and adding some code to extract detail, hint etc. from kw.
> This would also allow getting rid of the GUC since backward
> compatibility is fully preserved.
>
> Again, sorry for all the back and forth but it still feels like we
> haven't nailed the design to something satisfactory. All the tests you
> needed to change are a hint towards that.
>

It not any problem. I am thankful for cooperation.

Regards

Pavel
diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
new file mode 100644
index 015bbad..2bd9bcb
*** a/doc/src/sgml/plpython.sgml
--- b/doc/src/sgml/plpython.sgml
*** $$ LANGUAGE plpythonu;
*** 1341,1360 
Utility Functions

 The plpy module also provides the functions
!plpy.debug(msg),
!plpy.log(msg),
!plpy.info(msg),
!plpy.notice(msg),
!plpy.warning(msg),
!plpy.error(msg), and
!plpy.fatal(msg).elogin PL/Python
 plpy.error and
 plpy.fatal actually raise a Python exception
 which, if uncaught, propagates out to the calling query, causing
 the current transaction or subtransaction to be aborted.
 raise plpy.Error(msg) and
 raise plpy.Fatal(msg) are
!equivalent to calling
 plpy.error and
 plpy.fatal, respectively.
 The other functions only generate messages of different
--- 1341,1360 
Utility Functions

 The plpy module also provides the functions
!plpy.debug(exception_params),
!plpy.log(exception_params),
!plpy.info(exception_params),
!plpy.notice(exception_params),
!plpy.warning(exception_params),
!plpy.error(exception_params), and
!plpy.fatal(exception_params).elogin PL/Python
 plpy.error and
 plpy.fatal actually raise a Python exception
 which, if uncaught, propagates out to the calling query, causing
 the current transaction or subtransaction to be aborted.
 raise plpy.Error(msg) and
 raise plpy.Fatal(msg) are
!partial equivalent to calling
 plpy.error and
 plpy.fatal, respectively.
 The other functions only generate messages of different
*** $$ LANGUAGE plpythonu;
*** 1367,1372 
--- 1367,1397 

  

+ 
+The exception_params are
+[ message [, detail [, hint [, sqlstate  [, schema  [, table  [, column  [, datatype  [, constraint ].
+These parameters kan be entered as keyword parameters.
+The message, detail, hint
+are automaticly casted to string, other should be string.
+ 
+ 
+ CREATE FUNCTION raise_custom_exception() RETURNS void AS $$
+ plpy.error("custom exception message", "some info about exception", "hint for users")
+ $$ LANGUAGE plpythonu;
+ 
+ postgres=# select raise_custom_exception();
+ ERROR:  XX000: plpy.Error: custom exception message
+ DETAIL:  some info about exception
+ HINT:  hint for users
+ CONTEXT:  Traceback (most recent call last):
+   PL/Python function "raise_custom_exception", line 2, in 
+ plpy.error("custom exception message", "some info about exception", "hint for users")
+ PL/Python function "raise_custom_exception"
+ LOCATION:  PLy_elog, plpy_elog.c:132
+ 
+   
+ 
+   
 Another set of utility functions are
 plpy.quote_literal(string),
 plpy.quote_nullable(string), and
diff --git a/src/pl/plpyt

Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-17 Thread Tom Lane

Robert Haas  writes:
> I guess it's worth pointing out that storing the groupLeader in the
> PROCLOCK, as we do currently, avoids this issue entirely.

Right, that's why I framed it as "can we salvage this simplification"
(referring to removal of that field).  It may be best to just fix the
existing bug and docs deficiencies before returning to that idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] Improvements of Hunspell dictionaries support

2016-02-17 Thread Artur Zakirov


On 16.02.2016 18:14, Artur Zakirov wrote:

I attached a new version of the patch.



Sorry for noise. I attached new version of the patch. I saw mistakes in 
DecodeFlag(). This patch fix them.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
*** a/doc/src/sgml/textsearch.sgml
--- b/doc/src/sgml/textsearch.sgml
***
*** 2615,2632  SELECT plainto_tsquery('supernova star');
 
  
 
! To create an Ispell dictionary, use the built-in
! ispell template and specify several parameters:
 
! 
  
! CREATE TEXT SEARCH DICTIONARY english_ispell (
  TEMPLATE = ispell,
! DictFile = english,
! AffFile = english,
! StopWords = english
! );
  
  
 
  Here, DictFile, AffFile, and StopWords
--- 2615,2655 
 
  
 
! To create an Ispell dictionary perform these steps:
 
!
! 
!  
!   download dictionary configuration files. OpenOffice
!   extension files have the .oxt extension. It is necessary
!   to extract .aff and .dic files, change
!   extensions to .affix and .dict. For some
!   dictionary files it is also needed to convert characters to the UTF-8
!   encoding with commands (for example, for norwegian language dictionary):
  
! iconv -f ISO_8859-1 -t UTF-8 -o nn_no.affix nn_NO.aff
! iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
! 
!  
! 
! 
!  
!   copy files to the $SHAREDIR/tsearch_data directory
!  
! 
! 
!  
!   load files into PostgreSQL with the following command:
! 
! CREATE TEXT SEARCH DICTIONARY english_hunspell (
  TEMPLATE = ispell,
! DictFile = en_us,
! AffFile = en_us,
! Stopwords = english);
  
+  
+ 
+
  
 
  Here, DictFile, AffFile, and StopWords
***
*** 2643,2648  CREATE TEXT SEARCH DICTIONARY english_ispell (
--- 2666,2721 
 
  
 
+ The .affix file of Ispell has the following
+ structure:
+ 
+ prefixes
+ flag *A:
+ .   >   RE  # As in enter > reenter
+ suffixes
+ flag T:
+ E   >   ST  # As in late > latest
+ [^AEIOU]Y   >   -Y,IEST # As in dirty > dirtiest
+ [AEIOU]Y>   EST # As in gray > grayest
+ [^EY]   >   EST # As in small > smallest
+ 
+
+
+ And the .dict file has the following structure:
+ 
+ lapse/ADGRS
+ lard/DGRS
+ large/PRTY
+ lark/MRS
+ 
+
+ 
+
+ Format of the .dict file is:
+ 
+ basic_form/affix_class_name
+ 
+
+ 
+
+ In the .affix file every affix flag is described in the
+ following format:
+ 
+ condition > [-stripping_letters,] adding_affix
+ 
+
+ 
+
+ Here, condition has a format similar to the format of regular expressions.
+ It can use groupings [...] and [^...].
+ For example, [AEIOU]Y means that the last letter of the word
+ is "y" and the penultimate letter is "a",
+ "e", "i", "o" or "u".
+ [^EY] means that the last letter is neither "e"
+ nor "y".
+
+ 
+
  Ispell dictionaries support splitting compound words;
  a useful feature.
  Notice that the affix file should specify a special flag using the
***
*** 2663,2668  SELECT ts_lexize('norwegian_ispell', 'sjokoladefabrikk');
--- 2736,2800 
  
 
  
+
+ MySpell is very similar to Hunspell.
+ The .affix file of Hunspell has the following
+ structure:
+ 
+ PFX A Y 1
+ PFX A   0 re .
+ SFX T N 4
+ SFX T   0 st e
+ SFX T   y iest   [^aeiou]y
+ SFX T   0 est[aeiou]y
+ SFX T   0 est[^ey]
+ 
+
+ 
+
+ The first line of an affix class is the header. Fields of an affix rules are
+ listed after the header:
+
+
+ 
+  
+   parameter name (PFX or SFX)
+  
+ 
+ 
+  
+   flag (name of the affix class)
+  
+ 
+ 
+  
+   stripping characters from beginning (at prefix) or end (at suffix) of the
+   word
+  
+ 
+ 
+  
+   adding affix
+  
+ 
+ 
+  
+   condition that has a format similar to the format of regular expressions.
+  
+ 
+
+ 
+
+ The .dict file looks like the .dict file of
+ Ispell:
+ 
+ larder/M
+ lardy/RT
+ large/RSPMYT
+ largehearted
+ 
+
+ 
 
  
   MySpell does not support compound words.
*** a/src/backend/tsearch/Makefile
--- b/src/backend/tsearch/Makefile
***
*** 13,20  include $(top_builddir)/src/Makefile.global
  
  DICTDIR=tsearch_data
  
! DICTFILES=synonym_sample.syn thesaurus_sample.ths hunspell_sample.affix \
! 	ispell_sample.affix ispell_sample.dict
  
  OBJS = ts_locale.o ts_parse.o wparser.o wparser_def.o dict.o \
  	dict_simple.o dict_synonym.o dict_thesaurus.o \
--- 13,23 
  
  DICTDIR=tsearch_data
  
! DICTFILES=dicts/synonym_sample.syn dicts/thesaurus_sample.ths \
! 	dicts/hunspell_sample.affix \
! 	dicts/ispell_sample.affix dicts

[HACKERS] Prepared Statement support for Parallel query

2016-02-17 Thread Amit Kapila

Commit d1b7c1ffe72e86932b5395f29e006c3f503bc53d has added
the support for passing bind parameters to parallel workers, however
prepared statement which uses bind parameters wasn't enabled
for parallel query as the 'Execute' message in FE-BE protocol
can pass the row_count which can make parallel plans unusable.
(parallel plans are only possible when query can run to completion)

Later Commit bfc78d7196eb28cd4e3d6c24f7e607bacecf1129 has
ensure that if the row_count is non-zero then we won't enter
parallel mode which means that even if parallel plan is selected
by optimizer, it will run such a plan locally.

With above support, it was just a matter of enabling parallel
mode for prepared statements which is done in attached patch
(prepared_stmt_parallel_query_v1.patch).

I have tested that parallel plans are getting generated both
via Prepare/Execute statements and libpq prepared
statement execution.  Attached is a libpq program
(prepare_parallel_query.c) which I have used for testing prepared
statement support.  I have done the verification manually
(using auto_explain) to ensure that parallel plans gets generated
and executed via this libpq program.  This program expects some
data to be generated before-hand and the information of same is
added in file-header.

Thoughts?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


prepared_stmt_parallel_query_v1.patch
Description: Binary data
/*
 * Test case for PreparedStmt
 * Need first to run this query on the database connection is made
 * to:
 * CREATE TABLE t1 (c1 int, c2 char(1000));
 * INSERT INTO t1 values(generate_series(1,50), '');
 */
#include 

#define DB_CONN_STR		"host=localhost dbname=postgres"
#define SQL_FETCH_AA	"select c1 from t1 where c1 < $1"

int
main(int argc, char *argv[])
{
	PGconn* conn;
	PGresult*   res;
	intntuples;
	const char* paramValues[1];

	if ((conn = PQconnectdb(DB_CONN_STR)) == NULL)
		fprintf(stderr, "Connect to '%s' failed", DB_CONN_STR);

	if (PQstatus(conn) != CONNECTION_OK)
		fprintf(stderr, "Connect to '%s' failed: %s",
			DB_CONN_STR, PQerrorMessage(conn));

	if ((res = PQprepare(conn, "sql_fetch_aa",
		 SQL_FETCH_AA, 1, NULL)) == NULL)
		fprintf(stderr, "Preparing statement '%s' failed",
			SQL_FETCH_AA);

	if (PQresultStatus(res) != PGRES_COMMAND_OK)
		fprintf(stderr, "Preparing statement '%s' failed: %s",
			SQL_FETCH_AA, PQerrorMessage(conn));

	paramValues[0] = "1000";

	res = PQexecPrepared(conn, "sql_fetch_aa", 1,
		 paramValues,
		 NULL, NULL, 0);
	if (PQresultStatus(res) != PGRES_TUPLES_OK)
	{
		fprintf(stderr, "fetch data failed: %s",
PQerrorMessage(conn));
		return 1;
	}

	ntuples = PQntuples(res);
	PQclear(res);

	PQfinish(conn);

	return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: function parse_ident

2016-02-17 Thread Teodor Sigaev


I missed some of my edits. Updated patch with those in place attached.


I'm sorry, but special chararacter still isn't escaped correctly in error 
messages:

% select parse_ident(E'X\rXX');
XX""X
Time: 0,510 ms



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pglogical - logical replication contrib module

2016-02-17 Thread Craig Ringer

On 17 February 2016 at 18:39, Konstantin Knizhnik  wrote:

> Ok, what about the following plan:
>
> 1. Support custom WAL records (as far as I know 2ndQuadrant has such
> patch).
> 2. Add one more function to logical decoding allowing to deal with custom
> records.
>
> So the idea is that we somehow record DDL in WAL (for example using
> executor hook),
> then them are proceeded using logical decoding, calling special logical
> deocding plugin function to handle this records.
> For example we can store DDL in WAL just as SQL statements and so easily
> replay them.
>
> In this case DDL will be replicated using the same mechanism and through
> the same channel as DML.
>
>

Sure, you can do that, but you don't need to.

Go read the relevant BDR code again, you've missed how it works.

When DDL is fired the registered event trigger captures the DDL and passes
it to DDL deparse to extract a normalized representation. It is inserted
into a queued ddl commands table in the BDR schema during the transaction
that performed the DDL.

Later, when that transaction is decoded by logical decoding we see an
insert into the queued ddl commands table and replicate that to the
client(s).

Clients see the inserts into the queued DDL commands table and special-case
them on replay. As well as executing the original insert they also execute
the DDL command that was inserted into the table. This happens at the same
point in the transaction as the original insert, which is when the DDL was
run. So it's all consistent.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] [REVIEW] In-core regression tests for replication, cascading, archiving, PITR, etc.

2016-02-17 Thread Michael Paquier

On Fri, Feb 5, 2016 at 4:17 AM, Michael Paquier wrote:
> Thanks for your enthusiasm. Now, to do an auto-critic of my patch:
> +   if ($params{allows_streaming})
> +   {
> +   print $conf "wal_level = hot_standby\n";
> +   print $conf "max_wal_senders = 5\n";
> +   print $conf "wal_keep_segments = 20\n";
> +   print $conf "max_wal_size = 128MB\n";
> +   print $conf "shared_buffers = 1MB\n";
> +   print $conf "wal_log_hints = on\n";
> +   print $conf "hot_standby = on\n";
> +   }
> This could have more thoughts, particularly for wal_log_hints which is
> not used all the time, I think that we'd actually want to complete
> that with an optional hash of parameter/values that get appended at
> the end of the configuration file, then pass wal_log_hints in the
> tests where it is needed. The default set of parameter is maybe fine
> if done this way, still wal_keep_segments could be removed.

At the end I have refrained from doing that, and refactoring
setup_clus...@rewindtest.pm to use the new option allows_streaming,
simplifying a bit the code. The introduction of allows_streaming could
be done in a separate patch, though it did not seem worth the
complication when hacking at that. The split is simple, though.

> +# Tets for timeline switch
> +# Encure that a standby is able to follow a newly-promoted standby
> Two typos in two lines.

Fixed.

I also found an issue with the use of application_name causing test
001 to fail, bug squashed on the way. The generation of paths for
archive_command and restore_command was incorrect on Windows. Those
need to use two backslashes (to detect correctly files) and need to be
double-quoted (to avoid errors with command copy should a space be
included in the path). I have added as well a new subcommand in
vcregress.pl called recoverycheck where one can run the recovery test
suite on Windows using MSVC.

Attached are rebased patches, split into 3 parts doing the following:
- 0001, fix default configuration of MSVC builds ignoring TAP tests
- 0002, add a promote routine in PostgresNode.pm. pg_rewind's tests
can make immediate use of that.
- 0003, the actual test suite.
This is registered in CF 2016-03 as well for further consideration.
-- 
Michael
From 0231866094b793ccd7c6941a19d5565c3d5811b6 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 21 Dec 2015 16:28:34 +0900
Subject: [PATCH 1/3] Fix default configuration of MSVC builds ignoring TAP
 tests

MSVC build scripts use a flag to track if TAP tests are supported or not
but this was not configured correctly. By default, like the other build
types using ./configure, this is disabled.
---
 src/tools/msvc/Solution.pm   |  1 +
 src/tools/msvc/config_default.pl | 27 ++-
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index ac116b7..c5a43f9 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -643,6 +643,7 @@ sub GetFakeConfigure
 	$cfg .= ' --enable-integer-datetimes'
 	  if ($self->{options}->{integer_datetimes});
 	$cfg .= ' --enable-nls' if ($self->{options}->{nls});
+	$cfg .= ' --enable-tap-tests' if ($self->{options}->{tap_tests});
 	$cfg .= ' --with-ldap'  if ($self->{options}->{ldap});
 	$cfg .= ' --without-zlib' unless ($self->{options}->{zlib});
 	$cfg .= ' --with-extra-version' if ($self->{options}->{extraver});
diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl
index b9f2ff4..e50be7e 100644
--- a/src/tools/msvc/config_default.pl
+++ b/src/tools/msvc/config_default.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 
 our $config = {
-	asserts => 0,# --enable-cassert
+	asserts  => 0,# --enable-cassert
 	  # integer_datetimes=>1,   # --enable-integer-datetimes - on is now default
 	  # float4byval=>1, # --disable-float4-byval, on by default
 
@@ -13,18 +13,19 @@ our $config = {
 	# blocksize => 8, # --with-blocksize, 8kB by default
 	# wal_blocksize => 8, # --with-wal-blocksize, 8kB by default
 	# wal_segsize => 16,  # --with-wal-segsize, 16MB by default
-	ldap => 1,# --with-ldap
-	extraver => undef,# --with-extra-version=
-	nls  => undef,# --enable-nls=
-	tcl  => undef,# --with-tls=
-	perl => undef,# --with-perl
-	python   => undef,# --with-python=
-	openssl  => undef,# --with-openssl=
-	uuid => undef,# --with-ossp-uuid
-	xml  => undef,# --with-libxml=
-	xslt => undef,# --with-libxslt=
-	iconv=> undef,# (not in configure, path to iconv)
-	zlib => undef # --with-zlib=
+	ldap  => 1,# --with-ldap
+	extraver  => undef,# --with-extra-version=
+	nls   => undef,# --enable-nls=
+	tap_tests => undef,# --enable-tap-tests
+	tcl   => undef,# --with-tls=
+	perl  => undef,# --with-perl
+	python=> undef,# --with-python=

Re: [HACKERS] commitfest application doesn't see new patch

2016-02-17 Thread Pavel Stehule

2016-02-17 9:24 GMT+01:00 Magnus Hagander :

> On Wed, Feb 17, 2016 at 6:08 AM, Pavel Stehule 
> wrote:
>
>> Hi
>>
>> In a entry https://commitfest.postgresql.org/9/525/ the commitfest
>> application show last patch
>>
>> Latest attachment (incompat.py) at 2016-02-04 09:13:55 from Catalin Iacob
>> 
>>
>> although last patch is from yesterday
>>
>> Attachment: plpython-enhanced-error-02.patch.gz
>>
>>
> That's likely because it's not a patch. It's a python script. The tool
> intentionally only looks for patches, to avoid triggering on things that
> people put in their email signatures for example.
>

It possible, I used wrong format.

I am sorry for noise.

Regards

Pavel


>
>
> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/
>

[HACKERS] Pushing down sorted joins

2016-02-17 Thread Ashutosh Bapat

Hi All,
Now that we have join pushdown support in postgres_fdw, we can leverage the
sort pushdown mechanism for base relations to work for pushed down joins as
well. PFA patch for the same.

The code to find useful set of pathkeys and then generate paths for each
list of pathkeys is moved into a function which is called for base
relations and join relations, while creating respective paths. The useful
pathkeys are same as the base relation i.e. root->query_pathkeys and
pathkeys useful for merge join as discussed in [1].

I measured performance of pushing down sort for merge joins for query
SELECT lt1.val, ft1.val, ft2.val FROM lt1 join (ft1 join ft2 on (ft1.val =
ft2.val)) on (lt1.val = ft1.val) where ft1, ft2 are foreign tables, join
between which gets pushed down to the foreign server and lt is the local
table.

Without the patch servers prefers local merge join between foreign tables
followed by merge join with local table by getting the data sorted from the
foreign server. But with the patch, it pushes down the foreign join and
also gets the data sorted for local merge join. The times measured over 10
runs of query with and without patch are

With patch
 avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--+--+--+--
   60310.0369 | 251.075471210925 |59895.064 |60746.496

Without patch
 avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--+--+--+--
   86396.6001 |  254.30988131848 |85906.606 |86742.311

With the patch the execution time of the query reduces by 30%.

The scripts to setup and run query and outputs of running query with and
without patch are attached.


[1]
http://www.postgresql.org/message-id/CAFjFpRfeKHiCmwJ72p4=zvuzrqsau9tbfyw7vwr-5ppvrcb...@mail.gmail.com
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


pg_join_sort_pd.patch
Description: application/download
\set num_samples 10
\set query 'SELECT lt1.val, ft1.val, ft2.val FROM lt1 join (ft1 join ft2 on (ft1.val = ft2.val)) on (lt1.val = ft1.val)'
EXPLAIN (VERBOSE, ANALYSE) :query ;
SELECT avg_exe_time, std_dev_exe_time, min_exe_time, max_exe_time
		FROM query_execution_stats(:'query', :num_samples);
\set num_rows (1000*1000*10)
-- Create local tables (to be pointed by foreign tables)
DROP TABLE lt1 CASCADE;
CREATE TABLE lt1(val int, val2 int);
INSERT INTO lt1 SELECT i, i FROM (SELECT generate_series(1, :num_rows)) s(i);
CREATE INDEX i_lt1_val ON lt1(val);


DROP TABLE lt2 CASCADE;
CREATE TABLE lt2(val int, val2 int);
INSERT INTO lt2 SELECT i, i FROM (SELECT generate_series(1, :num_rows)) s(i);
CREATE INDEX i_lt2_val ON lt2(val);

DROP TABLE lt3 CASCADE;
CREATE TABLE lt3(val int, val2 int);
INSERT INTO lt3 SELECT i, i FROM (SELECT generate_series(1, :num_rows)) s(i);
CREATE INDEX i_lt3_val ON lt3(val);

DROP EXTENSION postgres_fdw CASCADE;
create extension postgres_fdw;
CREATE SERVER pg_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'postgres');
CREATE USER MAPPING FOR CURRENT_USER SERVER pg_server;
CREATE FOREIGN TABLE ft1 (val int, val2 int) SERVER pg_server OPTIONS (table_name 'lt1',
use_remote_estimate 'true');
CREATE FOREIGN TABLE ft2 (val int, val2 int) SERVER pg_server OPTIONS (table_name 'lt2',
use_remote_estimate 'true');

DROP FUNCTION query_execution_stats(query text, num_samples int,
	OUT avg_exe_time float,
	OUT exec_time_dev float,
	OUT min_exe_time float,
	OUT max_exe_time float);

CREATE FUNCTION query_execution_stats(query text, num_samples int,
	OUT avg_exe_time float,
	OUT std_dev_exe_time float,
	OUT min_exe_time float,
	OUT max_exe_time float)
RETURNS record LANGUAGE plpgsql AS $$
DECLARE
	plan json;
BEGIN
	CREATE TEMPORARY TABLE query_exe_times(exe_time float); 

	-- Execute query a few times (5% of user specified runs) to warm the cache
	FOR i IN 1 .. num_samples/20 LOOP
		EXECUTE query;
	END LOOP;

	FOR i IN 1 .. num_samples LOOP
		EXECUTE 'EXPLAIN (analyze, format json) ' || query INTO plan;
		INSERT INTO query_exe_times VALUES ((plan->0->'Execution Time')::text::float);
		RAISE NOTICE 'completed % samples', i;
	END LOOP;

	SELECT avg(exe_time), stddev(exe_time), min(exe_time), max(exe_time)
		INTO avg_exe_time, std_dev_exe_time, min_exe_time, max_exe_time
		FROM query_exe_times;

	DROP TABLE query_exe_times;
END;
$$;

ANALYZE ft1;
ANALYZE ft2;
ANALYZE lt1;
ANALYZE lt2;
ANALYZE lt3;


sort_pd.out.without_patch
Description: Binary data


sort_pd.out.with_patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tsearch_extras extension

2016-02-17 Thread Stas Kelvich


> On 17 Feb 2016, at 08:14, Oleg Bartunov  wrote:
> 
> 
> 
> On Wed, Feb 17, 2016 at 6:57 AM, Tim Abbott  wrote:
> Just following up here since I haven't gotten a reply -- I'd love to work 
> with someone from the Postgres community on a plan to make the tsearch_extras 
> functionality available as part of mainline postgres.
> 
> 
> -Tim Abbott
> 
> On Wed, Feb 3, 2016 at 9:41 PM, Tim Abbott  wrote:
> Hello,
> 
> I'm a maintainer of the Zulip open source group chat application.  Zulip 
> depends on a small (~200 lines) postgres extension called tsearch_extras 
> (https://github.com/zbenjamin/tsearch_extras) that returns the actual 
> (offset, length) pairs of all the matches for a postgres full text search 
> query.  This extension allows Zulip to do its own highlighting of the full 
> text search matches, using a more complicated method than what Postgres 
> supports natively.  
> 
> I think tsearch_extras is probably of value to others using postgres 
> full-text search (and I'd certainly love to get out of the business of 
> maintaining an out-of-tree postgres extension), so I'm wondering if this 
> feature (or a variant of it) would be of interest to upstream?  
> 
> Thanks!
> 
> -Tim Abbott
> 
> (See 
> http://www.postgresql.org/message-id/flat/52c7186d.8010...@strangersgate.com#52c7186d.8010...@strangersgate.com
>  for the discussion on postgres mailing lists that caused us to write this 
> module in the first place.)
> 
> Tim,
> 
> take a look on this patch (https://commitfest.postgresql.org/7/385/) and 
> contact author.  It contains everything you need to your purposes.
> 
> btw, Stas, check on status "Returned with feedback" !
> 
> 
> Regards,
> Oleg 
> 

Hi,

Yes, this extension have some common functionality with patch. Particularly 
yours tsvector_lexemes and mine to_array do the same thing. I wasn’t aware of 
you patch when started to work on that, so I think we can ask commiter to 
mention you in commit message (if it that will be commited).

And how do you use ts_match_locs_array / ts_match_locs_array? To highlight 
search results? There is function called ts_headline that can mark matches with 
custom start/stop strings.

---
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pglogical - logical replication contrib module

2016-02-17 Thread Konstantin Knizhnik


Ok, what about the following plan:

1. Support custom WAL records (as far as I know 2ndQuadrant has such patch).
2. Add one more function to logical decoding allowing to deal with 
custom records.


So the idea is that we somehow record DDL in WAL (for example using 
executor hook),
then them are proceeded using logical decoding, calling special logical 
deocding plugin function to handle this records.
For example we can store DDL in WAL just as SQL statements and so easily 
replay them.


In this case DDL will be replicated using the same mechanism and through 
the same channel as DML.



On 17.02.2016 12:16, Craig Ringer wrote:
On 17 February 2016 at 16:24, Konstantin Knizhnik 
mailto:k.knizh...@postgrespro.ru>> wrote:


Thanks for your explanation. I have to agree with your arguments
that in general case replication of DDL statement using logical
decoding seems to be problematic. But we are mostly considering
logical decoding in quite limited context: replication between two
identical Postgres database nodes (multimaster).


Yep, much like BDR. Where all this infrastructure came from and is/was 
aimed at.


Do you think that it in this case replication of DLL can be done
as sequence of low level operations with system catalog tables
including manipulation with locks?


No.

For one thing logical decoding doesn't see catalog tuple changes right 
now. Though I imagine that could be changed easily enough.


More importantly - oids. You add a column to a table:

ALTER TABLE mytable ADD COLUMN mycolumn some_type UNIQUE NOT NULL 
DEFAULT some_function()


This writes to catalogs including:

pg_attribute
pg_constraint
pg_index
pg_class (for the index relation)

... probably more. It also refers to pg_class (for the definition of 
mytable), pg_type (definition of some_type), pg_proc (definition of 
some_function), the b-tree operator class for some_type in pg_opclass, 
the b-tree indexam in pg_am, ... more.


Everything is linked by oids, and the oids are all node local. You 
can't just blindly re-use them. If "some_type" is hstore, the oid of 
hstore in pg_type might be different on the upstream and downstream. 
The only exception is the oids of built-in types and even then that's 
not guaranteed across major versions.


So if you blindly replicate catalog row changes you'll get a horrible 
mess. That's before considering a table's relfilenode, which is 
initially the same as its oid, but subject to change if truncated or 
rewritten.


To even begin to do this half-sanely you'd have to maintain a mapping 
of upstream object oids->names on the downstream, with invalidations 
replicated from the upstream. That's only the beginning. There's 
handling of extensions and lots more fun.


So in your example with ALTER TABLE statement, can we correctly
replicate it to other nodes
as request to set exclusive lock + some manipulations with catalog
tables and data table itself?


Nope. No hope, not unless "some manipulations with catalog tables and 
data table its self" is a lot more comprehensive than I think you mean.


1. Add option whether to include operations on system catalog
tables in logical replication or not.


I would like to have this anyway.

2. Make it possible to replicate lock requests (can be useful not
only for DDLs)


I have no idea how you'd even begin to do that.

I looked how DDL was implemented in BDR and did it in similar way
in our multimaster.
But it is awful: we need to have two different channels for
propagating changes.


Yeah, it's not beautiful, but maybe you misunderstood something? The 
DDL is written to a table, and that table's changes are replayed along 
with everything else. It's consistent and keeps DDL changes as part of 
the xact that performed them. Maybe you misunderstood how it works in 
BDR and missed the indirection via a table?


Additionally, in multimaster we want to enforce cluster wide ACID.
It certainly includes operations with metadata. It will be very
difficult to implement if replication of DML and DDL is done in
two different ways...


That's pretty much why BDR does it this way, warts and all. Though it 
doesn't offer cluster-wide ACID it does need atomic commit of xacts 
that may contain DML, DDL, or some mix of the two.


Let me ask one more question concerning logical replication: how
difficult it will be from your point of view to support two phase
commit in logical replication? Are there some principle problems?


I haven't looked closely yet. Andres will know more.

I very, very badly want to be able to decode 2PC prepared xacts myself.

The main issue I'm aware of is locking - specifically the inability to 
impersonate another backend and treat locks held by that backend 
(which might be a fake backend for a pg_prepared_xacts entry) as held 
by ourselves for the purpose of being able to access relations, etc.


The work Robert is doing on group lockin

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-02-17 Thread Michael Paquier

On Wed, Feb 17, 2016 at 10:13 AM, Michael Paquier
 wrote:
> On Tue, Feb 16, 2016 at 11:36 PM, Joe Conway  wrote:
>> Thanks!
>
> OK. I think I'm good now. Thanks for the quick update.

Actually, having second-thoughts on the matter, why is is that
necessary to document the function pg_config()? The functions wrapping
a system view just have the view documented, take for example
pg_show_all_file_settings, pg_show_all_settings,
pg_stat_get_wal_receiver, etc.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER ROLE SET/RESET for multiple options

2016-02-17 Thread Masahiko Sawada

On Wed, Feb 17, 2016 at 7:14 PM, Robert Haas  wrote:
> On Wed, Feb 17, 2016 at 3:22 PM, Masahiko Sawada  
> wrote:
>> On Sat, Feb 13, 2016 at 2:45 PM, Robert Haas  wrote:
>>> On Fri, Feb 12, 2016 at 1:35 PM, Payal Singh  wrote:
 The feature seems to work as described, but is it necessary to enclose 
 multiple GUC settings in a parenthesis? This seems a deviation from the 
 usual syntax of altering multiple settings separated with comma.
>>>
>>> Well, note that you can say:
>>>
>>> ALTER USER bob SET search_path = a, b, c;
>>>
>>> I'm not sure how the parentheses help exactly; it seems like there is
>>> an inherit ambiguity either way.
>>>
>>
>> I thought it would be useful for user who wants to set several GUC
>> parameter for each user. Especially the case where changing logging
>> parameter for each user.
>> But it might not bring us fantastic usability.
>
> Yeah, it doesn't really seem like it's worth trying to figure out a
> syntax for this that can work.  It just doesn't buy us very much vs.
> issuing one ALTER COMMAND per setting.
>

Yeah, please mark this patch as 'rejected'.
If I can come up with another good idea, will post.

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Figures in docs

2016-02-17 Thread Ioseph Kim

On 수, 2016-02-17 at 12:14 +0300, Alexander Lakhin wrote:
> 17.02.2016 09:17, Tatsuo Ishii wrote:
> >> Hi. 
> >>
> >> In DocBook 4.2 sgml dtd, figure tag is supported already.
> >> that was implemented for multi output format.
> > Ok, there's no technical problems with figures then.  MySQL docs has
> > some nice figures. I am jealous.
> The "figure" tag is just a placeholder in the Docbook
> (http://www.docbook.org/tdg/en/html/figure.html).
> The question is what to place inside this tag: "graphic" (not inline),
> "mediaobject/imageobject" (alternative object, supports inline contents
> and SVG), or something else.
> So you surely can insert some picture from an external file (.png,
> .jpg), but if it could contain title or some labels that should be
> translated, it's not a best solution.

I want say, just about figure tag.
sgml document can include external image file. 

in sgml

   

 Some Image
  

   

then make html command generate below html code

Figure E-1. Some Image

so, 
I asked how maintenance that some_image.jpg file.
I think that is so difficult, because new release document might change
these too.
if use svg module for docbook, document sources will are maked very
dirty.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER ROLE SET/RESET for multiple options

2016-02-17 Thread Robert Haas

On Wed, Feb 17, 2016 at 3:22 PM, Masahiko Sawada  wrote:
> On Sat, Feb 13, 2016 at 2:45 PM, Robert Haas  wrote:
>> On Fri, Feb 12, 2016 at 1:35 PM, Payal Singh  wrote:
>>> The feature seems to work as described, but is it necessary to enclose 
>>> multiple GUC settings in a parenthesis? This seems a deviation from the 
>>> usual syntax of altering multiple settings separated with comma.
>>
>> Well, note that you can say:
>>
>> ALTER USER bob SET search_path = a, b, c;
>>
>> I'm not sure how the parentheses help exactly; it seems like there is
>> an inherit ambiguity either way.
>>
>
> I thought it would be useful for user who wants to set several GUC
> parameter for each user. Especially the case where changing logging
> parameter for each user.
> But it might not bring us fantastic usability.

Yeah, it doesn't really seem like it's worth trying to figure out a
syntax for this that can work.  It just doesn't buy us very much vs.
issuing one ALTER COMMAND per setting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates

2016-02-17 Thread Robert Haas

On Wed, Dec 23, 2015 at 12:19 PM, Peter Geoghegan  wrote:
> On Tue, Dec 22, 2015 at 2:59 PM, Robert Haas  wrote:
>> OK, so I've gone ahead and committed and back-patched that.  Can you
>> please rebase and repost the remainder as a 9.6 proposal?
>
> I attach a rebased patch for 9.6 only.

Committed, except I left out one comment hunk that I wasn't convinced about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-17 Thread Dean Rasheed

On 17 February 2016 at 00:39, Vitaly Burovoy  wrote:
> On 2/16/16, Vitaly Burovoy  wrote:
>> On 2/16/16, Dean Rasheed  wrote:
>>> Fixing that in parse_memory_unit() would be messy because it assumes a
>>> base unit of kB, so it would require a negative multiplier, and
>>> pg_size_bytes() would have to be taught to divide by the magnitude of
>>> negative multipliers in the same way that guc.c does.
>
> Now parse_memory_unit returns -1024 for bytes as divider, constant
> "bytes" has moved there.
> Add new memory_units_bytes_hint which differs from an original
> memory_units_int by "bytes" size unit.
> Allow hintmsg be NULL and if so, skip setting dereferenced variable to
> memory_units_bytes_hint.
>

I think that approach is getting more and more unwieldy, and it simply
isn't worth the effort just to share a few values from the unit
conversion table, especially given that the set of supported units
differs between the two places.


>>> ISTM that it would be far less code, and much simpler and more
>>> readable to just parse the supported units directly in
>>> pg_size_bytes(), rather than trying to share code with guc.c, when the
>>> supported units are actually different and may well diverge further in
>>> the future.
>>

I've gone with this approach and it is indeed far less code, and much
simpler and easier to read. This will also make it easier to
maintain/extend in the future.

I've made a few minor tweaks to the docs, and added a note to make it
clear that the units in these functions work in powers of 2 not 10.

I also took the opportunity to tidy up the number scanning code
somewhat (I was tempted to rip it out entirely, since it feels awfully
close to duplicating the numeric code, but it's probably worth it for
the better error message).

Additionally, note that I replaced strcasecmp() with pg_strcasecmp(),
since AIUI the former is not available on all supported platforms.

Barring objections, and subject to some more testing, I intend to
commit this version.

Regards,
Dean
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index f9eea76..60f117a
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17767,6 +17767,9 @@ postgres=# SELECT * FROM pg_xlogfile_nam
 pg_relation_size


+pg_size_bytes
+   
+   
 pg_size_pretty


@@ -17838,6 +17841,15 @@ postgres=# SELECT * FROM pg_xlogfile_nam
   
   

+pg_size_bytes(text)
+
+   bigint
+   
+ Converts a size in human-readable format with size units into bytes
+   
+  
+  
+   
 pg_size_pretty(bigint)
 
text
@@ -17968,11 +17980,27 @@ postgres=# SELECT * FROM pg_xlogfile_nam
 

 pg_size_pretty can be used to format the result of one of
-the other functions in a human-readable way, using kB, MB, GB or TB as
-appropriate.
+the other functions in a human-readable way, using bytes, kB, MB, GB or TB
+as appropriate.

 

+pg_size_bytes can be used to get the size in bytes from a
+string in human-readable format. The input may have units of bytes, kB,
+MB, GB or TB, and is parsed case-insensitively. If no units are specified,
+bytes are assumed.
+   
+
+   
+
+ The units kB, MB, GB and TB used by the functions
+ pg_size_pretty and pg_size_bytes are defined
+ using powers of 2 rather than powers of 10, so 1kB is 1024 bytes, 1MB is
+ 10242 = 1048576 bytes, and so on.
+
+   
+
+   
 The functions above that operate on tables or indexes accept a
 regclass argument, which is simply the OID of the table or index
 in the pg_class system catalog.  You do not have to look up
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 2084692..91260cd
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -700,6 +700,145 @@ pg_size_pretty_numeric(PG_FUNCTION_ARGS)
 }
 
 /*
+ * Convert a human-readable size to a size in bytes
+ */
+Datum
+pg_size_bytes(PG_FUNCTION_ARGS)
+{
+	text	   *arg = PG_GETARG_TEXT_PP(0);
+	char	   *str,
+			   *strptr,
+			   *endptr;
+	bool		have_digits;
+	char		saved_char;
+	Numeric		num;
+	int64		result;
+
+	str = text_to_cstring(arg);
+
+	/* Skip leading whitespace */
+	strptr = str;
+	while (isspace((unsigned char) *strptr))
+		strptr++;
+
+	/* Check that we have a valid number and determine where it ends */
+	endptr = strptr;
+
+	/* Part (1): sign */
+	if (*endptr == '-' || *endptr == '+')
+		endptr++;
+
+	/* Part (2): main digit string */
+	if (isdigit((unsigned char) *endptr))
+	{
+		have_digits = true;
+		do
+			endptr++;
+		while (isdigit((unsigned char) *endptr));
+	}
+	else
+		have_digits = false;
+
+	/* Part (3): optional decimal point and fractional digits */
+	if (*endptr == '.')
+	{
+		endptr++;
+		if (isdigit((unsigned char) *endptr))
+		{
+			have_digits = true;
+			do
+endptr++;
+			while (isdigit((unsigned char) *endptr));
+		}
+	}

[HACKERS] Re: Should PostgresFDW ImportForeignSchema should import the remote table default expression?

2016-02-17 Thread Rushabh Lathia

Oh I just found out that IMPORT FOREIGN do have
import_default and import_not_null options.

Got the answer, sorry for noise.

On Wed, Feb 17, 2016 at 1:31 PM, Rushabh Lathia 
wrote:

> Here is the test:
>
> -- create database
> postgres=# create database foo;
> CREATE DATABASE
> postgres=# \c foo
> You are now connected to database "foo" as user "rushabh".
> -- Create remote table with default expression
> foo=# create table test ( a int , b int default 200 );
> CREATE TABLE
> foo=# \c postgres
> You are now connected to database "postgres" as user "rushabh".
> postgres=#
> postgres=# create extension postgres_fdw ;
> CREATE EXTENSION
> -- Create server and user mapping
> postgres=# create server myserver FOREIGN DATA WRAPPER postgres_fdw
> options (dbname 'foo', port '');
> CREATE SERVER
> postgres=# CREATE USER MAPPING FOR CURRENT_USER SERVER myserver;
> CREATE USER MAPPING
> -- Import foreign schema
> postgres=# import foreign schema public from server myserver into public;
> IMPORT FOREIGN SCHEMA
>
> -- Foreign table got imported
> postgres=# \d test
>Foreign table "public.test"
>  Column |  Type   | Modifiers |FDW Options
> +-+---+---
>  a  | integer |   | (column_name 'a')
>  b  | integer |   | (column_name 'b')
> Server: myserver
> FDW Options: (schema_name 'public', table_name 'test')
>
> -- Try to insert row and assume that it will add default value for 'b'
> column
> postgres=# insert into test (a) values ( 10 );
> INSERT 0 1
>
> -- But guess what, I was wrong ???
> postgres=# select * from test;
>  a  | b
> +---
>  10 |
> (1 row)
>
> Looking at the code of postgresImportForeignSchema it clear that its not
> importing the default expression from the foreign table. But question is
> whether it should ?
>
> inputs/thoughts ?
>
> Regards,
> Rushabh Lathia
> www.EnterpriseDB.com
>



-- 
Rushabh Lathia

Re: [HACKERS] ALTER ROLE SET/RESET for multiple options

2016-02-17 Thread Masahiko Sawada

On Sat, Feb 13, 2016 at 2:45 PM, Robert Haas  wrote:
> On Fri, Feb 12, 2016 at 1:35 PM, Payal Singh  wrote:
>> The feature seems to work as described, but is it necessary to enclose 
>> multiple GUC settings in a parenthesis? This seems a deviation from the 
>> usual syntax of altering multiple settings separated with comma.
>
> Well, note that you can say:
>
> ALTER USER bob SET search_path = a, b, c;
>
> I'm not sure how the parentheses help exactly; it seems like there is
> an inherit ambiguity either way.
>

I thought it would be useful for user who wants to set several GUC
parameter for each user. Especially the case where changing logging
parameter for each user.
But it might not bring us fantastic usability.

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pglogical - logical replication contrib module

2016-02-17 Thread Craig Ringer

On 17 February 2016 at 16:27, Andres Freund  wrote:

> On 2016-02-17 09:33:56 +0800, Craig Ringer wrote:
> > Some DDL operations don't translate well to a series of replicatable
> > actions. The case I hit the most is
> >
> >ALTER TABLE mytable ADD COLUMN somecolumn sometype NOT NULL DEFAULT
> > some_function();
> >
> > This is executed (simplified) by taking an ACCESS EXCLUSIVE lock,
> changing
> > the catalogs but not making the changes visible yet, rewriting the table,
> > and committing to make the rewritten table and the catalog changes
> visible.
> >
> > That won't work well with logical replication.
>
> FWIW, I think this is much less a fundamental, and more an
> implementation issue. Falling back to just re-replicating the table

Do you mean taking a new schema dump from the upstream? Or just the table
data?

We already receive the table data in a pg_temp_ virtual relation.
While it'd be nice to have a better way to map that to the relation being
rewritten without having to do string compares on table names all the time,
it works. If we do a low level truncate on the table *then* execute the DDL
on the empty table and finally rewrite it based on that stream as we
receive it that should work OK.

> Lets get the basics right, before reaching for the moon.
>

Yeah, it's got to be incremental. Though I do think we'll need to address
DDL affecting shared catalogs.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] pglogical - logical replication contrib module

2016-02-17 Thread Craig Ringer

On 17 February 2016 at 16:24, Konstantin Knizhnik  wrote:

> Thanks for your explanation. I have to agree with your arguments that in
> general case replication of DDL statement using logical decoding seems to
> be problematic. But we are mostly considering logical decoding in quite
> limited context: replication between two identical Postgres database nodes
> (multimaster).
>

Yep, much like BDR. Where all this infrastructure came from and is/was
aimed at.

> Do you think that it in this case replication of DLL can be done as
> sequence of low level operations with system catalog tables
> including manipulation with locks?
>

No.

For one thing logical decoding doesn't see catalog tuple changes right now.
Though I imagine that could be changed easily enough.

More importantly - oids. You add a column to a table:

ALTER TABLE mytable ADD COLUMN mycolumn some_type UNIQUE NOT NULL DEFAULT
some_function()

This writes to catalogs including:

pg_attribute
pg_constraint
pg_index
pg_class (for the index relation)

... probably more. It also refers to pg_class (for the definition of
mytable), pg_type (definition of some_type), pg_proc (definition of
some_function), the b-tree operator class for some_type in pg_opclass, the
b-tree indexam in pg_am, ... more.

Everything is linked by oids, and the oids are all node local. You can't
just blindly re-use them. If "some_type" is hstore, the oid of hstore in
pg_type might be different on the upstream and downstream. The only
exception is the oids of built-in types and even then that's not guaranteed
across major versions.

So if you blindly replicate catalog row changes you'll get a horrible mess.
That's before considering a table's relfilenode, which is initially the
same as its oid, but subject to change if truncated or rewritten.

To even begin to do this half-sanely you'd have to maintain a mapping of
upstream object oids->names on the downstream, with invalidations
replicated from the upstream. That's only the beginning. There's handling
of extensions and lots more fun.

> So in your example with ALTER TABLE statement, can we correctly replicate
> it to other nodes
> as request to set exclusive lock + some manipulations with catalog tables
> and data table itself?
>

Nope. No hope, not unless "some manipulations with catalog tables and data
table its self" is a lot more comprehensive than I think you mean.

> 1. Add option whether to include operations on system catalog tables in
> logical replication or not.
>

I would like to have this anyway.

> 2. Make it possible to replicate lock requests (can be useful not only for
> DDLs)
>

I have no idea how you'd even begin to do that.

> I looked how DDL was implemented in BDR and did it in similar way in our
> multimaster.
> But it is awful: we need to have two different channels for propagating
> changes.
>

Yeah, it's not beautiful, but maybe you misunderstood something? The DDL is
written to a table, and that table's changes are replayed along with
everything else. It's consistent and keeps DDL changes as part of the xact
that performed them. Maybe you misunderstood how it works in BDR and missed
the indirection via a table?

> Additionally, in multimaster we want to enforce cluster wide ACID. It
> certainly includes operations with metadata. It will be very difficult to
> implement if replication of DML and DDL is done in two different ways...
>

That's pretty much why BDR does it this way, warts and all. Though it
doesn't offer cluster-wide ACID it does need atomic commit of xacts that
may contain DML, DDL, or some mix of the two.

> Let me ask one more question concerning logical replication: how difficult
> it will be from your point of view to support two phase commit in logical
> replication? Are there some principle problems?
>

I haven't looked closely yet. Andres will know more.

I very, very badly want to be able to decode 2PC prepared xacts myself.

The main issue I'm aware of is locking - specifically the inability to
impersonate another backend and treat locks held by that backend (which
might be a fake backend for a pg_prepared_xacts entry) as held by ourselves
for the purpose of being able to access relations, etc.

The work Robert is doing on group locking looks absolutely ideal for this,
but won't land before 9.7.

(Closely related, I also want to be able to hook into commit and transform
a normal COMMIT into a PREPARE TRANSACTION, , COMMIT
PREPARED with the application that issued the commit none the wiser. This
will allow pessimistic 2PC-based conflict handling for must-succeed xacts
like those that do DDL).

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Identifying a message in emit_log_hook.

2016-02-17 Thread Simon Riggs

On 17 February 2016 at 08:34, Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:

>
> > I'm guessing this would require making the pre-translated error text
> > available to plugins as well as translated form.
>
> If I understand you correctly, edata->messgage_id in my patch is
> just what offers the pre-translated error text to plugins.


OK, now I understand the patch, I am happy to apply it.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Identifying a message in emit_log_hook.

2016-02-17 Thread Kyotaro HORIGUCHI

Hello, thank you for the opinions,

At Tue, 16 Feb 2016 10:57:33 +0100, Pavel Stehule  
wrote in 
> Hi
> 
> 2016-02-16 10:47 GMT+01:00 Kyotaro HORIGUCHI <
> horiguchi.kyot...@lab.ntt.co.jp>:
> 
> > Hello.
> >
> > I'm planning to change the error level of a specific error
> > message.  We can use emit_log_hook for this purpose but we have
> > no reliable means to identify what a message is.
> >
> > For messages without valid sqlerrcode, filename:lineno could be
> > used to identify but lineno is too unstable.  One possible and
> > most straightforward way to solve this is defining identifiers
> > covering all error messages but such identifiers are too hard to
> > manage.  ErrorData.message could also be used but NLS translation
> > and placeholders prevent it from being identified by simple means
> > like strcmp(3).
> >
> > As a solution for this problem, I'd like to porpose to have an
> > additional member in the struct ErrorData to hold a message id,
> > that is, the format string for errmsg(). This is not best but
> > useful enough.
> >
> > It is somewhat a crude way, but the attached small patch would
> > do.
> >
> > Is there any opinions or suggestions?
> >
> 
> It looks like workaround. The fixing missing sqlerrcode is much better.

Inventing a consistent system of sqlerrcode covering whole of the
code would be an unsolvable problem. Even allocating sequence
number or such to every error messages would be difficult to
maintain.

The "message id" called in this thread is a text finally passed
to gettext(3) as "msgid". Although it is under the context of
translation, it is usable when identifying an error by its
messages text is enough.

>From such an aspect, this could be called both of a workaround or
a solution.


At Tue, 16 Feb 2016 10:21:30 +, Simon Riggs  wrote 
in 
> First, I support the concept of message ids and any work you do in moving
> this forwards.

Thank you for supporting this.

> If you were to require message ids, how would this work for extensions?
> Many things write to the log, so this solution would rely upon 100%
> compliance with your new approach. I think that is unlikely to happen.

I'm uncertain about the "message ids" you mention, but as written
above, it is exactly the text passed to elog() or ereport(), then
passed to gettext(3) as the parameter "msgid". Of couse this
cannot be reliable for certain usage but usable enough for some
usage.

> I suggest an approach that allows optionally accessing a message id by
> hashing the English message text. That would allow people to identify
> messages without relying on people labelling everything correctly, as well
> as writing filters that do not depend upon language.

The hash code could be used for screening if it can be calculated
at compile time, but we cannot detect synonyms because of lack of
a central storage so anyway the pre-translated error text still
should be available to exntensions.

> I'm guessing this would require making the pre-translated error text
> available to plugins as well as translated form.

If I understand you correctly, edata->messgage_id in my patch is
just what offers the pre-translated error text to plugins.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-17 Thread Andres Freund

On 2016-02-16 23:33:47 -0500, Tom Lane wrote:
> Yes, exactly.  I'm not certain if there are any real platforms where
> a pointer-sized write wouldn't be atomic (it sure sounds inefficient
> for that to be true), but we have not assumed that to date and I'd
> just as soon not start here.

FWIW, there's no sizeof(void*) == 8 platform supported by PG where
aligned 8 byte writes aren't atomic. There are a number of supported
platforms with sizeof(void*) == 4 without atomic 8 byte writes though.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Audit Extension

2016-02-17 Thread Robert Haas

On Wed, Feb 17, 2016 at 5:20 AM, Bruce Momjian  wrote:
> On Fri, Feb  5, 2016 at 01:16:25PM -0500, Stephen Frost wrote:
>> Looking at pgaudit and the other approaches to auditing which have been
>> developed (eg: applications which sit in front of PG and essentially
>> have to reimplement large bits of PG to then audit the commands sent
>> before passing them to PG, or hacks which try to make sense out of log
>> files full of SQL statements) make it quite clear, in my view, that
>> attempts to bolt-on auditing to PG result in a poorer solution, from a
>> technical perspective, than what this project is known for and capable
>> of.  To make true progress towards that, however, we need to get past
>> the thinking that auditing doesn't need to be in-core or that it should
>> be a second-class citizen feature or that we don't need it in PG.
>
> Coming in late here, but the discussion around how to maintain the
> auditing code seems very similar to how to handle the logical
> replication of DDL commands.  First, have we looked into hooking
> auditing into scanning logical replication contents, and second, how are
> we handling the logical replication of DDL and could we use the same
> approach for auditing?

Auditing needs to trace read-only events, which aren't reflected in
logical replication in any way.  I think it's a good idea to try to
drive auditing off of existing machinery instead of inventing
something new - I suggested plan invalidation items upthread.  But I
doubt that logical replication is the thing to attach it to.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pglogical - logical replication contrib module

2016-02-17 Thread Andres Freund

On 2016-02-17 09:33:56 +0800, Craig Ringer wrote:
> Some DDL operations don't translate well to a series of replicatable
> actions. The case I hit the most is
> 
>ALTER TABLE mytable ADD COLUMN somecolumn sometype NOT NULL DEFAULT
> some_function();
> 
> This is executed (simplified) by taking an ACCESS EXCLUSIVE lock, changing
> the catalogs but not making the changes visible yet, rewriting the table,
> and committing to make the rewritten table and the catalog changes visible.
> 
> That won't work well with logical replication.

FWIW, I think this is much less a fundamental, and more an
implementation issue. Falling back to just re-replicating the table, and
then optimizing a few common cases (only immutable DEFALUT/USING
involved) should be enough for a while.

Lets get the basics right, before reaching for the moon.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-17 Thread Robert Haas

On Wed, Feb 17, 2016 at 10:03 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Hmm, that's true.  I don't think it actually matters all that much,
>> because proclock->tag.myProc->lockGroupLeader == NULL has pretty much
>> the same effect as if proclock->tag.myProc->lockGroupLeader ==
>> proclock->tag.myProc.  But not completely.  One problem is that we
>> don't currently assume that 8-byte writes are atomic, so somebody
>> might see the group leader field half-set, which would be bad.
>
> Yes, exactly.  I'm not certain if there are any real platforms where
> a pointer-sized write wouldn't be atomic (it sure sounds inefficient
> for that to be true), but we have not assumed that to date and I'd
> just as soon not start here.

I guess it's worth pointing out that storing the groupLeader in the
PROCLOCK, as we do currently, avoids this issue entirely.  The
deadlock detector can safely follow lockGroupLeader pointers because
it holds all the lock manager partition locks, and proc.c/lock.c don't
need to do so if groupLeader is present in the PROCLOCK.  Nor does
that field ever need to be updated, because it's defined to always be
non-NULL: if a process becomes a group leader, the value stored in the
PROCLOCKs does not change.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pglogical - logical replication contrib module

2016-02-17 Thread Konstantin Knizhnik


Hi Craig,

Thanks for your explanation. I have to agree with your arguments that in 
general case replication of DDL statement using logical decoding seems 
to be problematic. But we are mostly considering logical decoding in 
quite limited context: replication between two identical Postgres 
database nodes (multimaster).


Do you think that it in this case replication of DLL can be done as 
sequence of low level operations with system catalog tables
including manipulation with locks? So in your example with ALTER TABLE 
statement, can we correctly replicate it to other nodes
as request to set exclusive lock + some manipulations with catalog 
tables and data table itself?

If so, instead of full support of DDL in logical decoding we can only:

1. Add option whether to include operations on system catalog tables in 
logical replication or not.
2. Make it possible to replicate lock requests (can be useful not only 
for DDLs)


I looked how DDL was implemented in BDR and did it in similar way in our 
multimaster.
But it is awful: we need to have two different channels for propagating 
changes. Additionally, in multimaster we want to enforce cluster wide 
ACID. It certainly includes operations with metadata. It will be very 
difficult to implement if replication of DML and DDL is done in two 
different ways...


Let me ask one more question concerning logical replication: how 
difficult it will be from your point of view to support two phase commit 
in logical replication? Are there some principle problems?


Thanks in advance,
Konstantin




On 17.02.2016 04:33, Craig Ringer wrote:
On 17 February 2016 at 00:54, Oleg Bartunov > wrote:



DDL support is what it's missed for now.


TBH, based on experience with DDL replication and deparse in BDR, it's 
going to be missing for a while yet too, or at least not 
comprehensively present without caveats or exceptions.



Some DDL operations don't translate well to a series of replicatable 
actions. The case I hit the most is


   ALTER TABLE mytable ADD COLUMN somecolumn sometype NOT NULL DEFAULT 
some_function();


This is executed (simplified) by taking an ACCESS EXCLUSIVE lock, 
changing the catalogs but not making the changes visible yet, 
rewriting the table, and committing to make the rewritten table and 
the catalog changes visible.


That won't work well with logical replication. We currently capture 
DDL with event triggers and log them to a table for later logical 
decoding and replay - that's the "recognised" way. The trouble being 
that replaying that statement will result in an unnecessary full table 
rewrite on the downstream. Then we have to decode and send stream of 
changes to a table called pg_temp_, truncate the copy 
of mytable on the downstream that we just rewrote and apply those rows 
instead.


Of course all that only works sensibly if you have exactly one 
upstream and the downstream copy of the table is treated as (or 
enforced as) read-only.


Improving this probably needs DDL deparse to be smarter. Rather than 
just emitting something that can be reconstructed into the SQL text of 
the DDL it needs to emit one or more steps that are semantically the 
same but allow us to skip the rewrite. Along the lines of:


* ALTER TABLE mytable ADD COLUMN somecolumn sometype;
* ALTER TABLE mytable ALTER COLUMN somecolumn DEFAULT some_function();
* 
* ALTER TABLE mytable ALTER COLUMN somecolumn NOT NULL;

Alternately the downstream would need a hook that lets it intercept 
and prevent table rewrites caused by ALTER TABLE and similar. So it 
can instead just do a truncate and wait for the new rows to come from 
the master.


Note that all this means the standby has to hold an ACCESS EXCLUSIVE 
lock on the table during all of replay. That shouldn't be necessary, 
all we really need is an EXCLUSIVE lock since concurrent SELECTs are 
fine. No idea how to do that.




Deparse is also just horribly complicated to get right. There are so 
many clauses and subclauses and variants of statements. Each of which 
must be perfect.




Not everything has a simple and obvious mapping on the downstream side 
either. TRUNCATE ... CASCADE is the obvious one. You do a cascade 
truncate on the master - do you want that to replicate as a cascaded 
truncate on the replica, or a truncate of only those tables that 
actually got truncated on the master? If the replica has additional 
tables with FKs pointing at tables replica the TRUNCATE would truncate 
those too if you replicate it as CASCADE; if you don't the truncate 
will fail instead. Really, both are probably wrong as far as the user 
is concerned, but we can't truncate just the tables truncated on the 
master, ignore the FK relationships, and leave dangling FK references 
either.



All this means that DDL replication is probably only going to make 
sense in scenarios where there's exactly one master and the replica 
obeys some rules like "don't create FKs pointing from non-replicated 
t

Re: [HACKERS] commitfest application doesn't see new patch

2016-02-17 Thread Magnus Hagander

On Wed, Feb 17, 2016 at 6:08 AM, Pavel Stehule 
wrote:

> Hi
>
> In a entry https://commitfest.postgresql.org/9/525/ the commitfest
> application show last patch
>
> Latest attachment (incompat.py) at 2016-02-04 09:13:55 from Catalin Iacob
> 
>
> although last patch is from yesterday
>
> Attachment: plpython-enhanced-error-02.patch.gz
>
>
That's likely because it's not a patch. It's a python script. The tool
intentionally only looks for patches, to avoid triggering on things that
people put in their email signatures for example.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] extend pgbench expressions with functions

2016-02-17 Thread Fabien COELHO



Hello Michaël,


\set aid 1 + 1
pgbench -f addition.sql -t 5000

I have the following:
HEAD: 3.5~3.7M TPS
list method: 3.6~3.7M TPS
array method: 3.4~3.5M TPS
So all approaches have a comparable performance.


Yep, the execution trace is pretty similar in all cases, maybe with a 
little more work for the array method, although I'm surprise that the 
difference is discernable.



Btw, patch 2 is returning a warning for me:
It is trying to compare a 32b integer with an int64 value, evalFunc
needed an int64.


Indeed. My gcc 4.8.4 with --Wall does not show the warning, too bad.

Attached is the fixed patch for the array method.

--
Fabiendiff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..f39f341 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -786,7 +786,7 @@ pgbench  options  dbname
   
 
   
-   
+   
 
  \set varname expression
 
@@ -798,8 +798,10 @@ pgbench  options  dbname
   The expression may contain integer constants such as 5432,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity,
+  function calls, and
+  parentheses.
  
 
  
@@ -994,6 +996,62 @@ END;
 
  
 
+ 
+  Built-In Functions
+
+   
+ The following functions are built into pgbench and
+ may be used in conjunction with
+ \set.
+   
+
+   
+   
+pgbench Functions
+
+ 
+  
+   Function
+   Return Type
+   Description
+   Example
+   Result
+  
+ 
+ 
+  
+   abs(a)
+   same as a
+   integer value
+   abs(-17)
+   17
+  
+  
+   debug(a)
+   same as a 
+   print to stderr the given argument
+   debug(5432)
+   5432
+  
+  
+   max(i [, ... ] )
+   integer
+   maximum value
+   max(5, 4, 3, 2)
+   5
+  
+  
+   min(i [, ... ] )
+   integer
+   minimum value
+   min(5, 4, 3, 2)
+   2
+  
+ 
+ 
+   
+ 
+
  
   Per-Transaction Logging
 
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 06ee04b..cac4d5e 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -16,10 +16,13 @@
 
 PgBenchExpr *expr_parse_result;
 
+static PgBenchExprList *make_elist(PgBenchExpr *exp, PgBenchExprList *list);
 static PgBenchExpr *make_integer_constant(int64 ival);
 static PgBenchExpr *make_variable(char *varname);
-static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
+static PgBenchExpr *make_op(const char *operator, PgBenchExpr *lexpr,
 		PgBenchExpr *rexpr);
+static int find_func(const char *fname);
+static PgBenchExpr *make_func(const int fnumber, PgBenchExprList *args);
 
 %}
 
@@ -31,13 +34,15 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 	int64		ival;
 	char	   *str;
 	PgBenchExpr *expr;
+	PgBenchExprList *elist;
 }
 
+%type  elist
 %type  expr
-%type  INTEGER
-%type  VARIABLE
+%type  INTEGER function
+%type  VARIABLE FUNCTION
 
-%token INTEGER VARIABLE
+%token INTEGER VARIABLE FUNCTION
 %token CHAR_ERROR /* never used, will raise a syntax error */
 
 /* Precedence: lowest to highest */
@@ -49,16 +54,25 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 
 result: expr{ expr_parse_result = $1; }
 
+elist:  	{ $$ = NULL; }
+	| expr 	{ $$ = make_elist($1, NULL); }
+	| elist ',' expr		{ $$ = make_elist($3, $1); }
+	;
+
 expr: '(' expr ')'			{ $$ = $2; }
 	| '+' expr %prec UMINUS	{ $$ = $2; }
-	| '-' expr %prec UMINUS	{ $$ = make_op('-', make_integer_constant(0), $2); }
-	| expr '+' expr			{ $$ = make_op('+', $1, $3); }
-	| expr '-' expr			{ $$ = make_op('-', $1, $3); }
-	| expr '*' expr			{ $$ = make_op('*', $1, $3); }
-	| expr '/' expr			{ $$ = make_op('/', $1, $3); }
-	| expr '%' expr			{ $$ = make_op('%', $1, $3); }
+	| '-' expr %prec UMINUS	{ $$ = make_op("-", make_integer_constant(0), $2); }
+	| expr '+' expr			{ $$ = make_op("+", $1, $3); }
+	| expr '-' expr			{ $$ = make_op("-", $1, $3); }
+	| expr '*' expr			{ $$ = make_op("*", $1, $3); }
+	| expr '/' expr			{ $$ = make_op("/", $1, $3); }
+	| expr '%' expr			{ $$ = make_op("%", $1, $3); }
 	| INTEGER{ $$ = make_integer_constant($1); }
 	| VARIABLE { $$ = make_variable($1); }
+	| function '(' elist ')'{ $$ = make_func($1, $3); }
+	;
+
+function: FUNCTION			{ $$ = find_func($1); pg_free($1); }
 	;
 
 %%
@@ -84,14 +98,131 @@ make_variable(char *varname)
 }
 
 static PgBenchExpr *
-make_op(char operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr)
+make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr)
+{
+	return make_func(find_func(operator),
+	 make_elist(rexpr, make_elist(lexpr, NULL)));
+}
+
+/*
+ * List of available functions:
+ * - fname: function name
+ * - nargs: number of arguments
+ *

[HACKERS] Should PostgresFDW ImportForeignSchema should import the remote table default expression?

2016-02-17 Thread Rushabh Lathia

Here is the test:

-- create database
postgres=# create database foo;
CREATE DATABASE
postgres=# \c foo
You are now connected to database "foo" as user "rushabh".
-- Create remote table with default expression
foo=# create table test ( a int , b int default 200 );
CREATE TABLE
foo=# \c postgres
You are now connected to database "postgres" as user "rushabh".
postgres=#
postgres=# create extension postgres_fdw ;
CREATE EXTENSION
-- Create server and user mapping
postgres=# create server myserver FOREIGN DATA WRAPPER postgres_fdw options
(dbname 'foo', port '');
CREATE SERVER
postgres=# CREATE USER MAPPING FOR CURRENT_USER SERVER myserver;
CREATE USER MAPPING
-- Import foreign schema
postgres=# import foreign schema public from server myserver into public;
IMPORT FOREIGN SCHEMA

-- Foreign table got imported
postgres=# \d test
   Foreign table "public.test"
 Column |  Type   | Modifiers |FDW Options
+-+---+---
 a  | integer |   | (column_name 'a')
 b  | integer |   | (column_name 'b')
Server: myserver
FDW Options: (schema_name 'public', table_name 'test')

-- Try to insert row and assume that it will add default value for 'b'
column
postgres=# insert into test (a) values ( 10 );
INSERT 0 1

-- But guess what, I was wrong ???
postgres=# select * from test;
 a  | b
+---
 10 |
(1 row)

Looking at the code of postgresImportForeignSchema it clear that its not
importing the default expression from the foreign table. But question is
whether it should ?

inputs/thoughts ?

Regards,
Rushabh Lathia
www.EnterpriseDB.com

92 matches

Mail list logo