date:20180919

Chapman Flack  writes:
> Would it be unprecedented / be unreasonable / break anything for the
> install_jar function to simply force a CommandCounterIncrement
> at the end of step 1 (after its temporary snapshot has been popped,
> so the former/on-entry ActiveSnapshot gets the increment)?

The risk you take there is changing the behavior of calling function(s).

> DECISION TIME ...

> 1. fiddle the loader to always pass read_only => false to SPI calls,
>regardless of the volatility of the function it is loading for.
> 2. leave the loader alone, and adjust install_jar (an infrequent
>operation) to do something heretical with its on-entry snapshot.

I suspect #1 is less likely to have bad side-effects.  But I've not
done any careful analysis.

regards, tom lane

Re: infinite loop in parallel hash joins / DSA / get_best_segment

2018-09-19 Thread Thomas Munro

On Tue, Sep 18, 2018 at 11:35 AM Thomas Munro
 wrote:
> On Mon, Sep 17, 2018 at 9:12 PM Thomas Munro
>  wrote:
> > On Mon, Sep 17, 2018 at 10:42 AM Thomas Munro
> >  wrote:
> > > On Mon, Sep 17, 2018 at 10:38 AM Tomas Vondra
> > >  wrote:
> > > > While performing some benchmarks on REL_11_STABLE (at 55c2d9), I've
> > > > repeatedly hit an apparent infinite loop on TPC-H query 4. I don't know
> > > > what exactly are the triggering conditions, but the symptoms are these:
> >
> > With the attached draft patch, Tomas's benchmark script runs happily
> > for long periods.  A bit more study required with fresh eyes,
> > tomorrow.
>
> Here is a tidied up version that I propose to back-patch to 10 & 11.

Done.

I figured reviews would be unlikely to show up soon for a dsa.c patch,
but of course post-commit review is very welcome.  I thought it best
to have this fix in REL_11_STABLE being tested for as long as possible
before any potential RC1 tag, especially if Tomas plans to continue
his "big join" testing on that branch.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [WIP] [B-Tree] Retail IndexTuple deletion

2018-09-19 Thread Andrey Lepikhov

The next version of amtargetdelete() interface and its implementation 
for nbtree, devoted to the problem of retail indextuple deletion.
Uses 'v5-0001-Make-nbtree-indexes-have-unique-keys-in-tuples' patch [1] 
to delete all logical duplicates by one tree descent.


[1] 
https://www.postgresql.org/message-id/CAH2-WzkfK=jvhjkd17tldvsfb6psdutt5wyit8dg+-ufc+r...@mail.gmail.com


--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company
>From 99c9bc340a730a85328549165ae8023a0716e147 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Thu, 20 Sep 2018 07:56:18 +0500
Subject: [PATCH] Retail-IndexTuple-Deletion-Access-Method

---
 contrib/bloom/blutils.c  |   1 +
 src/backend/access/brin/brin.c   |   1 +
 src/backend/access/gin/ginutil.c |   1 +
 src/backend/access/gist/gist.c   |   1 +
 src/backend/access/hash/hash.c   |   1 +
 src/backend/access/index/indexam.c   |  15 +++
 src/backend/access/nbtree/nbtree.c   | 160 +++
 src/backend/access/spgist/spgutils.c |   1 +
 src/include/access/amapi.h   |   6 +
 src/include/access/genam.h   |  18 +++
 src/include/access/nbtree.h  |   4 +
 11 files changed, 209 insertions(+)

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 6b2b9e3742..96f1d47c70 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -126,6 +126,7 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->ambuild = blbuild;
 	amroutine->ambuildempty = blbuildempty;
 	amroutine->aminsert = blinsert;
+	amroutine->amtargetdelete = NULL;
 	amroutine->ambulkdelete = blbulkdelete;
 	amroutine->amvacuumcleanup = blvacuumcleanup;
 	amroutine->amcanreturn = NULL;
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index e95fbbcea7..a0e06bd868 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -103,6 +103,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ambuild = brinbuild;
 	amroutine->ambuildempty = brinbuildempty;
 	amroutine->aminsert = brininsert;
+	amroutine->amtargetdelete = NULL;
 	amroutine->ambulkdelete = brinbulkdelete;
 	amroutine->amvacuumcleanup = brinvacuumcleanup;
 	amroutine->amcanreturn = NULL;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 0a32182dd7..acf14e7bec 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -58,6 +58,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ambuild = ginbuild;
 	amroutine->ambuildempty = ginbuildempty;
 	amroutine->aminsert = gininsert;
+	amroutine->amtargetdelete = NULL;
 	amroutine->ambulkdelete = ginbulkdelete;
 	amroutine->amvacuumcleanup = ginvacuumcleanup;
 	amroutine->amcanreturn = NULL;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 8a42effdf7..d7a53d2ca9 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -80,6 +80,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ambuild = gistbuild;
 	amroutine->ambuildempty = gistbuildempty;
 	amroutine->aminsert = gistinsert;
+	amroutine->amtargetdelete = NULL;
 	amroutine->ambulkdelete = gistbulkdelete;
 	amroutine->amvacuumcleanup = gistvacuumcleanup;
 	amroutine->amcanreturn = gistcanreturn;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 0002df30c0..5fb32d6eba 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -76,6 +76,7 @@ hashhandler(PG_FUNCTION_ARGS)
 	amroutine->ambuild = hashbuild;
 	amroutine->ambuildempty = hashbuildempty;
 	amroutine->aminsert = hashinsert;
+	amroutine->amtargetdelete = NULL;
 	amroutine->ambulkdelete = hashbulkdelete;
 	amroutine->amvacuumcleanup = hashvacuumcleanup;
 	amroutine->amcanreturn = NULL;
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index eade540ef5..20d9acddad 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -731,6 +731,21 @@ index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap)
 	return ntids;
 }
 
+IndexTargetDeleteResult *
+index_target_delete(IndexTargetDeleteInfo *info,
+	IndexTargetDeleteResult *stats,
+	Datum *values,
+	bool *isnull)
+{
+	Relation indexRelation = info->indexRelation;
+
+	RELATION_CHECKS;
+
+	CHECK_REL_PROCEDURE(amtargetdelete);
+
+	return indexRelation->rd_amroutine->amtargetdelete(info, stats, values, isnull);
+}
+
 /* 
  *		index_bulk_delete - do mass deletion of index entries
  *
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index e8725fbbe1..a5d8ad4aa2 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -127,6 +127,7 @@ bthandler(PG_FUNCTION_ARGS)
 	amroutine->ambuild = btbuild;
 	amroutine->ambuildempty = btbuildempty;
 	amroutine->aminsert = btinsert;
+	amroutine->amtargetdelete = bttargetdelete;
 	amroutine->ambulkdelete = btbulkdelete;
 	amroutine->amvacuumcleanu

How to get active table within a transaction.

2018-09-19 Thread Hubert Zhang

Hi all,

I have a requirement to get the active table list in a child postmaster
process.

We are able to get the active table list after corresponding transaction
ends by using stat collector. Each postgres process will gather the table
change information locally, but only send the stat info to collector after
transaction end(become idle).

As an enhancement, we also want to get the active table while the
transaction inserting the table is in progress. Delay is acceptable.

Is there any existing ways in PG to support it?


-- 
Thanks

Hubert Zhang

Re: Query is over 2x slower with jit=on

čt 20. 9. 2018 v 5:39 odesílatel Andres Freund  napsal:

> On 2018-09-19 23:26:52 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2018-09-17 17:50:15 -0400, Tom Lane wrote:
> > >> Just to throw a contrarian opinion into this: I find the current
> EXPLAIN
> > >> output for JIT to be insanely verbose already.
> >
> > > Hm, it'd have been nice to get that feedback a little bit earlier, I
> did
> > > inquire...
> >
> > > Currently:
> >
> > > JIT:
> > >   Functions: 2
> > >   Generation Time: 0.680 ms
> > >   Inlining: true
> > >   Inlining Time: 7.591 ms
> > >   Optimization: true
> > >   Optimization Time: 20.522 ms
> > >   Emission Time: 14.607 ms
> >
> > Just to clarify, that seems perfectly fine for the "machine readable"
> > output formats.  I'd just like fewer lines in the "human readable"
> > output.
>
> Yea, I do think that's a fair complaint.
>
>
> > > How about making that:
> >
> > > JIT:
> > >   Functions: 2
>
> FWIW, not that I want to do that now, but at some point it might make
> sense to sub-divide this into things like number of "expressions",
> "tuple deforming", "plans", ...  Just mentioning that if somebody wants
> to comment on reformatting this as well, if we're tinkering anyway.
>
>
> > >   Options: Inlining, Optimization
> > >   Times (Total, Generation, Inlining, Optimization, Emission): 43.4
> ms, 0.680 ms, 7.591 ms, 20.522 ms, 14.607 ms
>

+1

Pavel


> >
> > > or something similar?
> >
> > That's going in the right direction.  Personally I'd make the last line
> > more like
> >
> > Times: generation 0.680 ms, inlining 7.591 ms, optimization 20.522
> ms, emission 14.607 ms, total 43.4 ms
>
> Yea, that's probably easier to read.
>
>
> > (total at the end seems more natural to me, YMMV).
>
> I kind of think doing it first is best, because that's usually the first
> thing one wants to know.
>
>
> > Also, the "options" format you suggest here seems a bit too biased
> > towards binary on/off options --- what happens when there's a
> > three-way option?  So maybe that line should be like
> >
> > Options: inlining on, optimization on
> >
> > though I'm less sure about that part.
>
> I'm pretty certain you're right :).  There's already arguments around
> making optimization more gradual (akin to O1,2,3).
>
> Greetings,
>
> Andres Freund
>
>

Re: pgsql: Allow concurrent-safe open() and fopen() in frontend code for Wi

2018-09-19 Thread Laurenz Albe

Michael Paquier wrote:
> Thanks, I have pushed the solution from Laurenz to maintain pure
> compatibility.

Thanks for all the work!

Yours,
Laurenz Albe

Re: Query is over 2x slower with jit=on

On 2018-09-19 23:26:52 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2018-09-17 17:50:15 -0400, Tom Lane wrote:
> >> Just to throw a contrarian opinion into this: I find the current EXPLAIN
> >> output for JIT to be insanely verbose already.
> 
> > Hm, it'd have been nice to get that feedback a little bit earlier, I did
> > inquire...
> 
> > Currently:
> 
> > JIT:
> >   Functions: 2
> >   Generation Time: 0.680 ms
> >   Inlining: true
> >   Inlining Time: 7.591 ms
> >   Optimization: true
> >   Optimization Time: 20.522 ms
> >   Emission Time: 14.607 ms
> 
> Just to clarify, that seems perfectly fine for the "machine readable"
> output formats.  I'd just like fewer lines in the "human readable"
> output.

Yea, I do think that's a fair complaint.


> > How about making that:
> 
> > JIT:
> >   Functions: 2

FWIW, not that I want to do that now, but at some point it might make
sense to sub-divide this into things like number of "expressions",
"tuple deforming", "plans", ...  Just mentioning that if somebody wants
to comment on reformatting this as well, if we're tinkering anyway.


> >   Options: Inlining, Optimization
> >   Times (Total, Generation, Inlining, Optimization, Emission): 43.4 ms, 
> > 0.680 ms, 7.591 ms, 20.522 ms, 14.607 ms
> 
> > or something similar?
> 
> That's going in the right direction.  Personally I'd make the last line
> more like
> 
> Times: generation 0.680 ms, inlining 7.591 ms, optimization 20.522 ms, 
> emission 14.607 ms, total 43.4 ms

Yea, that's probably easier to read.


> (total at the end seems more natural to me, YMMV).

I kind of think doing it first is best, because that's usually the first
thing one wants to know.


> Also, the "options" format you suggest here seems a bit too biased
> towards binary on/off options --- what happens when there's a
> three-way option?  So maybe that line should be like
> 
> Options: inlining on, optimization on
> 
> though I'm less sure about that part.

I'm pretty certain you're right :).  There's already arguments around
making optimization more gradual (akin to O1,2,3).

Greetings,

Andres Freund

Re: Query is over 2x slower with jit=on

Andres Freund  writes:
> On 2018-09-17 17:50:15 -0400, Tom Lane wrote:
>> Just to throw a contrarian opinion into this: I find the current EXPLAIN
>> output for JIT to be insanely verbose already.

> Hm, it'd have been nice to get that feedback a little bit earlier, I did
> inquire...

> Currently:

> JIT:
>   Functions: 2
>   Generation Time: 0.680 ms
>   Inlining: true
>   Inlining Time: 7.591 ms
>   Optimization: true
>   Optimization Time: 20.522 ms
>   Emission Time: 14.607 ms

Just to clarify, that seems perfectly fine for the "machine readable"
output formats.  I'd just like fewer lines in the "human readable"
output.

> How about making that:

> JIT:
>   Functions: 2
>   Options: Inlining, Optimization
>   Times (Total, Generation, Inlining, Optimization, Emission): 43.4 ms, 0.680 
> ms, 7.591 ms, 20.522 ms, 14.607 ms

> or something similar?

That's going in the right direction.  Personally I'd make the last line
more like

Times: generation 0.680 ms, inlining 7.591 ms, optimization 20.522 ms, 
emission 14.607 ms, total 43.4 ms

(total at the end seems more natural to me, YMMV).  Also, the "options"
format you suggest here seems a bit too biased towards binary on/off
options --- what happens when there's a three-way option?  So maybe that
line should be like

Options: inlining on, optimization on

though I'm less sure about that part.

regards, tom lane

Re: Code of Conduct plan

2018-09-19 Thread Craig Ringer

On Fri, 14 Sep 2018 at 23:11, James Keener  wrote:

> And if you believe strongly that a given statement you may have made is
>> not objectionable...you should be willing to defend it in an adjudication
>> investigation.
>
>
> So because someone doesn't like what I say in a venue 100% separate from
> postgres,  I have to subject myself, and waste my time, defending actions
> in this (and potentially other groups who would also adopt overly broad
> CoC) group.
>

(Usual disclaimer, I speak for myself not my employer here):

My understanding is that that's really only a concern for "Big Stuff".

If we have a committer who loudly and proudly goes to neo-nazi rallies or
pickup artist / pro-rape meetups, then actually yes, I have a problem with
that. That impacts my ability to work in the community, impacts everyone's
ability to recruit people to work on Postgres, potentially makes people
reluctant to engage with the community, etc.

Thankfully we don't.

I'm not sure how to codify it more clearly, though, and to a large degree I
think it's a case of presuming good intent and good will amongst all
parties.

It's clear that if the CoC leans too far, there'll certainly be no shortage
of proud defenders of liberty and free speech coming out of the woodwork,
right? (But remember, freedom of speech doesn't mean freedom from
consequences, even in nations that codify the concept of freedom of speech
at all. You shouldn't face Government sanction for it, but your peers can
still ostracise you, you can still get fired, etc.)

One of the biggest drivers of plea-bargains for innocent people in the US
> justice system is the expense of having to defend yourself. I find that to
> be a travesty; why are we duplicating that at a smaller level?
>

Because the fact that it is at a smaller level makes it way less of a
concern. No expensive lawyers. More likely we waste a lot of hot air. Like
this mail, probably.

There are intangible but very real (IMO) costs to being a community that
welcomes an unhealthy and hostile communication style, harassment and
personal attacks in the guise of technical argument, bullying defended as
making sure you have the right stuff to survive in a "meritocracy", etc.
Thankfully we are generally not such a community. But try asking a few
women you know in the Postgres community - if you can find any! - how their
experience at conferences has been. Then ask if maybe there are still a few
things we could work on changing.

I've found it quite confronting dealing with some of the more heated
exchanges on hackers from some of our most prominent team members. I've
sent the occasional gentle note to ask someone to chill and pause before
replying, too. And I've deserved to receive one a couple of times, though I
never have, as I'm far from free from blame here.

People love to point to LKML as the way it "must" be done to succeed in
software. Yet slowly that community has also come to recognise that verbal
abuse under the cloak of technical discussion is harmful to quality
discussion and drives out good people, harming the community long term.
Sure, not everything has to be super-diplomatic, but there's no excuse for
verbal bullying and wilful use of verbal aggression either. As widely
publicised, even Linus has recently recognised aspects of this, despite
being the poster child of proponents of abusive leadership for decades.

We don't have a culture like that. So in practice, I don't imagine the CoC
will see much use. The real problematic stuff that happens in this
community happens in conference halls and occasionally by private mail,
usually in the face of a power imbalance that makes the recipient/victim
reluctant to speak out. I hope a formal CoC will give them some hope
they'll be heard if they do take the personal risk to speak up. I've seen
so much victim blaming in tech that I'm not convinced most people
experiencing problems will be willing to speak out anyway, but hopefully
they'll be more so with a private and receptive group to talk to.

Let me be clear here, I'm no fan of trial by rabid mob. That's part of why
something like the CoC and a backing body is important. Otherwise people
are often forced to silently endure, or go loudly public. The latter tends
to result in a big messy explosion that hurts the community, those saying
they're victim(s) and the alleged perpetrator(s), no matter what the facts
and outcomes. It also encourages people to jump on one comment and run way
too far with it, instead of looking at patterns and giving people chances
to fix their behaviour.

I don't want us to have this:
https://techcrunch.com/2013/03/21/a-dongle-joke-that-spiraled-way-out-of-control/
. Which is actually why I favour a CoC, one with a resolution process and
encouragement toward some common sense. Every player in that story was an
idiot, and while none deserved the abuse and harrassment that came their
way, it's a shame it wan't handled by a complaint to a conference CoC group
instead.

I'd like

Re: [HACKERS] Bug in to_timestamp().

2018-09-19 Thread amul sul

On Thu, Sep 20, 2018, 3:22 AM Alexander Korotkov 
wrote:

> On Wed, Sep 19, 2018 at 1:38 PM amul sul  wrote:
> > On Wed, Sep 19, 2018 at 3:51 PM amul sul  wrote:
> > >
> > > On Wed, Sep 19, 2018 at 2:57 PM Alexander Korotkov
> > [...]
> > >
> > > With this patch, to_date and to_timestamp behaving differently, see
> this:
> > >
> > > edb=# SELECT to_date('18 12 2011', 'xDDxMMx');
> > >   to_date
> > > 
> > >  18-DEC-11 00:00:00
> > > (1 row)
> > >
> > > edb=# SELECT to_timestamp('18 12 2011', 'xDDxMMx');
> > >to_timestamp
> > > ---
> > >  08-DEC-11 00:00:00 -05:00  <=== Incorrect output.
> > > (1 row)
> > >
> > Sorry, this was wrong info -- with this patch, I had some mine trial
> changes.
> >
> > Both to_date and to_timestamp behaving same with your patch -- the
> > wrong output, we are expecting that?
> >
> > postgres =# SELECT to_date('18 12 2011', 'xDDxMMx');
> >   to_date
> > 
> >  2011-12-08
> > (1 row)
> >ma
> > postgres =# SELECT to_timestamp('18 12 2011', 'xDDxMMx');
> >   to_timestamp
> > 
> >  2011-12-08 00:00:00-05
> > (1 row)
>
> It's hard to understand whether it was expected, because it wasn't
> properly documented.  More important that it's the same behavior we
> have before cf984672, and purpose of cf984672 was not to change this.
>
> But from the code comments, it's intentional. If you put digits or
> text characters into format string, you can skip non-separator input
> string characters.  For instance you may do.
>
> # SELECT to_date('y18y12y2011', 'xDDxMMx');
>   to_date
> 
>  2011-12-18
> (1 row)
>
> It's even more interesting that letters and digits are handled in
> different manner.
>
> # SELECT to_date('01801202011', 'xDDxMMx');
> ERROR:  date/time field value out of range: "01801202011"
> Time: 0,453 ms
>
> # SELECT to_date('01801202011', '9DD9MM9');
>   to_date
> 
>  2011-12-18
> (1 row)
>
> So, letters in format string doesn't allow you to extract fields at
> particular positions of digit sequence, but digits in format string
> allows you to.  That's rather undocumented, but from the code you can
> get that it's intentional.  Thus, I think it would be nice to improve
> the documentation here.  But I still propose to commit the patch I
> propose to bring back unintentional behavior change in cf984672.
>

Agreed, thanks for working on this.

Regards,
Amul

>

vary read_only in SPI calls? or poke at the on-entry snapshot?

2018-09-19 Thread Chapman Flack

SOME BACKGROUND:

The code for PL/Java functions resides in 'class images'. These are made
available, in advance, by loading them from 'jar files' into some tables
in the database.

When a PL/Java function is called, if the JVM hasn't loaded its class image
yet, the JVM calls PL/Java's loader, which does an SPI query to retrieve
the image. This SPI query, only needed on the first reference, is
effectively part of the activity of the PL/Java function that has been
called. Among other things, that means the read_only SPI parameter is
passed as true if the PL/Java function in question is declared IMMUTABLE.

(A purist reaction to this might be to say no PL/Java function ever deserves
to be declared IMMUTABLE then, as the first call can generate a database
query to fetch the function code. But it has been that way a long time,
and as a practical matter, I'm not sure how much less "immutable" that
is than whatever happens to load/map/retrieve the needed code in other PLs.
It still strikes me as reasonable to declare a function immutable as long as
that's a fair description of what the function itself does.)

A headache does result in one corner case though.

THE HEADACHE:

The operation of making class images available in advance, from a jar file,
is done by the install_jar() function. The function reads images from the
jar file and stores them in the database. It can then also execute SQL
commands contained in the file. These can be declarations of the functions
and types and any other actions appropriate when installing the jar. They
can include queries that use the types and functions that just got declared.

A query that uses a newly-declared type will illustrate the headache,
because it results in a nested call into PL/Java for the type I/O
function, which is (typically) declared immutable. If it's the first
use of the class, the JVM will need to call the loader. The loader does
an SPI query to retrieve the image. Because this takes place 'during'
a call to an 'immutable' function, read_only => true is passed to SPI.
One effect of read_only => true in SPI is to rely on a snapshot from
the very start of the operation---when the jar file just loaded wasn't
there yet! Therefore, the loader fails to find the class.

So, this headache only affects references to the classes in a jar from
the install commands contained in the same jar, and only when some of
those install commands refer to immutable functions, and only when such
commands occur before any call to a stable or volatile function that
happens to be defined in the same class.

It can be worked around by reordering commands in the jar or by adding
a no-op function declared stable or volatile and adding an early query
that calls it. But these are hacks, and I would like to address the root
issue so hacks aren't necessary.

PASS read_only => false FROM THE CLASS LOADER?

One obvious approach would be to just arrange for read_only always to
be false in SPI queries originating from the class loader, regardless
of the volatility declaration on the function on whose behalf the loader
is operating.

As to the convention of setting read_only based on the function's
volatility, the manual says "While authors of C functions are able to
violate this convention, it's unlikely to be a good idea to do so."

Maybe this is an example where an exception is reasonable. So, a
function declared immutable would generate read_only SPI queries for
all of its normal behavior, but its first call might first generate
a non-read_only query from the class loader, followed by read_only
ones doing the function's own business.

SOMEHOW ADVANCE THE ON-ENTRY SNAPSHOT DURING install_jar?

An alternative could be to make the install_jar function smarter,
since it is only during the operation of install_jar that there is
any issue here. The function does two things in sequence:

1. Load the class images from the jar file
2. Execute the SQL commands in the file

install_jar has not much control or insight into step 2; it is just
reading query text and passing it to SPI, and we'll see the problem
if one of those commands refers to a not-yet-loaded class from step 1.

But install_jar knows exactly what it's doing in step 1. It populates
the tables of class images, using ordinary, non-read_only SPI calls.
It could even CommandCounterIncrement the snapshot those calls are
using---only that's futile, because that snapshot gets popped at the
end of step 1, leaving the former ActiveSnapshot, where the counter
is unchanged.

Any SPI calls made in step 2 will either get a new snapshot (in the
read_only false case), or rely on the current ActiveSnapshot (in the
read_only true case), which is the one that's too old to succeed.

Would it be unprecedented / be unreasonable / break anything for the
install_jar function to simply force a CommandCounterIncrement
at the end of step 1 (after its temporary snapshot has been popped,
so the former/on-entry ActiveSnapshot gets the increment)?

This really strike

Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)

2018-09-19 Thread Michael Paquier

On Wed, Sep 19, 2018 at 02:48:58PM -0700, Andres Freund wrote:
> On 2018-09-19 12:06:47 +0900, Michael Paquier wrote:
>> Yeah, my gut is telling me that this would be the best approach for now,
>> still I am not sure that this is the best move in the long term.
> 
> ISTM heap_sync() would be the entirely wrong layer to handle
> partitioning. For several reasons: 1) With pluggable storage, we want to
> have multiple different table implementations, doing the syncing on the
> heap_* for partitions would thus be wrong. 2) In just about all cases we
> only want to sync a few partitions, there's not really a use-case for
> doing syncs across all partitions imo.

I haven't considered things from the angle of 1), which is a very good
point.  2) is also a good argument.

>> All the other callers of heap_sync don't care about partitioned
>> tables, so we could add an assertion on RELKIND_PARTITIONED_TABLE.
> 
> Or rather, it should assert the expected relkinds?

Yeah, I think that we are coming back to what heap_create assumes in
term of which relkinds should have storage or not, so a global macro
able to do the work would be adapted perhaps?
--
Michael


signature.asc
Description: PGP signature

Re: pgsql: Allow concurrent-safe open() and fopen() in frontend code for Wi

2018-09-19 Thread Michael Paquier

On Wed, Sep 19, 2018 at 11:55:01AM -0400, Tom Lane wrote:
> I'm OK with this approach.  I wonder though what happens if you take
> away the "#ifdef FRONTEND" and just enforce that one or the other mode
> is selected always.  That would seem like a sensible solution rather
> than a wart to me ...

Thanks, I have pushed the solution from Laurenz to maintain pure
compatibility.  The origin of the failures reported by pg_dump actually
comes from a mismatch with pg_hba.conf in the way users and/or databases
are parsed.  I am glad that we have tests for the full range of ASCII
characters in TAP so as it is easy to detect regressions at early
stages.  We could try to manipulate more setmode calls like the one in
miscinit.c so as authentication gets the right call.  I am not sure of
other side effects though...
--
Michael


signature.asc
Description: PGP signature

SIGDANGER and oomd

2018-09-19 Thread Thomas Munro

Hello hackers,

On AIX, which I think might have been a pioneer in overcommitting
memory (?), they have the wonderfully named signal SIGDANGER. It's
delivered to every process on the system as a final courtesy some time
before the OOM killer starts delivering SIGKILL. The theory is that
you might be able to give back memory to avoid the carnage.

Most (all?) popular operating systems overcommit memory, and from time
to time someone proposes that they should have SIGDANGER too. I've
often wondered if it makes sense or not as a concept: although you
might manage to avoid getting killed for a while, is your access
pattern sustainable? On the other hand, maybe if you're less afraid
of the OOM you could use resources better. I'm not sure, but perhaps
it's just too blunt an instrument. Certainly it would be possible for
PostgreSQL to respond to SIGDANGER by dropping caches and perhaps even
old idle backends via a CHECK_FOR_INTERRUPTS() handler, and combined
with log monitoring so that you know you have a serious problem and
need to consider making changes so it doesn't happen again, I think
over all it's probably a good thing to support, because your system
might actually survive the storm. This question was somewhat
hypothetical, because Linux doesn't have it, and I don't expect we
have too many AIX users. Neither does FreeBSD, though I've been
tempted to propose it there.

Recently Facebook published oomd[1][2] for Linux. I haven't tried it
(for one thing you currently need a patched kernel which is a barrier
to casual investigation) but basically it's a better OOM mousetrap
that runs in userspace. It could be interesting for PostgreSQL
installations, because you can configure it to be smarter about what
to kill, and you can teach it to give you feedback and warnings about
pressure. It may be that no modification to PostgreSQL is required at
all to make good use of this, or it may be that you'd want a new
signalling mechanism that we'd have to provide, amounting to a
home-made equivalent of SIGDANGER (or something richer).

I don't have a specific proposal, but thought this was newsworthy and
might give someone some ideas.

[1] https://github.com/facebookincubator/oomd
[2]
https://code.fb.com/production-engineering/open-sourcing-oomd-a-new-approach-to-handling-ooms/

--
Thomas Munro
http://www.enterprisedb.com

Re: Query is over 2x slower with jit=on

Hi,

On 2018-09-17 17:50:15 -0400, Tom Lane wrote:
> Amit Khandekar  writes:
> > On 11 September 2018 at 14:50, Amit Khandekar  
> > wrote:
> >> On 10 September 2018 at 21:39, Andres Freund  wrote:
> >>> On 2018-09-10 15:42:55 +0530, Amit Khandekar wrote:
>  I think we better show per-worker jit info also.
>
> Just to throw a contrarian opinion into this: I find the current EXPLAIN
> output for JIT to be insanely verbose already.

Hm, it'd have been nice to get that feedback a little bit earlier, I did
inquire...

Currently:

JIT:
  Functions: 2
  Generation Time: 0.680 ms
  Inlining: true
  Inlining Time: 7.591 ms
  Optimization: true
  Optimization Time: 20.522 ms
  Emission Time: 14.607 ms

How about making that:

JIT:
  Functions: 2
  Options: Inlining, Optimization
  Times (Total, Generation, Inlining, Optimization, Emission): 43.4 ms, 0.680 
ms, 7.591 ms, 20.522 ms, 14.607 ms

or something similar?

Greetings,

Andres Freund

Re: [patch] Support LLVM 7

Hi,

On 2018-09-16 09:48:34 +0200, Christoph Berg wrote:
> Re: To Andres Freund 2018-09-12 <20180912210734.gb5...@msg.df7cb.de>
> > I plan to switch postgresql-11.deb to LLVM 7 over the next days
> > because of the support for non-x86 architectures
> 
> I did an upload of postgresql-11 beta3 with llvm 7 enabled on the
> architectures where it is available (or supposed to become available),
> that is, on !alpha !hppa !hurd-i386 !ia64 !kfreebsd-amd64 !kfreebsd-i386 
> !m68k !sh4.

Cool.  No idea why kfreebsd is on that list, but that's not really a
postgres relevan concern...


> There are two failures:
> https://buildd.debian.org/status/logs.php?pkg=postgresql-11&ver=11~beta3-2
> 
> sparc64 fails with a lot of these in the log:
> 
> FATAL:  fatal llvm error: Invalid data was encountered while parsing the file

Hm, that sounds like a proper serious LLVM bug. If it can't read bitcode
files it wrote, something is seriously wrong.


> powerpc (the old 32-bit variant) has a lot of "server closed the
> connection unexpectedly" in the regression logs, and one SIGILL:
> 
> 2018-09-15 10:49:25.052 UTC [26458] LOG:  server process (PID 26527) was 
> terminated by signal 4: Illegal instruction
> 2018-09-15 10:49:25.052 UTC [26458] DETAIL:  Failed process was running: 
> SELECT '' AS tf_12, BOOLTBL1.*, BOOLTBL2.*
>  FROM BOOLTBL1, BOOLTBL2
>  WHERE BOOLTBL2.f1 <> BOOLTBL1.f1;
> 2018-09-15 10:49:25.052 UTC [26458] LOG:  terminating any other active server 
> processes

Hm. Is there any chance to get a backtrace for this one?  This could,
although I think less likely so, also be a postgres issue
(e.g. generating code for the wrong microarch).


> Both smell more like LLVM bugs rather than PostgreSQL, so I guess we
> can dub that a success. I'll disable JIT on these architectures for
> the next upload.

Ok.

Greetings,

Andres Freund

Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)

On 2018-09-19 12:06:47 +0900, Michael Paquier wrote:
> On Wed, Sep 19, 2018 at 01:14:10PM +1200, David Rowley wrote:
> > Wouldn't it be better to modify copy.c to just perform the heap_sync
> > on just the partitions it touches?
> 
> Yeah, my gut is telling me that this would be the best approach for now,
> still I am not sure that this is the best move in the long term.

ISTM heap_sync() would be the entirely wrong layer to handle
partitioning. For several reasons: 1) With pluggable storage, we want to
have multiple different table implementations, doing the syncing on the
heap_* for partitions would thus be wrong. 2) In just about all cases we
only want to sync a few partitions, there's not really a use-case for
doing syncs across all partitions imo.

> All the other callers of heap_sync don't care about partitioned
> tables, so we could add an assertion on RELKIND_PARTITIONED_TABLE.

Or rather, it should assert the expected relkinds?

Greetings,

Andres Freund

Re: Progress reporting for pg_verify_checksums

2018-09-19 Thread Thomas Munro

On Tue, Sep 4, 2018 at 2:21 AM Alvaro Herrera  wrote:
> On 2018-Sep-01, Fabien COELHO wrote:
> > > If -P was forgotten and pg_verify_checksums operates on a large cluster,
> > > the caller can send SIGUSR1 to pg_verify_checksums to turn progress
> > > status reporting on during runtime.
> >
> > Hmmm. I cannot say I like the signal feature much. Would it make sense for
> > the progress to be on by default, and to have a quiet option instead?
>
> Hmm, I recall this technique being used elsewhere and is sometimes
> useful.  Can't remember where though -- by manpages, it's not rsync nor
> pv ...

On BSD-derived systems including macOS, you can hit ^T to deliver
SIGINFO to the foreground process (much like ^C delivers SIGINT).
Many well behaved programs including rsync (at least on FreeBSD) then
show you some kind of progress or status report:

munro@asterix $ sleep 60
load: 0.11  cmd: sleep 62234 [nanslp] 1.22r 0.00u 0.00s 0% 2032k
sleep: about 58 second(s) left out of the original 60

I've contemplated various crackpot ideas about teaching psql to show
information about what my backend is doing when I hit ^T (perhaps via
a second connection, similar to the way it cancels queries?)...

There was an interesting thread[1] about dumping various things from
backends on (multiplexed) SIGUSR1.  Also if memory serves, JVMs also
dump internal stuff when you signal them, so you can confirm that they
really did eat all 64GB of your RAM.  But none of these things are
examples of enabling/disabling an on-going status reporting mode,
which is perhaps what you meant, they're more like one-off pokes to
dump information.

[1] 
https://www.postgresql.org/message-id/flat/20140623101501.GN16260%40awork2.anarazel.de

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [HACKERS] Bug in to_timestamp().

2018-09-19 Thread Alexander Korotkov

On Wed, Sep 19, 2018 at 1:38 PM amul sul  wrote:
> On Wed, Sep 19, 2018 at 3:51 PM amul sul  wrote:
> >
> > On Wed, Sep 19, 2018 at 2:57 PM Alexander Korotkov
> [...]
> >
> > With this patch, to_date and to_timestamp behaving differently, see this:
> >
> > edb=# SELECT to_date('18 12 2011', 'xDDxMMx');
> >   to_date
> > 
> >  18-DEC-11 00:00:00
> > (1 row)
> >
> > edb=# SELECT to_timestamp('18 12 2011', 'xDDxMMx');
> >to_timestamp
> > ---
> >  08-DEC-11 00:00:00 -05:00  <=== Incorrect output.
> > (1 row)
> >
> Sorry, this was wrong info -- with this patch, I had some mine trial changes.
>
> Both to_date and to_timestamp behaving same with your patch -- the
> wrong output, we are expecting that?
>
> postgres =# SELECT to_date('18 12 2011', 'xDDxMMx');
>   to_date
> 
>  2011-12-08
> (1 row)
>ma
> postgres =# SELECT to_timestamp('18 12 2011', 'xDDxMMx');
>   to_timestamp
> 
>  2011-12-08 00:00:00-05
> (1 row)

It's hard to understand whether it was expected, because it wasn't
properly documented.  More important that it's the same behavior we
have before cf984672, and purpose of cf984672 was not to change this.

But from the code comments, it's intentional. If you put digits or
text characters into format string, you can skip non-separator input
string characters.  For instance you may do.

# SELECT to_date('y18y12y2011', 'xDDxMMx');
  to_date

 2011-12-18
(1 row)

It's even more interesting that letters and digits are handled in
different manner.

# SELECT to_date('01801202011', 'xDDxMMx');
ERROR:  date/time field value out of range: "01801202011"
Time: 0,453 ms

# SELECT to_date('01801202011', '9DD9MM9');
  to_date

 2011-12-18
(1 row)

So, letters in format string doesn't allow you to extract fields at
particular positions of digit sequence, but digits in format string
allows you to.  That's rather undocumented, but from the code you can
get that it's intentional.  Thus, I think it would be nice to improve
the documentation here.  But I still propose to commit the patch I
propose to bring back unintentional behavior change in cf984672.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] SERIALIZABLE with parallel query

2018-09-19 Thread Kevin Grittner

After reviewing the thread and the current two patches, I agree with
Masahiko Sawada plus preferring one adjustment to the coding: I would
prefer to break out the majority of the ReleasePredicateLocks function
to a static ReleasePredicateLocksMain (or similar) function and
eliminating the goto.

The optimization in patch 0002 is important.  Previous benchmarks
showed a fairly straightforward pgbench test scaled as well as
REPEATABLE READ when it was present, but performance fell off up to
20% as the scale increased without it.

I will spend a few more days in testing and review, but figured I
should pass along "first impressions" now.

On Tue, Jul 10, 2018 at 8:58 PM Masahiko Sawada  wrote:
>
> On Mon, Jul 2, 2018 at 3:12 PM, Masahiko Sawada  wrote:
> > On Fri, Jun 29, 2018 at 7:28 PM, Thomas Munro
> >  wrote:
> >> On Thu, Jun 28, 2018 at 7:55 PM, Masahiko Sawada  
> >> wrote:
> >>> I'd like to test and review this patches but they seem to conflict
> >>> with current HEAD. Could you please rebase them?
> >>
> >> Hi Sawada-san,
> >>
> >> Thanks!  Rebased and attached.  The only changes are: the LWLock
> >> tranche is now shown as "serializable_xact" instead of "sxact" (hmm,
> >> LWLock tranches have lower_case_names_with_underscores, but individual
> >> LWLocks have CamelCaseName...), and ShareSerializableXact() no longer
> >> does Assert(!IsParallelWorker()).  These changes are based on the last
> >> feedback from Robert.
> >>
> >
> > Thank you! Will look at patches.
> >
>
> I looked at this patches. The latest patch can build without any
> errors and warnings and pass all regression tests. I don't see
> critical bugs but there are random comments.
>
> +   /*
> +* If the leader in a parallel query earler stashed a 
> partially
> +* released SERIALIZABLEXACT for final clean-up at end
> of transaction
> +* (because workers might still have been accessing
> it), then it's
> +* time to restore it.
> +*/
>
> There is a typo.
> s/earler/earlier/
>
> 
> Should we add test to check if write-skew[1] anomaly doesn't happen
> even in parallel mode?
>
> 
> - * We aren't acquiring lightweight locks for the predicate lock or lock
> + * We acquire an LWLock in the case of parallel mode, because worker
> + * backends have access to the leader's SERIALIZABLEXACT.  Otherwise,
> + * we aren't acquiring lightweight locks for the predicate lock or lock
>
> There are LWLock and lightweight lock. Maybe it's better to unify the 
> spelling.
>
> 
> @@ -2231,9 +2234,12 @@ PrepareTransaction(void)
> /*
>  * Mark serializable transaction as complete for predicate locking
>  * purposes.  This should be done as late as we can put it and
> still allow
> -* errors to be raised for failure patterns found at commit.
> +* errors to be raised for failure patterns found at commit.
> This is not
> +* appropriate for parallel workers however, because we aren't
> committing
> +* the leader's transaction and its serializable state will live on.
>  */
> -   PreCommit_CheckForSerializationFailure();
> +   if (!IsParallelWorker())
> +   PreCommit_CheckForSerializationFailure();
>
> This code assumes that parallel workers could prepare transaction. Is
> that expected behavior of parallel query? There is an assertion
> !IsInParallelMode() at the beginning of that function though.
>
> 
> +/*
> + * If the transaction is committing, but it has been partially released
> + * already, then treat this as a roll back.  It was marked as rolled 
> back.
> + */
> +if (isCommit && SxactIsPartiallyReleased(MySerializableXact))
> +isCommit = false;
> +
>
> Isn't it better to add an assertion to check if
> MySerializableXact->flags has SXACT_FLAG_ROLLED_BACK flag for safety?
>
> [1] https://en.wikipedia.org/wiki/Snapshot_isolation#Definition
>
> Regards,
>
> --
> Masahiko Sawada
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center



-- 
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/

Re: PATCH: pgbench - option to build using ppoll() for larger connection counts

"Rady, Doug"  writes:
> This patch enables building pgbench to use ppoll() instead of select()
> to allow for more than (FD_SETSIZE - 10) connections.  As implemented,
> when using ppoll(), the only connection limitation is system resources.

So ... why exactly is this patch insisting on ppoll() rather than just
plain poll()?  AFAICS, all you're doing with that is being able to
specify the timeout in microsec not millisec, which does not really
justify taking much of a hit in portability, to my mind.

> “… ppoll() is to poll() as pselect() is to select().  Since the
> existing code uses select(), why not convert to poll() rather than
> ppoll()?”

A moment's glance at our git history will remind you that we attempted
to start using pselect() last year, and the attempt crashed and burned:

Author: Tom Lane 
Branch: master Release: REL_10_BR [64925603c] 2017-04-24 18:29:03 -0400

Revert "Use pselect(2) not select(2), if available, to wait in postmaster's 
loop."

This reverts commit 81069a9efc5a374dd39874a161f456f0fb3afba4.

Buildfarm results suggest that some platforms have versions of pselect(2)
that are not merely non-atomic, but flat out non-functional.  Revert the
use-pselect patch to confirm this diagnosis (and exclude the no-SA_RESTART
patch as the source of trouble).  If it's so, we should probably look into
blacklisting specific platforms that have broken pselect.

Discussion: https://postgr.es/m/9696.1493072...@sss.pgh.pa.us

Now, it might be that ppoll doesn't suffer from as many portability
problems as pselect, but I don't see a good reason to assume that.
So I'd rather see if we can't convert this to use poll(), and thereby
ensure that it works basically everywhere.

regards, tom lane

Re: Progress reporting for pg_verify_checksums

2018-09-19 Thread Michael Banck

Hi,

thanks for the review!

On Wed, Sep 19, 2018 at 05:17:05PM +0200, Fabien COELHO wrote:
> >>This optionally prints the progress of pg_verify_checksums via read
> >>kilobytes to the terminal with the new command line argument -P.
> >>
> >>If -P was forgotten and pg_verify_checksums operates on a large cluster,
> >>the caller can send SIGUSR1 to pg_verify_checksums to turn progress
> >>status reporting on during runtime.
> >
> >Version 2 of the patch is attached. This is rebased to master as of
> >422952ee and changes the signal handling to be a toggle as suggested by
> >Alvaro.
> 
> Patch applies cleanly and compiles.
> 
> About tests: "make check" is okay, but ITSM that the command is not started
> once, ever, in any test:-( It is unclear whether any test triggers checksums
> anyway...

Yeah, this was discussed on another thread and some basic tap tests for
pg_verify_checksums were proposed (also by me), but this hasn't been
further addressed.

Personally, I think this would be a good thing to add to v11 even.

In any case, this particular feature might not be very easy to tap test,
but I am open to suggestions, of course.

> The time() granularity to the second makes the result awkward on small
> tests:
> 
>  8/1554552 kB (0%, 8 kB/s)
>  1040856/1554552 kB (66%, 1040856 kB/s)
>  1554552/1554552 kB (100%, 1554552 kB/s)
> 
> I'd suggest to reuse the "portability/instr_time.h" stuff which allows a
> much better precision.
> 
> The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying
> 1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are
> about storage which is usually measured in powers of 1,000, so I'd suggest
> to use thousands.
> 
> Another reserve I have is that on any hardware it is likely that there will
> be a lot of kilos, so maybe megas (MB) should be used instead.

The display is exactly the same as in pg_basebackup (except the
bandwith is shown as well), so right now I think it is more useful to be
consistent here.  So if we change that, pg_basebackup should be changed
as well I think.

Maybe the code could be factored out to some common file in the future.

> I'm wondering whether the bandwidth display should be local (from the last
> display) or global (from the start of the command), but for the last forced
> one. This is an open question.

> Maybe it would be nice to show elapsed time and expected completion time
> based on the current speed.

Maybe; this could be added to the factored out common code I mentioned
above.

> I would be in favor or having this turned on by default, and silenced with
> some option. I'm not sure other people would agree, though, so it is an open
> question as well.

If this runs in a script, it would be pretty annoying, so everybody
would have to add --no-progress so I am not convinced. Also, both
pg_basebackup and pgbench don't show progress by default.

> About the code:
> 
>  +   if (show_progress)
>  +   show_progress = false;
>  +   else
>  +   show_progress = true;
> 
> Why not a simple:
> 
>   /* invert current state */
>   show_progress = !show_progress;

Fair enough.

> I do not see much advantage in using intermediate string buffers for values:
> 
>  + snprintf(totalstr, sizeof(totalstr), INT64_FORMAT,
>  +  total_size / 1024);
>  + snprintf(currentstr, sizeof(currentstr), INT64_FORMAT,
>  +  current_size / 1024);
>  + snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT,
>  +  (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 : 
> (time(NULL) - scan_started)));
>  + fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)",
>  +  currentstr, totalstr, total_percent, currspeedstr);
> 
> ISTM that just one simple fprintf would be fine:
> 
>   fprintf(stderr, INT64_FORMAT "/" INT64_FORMAT " ...",
>   formulas for each slot...);

That code is adopted from pg_basebackup.c and the comment there says:

| * Separate step to keep platform-dependent format code out of
| * translatable strings.  And we only test for INT64_FORMAT
| * availability in snprintf, not fprintf.

So that sounds legit to me.

> ISTM that this line overwriting trick deserves a comment:
> 
>  + if (isatty(fileno(stderr)))
>  +   fprintf(stderr, "\r");
>  + else
>  +   fprintf(stderr, "\n");

I agree a comment should be in there. But that code is also taken
verbatim from pg_basebackup.c (but this time, there's no comment there,
either).

How about this:

| * If we are reporting to a terminal, send a carriage return so that
| * we stay on the same line.  If not, send a newline. 

> And it runs a little amok with "-v".

Do you suggest we should make those mutually exlusive?  I agree that in
general, -P is not very useful if -v is on, but if you have a really big
table, it should still be, no?

>  + memset(&act, '\0', sizeof(act));
> 
> pg uses 0 instead of '\0' everywhere else.

Ok.

>   +   /* Make sure we just

Re: errmsg() ending with a period

Daniel Gustafsson  writes:
> Happened to notice that two errmsg() strings end with a period, which IIUC is
> against the Error Message Style Guide.  Fixed in the attached patch.

Yup.  Pushed, thanks.

regards, tom lane

Re: Code of Conduct

2018-09-19 Thread Andrew Dunstan





On 09/19/2018 04:27 PM, Kevin Grittner wrote:

On Tue, Sep 18, 2018 at 5:28 AM Dave Page  wrote:

The PostgreSQL Core team are pleased to announce that following a long 
consultation process, the project’s Code of Conduct (CoC) has now been 
finalised and published at https://www.postgresql.org/about/policies/coc/.

Please take time to read and understand the CoC, which is intended to ensure 
that PostgreSQL remains an open and enjoyable project for anyone to join and 
participate in.

A Code of Conduct Committee has been formed to handle any complaints. This 
consists of the following volunteers:

- Stacey Haysler (Chair)
- Lætitia Avrot
- Vik Fearing
- Jonathan Katz
- Ilya Kosmodemiansky

We would like to extend our thanks and gratitude to Stacey Haysler for her 
patience and expertise in helping develop the Code of Conduct, forming the 
committee and guiding the work to completion.

My thanks to all who participated.



Indeed, many thanks.

[...]

In the meantime, I was very happy to see the so many new faces at
PostgresOpen SV 2018; maybe it's just a happy coincidence, but if this
effort had anything to do with drawing in more people, it was well
worth the effort!




Yeah. The crowd also seemed noticeably more diverse than I have usually 
seen at Postgres conferences. That's a small beginning, but it's a 
welcome development.


cheers

andrew


--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: fast default vs triggers

2018-09-19 Thread Andrew Dunstan





On 09/18/2018 03:36 PM, Andrew Dunstan wrote:



Tomas Vondra has pointed out to me that there's an issue with triggers 
not getting expanded tuples for columns with fast defaults. Here is an 
example that shows the issue:



   andrew=# create table blurfl (id int);
   CREATE TABLE
   andrew=# insert into blurfl select x from generate_series(1,5) x;
   INSERT 0 5
   andrew=# alter table blurfl add column x int default 100;
   ALTER TABLE
   andrew=# create or replace function showmej() returns trigger
   language plpgsql as $$ declare j json; begin j := to_json(old);
   raise notice 'old x: %', j->>'x'; return new; end; $$;
   CREATE FUNCTION
   andrew=# create trigger show_x before update on blurfl for each row
   execute procedure showmej();
   CREATE TRIGGER
   andrew=# update blurfl set id = id where id = 1;
   NOTICE:  old x: 
   UPDATE 1
   andrew=# update blurfl set id = id where id = 1;
   NOTICE:  old x: 100
   UPDATE 1
   andrew=#


The error is fixed with this patch:


   diff --git a/src/backend/commands/trigger.c 
b/src/backend/commands/trigger.c

   index 2436692..f34a72a 100644
   --- a/src/backend/commands/trigger.c
   +++ b/src/backend/commands/trigger.c
   @@ -3396,7 +3396,11 @@ ltrmark:;
    LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
    }
    -   result = heap_copytuple(&tuple);
   +   if (HeapTupleHeaderGetNatts(tuple.t_data) < 
relation->rd_att->natts)

   +   result = heap_expand_tuple(&tuple, relation->rd_att);
   +   else
   +   result = heap_copytuple(&tuple);
   +
    ReleaseBuffer(buffer);
     return result;

I'm going to re-check the various places that this might have been 
missed. I guess it belongs on the open items list.








I haven't found anything further that is misbehaving. I paid particular 
attention to indexes and index-only scans.


I propose to commit this along with an appropriate regression test.

cheers

andrew

--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Code of Conduct

2018-09-19 Thread Kevin Grittner

On Tue, Sep 18, 2018 at 5:28 AM Dave Page  wrote:
>
> The PostgreSQL Core team are pleased to announce that following a long 
> consultation process, the project’s Code of Conduct (CoC) has now been 
> finalised and published at https://www.postgresql.org/about/policies/coc/.
>
> Please take time to read and understand the CoC, which is intended to ensure 
> that PostgreSQL remains an open and enjoyable project for anyone to join and 
> participate in.
>
> A Code of Conduct Committee has been formed to handle any complaints. This 
> consists of the following volunteers:
>
> - Stacey Haysler (Chair)
> - Lætitia Avrot
> - Vik Fearing
> - Jonathan Katz
> - Ilya Kosmodemiansky
>
> We would like to extend our thanks and gratitude to Stacey Haysler for her 
> patience and expertise in helping develop the Code of Conduct, forming the 
> committee and guiding the work to completion.

My thanks to all who participated.

FWIW, my view is that a CoC shares one very important characteristic
with coding style guides: it's not as important what the details are
as that you have one and everyone pays attention to it.  I was in an
early PGCon meeting on the topic, and offered some opinions early in
the process, so many of you may remember that my view was to keep it
short and simple -- a wide net with broad mesh, and trust that with
competent application nothing would slip through.

My biggest concern about the current document is that it is hard to
make it from start to end, reading every word.  To check my
(admittedly subjective) impression, I put it through the free
"Readability Test Tool" at
https://www.webpagefx.com/tools/read-able/check.php (pasting the
document itself into the "TEST BY DIRECT INPUT" tab so that page
menus, footers, etc. were not included in the score), and got this:

"""
Test Results:
Your text has an average grade level of about 16. It should be easily
understood by 21 to 22 year olds.
"""

Now, on the whole that doesn't sound too bad, since the audience
should be mature and educated enough to deal with that, but it does
suggest that it might be a bit of a burden on some for whom English is
not their first language (unless we have translations?).

Further detail:

"""
Readability Indices

Flesch Kincaid Reading Ease 32.2
Flesch Kincaid Grade Level 15.2
Gunning Fog Score 18.3
SMOG Index 13.9
Coleman Liau Index 14.8
Automated Readability Index 16

Text Statistics

No. of sentences 65
No. of words 1681
No. of complex words 379
Percent of complex words 22.55%
Average words per sentence 25.86
Average syllables per word 1.75
"""

Note that the page mentions that the Flesch Kincaid Reading Ease score
is based on a 0-100 scale. A high score means the text is easier to
read. Low scores suggest the text is complicated to understand.  A
value between 60 and 80 should be easy for a 12 to 15 year old to
understand.  Our score was 32.2.

Perhaps in next year's review we could try to ease this a little.

In the meantime, I was very happy to see the so many new faces at
PostgresOpen SV 2018; maybe it's just a happy coincidence, but if this
effort had anything to do with drawing in more people, it was well
worth the effort!

Kevin Grittner

--
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/

errmsg() ending with a period

2018-09-19 Thread Daniel Gustafsson

Happened to notice that two errmsg() strings end with a period, which IIUC is
against the Error Message Style Guide.  Fixed in the attached patch.

cheers ./daniel



errmsg_punctuation.patch
Description: Binary data

Re: Code of Conduct

2018-09-19 Thread Stephen Frost

Greetings,

* Francisco Olarte (fola...@peoplecall.com) wrote:
> I will happily pardon brevity ( although I would not call a ten line
> sig plus a huge bottom quote "breve", and AFAIK it means the same in
> english as in spanish ) and/or typos, but the "I am not responsible"
> feels nearly insulting. Did someone force you to use "this device" (
> which you seem to perceive as inadequate for a nice answer ) to reply,
> or did you choose to do it ? ( real, not rethoric question, but do not
> answer if you feel  its inadequate )

Let's please try to keep the off-topic discussion on these lists to a
minimum.

> As an aside, is this kind of afirmations and/or my response to it a
> violation of the current CoC ?

There's a way to find out the answer to that question, but it's
certainly not to send an email to this list asking about it.  Please
review the policy, and follow the process outlined there if you feel the
need to.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re: PostgreSQL 11 {Beta 4, RC1} Release: 2018-09-20

2018-09-19 Thread Jonathan S. Katz

On 9/19/18 2:41 AM, Noah Misch wrote:
> On Tue, Sep 18, 2018 at 12:15:45PM -0400, Jonathan S. Katz wrote:
>> On 9/13/18 12:51 PM, Jonathan S. Katz wrote:
>>> We are planning to have another release of PostgreSQL 11, either Beta 4
>>> or RC1, next week on Thursday, 2018-09-20. The version will be
>>> determined based on the state of the open items list[1] around the time
>>> of stamping.
>>
>> PostgreSQL 11 was stamped Beta 4[1] yesterday evening for release on
>> 2018-09-20.
>>
>> There is a copy of the Beta 4 release notes draft here[2]. I would
>> greatly appreciate any review and feedback on them, particularly around
>> technical accuracy and any omissions that should be included.
> 
> s/verifiy/verify/.  Looks good otherwise.

Fixed. Thanks for reviewing!

Jonathan



signature.asc
Description: OpenPGP digital signature

Re: Code of Conduct

2018-09-19 Thread Fred Pratt

Sorry, I emailed using my company account and thus the long sig.   In an effort 
to avoid further insulting Mr Olarte, I will delete it this time.See, 
Self-policing works !

Fred

Re: Code of Conduct

2018-09-19 Thread Francisco Olarte

On Wed, Sep 19, 2018 at 5:27 PM, Fred Pratt
 wrote:
> Keep pg open and free.   This smells of PC police.   This community can 
> police itself
No comment on this, just kept for context.

> Sent from my mobile device. Please pardon my brevity and typos.   I am not 
> responsible for changes made by this device’s autocorrect feature.

I will happily pardon brevity ( although I would not call a ten line
sig plus a huge bottom quote "breve", and AFAIK it means the same in
english as in spanish ) and/or typos, but the "I am not responsible"
feels nearly insulting. Did someone force you to use "this device" (
which you seem to perceive as inadequate for a nice answer ) to reply,
or did you choose to do it ? ( real, not rethoric question, but do not
answer if you feel  its inadequate )

As an aside, is this kind of afirmations and/or my response to it a
violation of the current CoC ?

Francisco Olarte.

Re: doc - add missing documentation for "acldefault"

2018-09-19 Thread John Naylor

On 9/19/18, Tom Lane  wrote:
> However, I don't object to documenting any function that has its
> own pg_description string.

Speaking of, that description string seems to have been neglected.
I've attached a remedy for that.

-John Naylor
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 860571440a..6b7088b9ce 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -2073,7 +2073,7 @@
 { oid => '1365', descr => 'make ACL item',
   proname => 'makeaclitem', prorettype => 'aclitem',
   proargtypes => 'oid oid text bool', prosrc => 'makeaclitem' },
-{ oid => '3943', descr => 'TODO',
+{ oid => '3943', descr => 'show default privileges, for use by the information schema',
   proname => 'acldefault', prorettype => '_aclitem', proargtypes => 'char oid',
   prosrc => 'acldefault_sql' },
 { oid => '1689',

Re: pgsql: Allow concurrent-safe open() and fopen() in frontend code for Wi

Michael Paquier  writes:
> I have spent a large portion of my morning trying to test all the
> solutions proposed, and a winner shows up.  ...

> - win32-open-laurenz.patch, which enforces to text mode only if binary
> mode is not defined, which maps strictly to what pre-11 is doing when
> calling the system _open or _fopen.  And surprisingly, this is proving
> to pass all the tests I ran: bincheck (including pgbench and pg_dump),
> upgradecheck, recoverycheck, check, etc.  initdb --pwfile is not
> complaining to me either.

I'm OK with this approach.  I wonder though what happens if you take
away the "#ifdef FRONTEND" and just enforce that one or the other mode
is selected always.  That would seem like a sensible solution rather
than a wart to me ...

regards, tom lane

Re: Code of Conduct

2018-09-19 Thread Fred Pratt

Keep pg open and free.   This smells of PC police.   This community can police 
itself

Sent from my mobile device. Please pardon my brevity and typos.   I am not 
responsible for changes made by this device’s autocorrect feature.

Fred Pratt
AmerisourceBergen
Manager – IT Infrastructure
Micro Technologies

8701 CenterPort Blvd
Amarillo, TX  79108

Work: 806.372.2369 (Ext. 8364)
Fax: 855.849.0680
Mobile: 806.679.1742

microtechnologies.com

On Sep 19, 2018, at 9:32 AM, ERR ORR 
mailto:rd0...@gmail.com>> wrote:

I was never consulted.
I was only Told that there was a CoC "to be". Not when, not how.
A CoC will inevitably lead to the project taken over by leftists, political and 
technical decisions will be made by others.
Most important from my PoV, the projects quality will decrease until its made 
unviable.
As others have said, this was rammed down our throats.
Before you ppl become unemployed, read "SJWs always lie". You'll know what 
awaits you.
As for myself, I'll be on the lookout for another DB. One that's not 
infiltrated by leftist nuts.

And Dave, you can tell the core team a big "FUCK YOU" for this.

James Keener mailto:j...@jimkeener.com>> schrieb am Di., 
18. Sep. 2018, 13:48:
> following a long consultation process

It's not a consultation if any dissenting voice is simply ignored. Don't 
sugar-coat or politicize it like this -- it was rammed down everyone's throats. 
That is core's right, but don't act as everyone's opinions and concerns were 
taken into consideration. There are a good number of folks who are concerned 
that this CoC is overreaching and is ripe for abuse. Those concerns were always 
simply, plainly, and purposely ignored.

> Please take time to read and understand the CoC, which is intended to ensure 
> that PostgreSQL remains an open and enjoyable project for anyone to join and 
> participate in.

I sincerely hope so, and that it doesn't become a tool to enforce social 
ideology like in other groups I've been part of. Especially since this is the 
main place to come to get help for PostgreSQL and not a social club.

Jim

On September 18, 2018 6:27:56 AM EDT, Dave Page 
mailto:dp...@postgresql.org>> wrote:
The PostgreSQL Core team are pleased to announce that following a long 
consultation process, the project’s Code of Conduct (CoC) has now been 
finalised and published at https://www.postgresql.org/about/policies/coc/.

Please take time to read and understand the CoC, which is intended to ensure 
that PostgreSQL remains an open and enjoyable project for anyone to join and 
participate in.

A Code of Conduct Committee has been formed to handle any complaints. This 
consists of the following volunteers:

- Stacey Haysler (Chair)
- Lætitia Avrot
- Vik Fearing
- Jonathan Katz
- Ilya Kosmodemiansky

We would like to extend our thanks and gratitude to Stacey Haysler for her 
patience and expertise in helping develop the Code of Conduct, forming the 
committee and guiding the work to completion.

--
Dave Page
PostgreSQL Core Team
http://www.postgresql.org/

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: doc - add missing documentation for "acldefault"

2018-09-19 Thread Fabien COELHO




Hello,


Our normal policy is that we do not document functions that are meant to
be invoked through operators.  The \df output saying that is sufficient:


The output of \df is one thing, but I was looking at pg online 
documentation and hoping to find things as well.


\dfS returns nearly 3000 results, it is not really pratical unless you 
already know what you are looking for, eg the name of the function...



I would strongly object to ignoring that policy in just one place.


Ok, fair enough.


Actually, it appears that most of these functions have associated
operators:

# select oid::regoperator, oprcode from pg_operator where oprright = 
'aclitem'::regtype;
  oid  |   oprcode
---+-
 +(aclitem[],aclitem)  | aclinsert
 -(aclitem[],aclitem)  | aclremove
 @>(aclitem[],aclitem) | aclcontains
 =(aclitem,aclitem)| aclitemeq
 ~(aclitem[],aclitem)  | aclcontains
(5 rows)

So maybe what we really need is a table of operators not functions.



Good idea -- I will take a look at that.


My initial complaint is that I did not know that there was an "acldefault" 
function while I was looking for that kind of thing. ISTM that this one 
should be kept explicitely in the doc.


I'm okay with documenting operators instead of the basic undelying 
functions and skipping type management (in/out) functions, though.


--
Fabien.

Re: doc - add missing documentation for "acldefault"

2018-09-19 Thread Joe Conway

On 09/19/2018 10:54 AM, Tom Lane wrote:
> Joe Conway  writes:
>> * I do believe aclitemeq() has utility outside internal purposes.
> 
> Our normal policy is that we do not document functions that are meant to
> be invoked through operators.  The \df output saying that is sufficient:



> I would strongly object to ignoring that policy in just one place.

Ok, fair enough.

> Actually, it appears that most of these functions have associated
> operators:
> 
> # select oid::regoperator, oprcode from pg_operator where oprright = 
> 'aclitem'::regtype;
>   oid  |   oprcode   
> ---+-
>  +(aclitem[],aclitem)  | aclinsert
>  -(aclitem[],aclitem)  | aclremove
>  @>(aclitem[],aclitem) | aclcontains
>  =(aclitem,aclitem)| aclitemeq
>  ~(aclitem[],aclitem)  | aclcontains
> (5 rows)
> 
> So maybe what we really need is a table of operators not functions.

Good idea -- I will take a look at that.

> However, I don't object to documenting any function that has its
> own pg_description string.

Ok.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Progress reporting for pg_verify_checksums

2018-09-19 Thread Fabien COELHO




Hallo Michael,


This optionally prints the progress of pg_verify_checksums via read
kilobytes to the terminal with the new command line argument -P.

If -P was forgotten and pg_verify_checksums operates on a large cluster,
the caller can send SIGUSR1 to pg_verify_checksums to turn progress
status reporting on during runtime.


Version 2 of the patch is attached. This is rebased to master as of
422952ee and changes the signal handling to be a toggle as suggested by
Alvaro.


Patch applies cleanly and compiles.

About tests: "make check" is okay, but ITSM that the command is not 
started once, ever, in any test:-( It is unclear whether any test triggers 
checksums anyway...


The time() granularity to the second makes the result awkward on small 
tests:


 8/1554552 kB (0%, 8 kB/s)
 1040856/1554552 kB (66%, 1040856 kB/s)
 1554552/1554552 kB (100%, 1554552 kB/s)

I'd suggest to reuse the "portability/instr_time.h" stuff which allows a 
much better precision.


The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying 
1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are 
about storage which is usually measured in powers of 1,000, so I'd suggest 
to use thousands.


Another reserve I have is that on any hardware it is likely that there 
will be a lot of kilos, so maybe megas (MB) should be used instead.


I'm wondering whether the bandwidth display should be local (from the last 
display) or global (from the start of the command), but for the last 
forced one. This is an open question.


Maybe it would be nice to show elapsed time and expected completion time 
based on the current speed.


I would be in favor or having this turned on by default, and silenced with 
some option. I'm not sure other people would agree, though, so it is an 
open question as well.


About the code:

 +   if (show_progress)
 +   show_progress = false;
 +   else
 +   show_progress = true;

Why not a simple:

/* invert current state */
show_progress = !show_progress;

I do not see much advantage in using intermediate string buffers for 
values:


 + snprintf(totalstr, sizeof(totalstr), INT64_FORMAT,
 +  total_size / 1024);
 + snprintf(currentstr, sizeof(currentstr), INT64_FORMAT,
 +  current_size / 1024);
 + snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT,
 +  (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 : 
(time(NULL) - scan_started)));
 + fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)",
 +  currentstr, totalstr, total_percent, currspeedstr);

ISTM that just one simple fprintf would be fine:

  fprintf(stderr, INT64_FORMAT "/" INT64_FORMAT " ...",
  formulas for each slot...);

ISTM that this line overwriting trick deserves a comment:

 + if (isatty(fileno(stderr)))
 +   fprintf(stderr, "\r");
 + else
 +   fprintf(stderr, "\n");

And it runs a little amok with "-v".

 + memset(&act, '\0', sizeof(act));

pg uses 0 instead of '\0' everywhere else.

  +   /* Make sure we just report at least once a second */
  +   if ((now == last_progress_update) && !force) return;

The comments seems contradictory, I understand the code makes sure that it 
is "at most" once a second. Pgbench uses "-P XXX" to strigger a progress 
report every XXX second. I'm unsure whether it would be useful to allow 
the user to change the 1 second display interval.


I'm not sure why you removed one unrelated line:

  #include "storage/checksum.h"
  #include "storage/checksum_impl.h"

 -
  static int64 files = 0;
  static int64 blocks = 0;
  static int64 badblocks = 0;


There is a problem in the scan_file code: The added sizeonly parameter is 
not used. It should be removed.


--
Fabien.

Re: doc - add missing documentation for "acldefault"

Joe Conway  writes:
> * I do believe aclitemeq() has utility outside internal purposes.

Our normal policy is that we do not document functions that are meant to
be invoked through operators.  The \df output saying that is sufficient:

# \df+ aclitemeq

List of functions
   Schema   |   Name| Result data type | Argument data types | Type | 
Volatility | Parallel |  Owner   | Security | Access privileges | Language | 
Source code | Description  
+---+--+-+--++--+--+--+---+--+-+--
 pg_catalog | aclitemeq | boolean  | aclitem, aclitem| func | 
immutable  | safe | postgres | invoker  |   | internal | 
aclitem_eq  | implementation of = operator
(1 row)

I would strongly object to ignoring that policy in just one place.

Actually, it appears that most of these functions have associated
operators:

# select oid::regoperator, oprcode from pg_operator where oprright = 
'aclitem'::regtype;
  oid  |   oprcode   
---+-
 +(aclitem[],aclitem)  | aclinsert
 -(aclitem[],aclitem)  | aclremove
 @>(aclitem[],aclitem) | aclcontains
 =(aclitem,aclitem)| aclitemeq
 ~(aclitem[],aclitem)  | aclcontains
(5 rows)

So maybe what we really need is a table of operators not functions.
However, I don't object to documenting any function that has its
own pg_description string.

regards, tom lane

Google Code-in 2018

2018-09-19 Thread Sarah Schnurr

Hello, all!

Google has a global, online contest that invites teenagers (ages 13-17) to
work with mentors from various pre-selected open source organizations. The
idea is that these students are able to work on "bite-sized tasks" to
become introduced to the technology and learn. Many of the tasks listed by
other organizations seem to focus on small, meaningful tasks where the
students are actually contributing to the project.

PostgreSQL was successfully chosen as an open-source organization to
participate; this means we have until Tuesday, October 23 (16:00 UTC) to
complete a minimum of 100 task instances across at least 50 unique tasks
available for students to choose from at the start of the contest in all 5
categories including many beginner tasks.

We can use help in meeting this number of tasks. So please, if you have a
moment, take time to review the following information and add any task idea
you could think of that would be a useful or meaningful contribution to
PostgreSQL or associated extensions/projects.

Similarly to GSoC, please consider PostgreSQL to be an "Umbrella" project
in that anything which would be considered "PostgreSQL Family" per the News
& Events  policy is
likely to be acceptable as a PostgreSQL Google Code-in 2018 task.

In other words, if you're a contributor or developer on barman, pgBackRest,
the PostgreSQL website (pgweb), the PgEU/PgUS website code (pgeu-website),
pgAdmin4, PostgresXL, pgbouncer, Citus, pldebugger, the PG RPMs (pgrpms),
the JDBC driver, the ODBC driver, or any of the many other Postgres Family
projects, please feel free to add a task for
consideration!

Furthermore, if you're interested in joining as a Mentor for this year's
contest, please let us know and we will add you in.

*More information, from their FAQ:*

A task is a small project that is expected to take between 3-5 hours of
work to complete. Tasks are categorized with the following labels:

   - Code: Tasks related to writing or refactoring code
   - Documentation/Training: Tasks related to creating/editing documents
   and helping others learn more
   - Outreach/Research: Tasks related to community management,
   outreach/marketing, or studying problems and recommending solutions
   - Quality Assurance: Tasks related to testing and ensuring code is of
   high quality
   - Design: Tasks related to user experience research or user interface
   design and interaction

*The list of current ideas we have for tasks (currently at 25) is listed on
the wiki, here:*

https://wiki.postgresql.org/wiki/GCI_2018

*More example tasks can be found here:*

https://developers.google.com/open-source/gci/resources/example-tasks

When submitting ideas on the Wiki page, please be sure to include "Tags"
and "Categories" sections to optimize the process of submitting tasks on
the organization page.

Looking forward to an exciting and productive GCI 2018!

Thank you!
-- 
Sarah Schnurr

Re: [HACKERS] proposal: schema variables

st 19. 9. 2018 v 14:53 odesílatel Arthur Zakirov 
napsal:

> On Wed, Sep 19, 2018 at 02:08:04PM +0200, Pavel Stehule wrote:
> > Unfortunately we cannot to use standard
> > "SET" command, because it is used in Postgres for different purpose.
> > READ|WRITE are totally clear, and for user it is another signal so
> > variables are different than tables (so it is not one row table).
> >
> > I prefer current state, but if common opinion will be different, I have
> not
> > problem to change it.
>
> I see. I grepped the thread before writhing this but somehow missed the
> discussion.
>
> > The content of variables is not transactional (by default). It is not
> > destroyed by rollback. So I have to calculate with rollback too. So the
> > most correct syntax should be "ON COMMIT ON ROLLBACK RESET" what is
> little
> > bit messy and I used "ON TRANSACTION END". It should be signal, so this
> > event is effective on rollback event and it is valid for not transaction
> > variable. This logic is not valid to transactional variables, where ON
> > COMMIT RESET has sense. But this behave is not default and then I prefer
> > more generic syntax.
> > ...
> > So I see two different cases - work with catalog (what is transactional)
> > and work with variable value, what is (like other variables in
> programming
> > languages) not transactional. "ON TRANSACTION END RESET" means - does
> reset
> > on any transaction end.
> >
> > I hope so I explained it cleanly - if not, please, ask.
>
> I understood what you mean, thank you. I thought that
> { ON COMMIT DROP | ON TRANSACTION END RESET } parameters are used only
> for transactional variables in the first place. But is there any sense
> in using this parameters with non-transactional variables? That is when
> we create non-transactional variable we don't want that the variable
> will rollback or reset its value after the end of a transaction.
>

ON COMMIT DROP is used only for temp variables (transaction or not
transaction). The purpose is same like for tables. Sometimes you can to
have object with shorter life than is session.

ON TRANSACTION END RESET has sense mainly for not transaction variables. I
see two use cases.

1. protect some sensitive data - on transaction end guaranteed reset and
cleaning on end transaction. So you can be sure, so variable is not
initialized (has default value), or you are inside transaction.

2. automatic initialization - ON TRANSACTION END RESET ensure so variable
is in init state for any transaction.

Both cases has sense for transaction or not transaction variables.

I am thinking so transaction life time for content has sense. Is cheaper to
reset variable than drop it (what ON COMMIT DROP does)

What do you think?

Pavel



> --
> Arthur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>

Re: Code of Conduct

2018-09-19 Thread ERR ORR

I was never consulted.
I was only Told that there was a CoC "to be". Not when, not how.
A CoC will inevitably lead to the project taken over by leftists, political
and technical decisions will be made by others.
Most important from my PoV, the projects quality will decrease until its
made unviable.
As others have said, this was rammed down our throats.
Before you ppl become unemployed, read "SJWs always lie". You'll know what
awaits you.
As for myself, I'll be on the lookout for another DB. One that's not
infiltrated by leftist nuts.

And Dave, you can tell the core team a big "FUCK YOU" for this.

James Keener  schrieb am Di., 18. Sep. 2018, 13:48:

> > following a long consultation process
>
> It's not a consultation if any dissenting voice is simply ignored. Don't
> sugar-coat or politicize it like this -- it was rammed down everyone's
> throats. That is core's right, but don't act as everyone's opinions and
> concerns were taken into consideration. There are a good number of folks
> who are concerned that this CoC is overreaching and is ripe for abuse.
> Those concerns were always simply, plainly, and purposely ignored.
>
> > Please take time to read and understand the CoC, which is intended to
> ensure that PostgreSQL remains an open and enjoyable project for anyone to
> join and participate in.
>
> I sincerely hope so, and that it doesn't become a tool to enforce social
> ideology like in other groups I've been part of. Especially since this is
> the main place to come to get help for PostgreSQL and not a social club.
>
> Jim
>
> On September 18, 2018 6:27:56 AM EDT, Dave Page 
> wrote:
>>
>> The PostgreSQL Core team are pleased to announce that following a long
>> consultation process, the project’s Code of Conduct (CoC) has now been
>> finalised and published at https://www.postgresql.org/about/policies/coc/
>> .
>>
>> Please take time to read and understand the CoC, which is intended to
>> ensure that PostgreSQL remains an open and enjoyable project for anyone to
>> join and participate in.
>>
>> A Code of Conduct Committee has been formed to handle any complaints.
>> This consists of the following volunteers:
>>
>> - Stacey Haysler (Chair)
>> - Lætitia Avrot
>> - Vik Fearing
>> - Jonathan Katz
>> - Ilya Kosmodemiansky
>>
>> We would like to extend our thanks and gratitude to Stacey Haysler for
>> her patience and expertise in helping develop the Code of Conduct, forming
>> the committee and guiding the work to completion.
>>
>> --
>> Dave Page
>> PostgreSQL Core Team
>> http://www.postgresql.org/
>> 
>>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>

Re: Regarding Mentoring for GCI-18

2018-09-19 Thread Stephen Frost

Greetings,

* The Coding Planet (sunveersing...@gmail.com) wrote:
> I am Sunveer Singh, Google Code-in 2017 Grand Prize Winner with OSGeo.
> 
> I am willing to be mentor for PostgreSQL. Last year I worked on almost 70
> tasks based on GIS world.  A student an Grand Prize winner I will be able
> to help you guys during the GCI period. And The programming languages and
> technologies that PostgreSQL work with, I am very much known to them, as I
> worked with OSGeo last year, so I am having all this knowledge of GIS and
> SQL. And this year, I am in love with PostgreSQL, so I am willing to be
> mentor for PostgreSQL.

That's fantastic and we are thrilled to have you!

Sarah and I are the admins for PostgreSQL's first GCI experience and
we'll add you to the list of mentors we have lined up.  We have an
initial wiki for tasks here:

https://wiki.postgresql.org/wiki/GCI_2018

Thanks!

Stephen


signature.asc
Description: PGP signature

Re: hostorder and failover_timeout for libpq

2018-09-19 Thread Ildar Musin

Hello Surafel,

On Fri, Sep 14, 2018 at 2:03 PM Surafel Temesgen 
wrote:

> Hey ,
> Here are a few comment.
> +  xreflabel="failover_timeout">
> Here's a typo: ="libpq-connect-falover-timeout"
> +   {"failover_timeout", NULL, NULL, NULL,
> +   "Failover Timeout", "", 10,
> Word is separated by hyphen in internalPQconninfoOption lable as a
> surrounding code
> +If the value is random, the host to connect to
> +will be randomly picked from the list. It allows load balacing
> between
> +several cluster nodes.
> I Can’t think of use case where randomly picking a node rather than in
> user specified order can load balance the cluster better. Can you
> explain the purpose of this feature more?

Probably load-balancing is a wrong word for this. Think of it as a
connection routing mechanism. Let's say you have 10 servers and 100 clients
willing to establish read-only connection. Without this feature all clients
will go to the first specified host (unless they hit max_connections
limit). And with random `hostorder` they would be splited between hosts
more or less evenly.

> And in the code I can’t see
> a mechanism for preventing picking one host multiple time
>
The original idea was to collect all ip addresses that we get after
resolving specified hostnames, put those addresses into a global array,
apply random permutations to it and then use round robin algorithm trying
to connect to each of them until we succeed. Now I'm not sure that this
approach was the best. There are two concerns:

1. host name can be resolved to several ip addresses (which in turn can
point to either the same physical server with multiple network interfaces
or different servers). In described above schema each ip address would be
added to the global array. This may lead to a situation when one host gets
higher chance of being picked because it has more addresses in global array
than other hosts.
2. host may support both ipv4 and ipv6 connections, which again leads to
extra items in global array and therefore also increases its chance to be
picked.

Another approach would be to leave `pg_conn->connhost` as it is now (i.e.
not to create global addresses array) and just apply random permutations to
it if `hostorder=random` is specified. And probably apply permutations to
addresses list within each individual host.

At this point I'd like to ask community what in your opinion would be the
best course of action and whether this feature should be implemented within
libpq at all? Because from my POV there are factors that really depend on
network architecture and there is probably no single right solution.

Kind regards,
Ildar

Re: Online verification of checksums

2018-09-19 Thread Michael Banck

Hi,

Am Dienstag, den 18.09.2018, 13:52 -0400 schrieb David Steele:
> On 9/18/18 11:45 AM, Stephen Frost wrote:
> > * Michael Banck (michael.ba...@credativ.de) wrote:
> > > I have added a retry for this as well now, without a pg_sleep() as well.
> > > This catches around 80% of the half-reads, but a few slip through. At
> > > that point we bail out with exit(1), and the user can try again, which I
> > > think is fine? 
> > 
> > No, this is perfectly normal behavior, as is having completely blank
> > pages, now that I think about it.  If we get a short read then I'd say
> > we simply check that we got an EOF and, in that case, we just move on.
> > 
> > > Alternatively, we could just skip to the next file then and don't make
> > > it count as a checksum failure.
> > 
> > No, I wouldn't count it as a checksum failure.  We could possibly count
> > it towards the skipped pages, though I'm even on the fence about that.
> 
> +1 for it not being a failure.  Personally I'd count it as a skipped
> page, since we know the page exists but it can't be verified.
> 
> The other option is to wait for the page to stabilize, which doesn't
> seem like it would take very long in most cases -- unless you are doing
> this test from another host with shared storage.  Then I would expect to
> see all kinds of interesting torn pages after the last checkpoint.

OK, I'm skipping the block now on first try, as this makes (i) sense and
(ii) simplifies the code (again).

Version 3 is attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutzdiff --git a/src/bin/pg_verify_checksums/pg_verify_checksums.c b/src/bin/pg_verify_checksums/pg_verify_checksums.c
index 1bc020ab6c..9963264028 100644
--- a/src/bin/pg_verify_checksums/pg_verify_checksums.c
+++ b/src/bin/pg_verify_checksums/pg_verify_checksums.c
@@ -1,7 +1,7 @@
 /*
  * pg_verify_checksums
  *
- * Verifies page level checksums in an offline cluster
+ * Verifies page level checksums in a cluster
  *
  *	Copyright (c) 2010-2018, PostgreSQL Global Development Group
  *
@@ -25,7 +25,9 @@
 static int64 files = 0;
 static int64 blocks = 0;
 static int64 badblocks = 0;
+static int64 skippedblocks = 0;
 static ControlFileData *ControlFile;
+static XLogRecPtr checkpointLSN;
 
 static char *only_relfilenode = NULL;
 static bool verbose = false;
@@ -54,6 +56,7 @@ static const char *const skip[] = {
 	"pg_filenode.map",
 	"pg_internal.init",
 	"PG_VERSION",
+	"pgsql_tmp",
 	NULL,
 };
 
@@ -67,8 +70,14 @@ skipfile(const char *fn)
 		return true;
 
 	for (f = skip; *f; f++)
+	{
 		if (strcmp(*f, fn) == 0)
 			return true;
+		if (strcmp(*f, "pg_internal.init") == 0)
+			if (strncmp(*f, fn, strlen(*f)) == 0)
+return true;
+	}
+
 	return false;
 }
 
@@ -79,10 +88,17 @@ scan_file(const char *fn, BlockNumber segmentno)
 	PageHeader	header = (PageHeader) buf.data;
 	int			f;
 	BlockNumber blockno;
+	bool		block_retry = false;
 
 	f = open(fn, O_RDONLY | PG_BINARY, 0);
 	if (f < 0)
 	{
+		if (errno == ENOENT)
+		{
+			/* File was removed in the meantime */
+			return;
+		}
+
 		fprintf(stderr, _("%s: could not open file \"%s\": %s\n"),
 progname, fn, strerror(errno));
 		exit(1);
@@ -99,24 +115,71 @@ scan_file(const char *fn, BlockNumber segmentno)
 			break;
 		if (r != BLCKSZ)
 		{
-			fprintf(stderr, _("%s: could not read block %u in file \"%s\": read %d of %d\n"),
-	progname, blockno, fn, r, BLCKSZ);
-			exit(1);
+			/* Skip partially read blocks */
+			skippedblocks++;
+			continue;
 		}
-		blocks++;
 
 		/* New pages have no checksum yet */
 		if (PageIsNew(header))
+		{
+			skippedblocks++;
 			continue;
+		}
+
+		blocks++;
 
 		csum = pg_checksum_page(buf.data, blockno + segmentno * RELSEG_SIZE);
 		if (csum != header->pd_checksum)
 		{
+			/*
+			 * Retry the block on the first failure.  It's
+			 * possible that we read the first 4K page of
+			 * the block just before postgres updated the
+			 * entire block so it ends up looking torn to
+			 * us.  We only need to retry once because the
+			 * LSN should be updated to something we can
+			 * ignore on the next pass.  If the error
+			 * happens again then it is a true validation
+			 * failure.
+			 */
+			if (block_retry == false)
+			{
+/* Seek to the beginning of the failed block */
+if (lseek(f, -BLCKSZ, SEEK_CUR) == -1)
+{
+	fprintf(stderr, _("%s: could not lseek in file \"%s\": %m\n"),
+			progname, fn);
+	exit(1);
+}
+
+/* Set flag so we know a retry was attempted */
+block_retry = true;
+
+/* Reset loop to validate the block again */
+blockno--;
+
+continue;
+			}
+
+			/* The checksum verification failed

Re: Code of Conduct plan

2018-09-19 Thread ERR ORR

I see a CoC as an infiltration of the PostgreSQL community which has worked
OK since at least 10 years.
The project owners have let their care slacken.
I request that the project owners EXPEL/EXCOMMUNICATE ALL those who are
advancing what can only be seen as an instrument for harassing members of a
to-date peaceful and cordial community.

THROW THESE LEFTIST BULLIES OUT‼️

Dimitri Maziuk  schrieb am Mo., 17. Sep. 2018, 19:21:

> On 09/17/2018 10:39 AM, Chris Travers wrote:
> > On Mon, Sep 17, 2018 at 5:28 PM Joshua D. Drake 
> > wrote:
> ...
> >> My feedback is that those two sentences provide an overarching authority
> >> that .Org does not have the right to enforce
> ...
> > Fascinating that this would, on its face, not apply to a harassment
> > campaign carried out over twitter, but it would apply to a few comments
> > made over drinks at a bar.
>
> There is a flip side: if you have written standards, you can be held
> liable for not enforcing them. Potentially including enforcement of
> twitbook AUP on the list subscribers who also have a slackogger account.
>
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
>

Re: doc - add missing documentation for "acldefault"

2018-09-19 Thread Joe Conway

On 08/03/2018 09:04 AM, Fabien COELHO wrote:
> Here is a version of the patch which documents briefly all aclitem-related 
> functions, in a separate table.

I claimed this patch for review and commit. Comments:
---
* There is a comment in src/backend/utils/adt/acl.c noting that
  acldefault is "not documented for general use" which needs adjustment

* It makes no sense to me to document purely internal functions, e.g.
  aclitemin/out. If we are going to do that we should do it universally,
  which is not true currently, and IMHO makes no sense anyway.

* I do believe aclitemeq() has utility outside internal purposes.

* The  section is incomplete

* Interestingly (since that is what started this thread apparently) I
  found myself questioning whether acldefault() is really worth
  documenting since the result will be misleading if the defaults have
  been altered via SQL.
---

Attached patch addresses those items and does a bit of reordering and
editorialization. If there are no objections I will commit it this way
in a day or two.

Thanks,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4331beb..cea674e 100644
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** SELECT has_function_privilege('joeuser',
*** 16893,16898 
--- 16893,17032 
  be specified by name or by OID.
 
  
+
+  shows functions to
+ manage the aclitem type, the internal representation of access
+ privileges.
+ An aclitem entry describes the permissions of a grantee,
+ whether they are grantable or not, and which grantor granted them.
+ For instance, calvin=r*w/hobbes tells that
+ role calvin has
+ grantable privilege SELECT (r*)
+ and non-grantable privilege UPDATE (w)
+ granted by role hobbes.
+ An empty grantee stands for PUBLIC.
+
+ 
+
+ aclitem Management Functions
+ 
+  
+   Name Return Type Description
+  
+  
+   
+acldefault(type,
+   ownerId)
+aclitem[]
+get the hardcoded default access privileges for an object belonging to ownerId
+   
+   
+aclinsert(aclitem[], aclitem)
+aclitem[]
+add element aclitem to aclitem[] array
+   
+   
+aclremove(aclitem[], aclitem)
+aclitem[]
+remove element aclitem from aclitem[] array
+   
+   
+aclitemeq(aclitem1, aclitem2)
+boolean
+test whether two aclitem elements are equal
+   
+   
+aclcontains(aclitem[], aclitem)
+boolean
+test whether element aclitem is contained within aclitem[] array
+   
+   
+aclexplode(aclitem[])
+setof record
+get aclitem array as tuples
+   
+   
+makeaclitem(grantee, grantor, privilege, grantable)
+aclitem
+build an aclitem from input
+   
+  
+ 
+
+ 
+
+ aclitem
+
+
+ acldefault
+
+
+ aclinsert
+
+
+ aclremove
+
+
+ aclitemeq
+
+
+ aclcontains
+
+
+ aclexplode
+
+
+ makeaclitem
+
+ 
+
+ acldefault returns the hardcoded default access privileges
+ for an object of type belonging to role ownerId.
+ Notice that these are used in the absence of any pg_default_acl
+ () entry. Default access privileges are described in
+  and can be overwritten with
+ . In other words, this function will return
+ results which may be misleading when the defaults have been overridden.
+ Type is a CHAR, use
+ 'c' for COLUMN,
+ 'r' for relation-like objects such as TABLE or VIEW,
+ 's' for SEQUENCE,
+ 'd' for DATABASE,
+ 'f' for FUNCTION or PROCEDURE,
+ 'l' for LANGUAGE,
+ 'L' for LARGE OBJECT,
+ 'n' for SCHEMA,
+ 't' for TABLESPACE,
+ 'F' for FOREIGN DATA WRAPPER,
+ 'S' for FOREIGN SERVER,
+ 'T' for TYPE or DOMAIN.
+
+ 
+
+ aclinsert and aclremove
+ allow to insertion/removal of a privilege described by an
+ aclitem into/from an array of aclitem.
+
+ 
+
+ aclitemeq checks for equality of two
+ aclitem elements.
+
+ 
+
+ aclcontains checks if an aclitem
+ element is present in an array of aclitem.
+
+ 
+
+ aclexplode returns an aclitem array
+ as a set rows. Output columns are grantor oid,
+ grantee oid (0 for PUBLIC),
+ granted privilege as text (SELECT, ...)
+ and whether the prilivege is grantable as boolean.
+ makeaclitem performs the inverse operation.
+
+ 

  shows functions that
 determine whether a certain object is visible in the
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index a45e093..d5285e2 100644
*** a/src/backend/utils/adt/acl.c
--- b/src/b

Re: [HACKERS] proposal: schema variables

2018-09-19 Thread Arthur Zakirov

On Wed, Sep 19, 2018 at 02:08:04PM +0200, Pavel Stehule wrote:
> Unfortunately we cannot to use standard
> "SET" command, because it is used in Postgres for different purpose.
> READ|WRITE are totally clear, and for user it is another signal so
> variables are different than tables (so it is not one row table).
> 
> I prefer current state, but if common opinion will be different, I have not
> problem to change it.

I see. I grepped the thread before writhing this but somehow missed the
discussion.

> The content of variables is not transactional (by default). It is not
> destroyed by rollback. So I have to calculate with rollback too. So the
> most correct syntax should be "ON COMMIT ON ROLLBACK RESET" what is little
> bit messy and I used "ON TRANSACTION END". It should be signal, so this
> event is effective on rollback event and it is valid for not transaction
> variable. This logic is not valid to transactional variables, where ON
> COMMIT RESET has sense. But this behave is not default and then I prefer
> more generic syntax.
> ...
> So I see two different cases - work with catalog (what is transactional)
> and work with variable value, what is (like other variables in programming
> languages) not transactional. "ON TRANSACTION END RESET" means - does reset
> on any transaction end.
> 
> I hope so I explained it cleanly - if not, please, ask.

I understood what you mean, thank you. I thought that
{ ON COMMIT DROP | ON TRANSACTION END RESET } parameters are used only
for transactional variables in the first place. But is there any sense
in using this parameters with non-transactional variables? That is when
we create non-transactional variable we don't want that the variable
will rollback or reset its value after the end of a transaction.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: [HACKERS] proposal: schema variables

Hi
st 19. 9. 2018 v 13:23 odesílatel Arthur Zakirov 
napsal:

> Hello,
>
> On Wed, Sep 19, 2018 at 10:30:31AM +0200, Pavel Stehule wrote:
> > Hi
> >
> > new update:
> >
> > I fixed pg_restore, and I cleaned a code related to transaction
> processing
> >
> > There should be a full functionality now.
>
> I reviewed a little bit the patch. I have a few comments.
>
> > pg_views Columns
>
> I think there is a typo here. It should be "pg_variable".
>

I'll fix it.

>
> > GRANT { READ | WRITE | ALL [ PRIVILEGES ] }
>
> Shouldn't we use here GRANT { SELECT | LET | ... } syntax for the
> constistency. Same for REVOKE. I'm not experienced syntax developer
> though. But we use SELECT and LET commands when working with variables.
> So we should GRANT and REVOKE priveleges for this commands.
>

I understand to your proposal, - and I have not strong opinion. Originally
I proposed {SELECT|UPDATE), but some people prefer {READ|WRITE}. Now I
prefer Peter's proposal (what is implemented now) - READ|WRITE, because it
is very illustrative - and the mentioned difference is good because the
variables are not tables (by default), are not persistent, so different
rights are good for me. I see "GRANT LET" like very low clear, because
nobody knows what LET command does. Unfortunately we cannot to use standard
"SET" command, because it is used in Postgres for different purpose.
READ|WRITE are totally clear, and for user it is another signal so
variables are different than tables (so it is not one row table).

I prefer current state, but if common opinion will be different, I have not
problem to change it.

> > [ { ON COMMIT DROP | ON TRANSACTION END RESET } ]
>
> I think we may join them and have the syntax { ON COMMIT DROP | RESET }
> to get more simpler syntax. If we create a variable with ON COMMIT
> DROP, PostgreSQL will drop it regardless of whether transaction was
> committed or rollbacked:
>

I though about it too. I'll try to explain my idea. Originally I was
surprised so postgres uses "ON COMMIT" syntax, but in documentation is used
term "at transaction end". But it has some sense. ON COMMIT DROP is allowed
only for temporary tables and ON COMMIT DELETE ROWS is allowed for tables.
With these clauses the PostgreSQL is more aggressive in cleaning. It
doesn't need to calculate with rollback, because the rollback does cleaning
by self. So syntax "ON COMMIT" is fully correct it is related only for
commit event. It has not sense on rollback event (and doesn't change a
behave on rollback event).

The content of variables is not transactional (by default). It is not
destroyed by rollback. So I have to calculate with rollback too. So the
most correct syntax should be "ON COMMIT ON ROLLBACK RESET" what is little
bit messy and I used "ON TRANSACTION END". It should be signal, so this
event is effective on rollback event and it is valid for not transaction
variable. This logic is not valid to transactional variables, where ON
COMMIT RESET has sense. But this behave is not default and then I prefer
more generic syntax.

Again I have not a problem to change it, but I am thinking so current
design is logically correct.

> =# ...
> =# begin;
> =# create variable int1 int on commit drop;
> =# rollback;
> =# -- There is no variable int1
>
>
PostgreSQL catalog is transactional (where the metadata is stored), so when
I am working with metadata, then I use ON COMMIT syntax, because the behave
of  ON ROLLBACK cannot be changed.

So I see two different cases - work with catalog (what is transactional)
and work with variable value, what is (like other variables in programming
languages) not transactional. "ON TRANSACTION END RESET" means - does reset
on any transaction end.

I hope so I explained it cleanly - if not, please, ask.

CREATE TABLE syntax has similar options [1]. ON COMMIT controls
> the behaviour of temporary tables at the end a transaction block,
> whether it was committed or rollbacked. But I'm not sure is this a good
> example of precedence.
>
> > - ONCOMMIT_DROP   /* ON COMMIT DROP */
> > + ONCOMMIT_DROP,  /* ON COMMIT DROP */
> > } OnCommitAction;
>
> There is the extra comma here after ONCOMMIT_DROP.
>

I'll fix it.

Regards

Pavel

>
> 1 - https://www.postgresql.org/docs/current/static/sql-createtable.html
>
> --
> Arthur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>

Re: [HACKERS] proposal: schema variables

2018-09-19 Thread Arthur Zakirov

Hello,

On Wed, Sep 19, 2018 at 10:30:31AM +0200, Pavel Stehule wrote:
> Hi
> 
> new update:
> 
> I fixed pg_restore, and I cleaned a code related to transaction processing
> 
> There should be a full functionality now.

I reviewed a little bit the patch. I have a few comments.

> pg_views Columns

I think there is a typo here. It should be "pg_variable".

> GRANT { READ | WRITE | ALL [ PRIVILEGES ] }

Shouldn't we use here GRANT { SELECT | LET | ... } syntax for the
constistency. Same for REVOKE. I'm not experienced syntax developer
though. But we use SELECT and LET commands when working with variables.
So we should GRANT and REVOKE priveleges for this commands.

> [ { ON COMMIT DROP | ON TRANSACTION END RESET } ]

I think we may join them and have the syntax { ON COMMIT DROP | RESET }
to get more simpler syntax. If we create a variable with ON COMMIT
DROP, PostgreSQL will drop it regardless of whether transaction was
committed or rollbacked:

=# ...
=# begin;
=# create variable int1 int on commit drop;
=# rollback;
=# -- There is no variable int1

CREATE TABLE syntax has similar options [1]. ON COMMIT controls
the behaviour of temporary tables at the end a transaction block,
whether it was committed or rollbacked. But I'm not sure is this a good
example of precedence.

> - ONCOMMIT_DROP   /* ON COMMIT DROP */
> + ONCOMMIT_DROP,  /* ON COMMIT DROP */
> } OnCommitAction;

There is the extra comma here after ONCOMMIT_DROP.

1 - https://www.postgresql.org/docs/current/static/sql-createtable.html

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Mentorship Initiative in Google Code-In

2018-09-19 Thread Neha Gupta

Hey all!

 I'm Neha and I participated in Google Summer of Code this year for Xwiki, 
since, xwiki is not participating in Google code-in this year, I would like to 
volunteer for mentorship for Drupal. Is there a vacancy for that?

I’ve worked with CERN and a couple of startups in past and I’ve worked with a 
broad set of technologies including various frontend frameworks, devops with 
container technology, server-less frameworks, Data science, and 
machine-learning. 

If you have any further questions, let me know.

Best Regards,
Neha.

Regarding Mentoring for GCI-18

2018-09-19 Thread The Coding Planet

Hello,

I am Sunveer Singh, Google Code-in 2017 Grand Prize Winner with OSGeo.

I am willing to be mentor for PostgreSQL. Last year I worked on almost 70
tasks based on GIS world.  A student an Grand Prize winner I will be able
to help you guys during the GCI period. And The programming languages and
technologies that PostgreSQL work with, I am very much known to them, as I
worked with OSGeo last year, so I am having all this knowledge of GIS and
SQL. And this year, I am in love with PostgreSQL, so I am willing to be
mentor for PostgreSQL.

Thank You
Sunveer

Re: [HACKERS] Bug in to_timestamp().

2018-09-19 Thread amul sul

On Wed, Sep 19, 2018 at 3:51 PM amul sul  wrote:
>
> On Wed, Sep 19, 2018 at 2:57 PM Alexander Korotkov
[...]
>
> With this patch, to_date and to_timestamp behaving differently, see this:
>
> edb=# SELECT to_date('18 12 2011', 'xDDxMMx');
>   to_date
> 
>  18-DEC-11 00:00:00
> (1 row)
>
> edb=# SELECT to_timestamp('18 12 2011', 'xDDxMMx');
>to_timestamp
> ---
>  08-DEC-11 00:00:00 -05:00  <=== Incorrect output.
> (1 row)
>
Sorry, this was wrong info -- with this patch, I had some mine trial changes.

Both to_date and to_timestamp behaving same with your patch -- the
wrong output, we are expecting that?

postgres =# SELECT to_date('18 12 2011', 'xDDxMMx');
  to_date

 2011-12-08
(1 row)

postgres =# SELECT to_timestamp('18 12 2011', 'xDDxMMx');
  to_timestamp

 2011-12-08 00:00:00-05
(1 row)

Regards,
Amul

Re: [HACKERS] Bug in to_timestamp().

2018-09-19 Thread amul sul

On Wed, Sep 19, 2018 at 2:57 PM Alexander Korotkov
 wrote:
>
> On Tue, Sep 18, 2018 at 4:42 PM Alexander Korotkov
>  wrote:
> > But, I found related issue in cf9846724.  Before it was:
> >
> > # select to_timestamp('2018 01 01', '9MM9DD');
> >   to_timestamp
> > 
> >  2018-01-01 00:00:00+03
> > (1 row)
> >
> > But after it becomes so.
> >
> > # select to_timestamp('2018 01 01', '9MM9DD');
> > ERROR:  invalid value "1 " for "MM"
> > DETAIL:  Field requires 2 characters, but only 1 could be parsed.
> > HINT:  If your source string is not fixed-width, try using the "FM" 
> > modifier.
> >
> > That happens because we've already skipped space "for free", and then
> > NODE_TYPE_CHAR eats digit.  I've checked that Oracle doesn't allow
> > random charaters/digits to appear in format string.
> >
> > select to_timestamp('2018 01 01', '9MM9DD') from dual
> > ORA-01821: date format not recognized
> >
> > So, Oracle compatibility isn't argument here. Therefore I'm going to
> > propose following fix for that: let NODE_TYPE_CHAR eat characters only
> > if we didn't skip input string characters more than it was in format
> > string.  I'm sorry for vague explanation.  I'll come up with patch
> > later, and it should be clear then.
>
> Please find attached patch for fixing this issue.  It makes handling
> of format string text characters be similar to pre cf984672 behavior.
> See the examples in regression tests and explanation in the commit
> message.  I'm going to commit this if no objections.
>

With this patch, to_date and to_timestamp behaving differently, see this:

edb=# SELECT to_date('18 12 2011', 'xDDxMMx');
  to_date

 18-DEC-11 00:00:00
(1 row)

edb=# SELECT to_timestamp('18 12 2011', 'xDDxMMx');
   to_timestamp
---
 08-DEC-11 00:00:00 -05:00  <=== Incorrect output.
(1 row)

Regards,
Amul

Re: [HACKERS] Bug in to_timestamp().

2018-09-19 Thread Alexander Korotkov

On Tue, Sep 18, 2018 at 4:42 PM Alexander Korotkov
 wrote:
> But, I found related issue in cf9846724.  Before it was:
>
> # select to_timestamp('2018 01 01', '9MM9DD');
>   to_timestamp
> 
>  2018-01-01 00:00:00+03
> (1 row)
>
> But after it becomes so.
>
> # select to_timestamp('2018 01 01', '9MM9DD');
> ERROR:  invalid value "1 " for "MM"
> DETAIL:  Field requires 2 characters, but only 1 could be parsed.
> HINT:  If your source string is not fixed-width, try using the "FM" modifier.
>
> That happens because we've already skipped space "for free", and then
> NODE_TYPE_CHAR eats digit.  I've checked that Oracle doesn't allow
> random charaters/digits to appear in format string.
>
> select to_timestamp('2018 01 01', '9MM9DD') from dual
> ORA-01821: date format not recognized
>
> So, Oracle compatibility isn't argument here. Therefore I'm going to
> propose following fix for that: let NODE_TYPE_CHAR eat characters only
> if we didn't skip input string characters more than it was in format
> string.  I'm sorry for vague explanation.  I'll come up with patch
> later, and it should be clear then.

Please find attached patch for fixing this issue.  It makes handling
of format string text characters be similar to pre cf984672 behavior.
See the examples in regression tests and explanation in the commit
message.  I'm going to commit this if no objections.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


to_timestamp_fix.patch
Description: Binary data

Re: [HACKERS] proposal: schema variables