date:20151002

Re: [HACKERS] WIP: Rework access method interface

2015-10-02 Thread Amit Kapila

On Fri, Oct 2, 2015 at 8:14 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:
>
>
> I agree about staying with one SQL-visible function.
>
> Other changes:
>  * Documentation reflects interface changes.
>  * IndexAmRoutine moved from CacheMemoryContext to indexcxt.
>  * Various minor formatting improvements.
>  * Error messages are corrected.
>

Few assorted comments:

1.
+  * Get IndexAmRoutine structure from access method oid.
+  */
+ IndexAmRoutine *
+ GetIndexAmRoutine(Oid
amoid)
+ {
+ IndexAmRoutine *result;
+ HeapTuple tuple;
+ regproc
amhandler;
+
+ tuple = SearchSysCache1(AMOID, ObjectIdGetDatum(amoid));
+ if (!HeapTupleIsValid
(tuple))
+ elog(ERROR, "cache lookup failed for access method %u",
+
amoid);
+ amhandler = ((Form_pg_am)GETSTRUCT(tuple))->amhandler;
+
+ if (!RegProcedureIsValid
(amhandler))
+ elog(ERROR, "invalid %u regproc", amhandler);

I have noticed that currently, the above kind of error is reported slightly
differently as in below code:
if (!RegProcedureIsValid(procOid)) \
elog(ERROR, "invalid %s regproc", CppAsString
(pname)); \

If you feel it is better to do the way as it is in current code, then you
can change accordingly.

2.

 Access methods that always return entries in the natural ordering
 of their data (such
as btree) should set
!pg_am.amcanorder to true.
 Currently, such
access methods must use btree-compatible strategy
 numbers for their equality and ordering operators.

--- 545,551 

 Access methods that always return entries in the natural
ordering
 of their data (such as btree) should set
!amcanorder to true.

Currently, such access methods must use btree-compatible strategy
 numbers for their equality and
ordering operators.

Isn't it better to use structure while referencing the field of it?

3.
!Handler function must be written in a compiled language such as C,
using
!the version-1 interface.

Here, it is not completely clear, what do you refer to as version-1
interface.

4.
xindex.sgml
Index Methods and Operator Classes
..
It is possible to add a
   new index method by defining the required interface routines and
   then creating a row in pg_am — but that is
   beyond the scope of this chapter (see ).

I think changing above to indicate something about handler function
could be useful.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] pgbench stats per script & other stuff

2015-10-02 Thread Fabien COELHO



Here is a v10, which is a rebase because of the "--progress-timestamp" option 
addition.


Here is a v11, which is a rebase after some recent changes committed to 
pgbench.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0ac40f1..e3a90e5 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,25 @@ pgbench  options  dbname
 benchmarking arguments:
 
 
+ 
+  -b scriptname[@weight]
+  --builtin scriptname[@weight]
+  
+   
+Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
+Available builtin scripts are: tpcb-like,
+simple-update and select-only.
+The provided scriptname needs only to be a prefix
+of the builtin name, hence simp would be enough to select
+simple-update.
+With special name list, show the list of builtin scripts
+and exit immediately.
+   
+  
+ 
+
 
  
   -c clients
@@ -307,14 +326,15 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

-Read transaction script from filename.
+Add a transaction script read from filename to
+the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.
--N, -S, and -f
-are mutually exclusive.

   
  
@@ -404,10 +424,7 @@ pgbench  options  dbname
   --skip-some-updates
   

-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for -b simple-update@1.

   
  
@@ -512,9 +529,9 @@ pgbench  options  dbname
 Report the specified scale factor in pgbench's
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the pgbench_branches table.  However, when testing
-custom benchmarks (-f option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the pgbench_branches table.
+However, when testing only custom benchmarks (-f option),
+the scale factor will be reported as 1 unless this option is used.

   
  
@@ -524,7 +541,7 @@ pgbench  options  dbname
   --select-only
   

-Perform select-only transactions instead of TPC-B-like test.
+Shorthand for -b select-only@1.

   
  
@@ -674,7 +691,20 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts with -b and
+   user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
+ 
+
+  
+   The default builtin transaction script (also invoked with -b tpcb-like)
+   issues seven commands per transaction over randomly chosen aid,
+   tid, bid and balance.
+   The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B,
+   hence the name.
   
 
   
@@ -688,9 +718,15 @@ pgbench  options  dbname
   
 
   
-   If you specify -N, steps 4 and 5 aren't included in the
-   transaction.  If you specify -S, only the SELECT is
-   issued.
+   If you select the simple-update builtin (also -N),
+   steps 4 and 5 aren't included in the transaction.
+   This will avoid update contention on these tables, but
+   it makes the test case even less like TPC-B.
+  
+
+  
+   If you select the select-only builtin (also -S),
+   only the SELECT is issued.
   
  
 
@@ -702,10 +738,7 @@ pgbench  options  dbname
benchmark scenarios by replacing the default transaction script
(described above) with a transaction script read from a file
(-f option).  In this case a transaction
-   counts as one execution of a script file.  You can even specify
-   multiple scripts (multiple -f options), in which
-   case a random one of the scripts is chosen each time a client session
-   starts a new transaction.
+   counts as one execution of a script file.
   
 
   
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f2d435b..0e56ed7 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -164,12 +164,11 @@ bool		use_log;			/* log transaction latencies to a file */
 bool		use_quiet;			/* quiet logging onto stderr */
 int			agg_interval;		/* log aggregates instead of individual

Re: [HACKERS] Idea for improving buildfarm robustness

2015-10-02 Thread Michael Paquier

On Sat, Oct 3, 2015 at 1:39 PM, Tom Lane  wrote:

> I wrote:
> > Here's a rewritten patch that looks at postmaster.pid instead of
> > pg_control.  It should be effectively the same as the prior patch in
> terms
> > of response to directory-removal cases, and it should also catch many
> > overwrite cases.
>
> BTW, my thought at the moment is to wait till after next week's releases
> to push this in.  I think it's probably solid, but it doesn't seem like
> it's worth taking the risk of pushing shortly before a wrap date.
>

That seems a wiser approach to me. Down to which version are you planning a
backpatch? As this is aimed for the buildfarm stability with TAP stuff, 9.4?
-- 
Michael

Re: [HACKERS] Proposing COPY .. WITH PERMISSIVE

2015-10-02 Thread dinesh kumar

On Wed, Sep 2, 2015 at 1:19 PM, Stefan Kaltenbrunner <
ste...@kaltenbrunner.cc> wrote:

> On 09/02/2015 10:10 PM, dinesh kumar wrote:
> > On Tue, Sep 1, 2015 at 10:58 PM, Stefan Kaltenbrunner
> > mailto:ste...@kaltenbrunner.cc>> wrote:
> >
> > On 07/25/2015 03:38 AM, dinesh kumar wrote:
> > >
> > >
> > > On Fri, Jul 24, 2015 at 10:22 AM, Robert Haas <
> robertmh...@gmail.com 
> > > >>
> wrote:
> > >
> > > On Thu, Jul 23, 2015 at 8:15 PM, dinesh kumar
> > > mailto:dineshkuma...@gmail.com>
> > >>
> > wrote:
> > > > On Thu, Jul 23, 2015 at 9:21 AM, Robert Haas
> > > mailto:robertmh...@gmail.com>
> > >>
> wrote:
> > > >>
> > > >> On Thu, Jul 23, 2015 at 12:19 PM, dinesh kumar
> > > mailto:dineshkuma...@gmail.com>
> > >>
> > > >> wrote:
> > > >> > Sorry for my  unclear description about the proposal.
> > > >> >
> > > >> > "WITH PERMISSIVE" is equal to our existing behavior. That
> > is, chmod=644
> > > >> > on
> > > >> > the created files.
> > > >> >
> > > >> > If User don't specify "PERMISSIVE" as an option, then the
> > chmod=600 on
> > > >> > created files. In this way, we can restrict the other
> > users from reading
> > > >> > these files.
> > > >>
> > > >> There might be some benefit in allowing the user to choose
> the
> > > >> permissions, but (1) I doubt we want to change the default
> > behavior
> > > >> and (2) providing only two options doesn't seem flexible
> > enough.
> > > >>
> > > >
> > > > Thanks for your inputs Robert.
> > > >
> > > > 1) IMO, we will keep the exiting behavior as it is.
> > > >
> > > > 2) As the actual proposal talks about the permissions of
> > group/others. So,
> > > > we can add few options as below to the WITH clause
> > > >
> > > > COPY
> > > > ..
> > > > ..
> > > > WITH
> > > > [
> > > > NO
> > > > (READ,WRITE)
> > > > PERMISSION TO
> > > > (GROUP,OTHERS)
> > > > ]
> > >
> > > If we're going to do anything here, it should use COPY's
> > > extensible-options syntax, I think.
> > >
> > >
> > > Thanks Robert. Let me send a patch for this.
> >
> >
> > how are you going to handle windows or unix ACLs here?
> > Its permission model is quite different and more powerful than
> (non-acl
> > based) unix in general, handling this in a flexible way might soon
> get
> > very complicated and complex for limited gain...
> >
> >
> > Hi Stefan,
> >
> > I had the same questions too. But, I believe, our initdb works in these
> > cases, after creating the data cluster. Isn't ?
>
> maybe - but having a fixed "default"  is very different from baking a
> classic unix permission concept of user/group/world^others into actual
> DDL or into a COPY option. The proposed syntax might make some sense to
> a admin used to a unix style system but it is likely utterly
> incomprehensible to somebody who is used to (windows style) ACLs.
>
> I dont have a good answer on what to do else atm but I dont think we
> should embedded traditional/historical unix permission models in our
> grammer unless really really needed..
>

Yes, I agree with you. We shouldn't embed the unix style access model into
COPY.

COPY's default behaviour umask 644 on output files, which is giving read
access to other users.
I see, there is a good reason behind it. Also, it would be good to have a
control on the READ ACCESS of a file, where we can secure our dump files,
from the non instance ownership users.

Adding a trivial patch to control this read ACL.

--Sample Test Case

--Default behaviour
postgres=# COPY test_table TO '/tmp/readacs.txt';
COPY 1000

$ ls -l /tmp/readacs.txt
-rw-r--r--  /tmp/test1.txt

--With applied patch
postgres=# COPY test_table TO '/tmp/noreadacs.txt' NO READ ACCESS;
COPY 1000

$ ls -l /tmp/noreadacs.txt
-rw--- /tmp/noreadacs.txt

We can also use "PROGRAM 'cat > Output.csv' " to achieve this "NO READ
ACCESS", since the program is always running as a instance owner.

Let me know your inputs and thoughts.

Stefan
>


-- 

Regards,
Dinesh
manojadinesh.blogspot.com
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 2850b47..eb53ddf 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -44,6 +44,7 @@ COPY { table_name [ ( column_name 
[, ...] )
 FORCE_NULL ( column_name [, 
...] )
 ENCODING 'encoding_name'
+NO READ ACCESS
 
  
 
@@ -356,7 +357,17 @@ COPY {

Re: [HACKERS] Idea for improving buildfarm robustness

2015-10-02 Thread Tom Lane

I wrote:
> Here's a rewritten patch that looks at postmaster.pid instead of
> pg_control.  It should be effectively the same as the prior patch in terms
> of response to directory-removal cases, and it should also catch many
> overwrite cases.

BTW, my thought at the moment is to wait till after next week's releases
to push this in.  I think it's probably solid, but it doesn't seem like
it's worth taking the risk of pushing shortly before a wrap date.

If anyone wants to argue for including it in the releases, speak up ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-10-02 Thread Amit Kapila

On Thu, Oct 1, 2015 at 7:41 PM, Robert Haas  wrote:
>
>
> Does heap_parallelscan_nextpage really need a pscan_finished output
> parameter, or can it just return InvalidBlockNumber to indicate end of
> scan?  I think the latter can be done and would be cleaner.
>

I think we can do that way as well, however we have to keep the check
of page == InvalidBlockNumber at all the callers to indicate finish
of scan which made me write the function as it exists in patch.  However,
I don't mind changing it the way you have suggested if you feel that would
be cleaner.

> There doesn't seem to be anything that ensures that everybody who's
> scanning the relation agrees on whether we're doing a synchronized
> scan.
>

I think that is implicitly ensured.  We perform sync scan's based
on GUC synchronize_seqscans and few other conditions in initscan
which I think will ensure that all workers will perform sync scans
on relation.  Do you see any problem with exiting logic which would
break the guarantee of sync scans on a relation among parallel
workers?

>  I think that heap_parallelscan_initialize() should taken an
> additional Boolean argument, allow_sync.  It should decide whether to
> actually perform a syncscan using the logic from initscan(), and then
> it should store a phs_syncscan flag into the ParallelHeapScanDesc.
> heap_beginscan_internal should set rs_syncscan based on phs_syncscan,
> regardless of anything else.
>

I think this will ensure that any future change in this area won't break the
guarantee for sync scans for parallel workers, is that the reason you
prefer to implement this function in the way suggested by you?

> I think heap_parallel_rescan() is an unworkable API.  When rescanning
> a synchronized scan, the existing logic keeps the original start-block
> so that the scan's results are reproducible,

It seems from the code comments in initscan, the reason for keeping
previous startblock is to allow rewinding the cursor which doesn't hold for
parallel scan.  We might or might not want to support such cases with
parallel query, but even if we want to there is a way we can do with
current logic (as mentioned in one of my replies below).

> but no longer reports the
> scan position since we're presumably out of step with other backends.

Is it true for all form of rescans or are you talking about some
case (like SampleScan) in particular?  As per my reading of code
(and verified by debugging that code path), it doesn't seem to be true
for rescan in case of seqscan.

> This isn't going to work at all with this API.  I don't think you can
> swap out the ParallelHeapScanDesc for another one once you've
> installed it;
>

Sure, but if we really need some such parameters like startblock position,
then we can preserve those in ScanDesc.

>
> This is obviously going to present some design complications for the
> as-yet-uncommitted code to push down PARAM_EXEC parameters, because if
> the new value takes more bytes to store than the old value, there
> won't be room to update the existing DSM in place.
>

PARAM_EXEC parameters will be used to support initPlan in parallel query (as
it is done in the initial patch), so it seems to me this is the main
downside of
this approach.  I think rather than trying to come up with various possible
solutions for this problem, lets prohibit sync scans with parallel query if
you
see some problem with the suggestions made by me above.  Do you see any
main use case getting hit due to non support of sync scans with
parallel seq. scan?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Peter Geoghegan

On Fri, Oct 2, 2015 at 4:27 PM, Tom Lane  wrote:
> ... do you want to produce an updated patch?  I'm not planning to look at
> it until tomorrow anyway.

I'll post a new patch by about midday tomorrow west coast time.
Hopefully that works for you.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Tom Lane

... do you want to produce an updated patch?  I'm not planning to look at
it until tomorrow anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Peter Geoghegan

On Fri, Oct 2, 2015 at 4:11 PM, Peter Geoghegan  wrote:
> Actually, isn't that another bug? The fact that we don't do the same
> from within gc_qtexts() in normal cases, even with an exclusive lock
> held? We do this:

Ah, no. We check pgss->gc_count in any case, so it should be fine.
That will also make it safe to do the unlink() as outlined already,
because a new qtext_load_file() call from
pg_stat_statements_internal() (due to gc_count bump) will allocate the
file again by name.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Peter Geoghegan

On Fri, Oct 2, 2015 at 3:57 PM, Peter Geoghegan  wrote:
> The spinlock acquisition above is actually necessary despite the
> n_writers trick, because that's only used by qtext_store().

Actually, isn't that another bug? The fact that we don't do the same
from within gc_qtexts() in normal cases, even with an exclusive lock
held? We do this:

/* Reset the shared extent pointer */
pgss->extent = extent;

I saw one really weird case on a customer database, with an enormous
although totally repetitive query text and one entry total (I
mentioned this in passing up-thread). Although I'd be willing to
believe it was just a very odd use of the database, since apparently
they were doing some kind of stress-test, perhaps it could be better
explained by a bug like this.

To recap, for other people: pg_stat_statements_internal() may do this
without any shared lock held:

/* No point in loading file now if there are active writers */
if (n_writers == 0)
qbuffer = qtext_load_file(&qbuffer_size);

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Peter Geoghegan

On Fri, Oct 2, 2015 at 3:57 PM, Peter Geoghegan  wrote:
> (void) AllocateFile(PGSS_TEXT_FILE, PG_BINARY_W);

Naturally, a FreeFile() call will also be required for any
successfully allocated file.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Peter Geoghegan

On Fri, Oct 2, 2015 at 2:42 PM, Tom Lane  wrote:
> Peter Geoghegan  writes:
>> I think that SIZE_MAX should be replaced by MaxAllocHugeSize before
>> the patch is committed. That should be perfectly portable.
>
> Hmm ... only back to 9.4, but I guess that's far enough.

I just realized that the existing gc_fail handler label within
gc_qtexts() lacks something like this, too:

unlink(PGSS_TEXT_FILE);
(void) AllocateFile(PGSS_TEXT_FILE, PG_BINARY_W);
SpinLockAcquire(&s->mutex);
pgss->extent = 0;
SpinLockRelease(&s->mutex);

I think it should do this anyway, but it makes particular sense in
light of the proposed changes. All existing failure cases within
gc_qtexts() seem like a good reason to give up forever.

The spinlock acquisition above is actually necessary despite the
n_writers trick, because that's only used by qtext_store().

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Joshua D. Drake


On 10/02/2015 01:26 PM, Alvaro Herrera wrote:


However, the contact surface between these two options wasn't really
well polished.  Formatting would be lost very frequently: I could write
a nice email, and the customer would get a nice email, but if you looked
 at it in the web, it was very ugly.


Newer versions are much better at this from what I can tell.


 If you used the web form to reply,
the resulting email looked pretty stupid in some cases.  I eventually
learned to use the right {{{ }}} markers in my email replies so that
code would look right in the web.  But if you made a single mistake, you
were fscked and there was no way at all to fix it.


I don't know anything about the brackets but I do know one thing that is 
unfortunate about redmine is that it assumes plain text unless you tell 
it something different. What does that mean?


If you want to use a preformatted (text based) entry:


psql -U postgres


Will do exactly the same thing in the web interface. Of course in the 
web interface it gets formatted so it looks great. In email, you get 
HTML code.


I find is that as long as you are working in just text, everything is 
kosher. If you try to do any formatting (so it looks nice on the web), 
you run into problems.




If you look at the debbugs interface, it is pretty clear that all that
it does is keep track of emails -- which, let it be said, is the soul of
this community's communication, so it seems to me that that is what we
need.  Metadata changes are kept visually separate from actual
commentary, which is convenient; and you can always get the mbox
involving that bug, or look at minute details of it using the web
interface if you need that sort of thing.


It is true (I believe roundup doesn't something similar to debbugs) that 
Redmine considers email a business class citizen, not coach but 
certainly not first class.


My main issue with debbugs is that it appears very limited and is yet 
another piece of infrastructure that only provides that infrastructure 
and thus will continue to cause more heartache than is needed. That 
isn't to say it is a bad piece of software but that it is very specific 
in what it does.


We seem to need more than that. As another person (I don't recall who) 
mentioned, Redmine gives us an infrastructure to many things including 
proper mapping between issues and GIT. Perhaps we don't use that now but 
do we want to be in a position a year from now where we want it but now 
we have pitched our tent with a more limited piece of software?


I do appreciate the feedback on it and I am in no way suggesting that 
Redmine is perfect only that it "might" be the solution.


Sincerely,

JD





--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Tom Lane

Peter Geoghegan  writes:
> I think that SIZE_MAX should be replaced by MaxAllocHugeSize before
> the patch is committed. That should be perfectly portable.

Hmm ... only back to 9.4, but I guess that's far enough.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Peter Geoghegan

On Fri, Oct 2, 2015 at 2:29 PM, Tom Lane  wrote:
> I'm not too impressed with this bit:
>
> /* Allocate buffer; beware that off_t might be wider than size_t */
> -   if (stat.st_size <= MaxAllocSize)
> +   if (stat.st_size <= SIZE_MAX)
> buf = (char *) malloc(stat.st_size);
>
> because there are no, zero, not one uses of SIZE_MAX in our code today,
> and I do not see such a symbol required by the POSIX v2 spec either.

See my remarks just now. This can easily be fixed.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Tom Lane

Peter Geoghegan  writes:
> It would be nice to get this committed before the next point releases
> are tagged, since I've now heard a handful of complaints like this.

I'm not too impressed with this bit:

/* Allocate buffer; beware that off_t might be wider than size_t */
-   if (stat.st_size <= MaxAllocSize)
+   if (stat.st_size <= SIZE_MAX)
buf = (char *) malloc(stat.st_size);

because there are no, zero, not one uses of SIZE_MAX in our code today,
and I do not see such a symbol required by the POSIX v2 spec either.
Perhaps this will work, but you're asking us to introduce a brand new
portability hazard just hours before a wrap deadline.  That is not
happening.

Other than that, this seems roughly sane, though I've not read it in
detail or tested it.  Does anyone have an objection to trying to squeeze
in something along this line?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Peter Geoghegan

On Fri, Oct 2, 2015 at 2:04 PM, Peter Geoghegan  wrote:
> It would be nice to get this committed before the next point releases
> are tagged, since I've now heard a handful of complaints like this.

I grep'd for SIZE_MAX to make sure that that was something that is
available on all supported platforms, since it's C99. What I
originally thought was code now turns out to actually be a code-like
comment within aset.c.

I think that SIZE_MAX should be replaced by MaxAllocHugeSize before
the patch is committed. That should be perfectly portable.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_statements query jumbling question

2015-10-02 Thread Peter Geoghegan

On Wed, Sep 2, 2015 at 7:41 PM, Satoshi Nagayasu  wrote:
> I know this still needs to be discussed, but I would like to submit
> a patch for further discussion at the next CF, 2015-11.

I think I already expressed this in my explanation of the current
behavior, but to be clear: -1 from me to this proposal.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-10-02 Thread Peter Geoghegan

On Tue, Sep 22, 2015 at 6:01 PM, Peter Geoghegan  wrote:
> On Tue, Sep 22, 2015 at 5:01 PM, Peter Geoghegan  wrote:
>> My guess is that this very large query involved a very large number of
>> constants, possibly contained inside an " IN ( )". Slight variants of
>> the same query, that a human would probably consider to be equivalent
>> have caused artificial pressure on garbage collection.
>
> I could write a patch to do compaction in-place.

In the end, I decided on a simpler approach to fixing this general
sort of problem with the attached patch. See commit message for
details.

I went this way not because compaction in-place was necessarily a bad
idea, but because I think that a minimal approach will work just as
well in real world cases.

It would be nice to get this committed before the next point releases
are tagged, since I've now heard a handful of complaints like this.

-- 
Peter Geoghegan
From 5afb0be51bfe0a7013452a6d591abb75f796a898 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Tue, 29 Sep 2015 15:32:51 -0700
Subject: [PATCH] Fix pg_stat_statements garbage collection bugs

Previously, pg_stat_statements garbage collection could encounter
problems when many large, technically distinct queries were executed
without completing successfully.  Garbage collection could enter a state
where it would never complete successfully, with the external query text
file left to grow indefinitely.  For example, this could happen when a
data integration process fails repeatedly before finally succeeding
(perhaps this process could recur intermittently over a fairly long
period).  Problematic queries might have referenced what are technically
distinct tables each time, as when a CREATE TABLE is performed as part
of the data integration operation, giving each execution a unique
queryid.

This example scenario should not pose any problem;  garbage collection
should still occur fairly frequently.  However, a spuriously high mean
query length was previously possible due to failed queries' sticky
entries artificially inflating mean query length during hash table
eviction.  The mechanism by which garbage collection is throttled to
prevent thrashing could therefore put it off for far too long.  By the
time a garbage collection actually occurs, the file may have become so
big that it may not be possible for memory to be allocated for all query
texts.  Worse still, once the MaxAllocSize threshold was crossed, it
became impossible for the system to recover even with sufficient memory
to allocate a buffer for all query texts.

pg_stat_statements now no longer weighs sticky entries when calculating
mean query text, which should prevent garbage collection from getting an
inaccurate idea of user-visible mean query text.  As a precaution,
pg_stat_statements may now avoid storing very large query texts after
parse analysis but ahead of query execution (entries for which no
statistics exist yet) based on certain criteria.  Garbage collection no
longer tolerates OOMs when allocating a buffer to store all existing
query texts.  MaxAllocSize no longer caps the buffer returned by
malloc() that is used to store query texts in memory during garbage
collection.

Backpatch to 9.4, where external query text storage was introduced.
---
 contrib/pg_stat_statements/pg_stat_statements.c | 53 +
 1 file changed, 46 insertions(+), 7 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 59b8a2e..97584b6 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -103,6 +103,7 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define USAGE_INIT(1.0)	/* including initial planning */
 #define ASSUMED_MEDIAN_INIT		(10.0)	/* initial assumed median usage */
 #define ASSUMED_LENGTH_INIT		1024	/* initial assumed mean query length */
+#define LENGTH_OUTLIER_FACTOR	5		/* mean query length multiplier */
 #define USAGE_DECREASE_FACTOR	(0.99)	/* decreased every entry_dealloc */
 #define STICKY_DECREASE_FACTOR	(0.50)	/* factor for sticky entries */
 #define USAGE_DEALLOC_PERCENT	5		/* free this % of entries at once */
@@ -,6 +1112,10 @@ pgss_hash_string(const char *str)
  * If jstate is not NULL then we're trying to create an entry for which
  * we have no statistics as yet; we just want to record the normalized
  * query string.  total_time, rows, bufusage are ignored in this case.
+ * A query text may not actually end up being stored in this case
+ * (storing oversized texts is sometimes avoided), and in any event is
+ * not guaranteed to still be around if and when actual results need to
+ * be stored in a subsequent pgss_store() call.
  */
 static void
 pgss_store(const char *query, uint32 queryId,
@@ -1159,7 +1164,28 @@ pgss_store(const char *query, uint32 queryId,
 		 */
 		if (jstate)
 		{
+			Size		mean_query_len = Max(ASSUMED_LENGTH_INIT,
+			 pgss->mea

Re: [HACKERS] Confusing remark about UPSERT in fdwhandler.sgml

2015-10-02 Thread Robert Haas

On Fri, Oct 2, 2015 at 4:04 AM, Peter Geoghegan  wrote:
> On Fri, Oct 2, 2015 at 1:00 AM, Etsuro Fujita
>  wrote:
>> ISTM that the sentence "as remote constraints are not locally known" is
>> somewhat confusing, because check constrains on remote tables can be
>> defined locally in 9.5.  How about "unique constraints or exclusion
>> constraints on remote tables are not locally known"?  Attached is a
>> patch for that.
>
> Makes sense to me.

Me, too.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Robert Haas

On Fri, Oct 2, 2015 at 4:26 PM, Alvaro Herrera  wrote:
> Joshua D. Drake wrote:
>> Put succinctly, I am willing to put resources into testing Redmine for our
>> needs. I will need help to do so because I am not a committer/hacker. Andres
>> thinks that isn't worth while. I think he is wrong. If he doesn't want to
>> help, he doesn't have to, thus the call for volunteers.
>
> Nobody asked, but here's my opinion on Redmine.  I worked pretty heavily
> with it during my time at Command Prompt.  I have to say that with the
> customizations that CMD had at the time, it wasn't that bad -- I was
> pretty happy that I could interact with it via email, and most of the
> time it wouldn't do anything too stupid.  I could also interact with it
> using the web, and it worked pretty well there.  Most other Redmine
> installations I've used don't have the email interface at all.
>
> However, the contact surface between these two options wasn't really
> well polished.  Formatting would be lost very frequently: I could write
> a nice email, and the customer would get a nice email, but if you looked
> at it in the web, it was very ugly.  If you used the web form to reply,
> the resulting email looked pretty stupid in some cases.  I eventually
> learned to use the right {{{ }}} markers in my email replies so that
> code would look right in the web.  But if you made a single mistake, you
> were fscked and there was no way at all to fix it.
>
> As far as I remember, the main reason for this pain was that it didn't
> try to consider an email as an email: instead, what it did was grab the
> text and cram it into the comment box.  Same thing in the other
> direction, where the text from the comment would be crammed as an email
> in output.  All that was needed was for it to store emails in the
> rfc/2822 format in the database, and then render them as emails in the
> web form, instead of trying to do the conversion in the process.
>
>
> If you look at the debbugs interface, it is pretty clear that all that
> it does is keep track of emails -- which, let it be said, is the soul of
> this community's communication, so it seems to me that that is what we
> need.  Metadata changes are kept visually separate from actual
> commentary, which is convenient; and you can always get the mbox
> involving that bug, or look at minute details of it using the web
> interface if you need that sort of thing.

Thanks for this very thoughtful email.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] row_security GUC, BYPASSRLS

2015-10-02 Thread Robert Haas

On Thu, Oct 1, 2015 at 11:10 PM, Noah Misch  wrote:
> On Mon, Sep 28, 2015 at 05:13:56PM -0400, Stephen Frost wrote:
>> * Noah Misch (n...@leadboat.com) wrote:
>> > In schema reviews, I will raise a red flag for use of this feature; the 
>> > best
>> > designs will instead use additional roles.  I forecast that PostgreSQL 
>> > would
>> > fare better with no owner-constrained-by-RLS capability.  Even so, others 
>> > want
>> > it, and FORCE ROW SECURITY would deliver it with an acceptable risk 
>> > profile.
>>
>> I've attached a patch to implement it.  It's not fully polished but it's
>> sufficient for comment, I believe.  Additional comments, documentation
>> and regression tests are to be added, if we have agreement on the
>> grammer and implementation approach.
>
> This patch has FORCE ROW LEVEL SECURITY take precedence over row_security=off,
> which thwarts pg_dump use of row_security=off to ensure dump completeness.

Yeah, I think that's NOT ok.

> Should this be a table-level flag, or should it be a policy-level flag?  A
> policy-level flag is more powerful.  If nobody really anticipates using that
> power, this table-level flag works for me.

Either works for me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Potential GIN vacuum bug

2015-10-02 Thread Robert Haas

On Thu, Oct 1, 2015 at 4:44 PM, Jeff Janes  wrote:
> Is the locking around indexes summarized someplace?

Not to my knowledge.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CustomScan support on readfuncs.c

2015-10-02 Thread Robert Haas

On Tue, Sep 29, 2015 at 6:19 PM, Kouhei Kaigai  wrote:
>> On Mon, Sep 28, 2015 at 8:31 PM, Kouhei Kaigai  wrote:
>> >> Instead of doing this:
>> >>
>> >> +/* Dump library and symbol name instead of raw pointer */
>> >>  appendStringInfoString(str, " :methods ");
>> >> -_outToken(str, node->methods->CustomName);
>> >> +_outToken(str, node->methods->methods_library_name);
>> >> +appendStringInfoChar(str, ' ');
>> >> +_outToken(str, node->methods->methods_symbol_name);
>> >>
>> >> Suppose we just make library_name and symbol_name fields in the node
>> >> itself, that are dumped and loaded like any others.
>> >>
>> >> Would that be better?
>> >>
>> > I have no preference here.
>> >
>> > Even if we dump library_name/symbol_name as if field in CustomScan,
>> > not CustomScanMethods, in a similar way, we cannot use WRITE_STRING_FIELD
>> > here, because its 'fldname' assumes these members are direct field of
>> > CustomScan.
>> >
>> >   /* Write a character-string (possibly NULL) field */
>> >   #define WRITE_STRING_FIELD(fldname) \
>> >   (appendStringInfo(str, " :" CppAsString(fldname) " "), \
>> >_outToken(str, node->fldname))
>>
>> Well that's exactly what I was suggesting: making them a direct field
>> of CustomScan.
>>
> Let me confirm. Are you suggesting to have library_name/symbol_name
> in CustomScan, not CustomScanMethods?
> I prefer these fields are in CustomScanMethods because we don't need
> to setup them again once PG_init set up these symbols. CustomScan has
> to be created every time when it is chosen by planner.

True.  But that doesn't cost much.  I'm not sure it's better, so if
you don't like it, don't worry about it.

>> > One other question I have. Do we have a portable way to lookup
>> > a pair of library and symbol by address?
>> > Glibc has dladdr() functions that returns these information,
>> > however, manpage warned it is not defined by POSIX.
>> > If we would be able to have any portable way, it may make the
>> > interface simpler.
>>
>> Yes: load_external_function.
>>
> It looks up an address by a pair of library and symbol name
> What I'm looking for is a portable function that looks up a pair
> of library and symbol name by an address, like dladdr() of glibc.
> I don't know whether other *nix or windows have these infrastructure.

No, that doesn't exist, and the chances of us trying add that
infrastructure for this feature are nil.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Alvaro Herrera

Joshua D. Drake wrote:

> Put succinctly, I am willing to put resources into testing Redmine for our
> needs. I will need help to do so because I am not a committer/hacker. Andres
> thinks that isn't worth while. I think he is wrong. If he doesn't want to
> help, he doesn't have to, thus the call for volunteers.

Nobody asked, but here's my opinion on Redmine.  I worked pretty heavily
with it during my time at Command Prompt.  I have to say that with the
customizations that CMD had at the time, it wasn't that bad -- I was
pretty happy that I could interact with it via email, and most of the
time it wouldn't do anything too stupid.  I could also interact with it
using the web, and it worked pretty well there.  Most other Redmine
installations I've used don't have the email interface at all.

However, the contact surface between these two options wasn't really
well polished.  Formatting would be lost very frequently: I could write
a nice email, and the customer would get a nice email, but if you looked
at it in the web, it was very ugly.  If you used the web form to reply,
the resulting email looked pretty stupid in some cases.  I eventually
learned to use the right {{{ }}} markers in my email replies so that
code would look right in the web.  But if you made a single mistake, you
were fscked and there was no way at all to fix it.

As far as I remember, the main reason for this pain was that it didn't
try to consider an email as an email: instead, what it did was grab the
text and cram it into the comment box.  Same thing in the other
direction, where the text from the comment would be crammed as an email
in output.  All that was needed was for it to store emails in the
rfc/2822 format in the database, and then render them as emails in the
web form, instead of trying to do the conversion in the process.

If you look at the debbugs interface, it is pretty clear that all that
it does is keep track of emails -- which, let it be said, is the soul of
this community's communication, so it seems to me that that is what we
need.  Metadata changes are kept visually separate from actual
commentary, which is convenient; and you can always get the mbox
involving that bug, or look at minute details of it using the web
interface if you need that sort of thing.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-10-02 Thread Robert Haas

On Tue, Sep 29, 2015 at 1:52 PM, Shulgin, Oleksandr
 wrote:
> This is not a change of the direction, but rather of the approach.  Hitting
> a process with a signal and hoping it will produce a meaningful response in
> all circumstances without disrupting its current task was way too naive.

I think that's exactly right.  It's not safe to do much anything from
a signal handler, and while ProcessInterrupts() is a very
substantially safer, the set of things that can be safely done there
is still awfully restricted.  You have to cope with the fact that the
function you just interrupted may be doing anything at all, and if you
change anything before returning, you may knock over the apple cart.
Just as bad, the state you inherit may not be very sane: we call
ProcessInterrupts() from LOTS of places and there is absolutely no
guarantee that every one of those places has the data structures that
you need to run EXPLAIN ANALYZE in a sane state.

I haven't looked at the new patch specifically so I don't have an
opinion on that at this time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-02 Thread Robert Haas

On Fri, Oct 2, 2015 at 3:14 AM, Amit Langote
 wrote:
> On 2015/10/02 15:38, Fujii Masao wrote:
>>
>> +uint32 progress_param[N_PROGRESS_PARAM];
>>
>> Why did you use an array to store the progress information of VACUUM?
>> I think that it's better to use separate specific variables for them for
>> better code readability, for example, variables scanned_pages,
>> heap_total_pages, etc.
>>
>> +doubleprogress_param_float[N_PROGRESS_PARAM];
>>
>> Currently only progress_param_float[0] is used. So there is no need to
>> use an array here.
>
> I think this kind of design may have come from the ideas expressed here
> (especially the last paragraph):
>
> http://www.postgresql.org/message-id/CA+TgmoYnWtNJRmVWAJ+wGLOB_x8vNOTrZnEDio=gapi5hk7...@mail.gmail.com

Right.  This design is obviously silly if we only care about exposing
VACUUM progress.  But if we want to be able to expose progress from
many utility commands, and slightly different kinds of information for
each one, then I think it could be quite useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-10-02 Thread Robert Haas

On Fri, Oct 2, 2015 at 4:27 AM, Kouhei Kaigai  wrote:
>> On Thu, Oct 1, 2015 at 2:35 AM, Kouhei Kaigai  wrote:
>> > Gather node was oversight by readfunc.c, even though it shall not be
>> > serialized actually.
>> > Also, it used incompatible WRITE_xxx_FIELD() macro on outfuncs.c.
>> >
>> > The attached patch fixes both of incomsistence.
>>
>> Thanks.  You missed READ_DONE() but fortunately my compiler noticed
>> that oversight.  Committed with that fix.
>>
> I could find one other strangenes, at explain.c.
>
> case T_Gather:
> {
> Gather *gather = (Gather *) plan;
>
> show_scan_qual(plan->qual, "Filter", planstate, ancestors, 
> es);
> if (plan->qual)
> show_instrumentation_count("Rows Removed by Filter", 1,
>planstate, es);
> ExplainPropertyInteger("Number of Workers",
>gather->num_workers, es);
> if (gather->single_copy)
> ExplainPropertyText("Single Copy",
> gather->single_copy ? "true" : 
> "false",
> es);
> }
> break;
>
> What is the intention of the last if-check?
> The single_copy is checked in the argument of ExplainPropertyText().

Oops, that was dumb.  single_copy only makes sense if num_workers ==
1, so I intended the if-test to be based on num_workers, not
single_copy.  Not sure if we should just make that change now or if
there's a better way to do display it.

I'm sort of tempted to try to come up with a shorthand that only uses
one line in text mode - e.g. set pname to something like "Gather 3" if
there are 3 workers, "Gather 1" if there is worker, or "Gather Single"
if there is one worker and we're in single_copy mode.  In non-text
mode, of course, the properties should be displayed separately, for
machine parse-ability.

But maybe I'm getting ahead of myself and we should just change it to
if (gather->num_workers == 1) for now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [DOCS] max_worker_processes on the standby

2015-10-02 Thread Robert Haas

On Fri, Oct 2, 2015 at 2:59 PM, Alvaro Herrera  wrote:
> Robert Haas wrote:
>> The standby can have the feature enabled even though the master has it
>> disabled?  That seems like it can only lead to heartache.
>
> Can you elaborate?

Sort of.  Our rule up until now has always been that the standby is an
exact copy of the master.  I suspect deviating from that behavior will
introduce bugs.  I suspect having the standby make data changes that
aren't WAL-logged will introduce bugs; not to be unkind, but that
certainly seems like a lesson to take from what happened with
multixacts.

I haven't looked at this code well enough to guess specifically what
will go wrong.  But consider people turning the feature on and off
repeatedly on the master, and separately on the standby, combined with
crashes on the standby that restart replay from earlier points
(possibly with settings that have changed in the meantime).  Are we
really sure that we're never going to end up with the wrong files, or
inconsistent ones, on the standby?  I have a really hard time
believing that's going to work out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Joshua D. Drake


On 10/02/2015 12:52 PM, Robert Haas wrote:


OK.  My reading of the thread is that the hackers who expressed an
opinion mostly preferred debbugs and the people less involved in the
project wanted something more like GitHub/GitLab.  Some people also
argued for and against Bugzilla and JIRA.  I didn't really see anybody
agitating for Redmine, so I'm not sure how far you're going to get
with that.


Right, thus call for Volunteers. If I can get a few to help me, I am 
willing to put my resources behind it. If I can't then I won't. I guess 
what I am really saying here is, "If I am going to bitch about the lack 
of a tracker, I better damn well be willing to put resources into fixing 
the problem".


So I am putting my money where my mouth is.



But I'm not really arguing against it.  We use it here at
EnterpriseDB, and it's very helpful, though it is undeniably a
[expletive]-ton of work to maintain.


Yeah, and that is any issue tracker (the ton of work to maintain).

Sincerely,

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Robert Haas

On Fri, Oct 2, 2015 at 2:43 PM, Joshua D. Drake  wrote:
> I am sorry if I wasn't clear.
>
> Put succinctly, I am willing to put resources into testing Redmine for our
> needs. I will need help to do so because I am not a committer/hacker. Andres
> thinks that isn't worth while. I think he is wrong. If he doesn't want to
> help, he doesn't have to, thus the call for volunteers.
>
> I am not expecting that redmine will be chosen over debbugs. I am expecting
> that we make an educated decision about which one suits our needs. If we
> only review debbugs, we are making a decision in a vacuum.

OK.  My reading of the thread is that the hackers who expressed an
opinion mostly preferred debbugs and the people less involved in the
project wanted something more like GitHub/GitLab.  Some people also
argued for and against Bugzilla and JIRA.  I didn't really see anybody
agitating for Redmine, so I'm not sure how far you're going to get
with that.

But I'm not really arguing against it.  We use it here at
EnterpriseDB, and it's very helpful, though it is undeniably a
[expletive]-ton of work to maintain.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [DOCS] max_worker_processes on the standby

2015-10-02 Thread Alvaro Herrera

Robert Haas wrote:

> The standby can have the feature enabled even though the master has it
> disabled?  That seems like it can only lead to heartache.

Can you elaborate?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Joshua D. Drake


On 10/02/2015 11:36 AM, Robert Haas wrote:


I don't know what this has to do with anything Andres said.



I am sorry if I wasn't clear.

Put succinctly, I am willing to put resources into testing Redmine for 
our needs. I will need help to do so because I am not a 
committer/hacker. Andres thinks that isn't worth while. I think he is 
wrong. If he doesn't want to help, he doesn't have to, thus the call for 
volunteers.


I am not expecting that redmine will be chosen over debbugs. I am 
expecting that we make an educated decision about which one suits our 
needs. If we only review debbugs, we are making a decision in a vacuum.


JD



--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [DOCS] max_worker_processes on the standby

2015-10-02 Thread Robert Haas

On Fri, Oct 2, 2015 at 10:58 AM, Alvaro Herrera
 wrote:
> Fujii Masao wrote:
>
>> What happens if pg_xact_commit_timestamp() is called in standby after
>> track_commit_timestamp is disabled in master, DeactivateCommitTs() is
>> called and all commit_ts files are removed in standby? I tried that case
>> and got the following assertion failure.
>
> Ah.  So the standby needs to keep the module activated if it's enabled
> locally, even when it receives a message that the master turned it off.
> Here's a patch.

The standby can have the feature enabled even though the master has it
disabled?  That seems like it can only lead to heartache.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Freeze avoidance of very large table.

2015-10-02 Thread Robert Haas

On Fri, Oct 2, 2015 at 11:23 AM, Alvaro Herrera
 wrote:
>> + /* all-frozen information is also cleared at the same time */
>>   PageClearAllVisible(page);
>> + PageClearAllFrozen(page);
>
> I wonder if it makes sense to have a macro to clear both in unison,
> which seems a very common pattern.

I think PageClearAllVisible should clear both, and there should be no
other macro.  There is no event that causes a page to cease being
all-visible that does not also cause it to cease being all-frozen.
You might think that deleting or locking a tuple would fall into that
category - but nope, XMAX needs to be cleared or the tuple pruned, or
there will be problems after wraparound.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Robert Haas

On Fri, Oct 2, 2015 at 12:50 PM, Joshua D. Drake  wrote:
> I once drove a manual all the time. I swore by the manual. I loved the
> control, the feeling (ridiculously) that I was a race car driver on the
> Urban track.
>
> Then I got an automatic and realized how damn nice it was to not be in
> control or have to worry about whether or not I should shift at 3k RPMS or
> 4k RPMS. (Then I drove a manual on the wrong side of the road in Ireland and
> was even happier that I prefer automatics)
>
> We can't just point at something and say, "Oh that will work" because we
> haven't tried to see how it is going to work with our requirements.
>
> I wouldn't expect any customer to use postgresql because it has an Elephant
> as a mascot. I expect them to use it because it is the best choice for their
> requirements.

I don't know what this has to do with anything Andres said.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-10-02 Thread Jeff Janes

On Thu, Oct 1, 2015 at 7:48 AM, Magnus Hagander  wrote:

> On Thu, Oct 1, 2015 at 4:35 PM, Robert Haas  wrote:
>
>>
>> - Bug numbers are sometimes preserved in commit messages, but they
>> never make it into release notes.  This actually seems like something
>> we could improve pretty easily and without a lot of extra work (and
>> also without a bug tracker).  If every committer makes a practice of
>> putting the bug number into the commit message, and the people who
>> write the release notes then transcribe the information there, I bet
>> that would be pretty useful to a whole lotta people.
>>
>
> That would require people to actually use the bug form to submit the
> initial thread as well of course - which most developers don't do
> themselves today. But there is in itself nothing that prevents them from
> doing that, of course - other than a Small Amount Of Extra Work.
>

It is not always clear to me where I am supposed to report bugs.  I
generally use -hackers if it is in code which is committed but not yet been
released, or if I've tracked it down to source code or a backtrace or
something like that, or if it is theoretical concern that I am not sure is
actually a bug.

Of course if I error on the side of sending it to -hackers when it should
be -bugs, someone could always forward it there, or tell me to do so.

It is also a bit awkward to send a patch on the bugs form, so if we want to
people to use the bugs form even when they are also submitting a patch to
fix the bug, we should explicitly state what it is we want them to.  Two
separate submissions, one to -bugs, one to -hackers?  An email to -bugs
(rather than using the form, which doesn't allow attachments) with an
attachment?

Cheers,

Jeff

Re: [HACKERS] Freeze avoidance of very large table.

2015-10-02 Thread Masahiko Sawada

On Sat, Oct 3, 2015 at 12:23 AM, Alvaro Herrera
 wrote:
> Masahiko Sawada wrote:
>

Thank you for taking time to review this feature.
Attached the latest version patch (v15).


>> @@ -2972,10 +2981,15 @@ l1:
>>*/
>>   PageSetPrunable(page, xid);
>>
>> + /* clear PD_ALL_VISIBLE and PD_ALL_FORZEN flags */
>
> Typo "FORZEN".

Fixed.

>
>>   if (PageIsAllVisible(page))
>>   {
>>   all_visible_cleared = true;
>> +
>> + /* all-frozen information is also cleared at the same time */
>>   PageClearAllVisible(page);
>> + PageClearAllFrozen(page);
>
> I wonder if it makes sense to have a macro to clear both in unison,
> which seems a very common pattern.
>

Fixed.

>
>> diff --git a/src/backend/access/heap/visibilitymap.c 
>> b/src/backend/access/heap/visibilitymap.c
>> index 7c38772..a284b85 100644
>> --- a/src/backend/access/heap/visibilitymap.c
>> +++ b/src/backend/access/heap/visibilitymap.c
>> @@ -21,33 +21,45 @@
>>   *
>>   * NOTES
>>   *
>> - * The visibility map is a bitmap with one bit per heap page. A set bit 
>> means
>> - * that all tuples on the page are known visible to all transactions, and
>> - * therefore the page doesn't need to be vacuumed. The map is conservative 
>> in
>> - * the sense that we make sure that whenever a bit is set, we know the
>> - * condition is true, but if a bit is not set, it might or might not be 
>> true.
>> + * The visibility map is a bitmap with two bits (all-visible and all-frozen)
>> + * per heap page. A set all-visible bit means that all tuples on the page 
>> are
>> + * known visible to all transactions, and therefore the page doesn't need to
>> + * be vacuumed. A set all-frozen bit means that all tuples on the page are
>> + * completely frozen, and therefore the page doesn't need to be vacuumed 
>> even
>> + * if whole table scanning vacuum is required (e.g. anti-wraparound vacuum).
>> + * A all-frozen bit must be set only when the page is already all-visible.
>> + * That is, all-frozen bit is always set with all-visible bit.
>
> "A all-frozen" -> "The all-frozen" (but "A set all-xyz" is correct).

Fixed.

>
>>   * When we *set* a visibility map during VACUUM, we must write WAL.  This 
>> may
>>   * seem counterintuitive, since the bit is basically a hint: if it is clear,
>> - * it may still be the case that every tuple on the page is visible to all
>> - * transactions; we just don't know that for certain.  The difficulty is 
>> that
>> - * there are two bits which are typically set together: the PD_ALL_VISIBLE 
>> bit
>> - * on the page itself, and the visibility map bit.  If a crash occurs after 
>> the
>> - * visibility map page makes it to disk and before the updated heap page 
>> makes
>> - * it to disk, redo must set the bit on the heap page.  Otherwise, the next
>> - * insert, update, or delete on the heap page will fail to realize that the
>> - * visibility map bit must be cleared, possibly causing index-only scans to
>> - * return wrong answers.
>> + * it may still be the case that every tuple on the page is visible or 
>> frozen
>> + * to all transactions; we just don't know that for certain.  The 
>> difficulty is
>> + * that there are two bits which are typically set together: the 
>> PD_ALL_VISIBLE
>> + * or PD_ALL_FROZEN bit on the page itself, and the visibility map bit.  If 
>> a
>> + * crash occurs after the visibility map page makes it to disk and before 
>> the
>> + * updated heap page makes it to disk, redo must set the bit on the heap 
>> page.
>> + * Otherwise, the next insert, update, or delete on the heap page will fail 
>> to
>> + * realize that the visibility map bit must be cleared, possibly causing 
>> index-only
>> + * scans to return wrong answers.
>
> In the "The difficulty ..." para, I would add the word "corresponding" before
> "visibility".  Otherwise, it is not clear what the plural means exactly.

Fixed.

>>   * VACUUM will normally skip pages for which the visibility map bit is set;
>>   * such pages can't contain any dead tuples and therefore don't need 
>> vacuuming.
>> - * The visibility map is not used for anti-wraparound vacuums, because
>> + * The visibility map is not used for anti-wraparound vacuums before 9.5, 
>> because
>>   * an anti-wraparound vacuum needs to freeze tuples and observe the latest 
>> xid
>>   * present in the table, even on pages that don't have any dead tuples.
>> + * 9.6 or later, the visibility map has a additional bit which indicates 
>> all tuple
>> + * on single page has been completely forzen, so the visibility map is also 
>> used for
>> + * anti-wraparound vacuums.
>
> This should not mention database versions.  Just explain how the code
> behaves today, not how it behaved in the past.  Those who want to
> understand how it behaved in 9.5 can read the 9.5 code.  (Again typo
> "forzen".)

Changed these comment.
Sorry for the same typo frequently..

>> @@ -1115,6 +1187,14 @@ lazy_scan_heap(Relation onerel, LVRelStats 
>> *vac

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Joshua D. Drake


On 10/02/2015 09:41 AM, Andres Freund wrote:

Hi,

On 2015-10-02 09:28:02 -0700, Joshua D. Drake wrote:

I am proposing to setup a redmine instance on a VM. I am happy to do a lot
of the legwork and should be able to get most of it done before EU. This is
what I think I need from my fellows:


-1. This thread has already cost disproportionally much time and
motiviation. Let's not wase more in doing two evaluations at the same
time, when we might already be happy with one of them. We came pretty
close to agreeing on debbugs in a couple past discussions so that seems
the least unlikely choice to actually get adopted.


I once drove a manual all the time. I swore by the manual. I loved the 
control, the feeling (ridiculously) that I was a race car driver on the 
Urban track.


Then I got an automatic and realized how damn nice it was to not be in 
control or have to worry about whether or not I should shift at 3k RPMS 
or 4k RPMS. (Then I drove a manual on the wrong side of the road in 
Ireland and was even happier that I prefer automatics)


We can't just point at something and say, "Oh that will work" because we 
haven't tried to see how it is going to work with our requirements.


I wouldn't expect any customer to use postgresql because it has an 
Elephant as a mascot. I expect them to use it because it is the best 
choice for their requirements.


JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Andres Freund

Hi,

On 2015-10-02 09:28:02 -0700, Joshua D. Drake wrote:
> I am proposing to setup a redmine instance on a VM. I am happy to do a lot
> of the legwork and should be able to get most of it done before EU. This is
> what I think I need from my fellows:

-1. This thread has already cost disproportionally much time and
motiviation. Let's not wase more in doing two evaluations at the same
time, when we might already be happy with one of them. We came pretty
close to agreeing on debbugs in a couple past discussions so that seems
the least unlikely choice to actually get adopted.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Joshua D. Drake


On 10/02/2015 09:34 AM, Dave Page wrote:


Thoughts? Volunteers?


I swore to myself that I'd stay out of this bikeshed, but... we already have a 
Redmine instance.


I know but I didn't want to dogfood in an installation that was 
production. I am not going to put up vanilla redmine, I actually planned 
on configuring in a way that could be used by -hackers which could 
(possibly?) cause issues with the install .Org has.


JD







--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Dave Page



> On 2 Oct 2015, at 17:28, Joshua D. Drake  wrote:
> 
> Hello,
> 
> I believe it is pretty obvious we are moving in the direction of having a 
> tracker at this point. The problem is exactly which direction. Stephen has 
> offered some resources for his ideas and now I am offering some resources for 
> mine.
> 
> I am proposing to setup a redmine instance on a VM. I am happy to do a lot of 
> the legwork and should be able to get most of it done before EU. This is what 
> I think I need from my fellows:
> 
> 1. At least two committers
> 2. At least three hackers (preferably that are different from the committers 
> but not required)
> 3. At least two users
> 
> I think we have reached a point where everyone has ideas of what they do and 
> don't want but none of it matters if we don't have a proof of concept to 
> determine what we do and don't know or want.
> 
> Thoughts? Volunteers?

I swore to myself that I'd stay out of this bikeshed, but... we already have a 
Redmine instance.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Request for dogfood volunteers (was [HACKERS] No Issue Tracker - Say it Ain't So!)

2015-10-02 Thread Joshua D. Drake


Hello,

I believe it is pretty obvious we are moving in the direction of having 
a tracker at this point. The problem is exactly which direction. Stephen 
has offered some resources for his ideas and now I am offering some 
resources for mine.


I am proposing to setup a redmine instance on a VM. I am happy to do a 
lot of the legwork and should be able to get most of it done before EU. 
This is what I think I need from my fellows:


1. At least two committers
2. At least three hackers (preferably that are different from the 
committers but not required)

3. At least two users

I think we have reached a point where everyone has ideas of what they do 
and don't want but none of it matters if we don't have a proof of 
concept to determine what we do and don't know or want.


Thoughts? Volunteers?

Sincerely,

JD
--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Freeze avoidance of very large table.

2015-10-02 Thread Alvaro Herrera

Masahiko Sawada wrote:

> @@ -2972,10 +2981,15 @@ l1:
>*/
>   PageSetPrunable(page, xid);
>  
> + /* clear PD_ALL_VISIBLE and PD_ALL_FORZEN flags */

Typo "FORZEN".

>   if (PageIsAllVisible(page))
>   {
>   all_visible_cleared = true;
> +
> + /* all-frozen information is also cleared at the same time */
>   PageClearAllVisible(page);
> + PageClearAllFrozen(page);

I wonder if it makes sense to have a macro to clear both in unison,
which seems a very common pattern.

  
> diff --git a/src/backend/access/heap/visibilitymap.c 
> b/src/backend/access/heap/visibilitymap.c
> index 7c38772..a284b85 100644
> --- a/src/backend/access/heap/visibilitymap.c
> +++ b/src/backend/access/heap/visibilitymap.c
> @@ -21,33 +21,45 @@
>   *
>   * NOTES
>   *
> - * The visibility map is a bitmap with one bit per heap page. A set bit means
> - * that all tuples on the page are known visible to all transactions, and
> - * therefore the page doesn't need to be vacuumed. The map is conservative in
> - * the sense that we make sure that whenever a bit is set, we know the
> - * condition is true, but if a bit is not set, it might or might not be true.
> + * The visibility map is a bitmap with two bits (all-visible and all-frozen)
> + * per heap page. A set all-visible bit means that all tuples on the page are
> + * known visible to all transactions, and therefore the page doesn't need to
> + * be vacuumed. A set all-frozen bit means that all tuples on the page are
> + * completely frozen, and therefore the page doesn't need to be vacuumed even
> + * if whole table scanning vacuum is required (e.g. anti-wraparound vacuum).
> + * A all-frozen bit must be set only when the page is already all-visible.
> + * That is, all-frozen bit is always set with all-visible bit.

"A all-frozen" -> "The all-frozen" (but "A set all-xyz" is correct).


>   * When we *set* a visibility map during VACUUM, we must write WAL.  This may
>   * seem counterintuitive, since the bit is basically a hint: if it is clear,
> - * it may still be the case that every tuple on the page is visible to all
> - * transactions; we just don't know that for certain.  The difficulty is that
> - * there are two bits which are typically set together: the PD_ALL_VISIBLE 
> bit
> - * on the page itself, and the visibility map bit.  If a crash occurs after 
> the
> - * visibility map page makes it to disk and before the updated heap page 
> makes
> - * it to disk, redo must set the bit on the heap page.  Otherwise, the next
> - * insert, update, or delete on the heap page will fail to realize that the
> - * visibility map bit must be cleared, possibly causing index-only scans to
> - * return wrong answers.
> + * it may still be the case that every tuple on the page is visible or frozen
> + * to all transactions; we just don't know that for certain.  The difficulty 
> is
> + * that there are two bits which are typically set together: the 
> PD_ALL_VISIBLE
> + * or PD_ALL_FROZEN bit on the page itself, and the visibility map bit.  If a
> + * crash occurs after the visibility map page makes it to disk and before the
> + * updated heap page makes it to disk, redo must set the bit on the heap 
> page.
> + * Otherwise, the next insert, update, or delete on the heap page will fail 
> to
> + * realize that the visibility map bit must be cleared, possibly causing 
> index-only
> + * scans to return wrong answers.

In the "The difficulty ..." para, I would add the word "corresponding" before
"visibility".  Otherwise, it is not clear what the plural means exactly.

>   * VACUUM will normally skip pages for which the visibility map bit is set;
>   * such pages can't contain any dead tuples and therefore don't need 
> vacuuming.
> - * The visibility map is not used for anti-wraparound vacuums, because
> + * The visibility map is not used for anti-wraparound vacuums before 9.5, 
> because
>   * an anti-wraparound vacuum needs to freeze tuples and observe the latest 
> xid
>   * present in the table, even on pages that don't have any dead tuples.
> + * 9.6 or later, the visibility map has a additional bit which indicates all 
> tuple
> + * on single page has been completely forzen, so the visibility map is also 
> used for
> + * anti-wraparound vacuums.

This should not mention database versions.  Just explain how the code
behaves today, not how it behaved in the past.  Those who want to
understand how it behaved in 9.5 can read the 9.5 code.  (Again typo
"forzen".)

> @@ -1115,6 +1187,14 @@ lazy_scan_heap(Relation onerel, LVRelStats 
> *vacrelstats,
>   tups_vacuumed, 
> vacuumed_pages)));
>  
>   /*
> +  * This information would be effective for how much effect all-frozen 
> bit
> +  * of VM had for freezing tuples.
> +  */
> + ereport(elevel,
> + (errmsg("Skipped %d frozen pages acoording to 
> visibility map",
> +

Re: [HACKERS] What is the extent of FDW join pushdown support in 9.5?

2015-10-02 Thread Bruce Momjian

On Wed, Sep 16, 2015 at 04:32:51PM -0400, Peter Eisentraut wrote:
> On 9/15/15 10:02 PM, Kouhei Kaigai wrote:
> >> The 9.5 release notes contain this promising but cryptic item:
> >>
> >> - Allow foreign data wrappers and custom scans to implement join
> >> pushdown (KaiGai Kohei)
> >>
> >> As phrased, this seems to mean, "it can be done, but we haven't done
> >> it".  But there is no link to any documentation that explains how to do
> >> this.  The commit that appears to have added this feature touches
> >> postgres_fdw, but neither the documentation nor the tests show anything
> >> about this feature, and I haven't been able to reproduce anything that
> >> would seem to indicate anything like this is at work.
> >>
> >> So what is the actual extent of this feature?
> >>
> > It says these enhancement on interface allows extensions to implement
> > join operation in their own way (including remote join in case of FDW),
> > however, enhancement of postgres_fdw is not yet upstreamed.
> 
> Thank you for this clarification.  I suppose this item is correct as
> written.

Yes, it was cryptically written on purpose.  ;-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [DOCS] max_worker_processes on the standby

2015-10-02 Thread Alvaro Herrera

Fujii Masao wrote:

> What happens if pg_xact_commit_timestamp() is called in standby after
> track_commit_timestamp is disabled in master, DeactivateCommitTs() is
> called and all commit_ts files are removed in standby? I tried that case
> and got the following assertion failure.

Ah.  So the standby needs to keep the module activated if it's enabled
locally, even when it receives a message that the master turned it off.
Here's a patch.

Thanks for your continued testing!

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
commit 7441f88b746b7522f1714ed9fec95c3c4fe1dacb[m
Author: Alvaro Herrera 
AuthorDate: Fri Oct 2 11:39:44 2015 -0300
CommitDate: Fri Oct 2 11:39:44 2015 -0300

Don't disable commit_ts in standby if enabled locally

Bug noticed by Fujii Masao

diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 79ca04a..24b8291 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -583,14 +583,15 @@ CommitTsParameterChange(bool newvalue, bool oldvalue)
 	 * pg_control.  If the old value was already set, we already did this, so
 	 * don't do anything.
 	 *
-	 * If the module is disabled in the master, disable it here too.
+	 * If the module is disabled in the master, disable it here too, unless
+	 * the module is enabled locally.
 	 */
 	if (newvalue)
 	{
 		if (!track_commit_timestamp && !oldvalue)
 			ActivateCommitTs();
 	}
-	else if (oldvalue)
+	else if (!track_commit_timestamp && oldvalue)
 		DeactivateCommitTs(false);
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Connection string parameter 'replication' in documentation

2015-10-02 Thread Takashi Ohnishi

Hi,

I found that it can be set like 'replication = true' in connection
parameter as documentation says in  "50.3. Streaming Replication Protocol",
but this parameter is not denoted in "31.1.2. Parameter Key Words".

How about adding notation about this paramter in 31.1.2.?
Attached is a patch for that.

Regards,

=
Takashi Ohnishi


connect-parameter-replication.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-02 Thread Amir Rohan

On 10/02/2015 03:33 PM, Michael Paquier wrote:

> Any server instances created during the tests should never use a
> user-defined port for portability. Hence using those ports as keys
> just made sense. We could have for example custom names, that have
> port values assigned to them, but that's actually an overkill and
> complicates the whole facility.
> 

Something like:

global nPortsAssigned = 0;
AssignPort() -> return is_ok(nPortsAssigned++)

was what I used.

>> 2) Behaviour (paths in particular) is hardwired rather then overridable
>> defaults.
> 
> This is the case of all the TAP tests. We could always use the same
> base directory for all the nodes and then embed a sub-directory whose
> name is decided using the port number. But I am not really sure if
> that's a win.
> 

I understand, but it eliminates the kind of scenarios this convenience
package lets you express... conveniently.

>> This is exactly what I needed to test, problems:
>> 3) Can't stop server without clearing its testing data (the maps holding
>> paths and things). But that data might be specifically
>> needed, in particular the backup shouldn't disappear when the
>> server melts down or we have a very low-grade DBA on our hands.
> 
> OK, you have a point here. You may indeed want routines for to enable
> and disable a node completely decoupled from start and stop, with
> something like enable_node and disable_node that basically registers
> or unregisters it from the list of active nodes. I have updated the
> patch this way.
> 

Excellent.

>> 4) Port assignment relies on liveness checks on running servers.
>> If a server is shut down and a new instantiated, the port will get
>> reused, data will get trashed, and various confusing things can happen.
> 
> Right. The safest way to do that is to check in get_free_port if a
> port number is used by a registered node, and continue to loop in if
> that's the case. So done.
> 

That eliminates the "sweet and gentle" variant of the scenario, but it's
susceptible to the "ABA problem":
https://en.wikipedia.org/wiki/ABA_problem
https://youtu.be/CmxkPChOcvw?t=786

Granted, you have to try fairly hard to shoot yourself in the leg,
but since the solution is so simple, why not? If we never reuse ports
within a single test, this goes away.

>> 5) Servers are shutdown with -m 'immediate', which can lead to races
>> in the script when archiving is turned on. That may be good for some
>> tests, but there's no control over it.
> 
> I hesitated with fast here actually. So changed this way. We would
> want as wall a teardown command to stop the node with immediate and
> unregister the node from the active list.
> 

In particular, I was shutting down an archiving node and the archiving
was truncated. I *think* smart doesn't do this. But again, it's really
that the test writer can't easily override, not that the default is wrong.

>> Other issues:
>> 6. Directory structure, used one directory per thing but more logical
>> to place all things related to an instance under a single directory,
>> and name them according to role (57333_backup, and so on).
> 
> Er, well. The first version of the patch did so, and then I switched
> to an approach closer to what the existing TAP facility is doing. But
> well let's simplify things a bit.
> 

I know, I know, but:
1) an instance is a "thing" in your script, so having its associated
paraphernalia in one place makes more sense (maybe only to me).
2) That's often how folks (well, how I) arrange things in deployment,
at least with archive/backup as symlinks to the nas.

Alternatively, naming the dirs with a prefix (srv_foo_HASH,
backup_foo_backupname_HASH, etc') would work as well.

>> 7. enable_restoring() uses "cp -i" 'archive_command', not a good fit
>> for an automated test.
> 
> This seems like a good default to me, and actually that's portable on
> Windows easily. One could always append a custom archive_command in a
> test when for example testing conflicting archiving when archive_mode
> = always.
> 

Ok, I wasn't sure about this, but specifically activating a switch that
asks for input from the user during a test? hmm.

 >> 8. No canned way to output a pprinted overview of the running system
>> (paths, ports, for manual checking).
> 
> Hm. Why not... Are you suggesting something like print_current_conf
> that goes through all the registered nodes and outputs that? How would
> you use it?
>

For one thin, I could open a few terminals and `$(print_env_for_server
5437), so psql just worked.

I wish PEG had that as well.


>> 10. If a test passes/fails or dies due to a bug, everything is cleaned.
>> Great for testing, bad for postmortem.
> 
> That's something directly related to TestLib.pm where
> File:Temp:tempdir creates a temporary path with CLEANUP = 1. We had
> discussions regarding that actually...
> 

>> 11. a canned "server is responding to queries" helper would be convenient.
> 
> Do you mean a wrapper on pg_isready? Do you have use cases in mind f

[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-02 Thread Michael Paquier

On Sat, Sep 26, 2015 at 10:25 AM, Amir Rohan wrote:
> On 09/25/2015 01:47 PM, Michael Paquier wrote:
>> offer a way to set up a cluster ready for hacking in a single
>> command, and being able to switch among them easily. I am not sure we
>> would really find a cross-platform script generic enough to satisfy
>> all the needs of folks here and integrate it directly in the tree :)
>>
> Yes, perl/bash seems to be the standard and both have their shorcomings,
> in either convenience or familiarity...

Note as well that bash does not run on Windows, so this should be in
perl for the cross-platform requirements.

> Coming back to the recovery testing package...

Thanks for providing input on this patch!

> I was investigating some behaviour having to do with recovery
> and tried your new library to write a repro case. This uncovers some
> implicit assumptions in the package that can make things diffficult
> when violated. I had to rewrite nearly every function I used.

OK. Noted. Could you be more explicit here with for example an example
of script or a patch?

> Major pain points:
> 1) Lib stores metadata (ports,paths, etc') using ports as keys.
> -> Assumes ports aren't reused.

What would be the point of reusing the same port number in a test for
different nodes? Ports are assigned automatically depending on their
availability looking at if a server is listening to it so they are
never reused as long as the node is running so that's a non-issue IMO.
Any server instances created during the tests should never use a
user-defined port for portability. Hence using those ports as keys
just made sense. We could have for example custom names, that have
port values assigned to them, but that's actually an overkill and
complicates the whole facility.

> -> Because assumes server keep running until the tear down.

Check. Locking down the port number is the problem here.

> And
> 2) Behaviour (paths in particular) is hardwired rather then overridable
> defaults.

This is the case of all the TAP tests. We could always use the same
base directory for all the nodes and then embed a sub-directory whose
name is decided using the port number. But I am not really sure if
that's a win.

> This is exactly what I needed to test, problems:
> 3) Can't stop server without clearing its testing data (the maps holding
> paths and things). But that data might be specifically
> needed, in particular the backup shouldn't disappear when the
> server melts down or we have a very low-grade DBA on our hands.

OK, you have a point here. You may indeed want routines for to enable
and disable a node completely decoupled from start and stop, with
something like enable_node and disable_node that basically registers
or unregisters it from the list of active nodes. I have updated the
patch this way.

> 4) Port assignment relies on liveness checks on running servers.
> If a server is shut down and a new instantiated, the port will get
> reused, data will get trashed, and various confusing things can happen.

Right. The safest way to do that is to check in get_free_port if a
port number is used by a registered node, and continue to loop in if
that's the case. So done.

> 5) Servers are shutdown with -m 'immediate', which can lead to races
> in the script when archiving is turned on. That may be good for some
> tests, but there's no control over it.

I hesitated with fast here actually. So changed this way. We would
want as wall a teardown command to stop the node with immediate and
unregister the node from the active list.

> Other issues:
> 6. Directory structure, used one directory per thing but more logical
> to place all things related to an instance under a single directory,
> and name them according to role (57333_backup, and so on).

Er, well. The first version of the patch did so, and then I switched
to an approach closer to what the existing TAP facility is doing. But
well let's simplify things a bit.

> 7. enable_restoring() uses "cp -i" 'archive_command', not a good fit
> for an automated test.

This seems like a good default to me, and actually that's portable on
Windows easily. One could always append a custom archive_command in a
test when for example testing conflicting archiving when archive_mode
= always.

> Aside from running the tests, the convenience of writing them
> needs to be considered. My perl is very weak, it's been at least
> a decade, but It was difficult to make progress because everything
> is geared toward a batch "pass/fail" run . Stdout is redirected,
> and the log files can't be 'tail --retry -f' in another terminal,
> because they're clobbered at every run.

This relies on the existing TAP infrastructure and this has been
considered as the most portable way of doing by including Windows.
This patch is not aiming at changing that, and will use as much as
possible the existing infrastructure.

> 8. No canned way to output a pprinted overview of the running system
> (paths, ports, for manual checking).

Hm. Why not... A

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-02 Thread Nikolay Shaplov

В письме от 2 октября 2015 13:10:22 пользователь Michael Paquier написал:
 
> >> + 
> >> + General idea about output columns: lp_*
> >> attributes
> >> + are about where tuple is located inside the page;
> >> + t_xmin, t_xmax,
> >> + t_field3, t_ctid are
> 
> about
> 
> >> + visibility of this tuplue inside or outside of the transaction;
> >> + t_infomask2, t_infomask
> 
> has
> 
> >> some
> >> + information about properties of attributes stored in tuple data.
> >> + t_hoff sais where tuple data begins and
> >> + t_bits sais which attributes are NULL and
> 
> which
> 
> >> are
> >> + not. Please notice that t_bits here is not an actual value that is
> >> stored
> >> + in tuple data, but it's text representation  with '0' and '1'
> >> charactrs.
> >> + 
> >> I would remove that as well. htup_details.h contains all this
> 
> information.
> 
> > Made it even more shorter. Just links and comments about representation of
> > t_bits.
> 
> I would completely remove this part.

Michael my hand would not do it. I've been working as a lecturer for six 
years. If I want to pass an information in a comfortable way to reader, there 
should go some binding phrase. It may be very vague, but it should outline  
(may be in a very brief way, but still outline) an information that would be 
found if he follows the links. 

If we just give links "knowledge flow" will be uncomfortable for person who 
reads it.


> -test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
> +
> +test=# select * from heap_page_items(get_raw_page('pg_range', 0));
> This example in the docs is far too long in character length... We should
> get that reduced.

May be should do 

\x and limit 1 like this:

test=# select * from heap_page_items(get_raw_page('pg_range', 0)) limit 1;
-[ RECORD 1 ]---
lp  | 1
lp_off  | 8144
lp_flags| 1
lp_len  | 48
t_xmin  | 1
t_xmax  | 0
t_field3| 0
t_ctid  | (0,1)
t_infomask2 | 6
t_infomask  | 2304
t_hoff  | 24
t_bits  | 
t_oid   | 
t_data  | \x400f1700ba074a0f520f




> +  Please notice that t_bits in tuple header
> structure
> +  is a binary bitmap, but heap_page_items returns
> +  t_bits as human readable text representation of
> +  original t_bits bitmap.
> This had better remove the mention to "please notice". Still as this is
> already described in htup_details.h there is no real point to describe it
> again here: that's a bitmap of NULLs.

heap_page_items returns text(!) as t_bits. This is unexpected. This is not 
described in page layout documentation. We should tell about it here 
explicitly. 


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Freeze avoidance of very large table.

2015-10-02 Thread Masahiko Sawada

On Fri, Oct 2, 2015 at 7:30 AM, Josh Berkus  wrote:
> On 10/01/2015 07:43 AM, Robert Haas wrote:
>> On Thu, Oct 1, 2015 at 9:44 AM, Fujii Masao  wrote:
>>> I wonder how much it's worth renaming only the file extension while
>>> there are many places where "visibility map" and "vm" are used,
>>> for example, log messages, function names, variables, etc.
>>
>> I'd be inclined to keep calling it the visibility map (vm) even if it
>> also contains freeze information.
>>
>
> -1 to rename.  Visibility Map is a perfectly good name.
>

Thank you for taking time to review this patch.

Attached latest v14 patch.
v14 patch is changed so that I don't change file name of visibilitymap
to "vfm", and contains some bug fix.

> +#include "access/visibilitymap.h"
> visibilitymap.h doesn't need to be included in cluster.c.

Fixed.

> -  errmsg("table row type and query-specified row type do not match"),
> + errmsg("table row type and query-specified row type
> do not match"),
> This change doesn't seem to be necessary.

Fixed.

> +#define Anum_pg_class_relallfrozen12
> Why is pg_class.relallfrozen necessary? ISTM that there is no user of it now.

The relallfrozen would be useful for user to estimate time to vacuum
freeze or anti-wrapping vacuum before being done them actually.
(Also this value is used on regression test.)
But this information is not used on planning like relallvisible, so it
would be good to move this information to another system view like
pg_stat_*_tables.

> lazy_scan_heap() calls PageClearAllVisible() when the page containing
> dead tuples is marked as all-visible. Shouldn't PageClearAllFrozen() be
> called at the same time?

Fixed.

> -"vm",/* VISIBILITYMAP_FORKNUM */
> +"vfm",/* VISIBILITYMAP_FORKNUM */
> I wonder how much it's worth renaming only the file extension while
> there are many places where "visibility map" and "vm" are used,
> for example, log messages, function names, variables, etc.
>
> I'd be inclined to keep calling it the visibility map (vm) even if it
> also contains freeze information.
>
> -1 to rename.  Visibility Map is a perfectly good name.

Yeah, I agree with this.
The latest v14 patch is changed so.

Regards,

--
Masahiko Sawada
diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index 22c5f7a..b1b6a06 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -87,7 +87,7 @@ statapprox_heap(Relation rel, output_type *stat)
 		 * If the page has only visible tuples, then we can find out the free
 		 * space from the FSM and move on.
 		 */
-		if (visibilitymap_test(rel, blkno, &vmbuffer))
+		if (visibilitymap_test(rel, blkno, &vmbuffer, VISIBILITYMAP_ALL_VISIBLE))
 		{
 			freespace = GetRecordedFreeSpace(rel, blkno);
 			stat->tuple_len += BLCKSZ - freespace;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index bcf9871..fc33772 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2176,8 +2176,9 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
 
 	/*
-	 * Find buffer to insert this tuple into.  If the page is all visible,
-	 * this will also pin the requisite visibility map page.
+	 * Find buffer to insert this tuple into.  If the page is all visible
+	 * or all frozen, this will also pin the requisite visibility map and
+	 * frozen map page.
 	 */
 	buffer = RelationGetBufferForTuple(relation, heaptup->t_len,
 	   InvalidBuffer, options, bistate,
@@ -2192,7 +2193,11 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	if (PageIsAllVisible(BufferGetPage(buffer)))
 	{
 		all_visible_cleared = true;
+
+		/* all-frozen information is also cleared at the same time */
 		PageClearAllVisible(BufferGetPage(buffer));
+		PageClearAllFrozen(BufferGetPage(buffer));
+
 		visibilitymap_clear(relation,
 			ItemPointerGetBlockNumber(&(heaptup->t_self)),
 			vmbuffer);
@@ -2493,7 +2498,11 @@ heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
 		if (PageIsAllVisible(page))
 		{
 			all_visible_cleared = true;
+
+			/* all-frozen information is also cleared at the same time */
 			PageClearAllVisible(page);
+			PageClearAllFrozen(page);
+
 			visibilitymap_clear(relation,
 BufferGetBlockNumber(buffer),
 vmbuffer);
@@ -2776,9 +2785,9 @@ heap_delete(Relation relation, ItemPointer tid,
 
 	/*
 	 * If we didn't pin the visibility map page and the page has become all
-	 * visible while we were busy locking the buffer, we'll have to unlock and
-	 * re-lock, to avoid holding the buffer lock across an I/O.  That's a bit
-	 * unfortunate, but hopefully shouldn't happen often.
+	 * visible or all frozen while we were busy locking the buffer, we'll
+	 * have to unlock and re-lock, to avoid holding the buffer lock across an
+	 * I/O.  That's a bit

Re: [HACKERS] No Issue Tracker - Say it Ain't So!

2015-10-02 Thread Oleg Bartunov

On Tue, Sep 29, 2015 at 5:55 PM, Steve Crawford <
scrawf...@pinpointresearch.com> wrote:

> On Tue, Sep 29, 2015 at 7:16 AM, David Fetter  wrote:
>
>> ...What we're not fine with is depending on a proprietary system, no
>> matter what type of license, as infrastructure...
>>
>>
> Exactly. Which is why I was warning about latching onto features only
> available in the closed enterprise version.
>
> Cheers,
> Steve
>
>

If we have consensus of what we want, why not just hire some company to
develop it for us ? I'm sure we could find such a company in Russia and
even would sponsor postgres community and pay for the development.  There
are other postgres companies, which may join us.  Or better,  pay through
pg foundation.

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-02 Thread Nikolay Shaplov

В письме от 2 октября 2015 13:10:22 Вы написали:

> > There also was a bug in original pageinspect, that did not get t_bits
> 
> right
> 
> > when there was OID in the tuple.  I've fixed it too.
> 
> Aha. Good catch! By looking at HeapTupleHeaderGetOid if the tuple has an
> OID it will be stored at the end of HeapTupleHeader and t_hoff includes it,
> so bits_len is definitely larger of 4 bytes should an OID be present.
>if (tuphdr->t_infomask & HEAP_HASNULL)
>{
> -   bits_len = tuphdr->t_hoff -
> -   offsetof(HeapTupleHeaderData, t_bits);
> +  int bits_len =
> +  ((tuphdr->t_infomask2 & HEAP_NATTS_MASK)/8 +
> 1)*8;
> 
> values[11] = CStringGetTextDatum(
> -   bits_to_text(tuphdr->t_bits,
> bits_len * 8));
> +  bits_to_text(tuphdr->t_bits,
> bits_len));
> }
> And here is what you are referring to in your patch. I think that we should
> instead check for HEAP_HASOID and reduce bits_len by the size of Oid should
> one be present. As this is a bug that applies to all the existing versions
> of Postgres it would be good to extract it as a separate patch and then
> apply your own patch on top of it instead of including in your feature.
> Attached is a patch, this should be applied down to 9.0 I guess. Could you
> rebase your patch on top of it?
No we should not do it, because after t_bits there always goes padding bytes. 
So OID or the top of tuple data is properly aligned. So we should not use 
t_hoff for determinating t_bit's size at all.

Here is an example. I create a table with 10 columns and OID. (ten is greater 
then 8, so there should be two bytes of t_bits data)

create table test3 (a1 int, a2 int, a3 int, a4 int,a5 int,a6 int, a7 int,a8 
int, a9 int, a10 int) with oids;
insert into test3 VALUES
(1,2,3,4,5,6,7,8,null,10);

With your patch we get 

test=# select lp, t_bits, t_data from heap_page_items(get_raw_page('test3', 0));
 lp |  t_bits  |
   t_data   
+--+
  1 | 0100 | 
\x010002000300040005000600070008000a00
(1 row)


here we get 40 bits of t_bits.

With my way to calculate t_bits length we get

test=# select lp, t_bits, t_data from heap_page_items(get_raw_page('test3', 0));
 lp |  t_bits  |   t_data   

+--+
  1 | 0100 | 
\x010002000300040005000600070008000a00
(1 row)

16 bits as expected.

So I would keep my version of bits_len calculation


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-10-02 Thread Kouhei Kaigai

> On Thu, Oct 1, 2015 at 2:35 AM, Kouhei Kaigai  wrote:
> > Gather node was oversight by readfunc.c, even though it shall not be
> > serialized actually.
> > Also, it used incompatible WRITE_xxx_FIELD() macro on outfuncs.c.
> >
> > The attached patch fixes both of incomsistence.
> 
> Thanks.  You missed READ_DONE() but fortunately my compiler noticed
> that oversight.  Committed with that fix.
>
I could find one other strangenes, at explain.c.

case T_Gather:
{
Gather *gather = (Gather *) plan;

show_scan_qual(plan->qual, "Filter", planstate, ancestors, es);
if (plan->qual)
show_instrumentation_count("Rows Removed by Filter", 1,
   planstate, es);
ExplainPropertyInteger("Number of Workers",
   gather->num_workers, es);
if (gather->single_copy)
ExplainPropertyText("Single Copy",
gather->single_copy ? "true" : "false",
es);
}
break;

What is the intention of the last if-check?
The single_copy is checked in the argument of ExplainPropertyText().

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-10-02 Thread Kyotaro HORIGUCHI

Hello, thank you for explanation. I understood the background.

On the current planner implement, row marks are tightly bound to
initial RTEs. This is quite natural for the purpose of row marks.

During join search, a joinrel should be comptible between local
joins and remote joins, of course target list also should be
so. So it is quite difficult to add wholerow resjunk for joinrels
before whole join tree is completed even if we allow row marks
that are not bound to base RTEs.

The result of make_rel_from_joinlist contains only winner paths
so we might be able to transform target list for this joinrel so
that it has join wholerows (and doesn't have unnecessary RTE
wholerows), but I don't see any clean way to do that.

As the result, all that LockRow can collect for EPQ are tuples
for base relations. No room to pass joined whole row so far.


At Fri, 2 Oct 2015 05:04:44 +, Kouhei Kaigai  wrote 
in <9a28c8860f777e439aa12e8aea7694f80114d...@bpxm15gp.gisp.nec.co.jp>
> > > I never say FDW should refetch tuples on the recheck routine.
> > > All I suggest is, projection to generate a joined tuple and
> > > recheck according to the qualifier pushed down are role of
> > > FDW driver, because it knows the best strategy to do the job.
> > 
> > I have no objection that rechecking is FDW's job.
> > 
> > I think you are thinking that all ROW_MARK_COPY base rows are
> > held in ss_ScanTupleSlot so simply calling recheckMtd on the slot
> > gives enough data to the function. (EPQState would also be needed
> > to retrieve, though..) Right?
> >
> Not ss_ScanTupleSlot. It is initialized according to fdw_scan_tlist
> in case of scanrelid==0, regardless of base foreign relation's
> definition.

Sorry, EvalPlanQualFetchRowMarks retrieves wholerows from
epqstate->origslot.

> My expectation is, FDW callback construct tts_values/tts_isnull
> of ss_ScanTupleSlot according to the preloaded tuples in EPQ slots
> and underlying projection. Only FDW driver knows the best way to
> construct this result tuple.

Currently only FDW itself knows how the joined relaiton are made
precisely.

> You can pull out EState reference from PlanState portion of the
> ForeignScanState, so nothing needs to be changed.

Exactly.

> > > > > Apart from FDW requirement, custom-scan/join needs recheckMtd is
> > > > > called when scanrelid==0 to avoid assertion fail. I hope FDW has
> > > > > symmetric structure, however, not a mandatory requirement for me.
...
> > Hmm. What I said by "works as expected" is that the function
> > stores the tuple for the "foreign join" scan node. If it doesn't,
> > you're right.
> >
> Which slot of the EPQ slot will save the joined tuple?

Yes, that is the second significant problem. As described above,
furtermore, the way to inject joined wholrow var into the target
list for the pushed-down join seems more difficult to find

> scanrelid is zero, and we have no identifier of join planstate.
> 
> > > Who can provide a projection to generate joined tuple?
> > > It is a job of individual plan-state-node to be walked on during
> > > EvalPlanQualNext().
> > 
> > EvalPlanQualNext simply does recheck tuples stored in epqTuples,
> > which are designed to be provided by EvalPlanQualFetchRowMarks.
> > 
> > I think that that premise shouldn't be broken for convenience...
> > 
> Do I see something different or understand incorrectly?
> EvalPlanQualNext() walks down entire subtree of the Lock node.
> (epqstate->planstate is entire subplan of Lock node.)
> 
>   TupleTableSlot *
>   EvalPlanQualNext(EPQState *epqstate)
>   {
>   MemoryContext oldcontext;
>   TupleTableSlot *slot;
>   
>   oldcontext = MemoryContextSwitchTo(epqstate->estate->es_query_cxt);
>   slot = ExecProcNode(epqstate->planstate);
>   MemoryContextSwitchTo(oldcontext);
>   
>   return slot;
>   }
> 
> If and when relations joins are kept in the sub-plan, ExecProcNode()
> processes the projection by join, doesn't it?

Yes, but it is needed to prepare alternative plan to do such
projection.

> Why projection by join is not a part of EvalPlanQualNext()?
> It is the core of its job. Unless projection by join, upper node cannot
> recheck the tuple come from child nodes.

What I'm uneasy on is the foreign join introduced the difference
in behavior between ordinary fetching and epq fetching. It is
quite natural having joined whole row but is seems hard to get.

Another reason is that ExecScanFetch with scanrelid == 0 on EPQ
is FDW/CS specific feature and looks to be a kind of hack. (Even
if it would be one of many)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confusing remark about UPSERT in fdwhandler.sgml

2015-10-02 Thread Peter Geoghegan

On Fri, Oct 2, 2015 at 1:00 AM, Etsuro Fujita
 wrote:
> ISTM that the sentence "as remote constraints are not locally known" is
> somewhat confusing, because check constrains on remote tables can be
> defined locally in 9.5.  How about "unique constraints or exclusion
> constraints on remote tables are not locally known"?  Attached is a
> patch for that.

Makes sense to me.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Confusing remark about UPSERT in fdwhandler.sgml

2015-10-02 Thread Etsuro Fujita

The following is a remark about UPSERT in fdwhandler.sgml.

 INSERT with an ON CONFLICT clause does not
 support specifying the conflict target, as remote constraints are not
 locally known. This in turn implies that ON CONFLICT DO
 UPDATE is not supported, since the specification is mandatory there.

ISTM that the sentence "as remote constraints are not locally known" is
somewhat confusing, because check constrains on remote tables can be
defined locally in 9.5.  How about "unique constraints or exclusion
constraints on remote tables are not locally known"?  Attached is a
patch for that.

Best regards,
Etsuro Fujita
*** a/doc/src/sgml/fdwhandler.sgml
--- b/doc/src/sgml/fdwhandler.sgml
***
*** 1196,1204  GetForeignServerByName(const char *name, bool missing_ok);
  
  
   INSERT with an ON CONFLICT clause does not
!  support specifying the conflict target, as remote constraints are not
!  locally known. This in turn implies that ON CONFLICT DO
!  UPDATE is not supported, since the specification is mandatory there.
  
  
 
--- 1196,1205 
  
  
   INSERT with an ON CONFLICT clause does not
!  support specifying the conflict target, as unique constraints or
!  exclusion constraints on remote tables are not locally known. This
!  in turn implies that ON CONFLICT DO UPDATE is not supported,
!  since the specification is mandatory there.
  
  
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-02 Thread Amit Langote

On 2015/10/02 15:38, Fujii Masao wrote:
> 
> +uint32 progress_param[N_PROGRESS_PARAM];
> 
> Why did you use an array to store the progress information of VACUUM?
> I think that it's better to use separate specific variables for them for
> better code readability, for example, variables scanned_pages,
> heap_total_pages, etc.
> 
> +doubleprogress_param_float[N_PROGRESS_PARAM];
> 
> Currently only progress_param_float[0] is used. So there is no need to
> use an array here.

I think this kind of design may have come from the ideas expressed here
(especially the last paragraph):

http://www.postgresql.org/message-id/CA+TgmoYnWtNJRmVWAJ+wGLOB_x8vNOTrZnEDio=gapi5hk7...@mail.gmail.com

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

58 matches

Mail list logo