[HACKERS] [PATCH] parallel & isolated makefile for plpython

2016-09-30 Thread Pavel Raiskup
Hi, we observed issues with parallel make during RPM build in plpython,
seems like the attached patch 0002 should help.  Feel free to reject 0001,
but comment like that would save some time to me as a "newcomer" into that
Makefile.

Pavel
>From b8722c8fb1e3d5f752d75e4d0740d04793577185 Mon Sep 17 00:00:00 2001
From: Pavel Raiskup 
Date: Fri, 30 Sep 2016 14:26:24 +0200
Subject: [PATCH 1/2] Document that "Empty Recipe" is standard thing in GNU
 make

---
 src/backend/parser/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/parser/Makefile b/src/backend/parser/Makefile
new file mode 100644
index fdd8485..92acc00
*** a/src/backend/parser/Makefile
--- b/src/backend/parser/Makefile
*** endif
*** 31,37 
  # shorthand for two otherwise separate rules.  To be safe for parallel
  # make, we must chain the dependencies like this.  The semicolon is
  # important, otherwise make will choose the built-in rule for
! # gram.y=>gram.c.
  
  gram.h: gram.c ;
  
--- 31,37 
  # shorthand for two otherwise separate rules.  To be safe for parallel
  # make, we must chain the dependencies like this.  The semicolon is
  # important, otherwise make will choose the built-in rule for
! # gram.y=>gram.c.  Run `info make -n "Empty Recipes"` for more info.
  
  gram.h: gram.c ;
  
-- 
2.7.4

>From dbd3a57775e3b177d1a9c974d89862ab8a24b9f1 Mon Sep 17 00:00:00 2001
From: Pavel Raiskup 
Date: Sat, 1 Oct 2016 07:53:29 +0200
Subject: [PATCH 2/2] Make makefile for plpython really self-standing

The submake-generated-headers can't be used as 'all' prerequisite
at the same level with with 'all-lib', because
'submake-generated-headers' is actually prerequisite for
'all-libs' in this makefile.  So use submake-generated-headers as
$(OBJS) prerequisite, as several object files depend on it.

Move the 'all' target before include statement, according to
documentation in Makefile.shlib.

This is follow-up for 548af97fcec5543603c20b61fec60f8147a05b29.
---
 src/pl/plpython/Makefile | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/pl/plpython/Makefile b/src/pl/plpython/Makefile
new file mode 100644
index 647b4b1..3a8b2c6
*** a/src/pl/plpython/Makefile
--- b/src/pl/plpython/Makefile
*** REGRESS = \
*** 93,102 
  
  REGRESS_PLPYTHON3_MANGLE := $(REGRESS)
  
! include $(top_srcdir)/src/Makefile.shlib
  
! all: submake-generated-headers all-lib
  
  
  install: all install-lib install-data
  
--- 93,103 
  
  REGRESS_PLPYTHON3_MANGLE := $(REGRESS)
  
! $(OBJS): | submake-generated-headers
  
! all: all-lib
  
+ include $(top_srcdir)/src/Makefile.shlib
  
  install: all install-lib install-data
  
-- 
2.7.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY as a set returning function

2016-09-30 Thread Tom Lane
Craig Ringer  writes:
> On 1 Oct. 2016 05:20, "Tom Lane"  wrote:
>> I think the last of those suggestions has come up before.  It has the
>> large advantage that you don't have to remember a different syntax for
>> copy-as-a-function.

> That sounds fantastic. It'd help this copy variant retain festure parity
> with normal copy. And it'd bring us closer to being able to FETCH in non
> queries.

On second thought, though, this couldn't exactly duplicate the existing
COPY syntax, because COPY relies heavily on the rowtype of the named
target table to tell it what it's copying.  You'd need some new syntax
to provide the list of column names and types, which puts a bit of
a hole in the "syntax we already know" argument.  A SRF-returning-record
would have a leg up on that, because we do have existing syntax for
defining the concrete rowtype that any particular call returns.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-09-30 Thread Amit Kapila
On 30-Sep-2016 10:26 PM, "Peter Geoghegan"  wrote:
>
> On Fri, Sep 30, 2016 at 4:46 PM, Robert Haas 
wrote:
> > I would just be very disappointed if, after the amount of work that
> > Amit and others have put into this project, the code gets rejected
> > because somebody thinks a different project would have been more worth
> > doing.
>
> I wouldn't presume to tell anyone else how to spend their time, and am
> not concerned about this patch making the hash index code any less
> useful from the user's perspective. If this is how we remove the wart
> of hash indexes not being WAL-logged, that's fine by me. I'm trying to
> be helpful.
>

If that is fine, then I think we should do that.  I want to bring it to
your notice that we have already seen and reported that with proposed set
of patches, hash indexes are good bit faster than btree, so that adds
additional value in making them WAL-logged.

> > As Tom said upthread: $But to kick the hash AM as such to the
> > curb is to say
> > "sorry, there will never be O(1) index lookups in Postgres".$  I
> > think that's correct and a sufficiently-good reason to pursue this
> > work, regardless of the merits (or lack of merits) of hash-over-btree.
>
> I don't think that "O(1) index lookups" is a useful guarantee with a
> very expensive constant factor.

The constant factor doesn't play much role when data doesn't have
duplicates or have lesser duplicates.

Amit seemed to agree with this, since
> he spoke of the importance of both theoretical performance benefits
> and practically realizable performance benefits.
>

No, I don't agree with that rather I think hash indexes are theoretically
faster than btree and we have seen that practically as well for quite a few
cases (for read workloads - when used with unique data and also in nest
loops).

With Regards,
Amit Kapila


[HACKERS] contrib/pg_visibility craps out in assert-enabled builds

2016-09-30 Thread Tom Lane
So I tried using pg_visibility's pg_check_visible() as part of
testing the business with pg_upgrade generating faulty visibility
maps on bigendian servers, and it instantly generated an assert
failure here:

#2  0x0041de78 in ExceptionalCondition (conditionName=, errorType=, fileName=, lineNumber=) at assert.c:54
#3  0x0045c410 in HeapTupleSatisfiesVacuum (htup=0x0, OldestXmin=9170, 
buffer=2958) at tqual.c:1169
#4  0x00a41c3c in tuple_all_visible (tup=0xbfffd8e4, OldestXmin=9170, 
buffer=) at 
pg_visibility.c:719
#5  0x00a420a8 in collect_corrupt_items (relid=46802, all_visible=, all_frozen=) at pg_visibility.c:630
#6  0x00a4262c in pg_check_visible (fcinfo=0x104b704) at pg_visibility.c:328

The problem seems to be that HeapTupleSatisfiesVacuum asserts

Assert(ItemPointerIsValid(&htup->t_self));

while collect_corrupt_items hasn't bothered to set up the t_self
field of the HeapTupleData it's passing in.  This would imply that
you never tested this code in an assert-enabled build, which I find
surprising.  Am I missing something?

(I'm not really sure *why* HeapTupleSatisfiesVacuum contains this
Assert, because it doesn't do anything with t_self, but nonetheless
the Assert has been there for several years.  Seems to have been
inserted by you, in fact.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set log_line_prefix and application name in test drivers

2016-09-30 Thread Peter Eisentraut
On 9/29/16 10:06 AM, Tom Lane wrote:
> Personally I'm also on board with using this for regression testing:
> 
>   log_line_prefix = '%t [%p] %q%a '

Committed that way, but with %m instead of %t, as discussed earlier.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY as a set returning function

2016-09-30 Thread Craig Ringer
On 1 Oct. 2016 05:20, "Tom Lane"  wrote:
>
> Corey Huinker  writes:
> > Attached is a _very_ rough patch implementing a proof-of-concept
function
> > copy_srf();
> > ...
> > As for that future direction, we could either have:
> > - a robust function named something like copy_srf(), with parameters for
> > all of the relevant options found in the COPY command
> > - a function that accepts an options string and parse that
> > - we could alter the grammar to make COPY RETURNING col1, col3, col5
FROM
> > 'filename' a legit CTE.
>
> I think the last of those suggestions has come up before.  It has the
> large advantage that you don't have to remember a different syntax for
> copy-as-a-function.  Once you had the framework for that, other
> rows-returning utility commands such as EXPLAIN might plug in as well,
> whenever somebody got enough of an itch for it.

That sounds fantastic. It'd help this copy variant retain festure parity
with normal copy. And it'd bring us closer to being able to FETCH in non
queries.

> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Showing parallel status in \df+

2016-09-30 Thread Tom Lane
Jim Nasby  writes:
> On 9/28/16 2:59 PM, Alvaro Herrera wrote:
>> My vote (which was not counted by Stephen) was to remove it from \df+
>> altogether.  I stand by that.  People who are used to seeing the output
>> in \df+ will wonder "where the heck did it go" and eventually figure it
>> out, at which point it's no longer a problem.  We're not breaking
>> anyone's scripts, that's for sure.
>> 
>> If we're not removing it, I +0 support the option of moving it to
>> footers.  I'm -1 on doing nothing.

> I agree with everything Alvaro just said.

Well, alternatively, can we get a consensus for doing that?  People
did speak against removing PL source code from \df+ altogether, but
maybe they're willing to reconsider if the alternative is doing nothing.

Personally I'm on the edge of washing my hands of the whole thing...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Macro customizable hashtable / bitmapscan & aggregation perf

2016-09-30 Thread Andres Freund
Hi,

On 2016-07-26 17:43:33 -0700, Andres Freund wrote:
> In the attached patch I've attached simplehash.h, which can be
> customized by a bunch of macros, before being inlined.  There's also a
> patch using this for tidbitmap.c and nodeAgg/nodeSubplan/... via
> execGrouping.c.

Attached is a significantly improved version of this series.  The main
changes are:

- The hash table now uses robin-hood style hash collision handling. See the
  commit message of 0002 (or simplehash.h) for an introduction. That
  significantly reduces the impact of "clustering" due to linear
  addressing.
- Significant comment and correctness fixes, both in simplehash.h
- itself, and 0003/0004.
- a lot of other random performance improvements for the hashing code.


Unfortunately I'm running out battery right now, so I don't want to
re-run the benchmarks posted upthread (loading takes a while). But the
last time I ran them all the results after the patches were better than
before.


This patch series currently consists out of four patches:
- 0001 - add unlikely/likely() macros. The use here is not entirely
  mandatory, but they do provide a quite consistent improvement. There
  are other threads where they proved to be beneficial, so I see little
  reason not to introduce them.
- 0002 - the customizable hashtable itself. It now contains documentation.
- 0003 - convert tidbitmap.c to use the new hashmap. This includes a fix
  for the issue mentioned in [1], to improve peformance in heavily lossy
  scenarios (otherwise we could regress in some extreme cases)
- 0004 - convert execGrouping.c, e.g. used by hash aggregates


While not quite perfect yet, I do think this is at a state where input
is needed.  I don't want to go a lot deeper into this rabbit hole,
before we have some agreement that this is a sensible course of action.


I think there's several possible additional users of simplehash.h,
e.g. catcache.c - which has an open coded, not particularly fast hash
table and frequently shows up in profiles - but I think the above two
conversions are plenty to start with.


Comments?


Greetings,

Andres Freund

[1] 
http://archives.postgresql.org/message-id/20160923205843.zcs533sqdzlh4cpo%40alap3.anarazel.de
>From 6e64ea6fa79c265ebadf2260b7cc1ad05befd801 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sun, 3 Jul 2016 15:05:18 -0700
Subject: [PATCH 1/4] Add likely/unlikely() branch hint macros.

These are useful for hot code paths. Because it's easy to guess wrongly
about likelihood, and because such likelihoods change over time, they
should be used sparingly.
---
 src/include/c.h | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/include/c.h b/src/include/c.h
index 4ab3f80..3a77107 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -939,6 +939,22 @@ typedef NameData *Name;
 #endif
 
 
+/*
+ * Hints to the compiler about the likelihood of a branch. Both likely() and
+ * unlikely() return the boolean value of the contained expression.
+ *
+ * These should only be used sparingly, in very hot code paths. It's very easy
+ * to mis-estimate likelihoods.
+ */
+#if __GNUC__ >= 3
+#define likely(x)	__builtin_expect((x) != 0, 1)
+#define unlikely(x)	__builtin_expect((x) != 0, 0)
+#else
+#define likely(x)	((x) != 0)
+#define unlikely(x)	((x) != 0)
+#endif
+
+
 /* 
  *Section 8:	random stuff
  * 
-- 
2.9.3

>From 2e9425196a203fde28b022e8d742c44d0d6a5e70 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 19 Jul 2016 14:10:28 -0700
Subject: [PATCH 2/4] Add a macro customizable hashtable.

dynahash.c hash tables aren't quite fast enough for some
use-cases. There are several reasons for lacking performance:
- the use of chaining for collision handling makes them cache
  inefficient, that's especially an issue when the tables get bigger.
- as the element sizes for dynahash are only determined at runtime,
  offset computations are somewhat expensive
- hash and element comparisons are indirect function calls, causing
  unnecessary pipeline stalls
- it's two level structure has some benefits (somewhat natural
  partitioning), but increases the number of indirections

to fix several of these the hash tables have to be adjusted to the
invididual use-case at compile-time. C unfortunately doesn't provide a
good way to do compile code generation (like e.g. c++'s templates for
all their weaknesses do).  Thus the somewhat ugly approach taken here is
to allow for code generation using a macro-templatized header file,
which generates functions and types based on a prefix and other
parameters.

Later patches use this infrastructure to use such hash tables for
tidbitmap.c (bitmap scans) and execGrouping.c (hash aggregation,
...). In queries where these use up a large fraction of the time, this
has been measured to lead to performance improvements of over 100%.

There are other cases where this could

Re: [HACKERS] PL/Python adding support for multi-dimensional arrays

2016-09-30 Thread Jim Nasby

On 9/29/16 1:51 PM, Heikki Linnakangas wrote:

Jim, I was confused, but you agreed with me. Were you also confused, or
am I missing something?


I was confused by inputs:

CREATE FUNCTION repr(i foo[]) RETURNS text LANGUAGE plpythonu AS 
$$return repr(i)$$;

select repr(array[row(1,2)::foo, row(3,4)::foo]);
repr

 ['(1,2)', '(3,4)']
(1 row)

(in ipython...)

In [1]: i=['(1,2)', '(3,4)']

In [2]: type(i)
Out[2]: list

In [3]: type(i[0])
Out[3]: str

I wonder if your examples work only


Now, back to multi-dimensional arrays. I can see that the Sequence
representation is problematic, with arrays, because if you have a python
list of lists, like [[1, 2]], it's not immediately clear if that's a
one-dimensional array of tuples, or two-dimensional array of integers.
Then again, we do have the type definitions available. So is it really
ambiguous?


[[1,2]] is a list of lists...
In [4]: b=[[1,2]]

In [5]: type(b)
Out[5]: list

In [6]: type(b[0])
Out[6]: list

If you want a list of tuples...
In [7]: c=[(1,2)]

In [8]: type(c)
Out[8]: list

In [9]: type(c[0])
Out[9]: tuple
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Show hash / bitmap sizes in EXPLAIN ANALYZE?

2016-09-30 Thread Andres Freund
Hi,

At the moment in-memory sort and hash nodes show their memory usage in
explain:
│   ->  Sort  (cost=59.83..62.33 rows=1000 width=4) (actual time=0.512..0.632 
rows=1000 loops=1)│
│ Sort Key: a.a 
│
│ Sort Method: quicksort  Memory: 71kB  
│
│ ->  Function Scan on generate_series a  (cost=0.00..10.00 rows=1000 
width=4) (actual time=0.165..0.305 rows=1000 loops=1) │
and
│   ->  Hash  (cost=10.00..10.00 rows=1000 width=4) (actual time=0.581..0.581 
rows=1000 loops=1)│
│ Buckets: 1024  Batches: 1  Memory Usage: 44kB 
│

I think we should show something similar for bitmap scans, and for
some execGrouping.c users (at least hash aggregates, subplans and setop
seem good candidates too).

For both categories it's useful to see how close within work_mem a scan
ended up being (to understand how high to set it, and how much the data
can grow till work_mem is excceded), and for execGrouping.c users it's
also very interesting to see the actual memory usage because the limit
is only a very soft one.

Does anybody see a reason not to add that?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Showing parallel status in \df+

2016-09-30 Thread Jim Nasby

On 9/28/16 2:59 PM, Alvaro Herrera wrote:

I am sorry, I disagree. Proposed form is hard readable. Is not possible to
> simply copy/paste.

Why do you care?  You can use \sf if you want to copy&paste the
function code.


> I cannot to imagine any use case for proposed format.

My vote (which was not counted by Stephen) was to remove it from \df+
altogether.  I stand by that.  People who are used to seeing the output
in \df+ will wonder "where the heck did it go" and eventually figure it
out, at which point it's no longer a problem.  We're not breaking
anyone's scripts, that's for sure.

If we're not removing it, I +0 support the option of moving it to
footers.  I'm -1 on doing nothing.


I agree with everything Alvaro just said.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sloppy handling of pointers

2016-09-30 Thread Jim Nasby

On 9/27/16 9:45 PM, Mark Dilger wrote:

Fix attached.


General comment on all these fixes you're submitting: you'll want to 
submit those to the commitfest app to make sure they get looked at.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY as a set returning function

2016-09-30 Thread Tom Lane
Corey Huinker  writes:
> Attached is a _very_ rough patch implementing a proof-of-concept function
> copy_srf();
> ...
> As for that future direction, we could either have:
> - a robust function named something like copy_srf(), with parameters for
> all of the relevant options found in the COPY command
> - a function that accepts an options string and parse that
> - we could alter the grammar to make COPY RETURNING col1, col3, col5 FROM
> 'filename' a legit CTE.

I think the last of those suggestions has come up before.  It has the
large advantage that you don't have to remember a different syntax for
copy-as-a-function.  Once you had the framework for that, other
rows-returning utility commands such as EXPLAIN might plug in as well,
whenever somebody got enough of an itch for it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] COPY as a set returning function

2016-09-30 Thread Corey Huinker
Attached is a _very_ rough patch implementing a proof-of-concept function
copy_srf();

It allows you to do things like this:

# select a,c,e from copy_srf('echo 12,345,67,89,2016-01-01',true) as t(a
integer, b text, c text, d text, e date);
 a  | c  | e
++
 12 | 67 | 2016-01-01
(1 row)



Uses for this include:
- avoiding the pattern of creating a temp table just to select all the rows
back out and then discard the table (avoidable disk activity, avoidable oid
churn)
- avoiding the pattern of creating a file_fdw table in pg_temp just to drop
it after one select (avoidable oid churn)
- filtering file/program input by the columns that are relevant to the
user's needs.

This experiment arose from my realization that file_fdw just plugs into the
externally visible portions of copy.c to do all of it's work. So why
couldn't we do the same for a set returning function? Of course it wasn't
as simple as that. The existing Begin/NextCopyFrom functions require the
->rel to be a valid Oid...which we won't have in this context, so I had to
bypass that code and use CopyFromRawFields() directly...which isn't
externally visible, hence this being a patch to core rather than an
extension.

Currently the function only accepts two parameters, "filename" and
"is_program". Header is always false and csv mode is always true. Obviously
if we go forward on this, we'll want to add that functionality back in, but
I'm holding off for now to keep the example simple and wait for consensus
on future direction.

As for that future direction, we could either have:
- a robust function named something like copy_srf(), with parameters for
all of the relevant options found in the COPY command
- a function that accepts an options string and parse that
- we could alter the grammar to make COPY RETURNING col1, col3, col5 FROM
'filename' a legit CTE.

Regardless of the path forward, I'm going to need help in getting there,
hence this email. Thank you for your consideration.


copy_as_a_srf.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ICU integration

2016-09-30 Thread Thomas Munro
On Sat, Oct 1, 2016 at 3:30 AM, Peter Eisentraut
 wrote:
> On 9/23/16 2:27 AM, Thomas Munro wrote:
>> The warning persists even after I reindex all affected tables (and
>> start a new backend), because you're only recording the collation at
>> pg_collation level and then only setting it at initdb time, so that
>> HINT isn't much use for clearing the warning.  I think you'd need to
>> record a snapshot of the version used for each collation that was used
>> on *each index*, and update it whenever you CREATE INDEX or REINDEX
>> etc.  There can of course be more than one collation and thus version
>> involved, if you have columns with different collations, so I guess
>> you'd need an column to hold an array of version snapshots whose order
>> corresponds to pg_index.indcollation's.
>
> Hmm, yeah, that will need more work.  To be completely correct, I think,
> we'd also need to record the version in each expression node, so that
> check constraints of the form CHECK (x > 'abc') can be handled.

Hmm.  That is quite a rabbit hole.  In theory you need to recheck such
a constraint, but it's not at all clear when you should recheck and
what you should do about it if it fails.  Similar for the future
PARTITION feature.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2016-09-30 Thread Thomas Munro
On Sat, Oct 1, 2016 at 8:32 AM, Julien Rouhaud
 wrote:
> On 30/09/2016 21:12, David Fetter wrote:
>> On Fri, Sep 30, 2016 at 06:37:17PM +0200, Julien Rouhaud wrote:
>>> On 30/09/2016 05:23, Thomas Munro wrote:

 It would be really nice to be able to set this to 'Ready for
 Committer' in this CF.  Do you want to post a v6 patch or are you
 happy for me to ask a committer to look at v5 + these three
 corrections?
>>>
>>> I just looked at the patch, and noticed that only plain DELETE and
>>> UPDATE commands are handled.  Is it intended that writable CTE without
>>> WHERE clauses are not detected by this extension?  I personally think
>>> that wCTE should be handled (everyone can forget a WHERE clause), but if
>>> not it should at least be documented.
>>
>> You are correct in that it should work for every unqualified UPDATE or
>> DELETE, not just some.  Would you be so kind as to send along the
>> tests cases you used so I can add them to the patch?
>>
>
> Given your test case, these queries should fail if the related
> require_where GUCs are set (obviously last one should failed with either
> of the GUC set):
>
> WITH d AS (DELETE FROM test_require_where) SELECT 1;
>
> WITH u AS (UPDATE test_require_where SET t = t) SELECT 1;
>
> WITH d AS (DELETE FROM test_require_where), u AS (UPDATE
> test_require_where SET t = t) SELECT 1;

Right.  These cases work because they show up as CMD_DELETE/CMD_UPDATE:

postgres=# set require_where.delete = on;
SET
postgres=# with answer as (select 42) delete from foo;
ERROR:  DELETE requires a WHERE clause when require_where.delete is set to on
HINT:  To delete all rows, use "WHERE true" or similar.
postgres=# prepare x as delete from foo;
ERROR:  DELETE requires a WHERE clause when require_where.delete is set to on
HINT:  To delete all rows, use "WHERE true" or similar.

But not this case which shows up as a CMD_SELECT:

postgres=# set require_where.delete = on;
SET
postgres=# with q as (delete from foo) select 42;
┌──┐
│ ?column? │
├──┤
│   42 │
└──┘
(1 row)

I guess you need something involving query_tree_walker or some other
kind of recursive traversal if you want to find DELETE/UPDATE lurking
in there.

One option would be to document it as working for top level DELETE or
UPDATE, and leave the recursive version as an improvement for a later
patch.  That's the most interesting kind to catch because that's what
people are most likely to type directly into a command line.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2016-09-30 Thread Peter Eisentraut
On 9/29/16 11:23 PM, Thomas Munro wrote:
> The regression test fails.  The expected error messages show the old
> wording, so I think you forgot to add a file.  Also, should
> contrib/require_where/Makefile define REGRESS = require_where?  That
> would allow 'make check' from inside that directory, which is
> convenient and matches other extensions.  Please find attached
> fix-regression-test.patch which also applies on top of
> training_wheels_005.patch and fixes those things for me, and also adds
> a couple of extra test cases.

I don't think we need to have a separate data file to load a few test
rows.  A plain INSERT statement will do.

> It would be really nice to be able to set this to 'Ready for
> Committer' in this CF.  Do you want to post a v6 patch or are you
> happy for me to ask a committer to look at v5 + these three
> corrections?

As a committer, I would prefer a single patch to be posted.

Before it gets there, I would still like to see the documentation expanded.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2016-09-30 Thread Julien Rouhaud
On 30/09/2016 21:12, David Fetter wrote:
> On Fri, Sep 30, 2016 at 06:37:17PM +0200, Julien Rouhaud wrote:
>> On 30/09/2016 05:23, Thomas Munro wrote:
>>>
>>> It would be really nice to be able to set this to 'Ready for
>>> Committer' in this CF.  Do you want to post a v6 patch or are you
>>> happy for me to ask a committer to look at v5 + these three
>>> corrections?
>>
>> I just looked at the patch, and noticed that only plain DELETE and
>> UPDATE commands are handled.  Is it intended that writable CTE without
>> WHERE clauses are not detected by this extension?  I personally think
>> that wCTE should be handled (everyone can forget a WHERE clause), but if
>> not it should at least be documented.
> 
> You are correct in that it should work for every unqualified UPDATE or
> DELETE, not just some.  Would you be so kind as to send along the
> tests cases you used so I can add them to the patch?
> 

Given your test case, these queries should fail if the related
require_where GUCs are set (obviously last one should failed with either
of the GUC set):

WITH d AS (DELETE FROM test_require_where) SELECT 1;

WITH u AS (UPDATE test_require_where SET t = t) SELECT 1;

WITH d AS (DELETE FROM test_require_where), u AS (UPDATE
test_require_where SET t = t) SELECT 1;

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-09-30 Thread Peter Eisentraut
On 9/28/16 10:48 PM, Thomas Munro wrote:
> I wonder if the following bit of gin.h should be more nuanced: maybe
> it's OK to convert between bool and GinTernaryValue, but it's
> definitely not OK to cast between pointers types?  Or maybe we should
> have a function/macro to convert between the types explicitly and not
> encourage people to consider them convertible.

The more I look into the this, the more this looks like a web of lies.
:-)  The GIN consistent and triconsistent functions randomly change
around between bool and GinTernaryValue, share some of the same data
structures, and there is little guarantee that they each get the right
kind of value all the time.  This could perhaps use some deeper cleaning
at some point.  I've left that alone for now.

>> 0009-Rename-some-typedefs-to-avoid-name-conflicts.patch
> 
> I don't know if it's a relevant precedent or not, but I noticed that
> fdwapi.h, amapi.h and tsmapi.h used the convention that function
> pointer types are named XXX_function, and then the members of a struct
> behaving as a kind of vtable are named XXX.

Good idea, and that also overlaps with some other stuff I have wanted to
tidy up in pg_dump, so I might get back to that later.

>> 0011-Add-missing-fields-in-struct-initializations.patch
> 
> I don't undestand why this is necessary, unless you're explicitly
> choosing to enable a warning like missing-field-initializers for C++
> but not for C.

I can't reproduce this anymore, so never mind.

> As for the commitfest entry: this thread discusses two different
> people's efforts to compile PostgreSQL as C++.  Joy Arulraj's github
> branch derives in some way from Peter Eisentraut's work, but I have
> provided feedback on Peter's patches, because (1) they were posted
> here in patch format and (2) there is a commitfest entry listening
> Peter as the author.  I think several of these patches are
> committable, and many obviously are not.  Heikki already set the CF
> item to 'Ready for Committer' based on an inspection of a few of the
> patches, but invited others to continue looking, so I did.

Yeah, I have committed a few of the patches now and I'll close the CF
entry now.  Thanks for your research.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-09-30 Thread Peter Eisentraut
On 9/6/16 2:58 PM, Heikki Linnakangas wrote:
> 0001-0003 look clear to me as well. 0006 - 0009 also seem OK. The rest 
> really only make sense if we decided to make the switch to C++.

I have committed 0001, 0002, 0003, 0006, as well as 0012.  Thomas Munro
had some interesting comments on 0007-0009 that are worth considering
further.

The rest of the patches will be kept around for future amusement.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2016-09-30 Thread David Fetter
On Fri, Sep 30, 2016 at 06:37:17PM +0200, Julien Rouhaud wrote:
> On 30/09/2016 05:23, Thomas Munro wrote:
> > 
> > It would be really nice to be able to set this to 'Ready for
> > Committer' in this CF.  Do you want to post a v6 patch or are you
> > happy for me to ask a committer to look at v5 + these three
> > corrections?
> 
> I just looked at the patch, and noticed that only plain DELETE and
> UPDATE commands are handled.  Is it intended that writable CTE without
> WHERE clauses are not detected by this extension?  I personally think
> that wCTE should be handled (everyone can forget a WHERE clause), but if
> not it should at least be documented.

You are correct in that it should work for every unqualified UPDATE or
DELETE, not just some.  Would you be so kind as to send along the
tests cases you used so I can add them to the patch?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench more operators & functions

2016-09-30 Thread Stephen Frost
Fabien, Jeevan,

* Jeevan Ladhe (jeevan.la...@enterprisedb.com) wrote:
> On Sat, Jul 9, 2016 at 12:14 PM, Fabien COELHO  wrote:
> >> Here is a simple patch which adds a bunch of operators (bitwise: & | ^ ~,
> >> comparisons: =/== <>/!= < <= > >=, logical: and/&& or/|| xor/^^ not/!) and
> >> functions (exp ln if) to pgbench. I've tried to be pg's SQL compatible 
> >> where
> >> appropriate.
> >>
> >> Also attached is a simple test script.
> >
> > Here is a sightly updated version.
> 
> I did the review of your patch and here are my views on your patch.

Thanks for the review, I agree with most of your comments and the patch
looks pretty good, in general, but I had a few specific questions.

Apologies if I missed where these were discussed already, I've just
been reading the thread linked from the CF app, so if there is prior
discussion that I should read, please provide a link or Message-ID and
I'll go read the thread(s).

> Purpose of the patch:
> =
> 
> This patch introduces extensive list of new operators and functions that can 
> be
> used in expressions in pgbench transaction scripts(given with -f option).
> 
> Here is the list of operators and functions introduced by this patch:
> 
> New operators:
> --
> bitwise: <<, >>, &, |, ^/#, ~
> comparisons: =/==, <>/!=, <, <=, >, >=
> logical: and/&&, or/||, xor/^^, not, !

I'm not sure that we want to introduce operators '&&', '||' as logical
'and' and 'or' when those have specific meaning in PG which is different
(array overlaps and concatenation, specifically).  I can certainly see
how it could be useful to be able to perform string concatenation in a
\set command in pgbench, for example..

Also, are we sure that we want to have both '=' and '==' for comparison?

In general, my gut feeling is that we should be trying to make what's
available in PG available in pgbench, though I can see an argument for
making what is available in C available in pgbench, but this seems to be
more of a mix of the two and I think we'd be better off deciding which
we want and sticking to it.

> New functions:
> --
> exp, ln, if

I don't see any particular issue with the exp() and ln() functions.  I
certainly understand how the if() function could be useful, but I'm not
entirely sure that the if(expression, true-result, false-result) is the
best approach.  In particular, I recall cases with other languages where
that gets to be quite ugly when there are multiple levels of if/else
using that approach.

> Documentation:
> =
> I have a small suggestion here: Earlier we had only few number of operators, 
> so
> it was OK to have the operators embedded in the description of '\set' command
> itself, but now as we have much more number of operators will it be a good 
> idea
> to have a table of operators similar to that of functions. We need not have
> several columns here like description, example etc., but a short table just
> categorizing the operators would be sufficient.

The table should really have a row per operator, which is what we do in
pretty much all of the core documention.  In particular, it seems like
we should try to follow something like:

https://www.postgresql.org/docs/current/static/functions-math.html

Alternativly, if we decide that we really want to try and keep with how
PG works with these operators, we could simply make these operators work
like those and then refer to that page in the core docs.

The question which was brought up about having a line-continuation
(eg: '\') character would be useful, which really makes me wonder if we
shouldn't try to come up with a way to create functions in a pgbench
script that can later be used in a \set command.  With such an approach,
we could have proper control structures for conditionals (if/else),
loops (while/for), etc, and not complicate the single-statement set of
operators with those constructs.

Lastly, we should really add some regression tests to pgbench.  We
already have the start of a TAP test script which it looks like it
wouldn't be too hard to add on to.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2016-09-30 Thread Alvaro Herrera
Thomas Munro wrote:
> On Thu, Sep 29, 2016 at 6:19 PM, David Fetter  wrote:

> > Please find attached the next revision.
> 
> I'm not sold on ERRCODE_SYNTAX_ERROR.  There's nothing wrong with the
> syntax, since parsing succeeded.  It would be cool if we could use
> ERRCODE_E_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED, though I'm not sure
> what error class 38 really means.  Does require_where's hook function
> count as an 'external routine' and on that basis is it it allowed to
> raise this error?  Thoughts, anyone?

I don't think it's appropriate to use class 38.  "Part 1: Framework"
saith
  An external routine is an SQL-invoked routine that references some
  compilation unit of a specified programming language that is outside
  the SQL-environment. The mechanism and time of binding of such a
  reference is implementation-defined.
It seems to me that what matters here is that what the user is doing is
an UPDATE, and the restriction is about it's SQL-written WHERE clause;
not whether require_where module is written in SQL or not (which seems
irrelevant to me).  So this is not "external" IMO.

But there's class 2F "SQL routine exception" which has 003 for
"prohibited SQL-statement attempted" ... oh, and we even have that in
errcodes.txt as ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED.  Seems
apropos.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY command with RLS bug

2016-09-30 Thread Stephen Frost
Adam, Michael,

* Adam Brightwell (adam.brightw...@crunchydata.com) wrote:
> > Looking for and improving test coverage for RLS is a good suggestion,
> > but let's not link the fate of the issue reported here with this
> > requirement. I have spent some time looking at this patch and this
> > looks in rather good shape to me (you even remembered to use the
> > prefix regress_* for the role name that you are adding!). So I have
> > marked this bug fix as ready for committer.
> 
> Excellent, thanks for the review and feedback. I appreciate it.

Attached is an updated patch which fixes a few things and adds a few
regression tests.  In particular, 'location' should be set to -1 as this
is an internally-generated query and there's no location inside the
query string passed in by the user that would make sense (and we
shouldn't ever end up in a situation where this query is failing in a
way that it would make sense to report a location to the user, either).
Also fixed a comment or two.

Comments and testing welcome, of course, though it's looking pretty good
to me at this point and I'll likely commit it in another day or two
unless issues are found.

Thanks!

Stephen
From ddf1247c550e7b30c9b8dc51cda438b7425bc7fa Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Fri, 30 Sep 2016 13:31:22 -0400
Subject: [PATCH] Fix RLS with COPY (col1, col2) FROM tab

Attempting to COPY a subset of columns from a table with RLS enabled
would fail due to an invalid query being constructed (using a single
ColumnRef with the list of fields to exact in 'fields', but that's for
the different levels of an indirection for a single column, not for
specifying multiple columns).

Correct by building a ColumnRef and then RestTarget for each column
being requested and then adding those to the targetList for the select
query.  Include regression tests to hopefully catch if this is broken
again in the future.

Patch-By: Adam Brightwell
Reviewed-By: Michael Paquier
---
 src/backend/commands/copy.c | 63 --
 src/test/regress/expected/copy2.out | 78 +
 src/test/regress/sql/copy2.sql  | 63 ++
 3 files changed, 192 insertions(+), 12 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 432b0ca..e665f06 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -871,6 +871,7 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt, uint64 *processed)
 			ColumnRef  *cr;
 			ResTarget  *target;
 			RangeVar   *from;
+			List	   *targetList = NIL;
 
 			if (is_from)
 ereport(ERROR,
@@ -878,21 +879,59 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt, uint64 *processed)
    errmsg("COPY FROM not supported with row-level security"),
 		 errhint("Use INSERT statements instead.")));
 
-			/* Build target list */
-			cr = makeNode(ColumnRef);
-
+			/*
+			 * Build target list
+			 *
+			 * If no columns are specified in the attribute list of the COPY
+			 * command, then the target list is 'all' columns. Therefore, '*'
+			 * should be used as the target list for the resulting SELECT
+			 * statement.
+			 *
+			 * In the case that columns are specified in the attribute list,
+			 * create a ColumnRef and ResTarget for each column and add them to
+			 * the target list for the resulting SELECT statement.
+			 */
 			if (!stmt->attlist)
+			{
+cr = makeNode(ColumnRef);
 cr->fields = list_make1(makeNode(A_Star));
-			else
-cr->fields = stmt->attlist;
+cr->location = -1;
+
+target = makeNode(ResTarget);
+target->name = NULL;
+target->indirection = NIL;
+target->val = (Node *) cr;
+target->location = -1;
 
-			cr->location = 1;
+targetList = list_make1(target);
+			}
+			else
+			{
+ListCell   *lc;
 
-			target = makeNode(ResTarget);
-			target->name = NULL;
-			target->indirection = NIL;
-			target->val = (Node *) cr;
-			target->location = 1;
+foreach(lc, stmt->attlist)
+{
+	/*
+	 * Build the ColumnRef for each column.  The ColumnRef
+	 * 'fields' property is a String 'Value' node (see
+	 * nodes/value.h) that correspond to the column name
+	 * respectively.
+	 */
+	cr = makeNode(ColumnRef);
+	cr->fields = list_make1(lfirst(lc));
+	cr->location = -1;
+
+	/* Build the ResTarget and add the ColumnRef to it. */
+	target = makeNode(ResTarget);
+	target->name = NULL;
+	target->indirection = NIL;
+	target->val = (Node *) cr;
+	target->location = -1;
+
+	/* Add each column to the SELECT statement's target list */
+	targetList = lappend(targetList, target);
+}
+			}
 
 			/*
 			 * Build RangeVar for from clause, fully qualified based on the
@@ -903,7 +942,7 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt, uint64 *processed)
 
 			/* Build query */
 			select = makeNode(SelectStmt);
-			select->targetList = list_make1(target);
+			select->targetList = targetList;
 			select->fro

Re: [HACKERS] Hash Indexes

2016-09-30 Thread Andres Freund
On 2016-09-30 17:39:04 +0100, Peter Geoghegan wrote:
> On Fri, Sep 30, 2016 at 4:46 PM, Robert Haas  wrote:
> > I would just be very disappointed if, after the amount of work that
> > Amit and others have put into this project, the code gets rejected
> > because somebody thinks a different project would have been more worth
> > doing.
> 
> I wouldn't presume to tell anyone else how to spend their time, and am
> not concerned about this making the hash index code any less useful
> from the user's perspective.

Me neither.

I'm concerned that this is a heck of a lot of work, and I don't think
we've reached the end of it by a good bit. I think it would have, and
probably still is, a more efficient use of time to go for the
hash-via-btree method, and rip out the current hash indexes.  But that's
just me.

I find it more than a bit odd to be accused of trying to waste others
time by saying this, and that this is too late because time has already
been invested. Especially the latter never has been a standard in the
community, and while excruciatingly painful when one is the person(s)
having invested the time, it probably shouldn't be.


> > The fact that we have hash indexes already and cannot remove them
> > because too much other code depends on hash opclasses is also, in my
> > opinion, a sufficiently good reason to pursue improving them.
> 
> I think that Andres was suggesting that hash index opclasses would be
> usable with hash-over-btree, so you might still not end up with the
> wart of having hash opclasses without hash indexes (an idea that has
> been proposed and rejected at least once before now). Andres?

Yes, that was what I was pretty much thinking. I was kind of guessing
that this might be easiest implemented as a separate AM ("hash2" ;))
that's just a layer ontop of nbtree.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-09-30 Thread Tom Lane
Peter Geoghegan  writes:
> On Fri, Sep 30, 2016 at 4:46 PM, Robert Haas  wrote:
>> The fact that we have hash indexes already and cannot remove them
>> because too much other code depends on hash opclasses is also, in my
>> opinion, a sufficiently good reason to pursue improving them.

> I think that Andres was suggesting that hash index opclasses would be
> usable with hash-over-btree, so you might still not end up with the
> wart of having hash opclasses without hash indexes (an idea that has
> been proposed and rejected at least once before now). Andres?

That's an interesting point.  If we were to flat-out replace the hash AM
with a hash-over-btree AM, the existing hash opclasses would just migrate
to that unchanged.  But if someone wanted to add hash-over-btree alongside
the hash AM, it would be necessary to clone all those opclass entries,
or else find a way for the two AMs to share pg_opclass etc entries.
Either one of those is kind of annoying.  (Although if we did do the work
of implementing the latter, it might come in handy in future; you could
certainly imagine that there will be cases like a next-generation GIST AM
wanting to reuse the opclasses of existing GIST, say.)

But having said that, I remain opposed to removing the hash AM.
If someone wants to implement hash-over-btree, that's cool with me,
but I'd much rather put it in beside plain hash and let them duke
it out in the field.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-09-30 Thread Peter Geoghegan
On Fri, Sep 30, 2016 at 4:46 PM, Robert Haas  wrote:
> I would just be very disappointed if, after the amount of work that
> Amit and others have put into this project, the code gets rejected
> because somebody thinks a different project would have been more worth
> doing.

I wouldn't presume to tell anyone else how to spend their time, and am
not concerned about this patch making the hash index code any less
useful from the user's perspective. If this is how we remove the wart
of hash indexes not being WAL-logged, that's fine by me. I'm trying to
be helpful.

> As Tom said upthread: $But to kick the hash AM as such to the
> curb is to say
> "sorry, there will never be O(1) index lookups in Postgres".$  I
> think that's correct and a sufficiently-good reason to pursue this
> work, regardless of the merits (or lack of merits) of hash-over-btree.

I don't think that "O(1) index lookups" is a useful guarantee with a
very expensive constant factor. Amit seemed to agree with this, since
he spoke of the importance of both theoretical performance benefits
and practically realizable performance benefits.

> The fact that we have hash indexes already and cannot remove them
> because too much other code depends on hash opclasses is also, in my
> opinion, a sufficiently good reason to pursue improving them.

I think that Andres was suggesting that hash index opclasses would be
usable with hash-over-btree, so you might still not end up with the
wart of having hash opclasses without hash indexes (an idea that has
been proposed and rejected at least once before).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequences and pg_upgrade

2016-09-30 Thread Anastasia Lubennikova

23.09.2016 21:06, Peter Eisentraut:

Here is an updated patch set.  Compared to the initial set, I have
changed pg_dump's sorting priorities so that sequence data is always
after table data.  This would otherwise have introduced a problem
because sortDataAndIndexObjectsBySize() only considers consecutive
DO_TABLE_DATA entries.  Also, I have removed the separate
--sequence-data switch from pg_dump and made it implicit in
--binary-upgrade.  (So the previous patches 0002 and 0003 have been
combined, because it's no longer a separate feature.)



The patches are good, no complaints.
But again, I have the same question.
I was confused, why do we always dump sequence data,
because I'd overlooked the --sequence-data key. I'd rather leave this 
option,

because it's quite non intuitive behaviour...
 /* dump sequence data even in schema-only mode */

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"

2016-09-30 Thread Thomas Kellerer

Masahiko Sawada schrieb am 30.09.2016 um 12:54:

I did this on two different computers, one with Windows 10 the other with 
Windows 7.
(only test-databases, so no real issue anyway)

In both cases running a "vacuum full" for the table in question fixed the 
problem and pg_upgrade finished without problems.


Because vacuum full removes the _vm file, pg_upgrade completed job successfully.
If you still have the _vm file
("d:/Daten/db/pgdata95/base/16410/85358_vm") that lead an error, is it
possible that you check if there is '\r\n' [0d 0a] character in that
_vm file or share that _vm file with us?



Yes, I saved one of the clusters.

The file can be downloaded from here: http://www.kellerer.eu/85358_vm







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"

2016-09-30 Thread Andrew Dunstan



On 09/30/2016 12:24 PM, Tom Lane wrote:

Andrew Dunstan  writes:

On 09/30/2016 10:25 AM, Tom Lane wrote:

Oh!  How would I enable or use that?

1. Pull the latest version of the module from git.
2. enable it in your buildfarm config file
3. do "run_branches.pl --run-all --verbose --test" and watch the output

Seems to be some additional prep work needed somewhere ...

No upgrade_install_root at 
/home/bfarm/bf-scripts/build-farm-4.17/PGBuild/Modules/TestUpgradeXversion.pm 
line 51.
No upgrade_install_root at 
/home/bfarm/bf-scripts/build-farm-4.17/PGBuild/Modules/TestUpgradeXversion.pm 
line 51.
No upgrade_install_root at 
/home/bfarm/bf-scripts/build-farm-4.17/PGBuild/Modules/TestUpgradeXversion.pm 
line 51.





Oh sorry, you also need an entry for that in the config file. Mine has:

   upgrade_install_root => '/home/bf/bfr/root/upgrade',


cheers

andre


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On conflict update & hint bits

2016-09-30 Thread Konstantin Knizhnik



On 30.09.2016 19:37, Peter Geoghegan wrote:

On Fri, Sep 30, 2016 at 5:33 PM, Konstantin Knizhnik
 wrote:

Later we try to check tuple visibility:

 ExecCheckHeapTupleVisible(estate, &tuple, buffer);

and inside HeapTupleSatisfiesMVCC try to set hint bit.

So, you're using repeatable read or serializable isolation level?


Repeatable read.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-09-30 Thread Peter Geoghegan
On Fri, Sep 30, 2016 at 4:46 PM, Robert Haas  wrote:
> I would just be very disappointed if, after the amount of work that
> Amit and others have put into this project, the code gets rejected
> because somebody thinks a different project would have been more worth
> doing.

I wouldn't presume to tell anyone else how to spend their time, and am
not concerned about this making the hash index code any less useful
from the user's perspective. If this is how we remove the wart of hash
indexes not being WAL-logged, that's fine by me. I am trying to be
helpful.

> As Tom said upthread: $But to kick the hash AM as such to the
> curb is to say
> "sorry, there will never be O(1) index lookups in Postgres".$  I
> think that's correct and a sufficiently-good reason to pursue this
> work, regardless of the merits (or lack of merits) of hash-over-btree.

I don't think that "O(1) index lookups" is a useful guarantee with a
very expensive constant factor. Amit said: "I think any discussion on
whether we should consider not to improve current hash indexes is only
meaningful if some one has a  code which can prove both theoretically
and practically that it is better than hash indexes for all usages",
so I think that he shares this view.

> The fact that we have hash indexes already and cannot remove them
> because too much other code depends on hash opclasses is also, in my
> opinion, a sufficiently good reason to pursue improving them.

I think that Andres was suggesting that hash index opclasses would be
usable with hash-over-btree, so you might still not end up with the
wart of having hash opclasses without hash indexes (an idea that has
been proposed and rejected at least once before now). Andres?

To be clear: I haven't expressed any opinion on this patch.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"

2016-09-30 Thread Tom Lane
Masahiko Sawada  writes:

> +#ifndef WIN32
>   if ((src_fd = open(fromfile, O_RDONLY, 0)) < 0)
> +#else
> + if ((src_fd = open(fromfile, O_RDONLY | O_BINARY)) < 0)
> +#endif

This is easier with PG_BINARY.  Also, both open() calls need this.

I'm rather inclined to also stick PG_BINARY into the open() calls in
copy_file() as well.  That function isn't actively broken since it's
not used under Windows, but I think it's highly likely that the
current bug arose from copying-and-pasting that code, so leaving it
in a state that's unsafe for Windows is just asking for trouble.

Attached is the patch I'm currently working with to fix all three
of the portability issues here (PG_BINARY, endianness, alignment,
and there's some cosmetic changes too).

> + pg_log(PG_WARNING,
> +"could not read expected bytes: read = %u, 
> expected = %u\n",
> +BLCKSZ, bytesRead);

I think this is probably better done as part of a wholesale revision
of the error reporting in this file.  What I have in mind is to
redefine copyFile, linkFile, and rewriteVisibilityMap as all being
responsible for reporting their own errors and then exiting (so
they can just return void).  We'd need to pass in the schema name
and table name so that they can include that context, so that the
reports don't get any less complete than they were before, but that
does not seem like a big downside.

A variant would be to have them print the error messages but then
return a bool success flag, leaving the caller to decide whether
it's fatal or not.  But that would be more complicated and I see
no real reason to think pg_upgrade would ever need it.

regards, tom lane

diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index b33f0b4..9ae92a9 100644
*** a/src/bin/pg_upgrade/file.c
--- b/src/bin/pg_upgrade/file.c
***
*** 18,25 
  #include 
  #include 
  
- #define BITS_PER_HEAPBLOCK_OLD 1
- 
  
  #ifndef WIN32
  static int	copy_file(const char *fromfile, const char *tofile);
--- 18,23 
*** copy_file(const char *srcfile, const cha
*** 84,93 
  		return -1;
  	}
  
! 	if ((src_fd = open(srcfile, O_RDONLY, 0)) < 0)
  		return -1;
  
! 	if ((dest_fd = open(dstfile, O_RDWR | O_CREAT | O_EXCL, S_IRUSR | S_IWUSR)) < 0)
  	{
  		save_errno = errno;
  
--- 82,92 
  		return -1;
  	}
  
! 	if ((src_fd = open(srcfile, O_RDONLY | PG_BINARY, 0)) < 0)
  		return -1;
  
! 	if ((dest_fd = open(dstfile, O_RDWR | O_CREAT | O_EXCL | PG_BINARY,
! 		S_IRUSR | S_IWUSR)) < 0)
  	{
  		save_errno = errno;
  
*** copy_file(const char *srcfile, const cha
*** 153,183 
   * version, we could refuse to copy visibility maps from the old cluster
   * to the new cluster; the next VACUUM would recreate them, but at the
   * price of scanning the entire table.  So, instead, we rewrite the old
!  * visibility maps in the new format.  That way, the all-visible bit
!  * remains set for the pages for which it was set previously.  The
!  * all-frozen bit is never set by this conversion; we leave that to
!  * VACUUM.
   */
  const char *
  rewriteVisibilityMap(const char *fromfile, const char *tofile)
  {
! 	int			src_fd = 0;
! 	int			dst_fd = 0;
! 	char		buffer[BLCKSZ];
! 	ssize_t		bytesRead;
  	ssize_t		totalBytesRead = 0;
  	ssize_t		src_filesize;
  	int			rewriteVmBytesPerPage;
  	BlockNumber new_blkno = 0;
  	struct stat statbuf;
  
! 	/* Compute we need how many old page bytes to rewrite a new page */
  	rewriteVmBytesPerPage = (BLCKSZ - SizeOfPageHeaderData) / 2;
  
  	if ((fromfile == NULL) || (tofile == NULL))
  		return "Invalid old file or new file";
  
! 	if ((src_fd = open(fromfile, O_RDONLY, 0)) < 0)
  		return getErrorText();
  
  	if (fstat(src_fd, &statbuf) != 0)
--- 152,181 
   * version, we could refuse to copy visibility maps from the old cluster
   * to the new cluster; the next VACUUM would recreate them, but at the
   * price of scanning the entire table.  So, instead, we rewrite the old
!  * visibility maps in the new format.  That way, the all-visible bits
!  * remain set for the pages for which they were set previously.  The
!  * all-frozen bits are never set by this conversion; we leave that to VACUUM.
   */
  const char *
  rewriteVisibilityMap(const char *fromfile, const char *tofile)
  {
! 	int			src_fd;
! 	int			dst_fd;
! 	char	   *buffer;
! 	char	   *new_vmbuf;
  	ssize_t		totalBytesRead = 0;
  	ssize_t		src_filesize;
  	int			rewriteVmBytesPerPage;
  	BlockNumber new_blkno = 0;
  	struct stat statbuf;
  
! 	/* Compute number of old-format bytes per new page */
  	rewriteVmBytesPerPage = (BLCKSZ - SizeOfPageHeaderData) / 2;
  
  	if ((fromfile == NULL) || (tofile == NULL))
  		return "Invalid old file or new file";
  
! 	if ((src_fd = open(fromfile, O_RDONLY | PG_BINARY, 0)) < 0)
  		return getErrorText();
  
  	if (fstat(src_fd, &statbuf) != 0)
*** rewriteVisibilityMap(c

Re: [HACKERS] On conflict update & hint bits

2016-09-30 Thread Peter Geoghegan
On Fri, Sep 30, 2016 at 5:33 PM, Konstantin Knizhnik
 wrote:
> Later we try to check tuple visibility:
>
> ExecCheckHeapTupleVisible(estate, &tuple, buffer);
>
> and inside HeapTupleSatisfiesMVCC try to set hint bit.

So, you're using repeatable read or serializable isolation level?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2016-09-30 Thread Julien Rouhaud
On 30/09/2016 05:23, Thomas Munro wrote:
> 
> It would be really nice to be able to set this to 'Ready for
> Committer' in this CF.  Do you want to post a v6 patch or are you
> happy for me to ask a committer to look at v5 + these three
> corrections?

I just looked at the patch, and noticed that only plain DELETE and
UPDATE commands are handled.  Is it intended that writable CTE without
WHERE clauses are not detected by this extension?  I personally think
that wCTE should be handled (everyone can forget a WHERE clause), but if
not it should at least be documented.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question / requests.

2016-09-30 Thread Vitaly Burovoy
On 9/30/16, Francisco Olarte  wrote:
> Hello everyone.

Hello, Francisco!

> Also, although I feel confident in my coding I have zero knowledge of
> developing for postgres,

It is easy enough and all important steps are documented in the wiki.
Also some interesting things can be found in presentations from
hackers about how to hack PostgreSQL.

> and I am not confident in my git or testing habilities.
> I can develop it, as it is just modifying a single libpq
> client program and only in the easy part of the string lists and may
> be emitting a new error (as this can introduce a new failure mode of
> 'no tables to vacuum'), I can test it locally and produce a patch for
> that file, but I'm not confident on integrating it, making git patchs
> or going further, so I would like to know if doing that would be
> enough and then I can give the code to someone to review or integrate
> it.

Do your best and send a patch. No one is good enough at understanding
all the code base at once. There are lot of people who know different
parts of the code and who have ability to test patches in different
ways.

You can be sure you get a feedback, your code will not be merged to
the code base without deep review and independent testing.

Just be ready to improve your patch according to a feedback and be
ready that usually it takes several rounds of sending-review before
patch is committed.

Also you can follow a discussion from one of simple patches in a
commitfest to be familiar with the process.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] On conflict update & hint bits

2016-09-30 Thread Konstantin Knizhnik

Hi,

I am faced with rarely reproduced problem at our multimaster (and never 
at vanilla Postgres).
We are using our own customized transaction manager, so it may be 
definitely the problem in our multimaster.
But stack trace looks suspiciously and this is why I want to consult 
with people familiar with this code whether it is bug in 
ExecOnConflictUpdate or not.


Briefly: ExecOnConflictUpdate tries to set hint bit without holding lock 
on the buffer and so get assertion failure in MarkBufferDirtyHint.


Now stack trace:

#0  0x7fe3b940acc9 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56

#1  0x7fe3b940e0d8 in __GI_abort () at abort.c:89
#2  0x0097b996 in ExceptionalCondition (conditionName=0xb4d970 
"!(LWLockHeldByMe(((LWLock*) (&(bufHdr)->content_lock", 
errorType=0xb4d2e9 "FailedAssertion",

fileName=0xb4d2e0 "bufmgr.c", lineNumber=3380) at assert.c:54
#3  0x007e365b in MarkBufferDirtyHint (buffer=946, buffer_std=1 
'\001') at bufmgr.c:3380
#4  0x009c3660 in SetHintBits (tuple=0x7fe396a9d858, buffer=946, 
infomask=256, xid=1398) at tqual.c:136
#5  0x009c5194 in HeapTupleSatisfiesMVCC (htup=0x7ffc00169030, 
snapshot=0x2e79778, buffer=946) at tqual.c:1065
#6  0x006ace83 in ExecCheckHeapTupleVisible (estate=0x2e81ae8, 
tuple=0x7ffc00169030, buffer=946) at nodeModifyTable.c:197
#7  0x006ae343 in ExecOnConflictUpdate (mtstate=0x2e81d50, 
resultRelInfo=0x2e81c38, conflictTid=0x7ffc001690c0, planSlot=0x2e82428, 
excludedSlot=0x2e82428, estate=0x2e81ae8,

canSetTag=1 '\001', returning=0x7ffc001690c8) at nodeModifyTable.c:1173
#8  0x006ad256 in ExecInsert (mtstate=0x2e81d50, slot=0x2e82428, 
planSlot=0x2e82428, arbiterIndexes=0x2e7eeb0, 
onconflict=ONCONFLICT_UPDATE, estate=0x2e81ae8, canSetTag=1 '\001')

at nodeModifyTable.c:395
#9  0x006aebe3 in ExecModifyTable (node=0x2e81d50) at 
nodeModifyTable.c:1496


In ExecOnConflictUpdate buffer is pinned but not locked:

/*
 * Lock tuple for update.  Don't follow updates when tuple cannot be
 * locked without doing so.  A row locking conflict here means our
 * previous conclusion that the tuple is conclusively committed is not
 * true anymore.
 */
tuple.t_self = *conflictTid;
test = heap_lock_tuple(relation, &tuple, estate->es_output_cid,
   lockmode, LockWaitBlock, false, &buffer,
   &hufd);

heap_lock_tuple is pinning buffer but not locking it:
 **buffer: set to buffer holding tuple (pinned but not locked at exit)

Later we try to check tuple visibility:

ExecCheckHeapTupleVisible(estate, &tuple, buffer);

and inside HeapTupleSatisfiesMVCC try to set hint bit.

MarkBufferDirtyHint assumes that buffer is locked:
 * 2. The caller might have only share-lock instead of exclusive-lock 
on the

 *  buffer's content lock.

and we get assertion failure in

/* here, either share or exclusive lock is OK */
Assert(LWLockHeldByMe(BufferDescriptorGetContentLock(bufHdr)));

So the question is whether it is correct that ExecOnConflictUpdate tries 
to access and update tuple without holding lock on the buffer?


Thank in advance,

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"

2016-09-30 Thread Masahiko Sawada
On Fri, Sep 30, 2016 at 2:40 PM, Masahiko Sawada  wrote:
> On Fri, Sep 30, 2016 at 1:26 PM, Kyotaro HORIGUCHI
>  wrote:
>> Hello,
>>
>> At Fri, 30 Sep 2016 13:11:21 +0900, Masahiko Sawada  
>> wrote in 
>>> On Fri, Sep 30, 2016 at 7:08 AM, Tom Lane  wrote:
>>> > Alvaro Herrera  writes:
>>> >> Tom Lane wrote:
>>> >>> I wouldn't even put a lot of faith in the errno being meaningful,
>>> >>> considering that it does close() calls before capturing the errno.
>>> >
>>> >> So we do close() in a bunch of places while closing shop, which calls
>>> >> _close() on Windows; this function sets errno.
>>> >
>>> > But only on failure, no?  The close()s usually shouldn't fail, and
>>> > therefore shouldn't change errno, it's just that you can't trust that
>>> > 100%.
>>> >
>>> > I think likely what's happening is that we're seeing a leftover value from
>>> > some previous syscall that set GetLastError's result (and, presumably,
>>> > wasn't fatal so far as pg_upgrade was concerned).
>>> >
>>>
>>> It means that read() syscall could not read BLOCKSZ bytes because
>>> probably _vm file is corrupted?
>>>
>>> > if ((src_fd = open(fromfile, O_RDONLY, 0)) < 0)
>>>
>>> It could be happen that read() syscall stopped to read at where it
>>> reads bits representing '\n' characters (0x5c6e) because we don't open
>>> file with O_BINARY flag?
>>
>> Windows behaves stupidly there. fread() and even() read converts
>> '\r\n' into '\n' when text mode so every sequence of [0d 0a] in
>> the reading bytes shortens the data length by 1 byte.
>>
>
> Ah, [0d 0a] is right.
> After I manually added [0d 0a] into _vm file, I reproduced this bug on 
> Windows.
> I will post the patch.
>

Attached patch fixes this issue and doesn't include the improvement
for error messaging.
Please find attached file.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


add_obinary_flag.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"

2016-09-30 Thread Tom Lane
Andrew Dunstan  writes:
> On 09/30/2016 10:25 AM, Tom Lane wrote:
>> Oh!  How would I enable or use that?

> 1. Pull the latest version of the module from git.
> 2. enable it in your buildfarm config file
> 3. do "run_branches.pl --run-all --verbose --test" and watch the output

Seems to be some additional prep work needed somewhere ...

No upgrade_install_root at 
/home/bfarm/bf-scripts/build-farm-4.17/PGBuild/Modules/TestUpgradeXversion.pm 
line 51.
No upgrade_install_root at 
/home/bfarm/bf-scripts/build-farm-4.17/PGBuild/Modules/TestUpgradeXversion.pm 
line 51.
No upgrade_install_root at 
/home/bfarm/bf-scripts/build-farm-4.17/PGBuild/Modules/TestUpgradeXversion.pm 
line 51.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-30 Thread Julien Rouhaud
On 23/09/2016 21:10, Robert Haas wrote:
> On Fri, Sep 23, 2016 at 8:54 AM, Peter Eisentraut
>  wrote:
>> On 9/20/16 4:07 PM, Robert Haas wrote:
>>> No, I'm assuming that the classes would be built-in.  A string tag
>>> seems like over-engineering to me, particularly because the postmaster
>>> needs to switch on the tag, and we need to be very careful about the
>>> degree to which the postmaster trusts the contents of shared memory.
>>
>> I'm hoping that we can come up with something that extensions can
>> participate in, without the core having to know ahead of time what those
>> extensions are or how they would be categorized.
>>
>> My vision is something like
>>
>> max_processes = 512  # requires restart
>>
>> process_allowances = 'connection:300 superuser:10 autovacuum:10
>> parallel:30 replication:10 someextension:20 someotherextension:20'
>> # does not require restart
> 
> I don't think it's going to be very practical to allow extensions to
> participate in the mechanism because there have to be a finite number
> of slots that is known at the time we create the main shared memory
> segment.
> 
> Also, it's really important that we don't add lots more surface area
> for the postmaster to crash and burn.
> 

It seems that there's no objection on Robert's initial proposal, so I'll
try to implement it.

I've already fixed every other issues mentioned upthread, but I'm facing
a problem for this one.  Assuming that the bgworker classes are supposed
to be mutually exclusive, I don't see a simple and clean way to add such
a check in SanityCheckBackgroundWorker().  Am I missing something
obvious, or can someone give me some advice for this?

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issue with bgworker, SPI and pgstat_report_stat

2016-09-30 Thread Julien Rouhaud
On 28/09/2016 18:46, Robert Haas wrote:
> 
> Everybody seems happy with this fix for a first step, so I've
> committed it and back-patched it to 9.3.
> 

Thanks!

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-09-30 Thread Robert Haas
On Fri, Sep 30, 2016 at 7:47 AM, Peter Geoghegan  wrote:
> My only firm position is that it wouldn't be very hard to investigate
> hash-over-btree to Andres' satisfaction, say, so, why not? I'm
> surprised that this has caused consternation -- ISTM that Andres'
> suggestion is *perfectly* reasonable. It doesn't appear to be an
> objection to anything in particular.

I would just be very disappointed if, after the amount of work that
Amit and others have put into this project, the code gets rejected
because somebody thinks a different project would have been more worth
doing.  As Tom said upthread: $$But to kick the hash AM as such to the
curb is to say
"sorry, there will never be O(1) index lookups in Postgres".$$  I
think that's correct and a sufficiently-good reason to pursue this
work, regardless of the merits (or lack of merits) of hash-over-btree.
The fact that we have hash indexes already and cannot remove them
because too much other code depends on hash opclasses is also, in my
opinion, a sufficiently good reason to pursue improving them.  I don't
think the project needs the additional justification of outperforming
a hash-over-btree in order to exist, even if such a comparison could
be done fairly, which I suspect is harder than you're crediting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Question / requests.

2016-09-30 Thread Francisco Olarte
Hello everyone. I've been using the bugs/general mailing lists for a
while, but never been on hackers, so please take that into account.

After some messages due to vacuumdb auto-deadlocking itself on the
system tables when doing paralell vacuum of a full database I
suggested adding some flags to make vacuumdb process schemas. I was
asked wether I could write a patch for that and I am thinking on doing
it.

Having began to read the developer FAQ I have searched the TODO list
for similar things, and I'm asking here to know if someone is already
working on something similar to avoid duplicating efforts.

What I'm planning to do is adding a couple of include-schema /
exclude-schema options to vacuumdb, so you can first do paralell
vacuum excluding pg_catalog and then do serial one including
pg_catalog to finally tidy the db. Or you can move rarely updated
tables to their schema and avoid vacuuming them. After that I may try
a couple of shortcuts for the system ( in case a future
pg_extra_catalog apears ). I was planning on reusing the code to get
all the tables from the catalog used in paralel vacuums when no tables
is specified, so the modifications are mainly string juggling, as I
feel the extra time / flexibility gained by doing a finely tuned query
does not justify the extra bug surface added.

I would like to know if someone is doing something intersecting with this.

Also, although I feel confident in my coding I have zero knowledge of
developing for postgres, and I am not confident in my git or testing
habilities. I can develop it, as it is just modifying a single libpq
client program and only in the easy part of the string lists and may
be emitting a new error ( as this can introduce a new failure mode of
'no tables to vacuum' ), I can test it locally and produce a patch for
that file, but I'm not confident on integrating it, making git patchs
or going further, so I would like to know if doing that would be
enough and then I can give the code to someone to review or integrate
it.

Waiting for orientation.

 Francisco Olarte.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"

2016-09-30 Thread Andrew Dunstan



On 09/30/2016 10:25 AM, Tom Lane wrote:

Andrew Dunstan  writes:

On 09/29/2016 05:48 PM, Tom Lane wrote:

We might've caught these things earlier if the buildfarm testing
included cross-version upgrades, but of course that requires a
lot of test infrastructure that's not there ...

I have done quite a bit of work on this - see


Oh!  How would I enable or use that?



1. Pull the latest version of the module from git.
2. enable it in your buildfarm config file
3. do "run_branches.pl --run-all --verbose --test" and watch the output


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ICU integration

2016-09-30 Thread Peter Eisentraut
On 9/23/16 2:27 AM, Thomas Munro wrote:
> This patch adds pkg-config support to our configure script, in order
> to produce the build options for ICU.  That's cool, and I'm a fan of
> pkg-config, but it's an extra dependency that I just wanted to
> highlight.  For example MacOSX appears to ship with ICU but has is no
> pkg-config or ICU .pc files out of the box (erm, I can't even find the
> headers, so maybe that copy of ICU is useless to everyone except
> Apple).

The Homebrew package icu4c contains this note:

OS X provides libicucore.dylib (but nothing else).

That probably explains what you are seeing.

> There is something missing from the configure script however: the
> output of pkg-config is not making it into CFLAGS or LDFLAGS, so
> building fails on FreeBSD and MacOSX where for example
>  doesn't live in the default search path.

It sets ICU_CFLAGS and ICU_LIBS, but it seems I didn't put ICU_CFLAGS in
the backend makefiles.  So that should be easy to fix.

> You call the built-in strcoll/strxfrm collation provider "posix", and
> although POSIX does define those functions, it only does so because it
> inherits them from ISO C90.

POSIX defines strcoll_l() and such.  But I agree SYSTEM might be better.

> I notice that you use encoding -1, meaning "all".

The use of encoding -1 is existing behavior.

> I haven't fully
> grokked what that really means but it appears that we won't be able to
> create any new such collations after initdb using CREATE COLLATION, if
> for example you update your ICU installation and now have a new
> collation available.  When I tried dropping some and recreating them
> they finished up with collencoding = 6.  Should the initdb-created
> rows use 6 too?

Good observation.  That will need some fine-tuning.

> The warning persists even after I reindex all affected tables (and
> start a new backend), because you're only recording the collation at
> pg_collation level and then only setting it at initdb time, so that
> HINT isn't much use for clearing the warning.  I think you'd need to
> record a snapshot of the version used for each collation that was used
> on *each index*, and update it whenever you CREATE INDEX or REINDEX
> etc.  There can of course be more than one collation and thus version
> involved, if you have columns with different collations, so I guess
> you'd need an column to hold an array of version snapshots whose order
> corresponds to pg_index.indcollation's.

Hmm, yeah, that will need more work.  To be completely correct, I think,
we'd also need to record the version in each expression node, so that
check constraints of the form CHECK (x > 'abc') can be handled.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"

2016-09-30 Thread Tom Lane
Andrew Dunstan  writes:
> On 09/29/2016 05:48 PM, Tom Lane wrote:
>> We might've caught these things earlier if the buildfarm testing
>> included cross-version upgrades, but of course that requires a
>> lot of test infrastructure that's not there ...

> I have done quite a bit of work on this - see 
> 

Oh!  How would I enable or use that?

> This actually runs on crake, and it has found problems in the past which 
> we've fixed. However, it requires quite a deal of disk space (4GB or 
> so), and the results are not stable, which is why I haven't enabled it.

Fair enough, but right now I'd like to do a one-shot test using a
big-endian machine to see whether it catches the apparent endianness
problem in rewriteVisibilityMap.  I have a feeling that that is something
that won't easily be caught by automated checks, because it requires a
case where the table pages are in a variety of visibility states, and even
if corruption has happened, only index-only-scan queries would notice.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup stream xlog to tar

2016-09-30 Thread Magnus Hagander
On Thu, Sep 29, 2016 at 12:44 PM, Magnus Hagander 
wrote:

> On Mon, Sep 5, 2016 at 10:01 AM, Michael Paquier <
> michael.paqu...@gmail.com> wrote:
>
>> On Sat, Sep 3, 2016 at 10:35 PM, Magnus Hagander 
>> wrote:
>> > Ugh. That would be nice to have, but I think that's outside the scope of
>> > this patch.
>>
>> A test for this patch that could have value would be to use
>> pg_basebackup -X stream -Ft, then untar pg_xlog.tar and look at the
>> size of the segments. If you have an idea to untar something without
>> the in-core perl support because we need to have the TAP stuff able to
>> run on at least 5.8.8, I'd welcome an idea. Honestly I still have
>> none, and that's why the recovery tests do not use tarballs in their
>> tests when using pg_basebackup. In short let's not add something more
>> for this patch.
>>
>> > PFA is an updated version of this patch that:
>> > * documents a magic value passed to zlib (which is in their
>> documentation as
>> > being a magic value, but has no define)
>> > * fixes the padding of tar files
>> > * adds a most basic test that the -X stream -Ft does produce a tarfile
>>
>> So I had a more serious look at this patch, and it basically makes
>> more generic the operations done for the plain mode by adding a set of
>> routines that can be used by both tar and plain mode to work on the
>> WAL files streamed. Elegant.
>>
>> +   
>> +The transaction log files will be written to
>> + the base.tar file.
>> +   
>> Nit: number of spaces here.
>>
>
> Fixed.
>
>
>
>> -mark_file_as_archived(const char *basedir, const char *fname)
>> +mark_file_as_archived(StreamCtl *stream, const char *fname)
>> Just passing WalMethod as argument would be enough, but... My patch
>> adding the fsync calls to pg_basebackup could just make use of
>> StreamCtl, so let's keep it as you suggest.
>>
>
> Yeah, I think it's cleaner to pass the whole structure around really. If
> not now, we'd need it eventually. That makes all callers more consistent.
>
>
>
>>  static bool
>>  existsTimeLineHistoryFile(StreamCtl *stream)
>>  {
>> [...]
>> +   return stream->walmethod->existsfile(histfname);
>>  }
>> existsfile returns always false for the tar method. This does not
>> matter much because pg_basebackup exists immediately in case of a
>> failure, but I think that this deserves a comment in ReceiveXlogStream
>> where existsTimeLineHistoryFile is called.
>>
>
> OK, added. As you say, the behaviour is expected, but it makes sense to
> mention it clealry there.
>
>
>
>> I find the use of existsfile() in open_walfile() rather confusing
>> because this relies on the fact  that existsfile() returns always
>> false for the tar mode. We could add an additional field in WalMethod
>> to store the method type and use that more, but that may make the code
>> more confusing than what you propose. What do you think?
>>
>
> Yeah, I'm not sure that helps. The point is that the abstraction is
> supposed to take care of that. But if it's confusing, then clearly a
> comment is warranted there, so I've added that. Do you think that makes it
> clear enough?
>
>
>
>>
>> +   int (*unlink) (const char *pathname);
>> The unlink method is used nowhere. This could just be removed.
>>
>
> That's clearly a missed cleanup. Removed, thanks.
>
>
>
>> -static void
>> +void
>>  print_tar_number(char *s, int len, uint64 val)
>> This could be an independent patch. Or not.
>>
>
> Could be, but we don't really have any other uses for it.
>
>
>
>> I think that I found another bug regarding the contents of the
>> segments. I did pg_basebackup -F t -X stream, then untared pg_xlog.tar
>> which contained segment 1/0/2, then:
>> $ pg_xlogdump 00010002
>> pg_xlogdump: FATAL:  could not find a valid record after 0/200
>> I'd expect this segment to have records, up to a XLOG_SWITCH record.
>>
>
> Ugh. That's definitely broken yes. It seeked back and overwrote the tar
> header with the data, instead of starting where the file part was supposed
> to be. It worked fine on compressed files, and it's when implementing that
> that it broke.
>
> So what's our basic rule for these perl tests - are we allowed to use
> pg_xlogdump from within a pg_basebackup test? If so that could actually be
> a useful test - do the backup, extract the xlog and verify that it contains
> the SWITCH record.
>
>
> I also noticed that using -Z5 created a .tar.gz and -z created a .tar
> (which was compressed).  Because compresslevel is set to -1 with -z,
> meaning default.
>
>
> Again, apologies for getting late back into the game here.
>
>
And here's yet another version, now rebased on top of the fsync and nosync
changes that got applied.

In particular, this conflicted with pretty much every single change from
the fsync patch, so I'm definitely looking for another round of review
before this can be committed.

I ended up moving much of the fsync stuff into walmethods.c, since they
were dependent on if we used tar or no

Re: [HACKERS] wal_segment size vs max_wal_size

2016-09-30 Thread Michael Paquier
On Fri, Sep 30, 2016 at 11:05 PM, Peter Eisentraut
 wrote:
> On 9/26/16 8:38 PM, Michael Paquier wrote:
>> On Mon, Sep 26, 2016 at 9:30 PM, Kuntal Ghosh
>>  wrote:
>>> On Mon, Sep 26, 2016 at 5:04 PM, Amit Kapila  
>>> wrote:

 IIRC, there is already a patch to update the minRecoveryPoint
 correctly, can you check if that solves the problem for you?

 [1] - 
 https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp

>>> +1. I've tested after applying the patch. This clearly solves the problem.
>>
>> Even if many things have been discussed on this thread,
>> Horiguchi-san's first patch is still the best approach found after
>> several lookups and attempts when messing with the recovery code.
>
> What is the status of that patch then?  The above thread seems to have
> stopped.

The conclusion is to use the original patch proposed by Horiguchi-san,
and with a test case I have added you get that:
https://www.postgresql.org/message-id/CAB7nPqTv5gmKQcNDoFGTGqoqXz2xLz4RRw247oqOJzZTVy6-7Q%40mail.gmail.com
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"

2016-09-30 Thread Andrew Dunstan



On 09/29/2016 05:48 PM, Tom Lane wrote:

We might've caught these things earlier if the buildfarm testing
included cross-version upgrades, but of course that requires a
lot of test infrastructure that's not there ...





I have done quite a bit of work on this - see 



This actually runs on crake, and it has found problems in the past which 
we've fixed. However, it requires quite a deal of disk space (4GB or 
so), and the results are not stable, which is why I haven't enabled it.


What I am thinking of doing is making the diff tests fuzzy - if, say, we 
don't get a diff of more than 2000 lines we won't consider it a failure.


Among this module's advantages is that it tests upgrading quite a bit 
more than the standard pg_upgrade check - all the buildfarm's regression 
databases, not just the one created by "make check". For example, 
currently it is failing to upgrade from 9.6 to HEAD with this error:


   Could not load library "$libdir/hstore_plpython2"
   ERROR:  could not load library
   "/home/bf/bfr/root/upgrade/HEAD/inst/lib/postgresql/hstore_plpython2.so":
   /home/bf/bfr/root/upgrade/HEAD/inst/lib/postgresql/hstore_plpython2.so:
   undefined symbol: _Py_NoneStruct

I need to look into that.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wal_segment size vs max_wal_size

2016-09-30 Thread Peter Eisentraut
On 9/26/16 8:38 PM, Michael Paquier wrote:
> On Mon, Sep 26, 2016 at 9:30 PM, Kuntal Ghosh
>  wrote:
>> On Mon, Sep 26, 2016 at 5:04 PM, Amit Kapila  wrote:
>>>
>>> IIRC, there is already a patch to update the minRecoveryPoint
>>> correctly, can you check if that solves the problem for you?
>>>
>>> [1] - 
>>> https://www.postgresql.org/message-id/20160609.215558.118976703.horiguchi.kyotaro%40lab.ntt.co.jp
>>>
>> +1. I've tested after applying the patch. This clearly solves the problem.
> 
> Even if many things have been discussed on this thread,
> Horiguchi-san's first patch is still the best approach found after
> several lookups and attempts when messing with the recovery code.

What is the status of that patch then?  The above thread seems to have
stopped.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-09-30 Thread Daniel Verite
Tomas Vondra wrote:

> A few minor comments regarding the patch:
> 
> 1) CopyStartSend seems pretty pointless - It only has one function call 
> in it, and is called on exactly one place (and all other places simply 
> call allowLongStringInfo directly). I'd get rid of this function and 
> replace the call in CopyOneRowTo(() with allowLongStringInfo().
> 
> 2) allowlong seems awkward, allowLong or allow_long would be better
> 
> 3) Why does resetStringInfo reset the allowLong flag? Are there any 
> cases when we want/need to forget the flag value? I don't think so, so 
> let's simply not reset it and get rid of the allowLongStringInfo() 
> calls. Maybe it'd be better to invent a new makeLongStringInfo() method 
> instead, which would make it clear that the flag value is permanent.
> 
> 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log 
> message, but with '%d' and not '%ld' (as needed after changing the type 
> to Size).
>
> 5) The comment at allowLongStringInfo talks about allocLongStringInfo 
> (i.e. wrong function name).

Here's an updated patch. Compared to the previous version:

- removed CopyStartSend (per comment #1 in review)

- renamed flag to allow_long (comment #2)

- resetStringInfo no longer resets the flag (comment #3).

- allowLongStringInfo() is removed (comment #3 and #5),
makeLongStringInfo() and initLongStringInfo() are provided
instead, as alternatives to makeStringInfo() and initStringInfo().

- StringInfoData.len is back to int type, 2GB max.
(addresses comment #4 incidentally).
This is safer because  many routines peeking
into StringInfoData use int for offsets into the buffer,
for instance most of the stuff in backend/libpq/pqformat.c
Altough these routines are not supposed to be called on long
buffers, this assertion was not enforced in the code, and
with a 64-bit length effectively over 2GB, they would misbehave
pretty badly.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
diff --git a/src/backend/access/common/heaptuple.c 
b/src/backend/access/common/heaptuple.c
index 6d0f3f3..ed9cabd 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -741,7 +741,9 @@ heap_form_tuple(TupleDesc tupleDescriptor,
 * Allocate and zero the space needed.  Note that the tuple body and
 * HeapTupleData management structure are allocated in one chunk.
 */
-   tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
+   tuple = MemoryContextAllocExtended(CurrentMemoryContext,
+  
HEAPTUPLESIZE + len,
+  
MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO);
tuple->t_data = td = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
 
/*
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 432b0ca..7419362 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -397,7 +397,7 @@ ReceiveCopyBegin(CopyState cstate)
pq_sendint(&buf, format, 2);/* per-column 
formats */
pq_endmessage(&buf);
cstate->copy_dest = COPY_NEW_FE;
-   cstate->fe_msgbuf = makeStringInfo();
+   cstate->fe_msgbuf = makeLongStringInfo();
}
else if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 2)
{
@@ -1892,7 +1892,7 @@ CopyTo(CopyState cstate)
cstate->null_print_client = cstate->null_print; /* default */
 
/* We use fe_msgbuf as a per-row buffer regardless of copy_dest */
-   cstate->fe_msgbuf = makeStringInfo();
+   cstate->fe_msgbuf = makeLongStringInfo();
 
/* Get info about the columns we need to process. */
cstate->out_functions = (FmgrInfo *) palloc(num_phys_attrs * 
sizeof(FmgrInfo));
@@ -2707,8 +2707,8 @@ BeginCopyFrom(ParseState *pstate,
cstate->cur_attval = NULL;
 
/* Set up variables to avoid per-attribute overhead. */
-   initStringInfo(&cstate->attribute_buf);
-   initStringInfo(&cstate->line_buf);
+   initLongStringInfo(&cstate->attribute_buf);
+   initLongStringInfo(&cstate->line_buf);
cstate->line_buf_converted = false;
cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
cstate->raw_buf_index = cstate->raw_buf_len = 0;
diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c
index 7382e08..0125047 100644
--- a/src/backend/lib/stringinfo.c
+++ b/src/backend/lib/stringinfo.c
@@ -37,6 +37,24 @@ makeStringInfo(void)
 }
 
 /*
+ * makeLongStringInfo
+ *
+ * Same as makeStringInfo for larger strings.
+ */
+StringInfo
+makeLongStringInfo(void)
+{
+   StringInfo  res;
+
+   res = (StringInfo) palloc(sizeof(StringInfoData));
+
+   initLongStringInfo(res);
+
+   return res;
+}
+
+
+/*
  * initStringInfo
  *
  * Initialize a StringInfoData struct (with previously undefine

Re: [HACKERS] Tuplesort merge pre-reading

2016-09-30 Thread Peter Geoghegan
On Thu, Sep 29, 2016 at 4:10 PM, Heikki Linnakangas  wrote:
> Bah, I fumbled the initSlabAllocator() function, attached is a fixed
> version.

This looks much better. It's definitely getting close. Thanks for
being considerate of my more marginal concerns. More feedback:

* Should say "fixed number of...":

> +* we start merging.  Merging only needs to keep a small, fixed number 
> tuples

* Minor concern about these new macros:

> +#define IS_SLAB_SLOT(state, tuple) \
> +   ((char *) tuple >= state->slabMemoryBegin && \
> +(char *) tuple < state->slabMemoryEnd)
> +
> +/*
> + * Return the given tuple to the slab memory free list, or free it
> + * if it was palloc'd.
> + */
> +#define RELEASE_SLAB_SLOT(state, tuple) \
> +   do { \
> +   SlabSlot *buf = (SlabSlot *) tuple; \
> +   \
> +   if (IS_SLAB_SLOT(state, tuple)) \
> +   { \
> +   buf->nextfree = state->slabFreeHead; \
> +   state->slabFreeHead = buf; \
> +   } else \
> +   pfree(tuple); \
> +   } while(0)

I suggest duplicating the paranoia seen elsewhere around what "state"
macro argument could expand to. You know, by surrounding "state" with
parenthesis each time it is used. This is what we see with existing,
similar macros.

* Should cast to int64 here (for the benefit of win64):

> +   elog(LOG, "using " INT64_FORMAT " KB of memory for read buffers among 
> %d input tapes",
> +(long) (availBlocks * BLCKSZ) / 1024, numInputTapes);

* FWIW, I still don't love this bit:

> +* numTapes and numInputTapes reflect the actual number of tapes we will
> +* use.  Note that the output tape's tape number is maxTapes - 1, so the
> +* tape numbers of the used tapes are not consecutive, so you cannot
> +* just loop from 0 to numTapes to visit all used tapes!
> +*/
> +   if (state->Level == 1)
> +   {
> +   numInputTapes = state->currentRun;
> +   numTapes = numInputTapes + 1;
> +   FREEMEM(state, (state->maxTapes - numTapes) * TAPE_BUFFER_OVERHEAD);
> +   }

But I can see how the verbosity of almost-duplicating the activeTapes
stuff seems unappealing. That said, I think that you should point out
in comments that you're calculating the number of
maybe-active-in-some-merge tapes. They're maybe-active in that they
have some number of real tapes. Not going to insist on that, but
something to think about.

* Shouldn't this use state->tapeRange?:

> +   else
> +   {
> +   numInputTapes = state->maxTapes - 1;
> +   numTapes = state->maxTapes;
> +   }

* Doesn't it also set numTapes without it being used? Maybe that
variable can be declared within "if (state->Level == 1)" block.

* Minor issues with initSlabAllocator():

You call the new function initSlabAllocator() as follows:

> +   if (state->tuples)
> +   initSlabAllocator(state, numInputTapes + 1);
> +   else
> +   initSlabAllocator(state, 0);

Isn't the number of slots (the second argument to initSlabAllocator())
actually just numInputTapes when we're "state->tuples"? And so,
shouldn't the "+ 1" bit happen within initSlabAllocator() itself? It
can just inspect "state->tuples" itself. In short, maybe push a bit
more into initSlabAllocator(). Making the arguments match those passed
to initTapeBuffers() a bit later would be nice, perhaps.

* This could be simpler, I think:

> +   /* Release the read buffers on all the other tapes, by rewinding them. */
> +   for (tapenum = 0; tapenum < state->maxTapes; tapenum++)
> +   {
> +   if (tapenum == state->result_tape)
> +   continue;
> +   LogicalTapeRewind(state->tapeset, tapenum, true);
> +   }

Can't you just use state->tapeRange, and remove the "continue"? I
recommend referring to "now-exhausted input tapes" here, too.

* I'm not completely prepared to give up on using
MemoryContextAllocHuge() within logtape.c just yet. Maybe you're right
that it couldn't possibly matter that we impose a MaxAllocSize cap
within logtape.c (per tape), but I have slight reservations that I
need to address. Maybe a better way of putting it would be that I have
some reservations about possible regressions at the very high end,
with very large workMem. Any thoughts on that?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Refactoring: rename md5Salt to pwsalt

2016-09-30 Thread Michael Paquier
On Fri, Sep 30, 2016 at 9:40 PM, Tom Lane  wrote:
> Aleksander Alekseev  writes:
>> Suggested patch (first of many, I hope) renames `md5Salt` to more
>> general `pwsalt`.
>> Does it sound reasonable?
>
> I'm dubious.  The main problem with supposing that port->md5Salt
> can serve other purposes is its fixed size.  I think you're likely
> going to have to change that representation at some point (eg
> make it a separately-palloc'd field).  My inclination would be to
> do the field renaming at the same time you change the representation,
> since that provides a convenient way to ensure you've caught every
> place that has to change.

SCRAM is going to use more than 4 bytes here. RFC5802 does not given
directly a length, the last set of patches has been using 10 bytes,
but at the end we are very likely to use more than that, and not 4 for
sure.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Refactoring: rename md5Salt to pwsalt

2016-09-30 Thread Tom Lane
Aleksander Alekseev  writes:
> Suggested patch (first of many, I hope) renames `md5Salt` to more
> general `pwsalt`.
> Does it sound reasonable?

I'm dubious.  The main problem with supposing that port->md5Salt
can serve other purposes is its fixed size.  I think you're likely
going to have to change that representation at some point (eg
make it a separately-palloc'd field).  My inclination would be to
do the field renaming at the same time you change the representation,
since that provides a convenient way to ensure you've caught every
place that has to change.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Refactoring: rename md5Salt to pwsalt

2016-09-30 Thread Aleksander Alekseev
Hello.

Since there are plans/efforts to introduce additional authorization
methods in nearest feature I suggest to refactor the code so it
wouldn't mention md5 when it possible. `md5Salt` for instance could be
not only "md5 salt" but also "sha2 salt", etc - depending on what
authorization method was chosen.

Suggested patch (first of many, I hope) renames `md5Salt` to more
general `pwsalt`.

Does it sound reasonable?

-- 
Best regards,
Aleksander Alekseev
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 0ba8530..25bb4c2 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -536,7 +536,7 @@ ClientAuthentication(Port *port)
 		(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION),
 		 errmsg("MD5 authentication is not supported when \"db_user_namespace\" is enabled")));
 			/* include the salt to use for computing the response */
-			sendAuthRequest(port, AUTH_REQ_MD5, port->md5Salt, 4);
+			sendAuthRequest(port, AUTH_REQ_MD5, port->pwsalt, 4);
 			status = recv_and_check_password_packet(port, &logdetail);
 			break;
 
diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c
index d84a180..98f3315 100644
--- a/src/backend/libpq/crypt.c
+++ b/src/backend/libpq/crypt.c
@@ -96,8 +96,8 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
 			{
 /* stored password already encrypted, only do salt */
 if (!pg_md5_encrypt(shadow_pass + strlen("md5"),
-	port->md5Salt,
-	sizeof(port->md5Salt), crypt_pwd))
+	port->pwsalt,
+	sizeof(port->pwsalt), crypt_pwd))
 {
 	pfree(crypt_pwd);
 	return STATUS_ERROR;
@@ -118,8 +118,8 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
 	return STATUS_ERROR;
 }
 if (!pg_md5_encrypt(crypt_pwd2 + strlen("md5"),
-	port->md5Salt,
-	sizeof(port->md5Salt),
+	port->pwsalt,
+	sizeof(port->pwsalt),
 	crypt_pwd))
 {
 	pfree(crypt_pwd);
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 0c0a609..b7ab8dd 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2350,7 +2350,7 @@ ConnCreate(int serverFd)
 	 * after.  Else the postmaster's random sequence won't get advanced, and
 	 * all backends would end up using the same salt...
 	 */
-	RandomSalt(port->md5Salt, sizeof(port->md5Salt));
+	RandomSalt(port->pwsalt, sizeof(port->pwsalt));
 
 	/*
 	 * Allocate GSSAPI specific state struct
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index b91eca5..6b7935c 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -144,7 +144,7 @@ typedef struct Port
 	 * Information that needs to be held during the authentication cycle.
 	 */
 	HbaLine*hba;
-	char		md5Salt[4];		/* Password salt */
+	char		pwsalt[4];		/* Password salt */
 
 	/*
 	 * Information that really has no business at all being in struct Port,
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
index 404bc93..9123d5b 100644
--- a/src/interfaces/libpq/fe-auth.c
+++ b/src/interfaces/libpq/fe-auth.c
@@ -522,8 +522,8 @@ pg_password_sendauth(PGconn *conn, const char *password, AuthRequest areq)
 	free(crypt_pwd);
 	return STATUS_ERROR;
 }
-if (!pg_md5_encrypt(crypt_pwd2 + strlen("md5"), conn->md5Salt,
-	sizeof(conn->md5Salt), crypt_pwd))
+if (!pg_md5_encrypt(crypt_pwd2 + strlen("md5"), conn->pwsalt,
+	sizeof(conn->pwsalt), crypt_pwd))
 {
 	free(crypt_pwd);
 	return STATUS_ERROR;
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f3a9e5a..7529fd5 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2441,8 +2441,8 @@ keep_going:		/* We will come back to here until there is
 /* Get the password salt if there is one. */
 if (areq == AUTH_REQ_MD5)
 {
-	if (pqGetnchar(conn->md5Salt,
-   sizeof(conn->md5Salt), conn))
+	if (pqGetnchar(conn->pwsalt,
+   sizeof(conn->pwsalt), conn))
 	{
 		/* We'll come back when there are more data */
 		return PGRES_POLLING_READING;
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index be6c370..1e18688 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -389,7 +389,7 @@ struct pg_conn
 	/* Miscellaneous stuff */
 	int			be_pid;			/* PID of backend --- needed for cancels */
 	int			be_key;			/* key of backend --- needed for cancels */
-	char		md5Salt[4];		/* password salt received from backend */
+	char		pwsalt[4];		/* password salt received from backend */
 	pgParameterStatus *pstatus; /* ParameterStatus data */
 	int			client_encoding;	/* encoding id */
 	bool		std_strings;	/* standard_conforming_strings */


pgpHAHfrE45d_.pgp
Description: OpenPGP digital signature


Re: [HACKERS] Learning to hack Postgres - Keeping track of ctids

2016-09-30 Thread Craig Ringer
On 30 Sep. 2016 19:36, "Emrul"  wrote:
>
> Thanks Craig, I will look at that code.
>
> The alternative I'm considering is whether I can define my own index type
on
> the column storing my array of primary key ids.  The index would, for each
> primary key id just provide a reference to the ctids of the related
records.
>
> In theory, I think this might be less work.  However, I am yet to find a
> simple/bare index implementation that I can build upon.

I don't think indexes are fully pluggable yet. And I'd be surprised if it
were possible within GiST /GIN. Check out the 9.6 / v10 pg_am changes,
which I think may be relevant.

Or it could be an unrelated dead end because I know little about indexing.

>
>
>
> --
> View this message in context:
http://postgresql.nabble.com/Learning-to-hack-Postgres-Keeping-track-of-ctids-tp5923649p5923757.html
> Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

2016-09-30 Thread Jeevan Chalke
On Mon, Sep 26, 2016 at 6:15 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> This patch will need some changes to conversion_error_callback(). That
> function reports an error in case there was an error converting the
> result obtained from the foreign server into an internal datum e.g.
> when the string returned by the foreign server is not acceptable by
> local input function for the expected datatype. In such cases, the
> input function will throw error and conversion_error_callback() will
> provide appropriate context for that error. postgres_fdw.sql has tests
> to test proper context
> We need to fix the error context to provide meaningful information or
> at least not crash. This has been discussed briefly in [1].
>

Oops. I had that in mind when working on this. Somehow skipped checking
for conversion error context. I have fixed that in v3 patch.
Removed assert and for non Var expressions, printing generic context.
This context is almost in-line with the discussion you referred here.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Hash Indexes

2016-09-30 Thread Peter Geoghegan
On Fri, Sep 30, 2016 at 9:07 AM, Amit Kapila  wrote:
> Considering,  I have missed the real intention of their suggestions, I think
> such a serious objection on any work should be discussed on list.  To answer
> the actual objection, I have already mentioned upthread that we can deduce
> from the current tests done by Jesper and Mithun that there are some cases
> where hash index will be better than hash-over-btree (tests done over
> integer columns).  I think any discussion on whether we should consider not
> to improve current hash indexes is only meaningful if some one has a  code
> which can prove both theoretically and practically that it is better than
> hash indexes for all usages.

I cannot speak for Andres, but you judged my intent here correctly. I
have no firm position on any of this just yet; I haven't even read the
patch. I just think that it is worth doing some simple analysis of a
hash-over-btree implementation, with simple prototyping and a simple
test-case. I would consider that a due-diligence thing, because,
honestly, it seems obvious to me that it should be at least checked
out informally.

I wasn't aware that there was already some analysis of this. Robert
did just acknowledge that it is *possible* that "the hash-over-btree
idea completely trounces hash indexes", so the general tone of this
thread suggested to me that there was little or no analysis of
hash-over-btree. I'm willing to believe that I'm wrong to be
dismissive of the hash AM in general, and I'm even willing to be
flexible on crediting the hash AM with being less optimized overall
(assuming we can see a way past that).

My only firm position is that it wouldn't be very hard to investigate
hash-over-btree to Andres' satisfaction, say, so, why not? I'm
surprised that this has caused consternation -- ISTM that Andres'
suggestion is *perfectly* reasonable. It doesn't appear to be an
objection to anything in particular.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Learning to hack Postgres - Keeping track of ctids

2016-09-30 Thread Emrul
Thanks Craig, I will look at that code.

The alternative I'm considering is whether I can define my own index type on
the column storing my array of primary key ids.  The index would, for each
primary key id just provide a reference to the ctids of the related records.

In theory, I think this might be less work.  However, I am yet to find a
simple/bare index implementation that I can build upon.



--
View this message in context: 
http://postgresql.nabble.com/Learning-to-hack-Postgres-Keeping-track-of-ctids-tp5923649p5923757.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics (v19)

2016-09-30 Thread Heikki Linnakangas
This patch set is in pretty good shape, the only problem is that it's so 
big that no-one seems to have the time or courage to do the final 
touches and commit it. If we just focus on the functional dependencies 
part for now, I think we might get somewhere. I peeked at the MCV and 
histogram patches too, and I think they make total sense as well, and 
are a natural extension of the functional dependencies patch. So if we 
just focus on that for now, I don't think we will paint ourselves in the 
corner.


(more below)

On 09/14/2016 01:01 AM, Tomas Vondra wrote:

On 09/12/2016 04:08 PM, Dean Rasheed wrote:

Regarding the statistics themselves, I read the description of soft
functional dependencies, and I'm somewhat skeptical about that
algorithm. I don't like the arbitrary thresholds or the sudden jump
from independence to dependence and clause reduction. As others have
said, I think this should account for a continuous spectrum of
dependence from fully independent to fully dependent, and combine
clause selectivities in a way based on the degree of dependence. For
example, if you computed an estimate for the fraction 'f' of the
table's rows for which a -> b, then it might be reasonable to combine
the selectivities using

  P(a,b) = P(a) * (f + (1-f) * P(b))



Yeah, I agree that the thresholds resulting in sudden changes between
"dependent" and "not dependent" are annoying. The question is whether it
makes sense to fix that, though - the functional dependencies were meant
as the simplest form of statistics, allowing us to get the rest of the
infrastructure in.

I'm OK with replacing the true/false dependencies with a degree of
dependency between 0 and 1, but I'm a bit afraid it'll result in
complaints that the first patch got too large / complicated.


+1 for using a floating degree between 0 and 1, rather than a boolean.


It also contradicts the idea of using functional dependencies as a
low-overhead type of statistics, filtering the list of clauses that need
to be estimated using more expensive types of statistics (MCV lists,
histograms, ...). Switching to a degree of dependency would prevent
removal of "unnecessary" clauses.


That sounds OK to me, although I'm not deeply familiar with this patch yet.


Of course, having just a single number that tells you the columns are
correlated, tells you nothing about whether the clauses on those
columns are consistent with that correlation. For example, in the
following table

CREATE TABLE t(a int, b int);
INSERT INTO t SELECT x/10, ((x/10)*789)%100 FROM generate_series(0,999) g(x);

'b' is functionally dependent on 'a' (and vice versa), but if you
query the rows with a<50 and with b<50, those clauses behave
essentially independently, because they're not consistent with the
functional dependence between 'a' and 'b', so the best way to combine
their selectivities is just to multiply them, as we currently do.

So whilst it may be interesting to determine that 'b' is functionally
dependent on 'a', it's not obvious whether that fact by itself should
be used in the selectivity estimates. Perhaps it should, on the
grounds that it's best to attempt to use all the available
information, but only if there are no more detailed statistics
available. In any case, knowing that there is a correlation can be
used as an indicator that it may be worthwhile to build more detailed
multivariate statistics like a MCV list or a histogram on those
columns.


Right. IIRC this is actually described in the README as "incompatible
conditions". While implementing it, I concluded that this is OK and it's
up to the developer to decide whether the queries are compatible with
the "assumption of compatibility". But maybe this is reasoning is bogus
and makes (the current implementation of) functional dependencies
unusable in practice.


I think that's OK. It seems like a good assumption that the conditions 
are "compatible" with the functional dependency. For two reasons:


1) A query with compatible clauses is much more likely to occur in real 
life. Why would you run a query with an incompatible ZIP and city clauses?


2) If the conditions were in fact incompatible, the query is likely to 
return 0 rows, and will bail out very quickly, even if the estimates are 
way off and you choose a non-optimal plan. There are exceptions, of 
course: an index scan might be able to conclude that there are no rows 
much quicker than a seqscan, but as a general rule of thumb, a query 
that returns 0 rows isn't very sensitive to the chosen plan.


And of course, as long as we're not collecting these statistics 
automatically, if it doesn't work for your application, just don't 
collect them.



I fear that using "statistics" as the name of the new object might get a 
bit awkward. "statistics" is a plural, but we use it as the name of a 
single object, like "pants" or "scissors". Not sure I have any better 
ideas though. "estimator"? "statistics collection"? Or perhaps it should 
be singular, "statistic". I not

Re: [HACKERS] [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"

2016-09-30 Thread Masahiko Sawada
On Fri, Sep 30, 2016 at 6:40 PM, Thomas Kellerer  wrote:
> Tom Lane schrieb am 29.09.2016 um 23:10:
>> Thomas Kellerer  writes:
>>> for some reason pg_upgrade failed on Windows 10 for me, with an error 
>>> message that one specifc _vm file couldn't be copied.
>>
>> Hmm ... a _vm file would go through rewriteVisibilityMap(), which is new
>> code for 9.6 and hasn't really gotten that much testing.  Its error
>> reporting is shamefully bad --- you can't tell which step failed, and
>> I wouldn't even put a lot of faith in the errno being meaningful,
>> considering that it does close() calls before capturing the errno.
>>
>> But what gets my attention in this connection is that it doesn't
>> seem to be taking the trouble to open the files in binary mode.
>> Could that lead to the reported failure?  Not sure, but it seems
>> like at the least it could result in corrupted VM files.
>
> I did this on two different computers, one with Windows 10 the other with 
> Windows 7.
> (only test-databases, so no real issue anyway)
>
> In both cases running a "vacuum full" for the table in question fixed the 
> problem and pg_upgrade finished without problems.

Because vacuum full removes the _vm file, pg_upgrade completed job successfully.
If you still have the _vm file
("d:/Daten/db/pgdata95/base/16410/85358_vm") that lead an error, is it
possible that you check if there is '\r\n' [0d 0a] character in that
_vm file or share that _vm file with us?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small race in pg_xlogdump --follow

2016-09-30 Thread Magnus Hagander
On Mon, Sep 26, 2016 at 3:59 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > Oh, right, at the very last loop. I've never seen it need more than 1
> loop
> > so I didn't manage to hit that codepath :) But yeah, saving errno and
> > restoring it on the other side of pg_usleep is probably a good idea.
>
> Right.  On a larger scale --- I'm too uncaffeinated to figure out whether
> this is the code path taken for the initially user-supplied file name.
> But it would be unfriendly if when you fat-finger the command line
> arguments, it takes 5 seconds for pg_xlogdump to tell you so.  Maybe
> the looping behavior should only happen on non-first file access.
>

It's a different codepath, so the retry does not happen on that. Different
error message, even. ("could not open file" vs "could not find file").

I'll go commit this including a source code comment about why the loop is
there that was missing.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Fix checkpoint skip logic on idle systems by tracking LSN progress

2016-09-30 Thread Michael Paquier
On Fri, Sep 30, 2016 at 2:51 PM, Michael Paquier
 wrote:
> On Fri, Sep 30, 2016 at 2:05 PM, Kyotaro HORIGUCHI
>  wrote:
>> At Fri, 30 Sep 2016 14:00:15 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
>>  wrote in 
>> <20160930.140015.150178454.horiguchi.kyot...@lab.ntt.co.jp>
>>> I don't see no problem in setting progressAt in XLogInsertRecord.
>>> But I doubt GetProgressRecPtr is harmful, especially when
>>
>> But I suspect that GetProgressRecPtr could be harmful.
>
> Well, you can maximize its effects by doing NUM_XLOGINSERT_LOCKS ==
> nproc and reducing checkpoint_timeout. That's what I did but..

Note: I am moving this patch to next CF.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-09-30 Thread Victor Wagner
On Thu, 29 Sep 2016 23:45:52 +0530
Mithun Cy  wrote:

> This patch do not apply on latest code. it fails as follows

It's strange. I have no problems applying it to commit
fd321a1dfd64d30

Ok, some trailing whitespace and mixing of tabs and spaces
which git apply doesn't like really present in the patch.

I'm attached hear version with these issues resolved.

 
> I am slightly confused. As per this target_server_type=master means
> user is expecting failover. What if user pass target_server_type=any
> and request sequential connection isn't this also a case for
> failover?.  I think by default it should be "any" for any number of
> hosts. And parameter "sequential and random will decide whether we
> want just failover or with load-balance.

I don't agree with this. In the first versions of the patch it refuses 
connect to readonly server even if it is only one, because I think that
read-write connection is what user typically expect.

When user tries to connect to cluster (specifying many hosts in the
connect string), it can be by default assumed that he wants master. 

But backward compatibility is more
important thing, so I now assume, that user tries to connect just one
node, and this node is read only, user knows what he is doing.

 

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 4e34f00..f0df877 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -792,7 +792,7 @@ host=localhost port=5432 dbname=mydb connect_timeout=10

The general form for a connection URI is:
 
-postgresql://[user[:password]@][netloc][:port][/dbname][?param1=value1&...]
+postgresql://[user[:password]@][netloc][:port][,netloc[:port]...][/dbname][?param1=value1&...]
 


@@ -809,6 +809,7 @@ postgresql://localhost/mydb
 postgresql://user@localhost
 postgresql://user:secret@localhost
 postgresql://other@localhost/otherdb?connect_timeout=10&application_name=myapp
+postgresql://node1,node2:5433,node3:4432,node4/mydb?hostorder=random&target_server_type=any
 
 Components of the hierarchical part of the URI can also
 be given as parameters.  For example:
@@ -831,7 +832,9 @@ postgresql:///mydb?host=localhost&port=5433

 For improved compatibility with JDBC connection URIs,
 instances of parameter ssl=true are translated into
-sslmode=require.
+sslmode=require and
+loadBalanceHosts=true  into
+hostorder=random.



@@ -841,6 +844,10 @@ postgresql:///mydb?host=localhost&port=5433
 postgresql://[2001:db8::1234]/database
 

+   
+   There can be serveral host specifications, optionally accompanied
+   with port, separated by comma.
+   


 The host component is interpreted as described for the parameter PostgreSQL was built). On machines without
 Unix-domain sockets, the default is to connect to localhost.

+   
+   There can be more than one host parameter in
+   the connect string. In this case these hosts would be considered
+   alternate entries into same database and if connect to first one
+   fails, library would try to connect second etc. This can be used
+   for high availability cluster or for load balancing. See
+parameter.
+   
+   
+   Network host name can be accompanied with port number, separated by
+   colon. If so, this port number is used only when connected to
+   this host. If there is no port number, port specified in the
+parameter would be used.
+   
   
  

@@ -942,8 +963,44 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname


   
-
+
+  
+  hostorder
+  
+  
+  Specifies how to choose host from list of alternate hosts,
+  specified in the  parameter.
+  
+  
+  If value of this argument is sequential (the
+  default) library connects to the hosts in order they specified,
+  and tries to connect second one only if connection to the first
+  fails.
+  
+  
+  If value is random host to connect is randomly
+  picked from the list. It allows to balance load between several
+  cluster nodes. However, currently PostgreSQL doesn't support
+  multimaster clusters. So, without use of third-party products,
+  only read-only connections can take advantage from the
+  load-balancing. See 
+  
+  
+  
+  
+  target_server_type
+  
+  
+  If this parameter is master upon successful connection
+  library checks if host is in recovery state, and if it is so,
+  tries next host in the connect string. If this parameter is
+  any, connection to standby nodes are considered
+  successful.
+  
+  
+  
   
+
port


@@ -985,7 +1042,6 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   
   
  
-
  
   connect_timeout
   
@@ -996,7 +1052,27 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   
   
  
-
+ 
+ failover_timeout
+ 

Re: [HACKERS] Hash Indexes

2016-09-30 Thread Amit Kapila
On 30-Sep-2016 6:24 AM, "Robert Haas"  wrote:
>
> On Thu, Sep 29, 2016 at 8:29 PM, Andres Freund  wrote:
> > On September 29, 2016 5:28:00 PM PDT, Robert Haas 
wrote:
> >>On Thu, Sep 29, 2016 at 8:16 PM, Andres Freund 
> >>wrote:
>  Well, I, for one, find it frustrating.  It seems pretty unhelpful to
>  bring this up only after the code has already been written.
> >>>
> >>> I brought this up in person at pgcon too.
> >>
> >>To whom?  In what context?
> >
> > Amit, over dinner.
>
> OK, well, I can't really comment on that, then, except to say that if
> you waited three months to follow up on the mailing list, you probably
> can't blame Amit if he thought that it was more of a casual suggestion
> than a serious objection.  Maybe it was?  I don't know.
>

Both of them have talked about hash indexes with me offline. Peter
mentioned that it would be better to improve btree rather than hash
indexes. IIRC, Andres asked me mainly about what use cases I have in mind
for hash indexes and then we do have some further discussion on the same
thing where he was not convinced that there is any big use case for hash
indexes even though there may be some cases. In that discussion, as he is
saying and I don't doubt him, he would have told me the alternative, but it
was not apparent to me that he is expecting some sort of comparison.

What I got from both the discussions was a friendly gesture that it might
be a better use of my time, if I work on some other problem.  I really
respect suggestions from both of them, but it was no where clear to me that
any one of  them is expecting any comparison of other approach.

Considering,  I have missed the real intention of their suggestions, I
think such a serious objection on any work should be discussed on list.  To
answer the actual objection, I have already mentioned upthread that we can
deduce from the current tests done by Jesper and Mithun that there are some
cases where hash index will be better than hash-over-btree (tests done over
integer columns).  I think any discussion on whether we should consider not
to improve current hash indexes is only meaningful if some one has a  code
which can prove both theoretically and practically that it is better than
hash indexes for all usages.

Note - excuse me for formatting of this email as I am on travel and using
my phone.

With Regards,
Amit Kapila.