Re: [HACKERS] [WIP] showing index maintenance on EXPLAIN

2014-05-08 Thread Jaime Casanova
On Wed, May 7, 2014 at 10:52 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova ja...@2ndquadrant.com wrote:
 Hi,

 This patch implements $subject only when ANALYZE and VERBOSE are on.
 I made it that way because for years nobody seemed interested in this
 info (at least no one did it) so i decided that maybe is to much
 information for most people (actually btree indexes are normally very
 fast).


 Why to capture only for Index Insert/Update and not for Read; is it
 because Read will be always fast ot implementation complexity?


EXPLAIN ANALYZE already shows that on any SELECT that uses an index in
some way. Or are you thinking on something else?

 Why not similar timings for heap?


well actual time shows us total time of the operation so just
deducting the time spent on triggers, indexes and planning seems like
a way to get heap modification time.

yes, maybe we still need some additional data. for example, i could
want to know how much time we spent extending a relation.

 Why can't we print when only Analyze is used with Explain, the
 execution time is printed with Analyze option?


i'm not sure the info is useful for everyone, i'm not opposed to show
it all the time though

 Could you please tell in what all kind of scenario's, do you expect it
 to be useful?
 One I could think is that if there are multiple indexes on a table and user
 wants to find out if any particular index is consuming more time.


exactly my use case. consider this plan (we spent 78% of the time
updating the index uniq_idx_on_text):

   QUERY PLAN

 Insert on public.t1 (actual time=0.540..0.540 rows=0 loops=1)
   -  Result (actual time=0.046..0.049 rows=1 loops=1)
 Output: some long data here
 Index uniq_idx_on_text on t1: time=0.421 rows=1
 Index t1_pkey on t1: time=0.027 rows=1
 Total runtime: 0.643 ms
(6 rows)

so i want to answer questions like, how much an index is hurting write
performance? once i know that i can look for alternative solutions.
In that vein, it was interesting to see how fastupdate affect
performance in a GIN index using gin_trgm_ops (5 times slower with
fastupdate=off)

(fastupdate=on) Index idx_ggin on t1: time=0.418 rows=1
(fastupdate=off) Index idx_ggin on t1: time=2.205 rows=1

this is not different to showing trigger time info, which we already do

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Clean up jsonb code.

2014-05-08 Thread Heikki Linnakangas

On 05/08/2014 02:25 AM, Peter Geoghegan wrote:

findJsonbValueFromSuperHeader()'s lowbound argument
previously served to establish a low bound for searching when
searching for multiple keys (so the second and subsequent
user-supplied key could skip much of the object).


Got that.


In the case of
jsonb_exists_any(), say, if you only have a reasonable expectation
that about 1 key exists, and that happens to be the last key that the
user passed to the text[] argument (to the existence/? operator), then
n - 1 calls to what is now findJsonbValueFromContainer() (which now
does not accept a lowbound) are wasted.


Check.


That's elem_count - 1
top-level binary searches of the entire jsonb. Or elem_count such
calls rather than 1 call (plus 1 sort of the supplied array) in the
common case where jsonb_exists_any() will return false.


Ok, but I don't see any big difference in that regard. It still called 
findJsonbValueFromContainer() elem_count times, before this commit. Each 
call was somewhat cheaper, because the lower bound of the binary search 
was initialized to where the previous search ended, but you still had to 
perform the search.



Granted, that might not be that bad right now, given that it's only
ever (say) elem_count or elem_count - 1 wasted binary searches through
the *top* level, but that might not always be true.


If we are ever to perform a deep search, I think we'll want to do much 
more to optimize that than just keep track of the lower bound. Like, 
start an iterator of tree and check for all of the keys in one go.



And even today,
sorting a presumably much smaller user-passed lookup array once has to
be cheaper than searching through the entire jsonb perhaps elem_count
times per call.


Well, the quick testing I did suggested otherwise. It's not a big 
difference, but sorting has all kinds of overhead, like pallocing the 
array to sort, copying the elements around etc. For a small array, the 
startup cost of sorting trumps the savings in the binary searches. 
Possibly the way the sorting was done was not optimal, and you could 
reduce the copying and allocations involved in that, but it's hardly 
worth the trouble.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] popen and pclose redefinitions causing many warning in Windows build

2014-05-08 Thread Heikki Linnakangas

On 05/08/2014 08:01 AM, Michael Paquier wrote:

Hi all,

Since commit a692ee5, code compilation on windows is full of warnings
caused by the re-definitions of popen and pclose:
In file included from ../../../src/include/c.h:1028:0,
  from ../../../src/include/postgres.h:47,
  from analyze.c:25:
../../../src/include/port.h:312:0: warning: popen redefined [enabled
by default]


Hmm. Does the MinGW version of popen() and system() do the quoting for 
you? If we just #ifdef the defines, then we will not use the wrappers on 
MinGW, which would be wrong if the quoting is needed there. If it's not 
needed, then we shouldn't be compiling the wrapper functions in the 
first place.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ignore files in src/interfaces/libpq generated by windows builds

2014-05-08 Thread Heikki Linnakangas

On 05/08/2014 08:48 AM, Michael Paquier wrote:

Hi all,

While doing some builds with mingw and 9.4, I noticed that a couple of
files generated by build are not ignored in the source tree. Those two
files are system.c and win32setlocale.c in src/interfaces/libpq.
Please find attached a patch fixing that.
Note that this is recent and is partially caused by commit a692ee5.


Applied, thanks. win32setlocale.c dates back to 9.1, so back-patched that.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 04:33, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

 From your description, my understanding is that you would like to stream
 data from 2 standard tables to the GPU, then perform a join on the GPU 
 itself.

 I have been told that is not likely to be useful because of the data transfer
 overheads.

 Here are two solutions. One is currently I'm working; in case when number
 of rows in left- and right- tables are not balanced well, we can keep a hash
 table in the GPU DRAM, then we transfer the data stream chunk-by-chunk from
 the other side. Kernel execution and data transfer can be run asynchronously,
 so it allows to hide data transfer cost as long as we have enough number of
 chunks, like processor pipelining.

Makes sense to me, thanks for explaining.

The hardware-enhanced hash join sounds like a great idea.

My understanding is we would need

* a custom cost-model
* a custom execution node

The main question seems to be whether doing that would be allowable,
cos its certainly doable.

I'm still looking for a way to avoid adding planning time for all
queries though.

 Other solution is integrated GPU that kills necessity of data transfer,
 like Intel's Haswell, AMD's Kaveri or Nvidia's Tegra K1; all majors are
 moving to same direction.

Sounds useful, but very non-specific, as yet.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 03:36, Stephen Frost sfr...@snowman.net wrote:

 Given that I count 4-5 beneficial use cases for this index-like
 lookaside, it seems worth investing time in.

 I'm all for making use of MatViews and GPUs, but there's more than one
 way to get there and look-asides feels like pushing the decision,
 unnecessarily, on to the user.

I'm not sure I understand where most of your comments come from, so
its clear we're not talking about the same things yet.


We have multiple use cases where an alternate data structure could be
used to speed up queries.

My goal is to use the alternate data structure(s)

1) if the data structure contains matching data for the current query
2) only when the user has explicitly stated it would be correct to do
so, and they wish it
3) transparently to the application, rather than forcing them to recode
4) after fully considering cost-based optimization, which we can only
do if it is transparent

all of which is how mat views work in other DBMS. My additional requirement is

5) allow this to work with data structures outside the normal
heap/index/block structures, since we have multiple already working
examples of such things and many users wish to leverage those in their
applications

which I now understand is different from the main thrust of Kaigai's
proposal, so I will restate this later on another thread.


The requirement is similar to the idea of running

CREATE MATERIALIZED VIEW foo
   BUILD DEFERRED
   REFRESH COMPLETE
   ON DEMAND
   ENABLE QUERY REWRITE
   ON PREBUILT TABLE

but expands on that to encompass any external data structure.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-05-08 Thread MauMau

From: Amit Kapila amit.kapil...@gmail.com
Only one minor suggestion:
+Name of the event source for pg_ctl to use for event log.  The
+default is PostgreSQL.


From this description, it is not clear when the event log will be used

in pg_ctl.  For example, if user uses -e option with register, then he
might expect any failure during register command will be logged into
eventlog, but that is not true.  So I think we should improve the docs
to reflect the usage of -e.

On Linux, I am getting below build failure for pg_ctl

pg_ctl.c: In function ‘main’:
pg_ctl.c:2173: error: ‘event_source’ undeclared (first use in this 
function)

pg_ctl.c:2173: error: (Each undeclared identifier is reported only once
pg_ctl.c:2173: error: for each function it appears in.)
make[3]: *** [pg_ctl.o] Error 1
make[2]: *** [all-pg_ctl-recurse] Error 2
make[1]: *** [all-bin-recurse] Error 2
make: *** [all-src-recurse] Error 2


Thank you for reviewing and testing.  I changed the doc, but I'd appreciate 
it if you could improve my poor English and update the CommitFest if it can 
be better.


I rebased the patch to HEAD and removed the compilation error on Linux.  I 
made event_source variable on all platforms like register_servicename, 
although they are not necessary on non-Windows platforms.


Regards
MauMau


pg_ctl_eventsrc_v9.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PGDLLEXPORTing all GUCs?

2014-05-08 Thread Robert Haas
On Thu, May 8, 2014 at 12:19 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 In terms of ugliness, would you be happier using a pg_extern instead of
 extern, where we:

 #ifdef WIN32
 #define pg_extern extern PGDLLIMPORT
 #else
 #define pg_extern extern
 #endif

 ?

I personally would not be happier with that.  extern can be applied to
function prototypes, not just variables, and even to function
definitions.  When I see PGDLLIMPORT, I think, oh look, this is some
kind of magic pixie dust that Windows requires us to sprinkle on our
variables.  When I see pg_extern, I think, oh, this is the PostgreSQL
version of extern, and it's not, really.

But I can live with it if it makes other people sufficiently happier.
One way or another, I really do feel very strongly that we should push
forward with broader PGDLLIMPORT-ification.  My experience mirrors
Craig's: this is a very useful thing for extension developers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_shmem_allocations view

2014-05-08 Thread Robert Haas
On Wed, May 7, 2014 at 5:54 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-05-07 17:48:15 -0400, Robert Haas wrote:
 On Tue, May 6, 2014 at 6:09 PM, Andres Freund and...@2ndquadrant.com wrote:
  I guess I'd vote for
  ditching the allocated column completely and outputting the memory
  allocated without ShmemIndex using some fixed tag (like ShmemIndex
  or Bootstrap or Overhead or something).
 
  My way feels slightly cleaner, but I'd be ok with that as well. There's
  no possible conflicts with an actual segment... In your variant the
  unallocated/slop memory would continue to have a NULL key?

 Yeah, that seems all right.

 Hm. Not sure what you're ACKing here ;).

The idea of giving the unallocated memory a NULL key.

 One way to avoid conflict with an actual segment would be to add an
 after-the-fact entry into ShmemIndex representing the amount of memory
 that was used to bootstrap it.

 There's lots of allocations from shmem that cannot be associated with
 any index entry though. Not just ShmemIndex's own entry. Most
 prominently most of the memory used for SharedBufHash isn't actually
 associated with the Shared Buffer Lookup Table entry - imo a
 dynahash.c defficiency.

Hmm, I don't know what to do about that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PGDLLEXPORTing all GUCs?

2014-05-08 Thread Andres Freund
On 2014-05-08 07:56:46 -0400, Robert Haas wrote:
 On Thu, May 8, 2014 at 12:19 AM, Craig Ringer cr...@2ndquadrant.com wrote:
  In terms of ugliness, would you be happier using a pg_extern instead of
  extern, where we:
 
  #ifdef WIN32
  #define pg_extern extern PGDLLIMPORT
  #else
  #define pg_extern extern
  #endif
 
  ?
 
 I personally would not be happier with that.  extern can be applied to
 function prototypes, not just variables, and even to function
 definitions.  When I see PGDLLIMPORT, I think, oh look, this is some
 kind of magic pixie dust that Windows requires us to sprinkle on our
 variables.  When I see pg_extern, I think, oh, this is the PostgreSQL
 version of extern, and it's not, really.

Well, it's actually helpful in some sense for functions too (one
trampoline less on windows). And given how postgres uses externs I think
it matches well with just PGDLLIMPORT everything.

 But I can live with it if it makes other people sufficiently happier.
 One way or another, I really do feel very strongly that we should push
 forward with broader PGDLLIMPORT-ification.  My experience mirrors
 Craig's: this is a very useful thing for extension developers.

Yea. I have been yelled at by jenkins far too many times. Exporting all
variables seems like the only way to significantly reduce the need for
!windows developers needing to care about windows.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Robert Haas
On Wed, May 7, 2014 at 4:01 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Agreed. My proposal is that if the planner allows the lookaside to an
 FDW then we pass the query for full execution on the FDW. That means
 that the scan, aggregate and join could take place via the FDW. i.e.
 Custom Plan == lookaside + FDW

 Or put another way, if we add Lookaside then we can just plug in the
 pgstrom FDW directly and we're done. And everybody else's FDW will
 work as well, so Citus etcc will not need to recode.

As Stephen notes downthread, Tom has already expressed opposition to
this idea on other threads, and I tend to agree with him, at least to
some degree.  I think the drive to use foreign data wrappers for
PGStrom, CitusDB, and other things that aren't really foreign data
wrappers as originally conceived is a result of the fact that we've
got only one interface in this area that looks remotely like something
pluggable; and so everyone's trying to fit things into the constraints
of that interface whether it's actually a good fit or not.
Unfortunately, I think what CitusDB really wants is pluggable storage,
and what PGStrom really wants is custom paths, and I don't think
either of those things is the same as what FDWs provide.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
 On 8 May 2014 03:36, Stephen Frost sfr...@snowman.net wrote:
  I'm all for making use of MatViews and GPUs, but there's more than one
  way to get there and look-asides feels like pushing the decision,
  unnecessarily, on to the user.
 
 I'm not sure I understand where most of your comments come from, so
 its clear we're not talking about the same things yet.
 
 We have multiple use cases where an alternate data structure could be
 used to speed up queries.

I don't view on-GPU memory as being an alternate *permanent* data store.
Perhaps that's the disconnect that we have here, as it was my
understanding that we're talking about using GPUs to make queries run
faster where the data comes from regular tables.

 My goal is to use the alternate data structure(s)

Pluggable storage is certainly interesting, but I view that as
independent of the CustomPlan-related work.

 which I now understand is different from the main thrust of Kaigai's
 proposal, so I will restate this later on another thread.

Sounds good.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 As Stephen notes downthread, Tom has already expressed opposition to
 this idea on other threads, and I tend to agree with him, at least to
 some degree.  I think the drive to use foreign data wrappers for
 PGStrom, CitusDB, and other things that aren't really foreign data
 wrappers as originally conceived is a result of the fact that we've
 got only one interface in this area that looks remotely like something
 pluggable; and so everyone's trying to fit things into the constraints
 of that interface whether it's actually a good fit or not.

Agreed.

 Unfortunately, I think what CitusDB really wants is pluggable storage,
 and what PGStrom really wants is custom paths, and I don't think
 either of those things is the same as what FDWs provide.

I'm not entirely sure that PGStrom even really wants custom paths..  I
believe the goal there is to be able to use GPUs to do work for us and
custom paths/pluggable plan/execution are seen as the way to do that and
not depend on libraries which are under GPL, LGPL or other licenses which
we'd object to depending on from core.

Personally, I'd love to just see CUDA or whatever support in core as a
configure option and be able to detect at start-up when the right
libraries and hardware are available and enable the join types which
could make use of that gear.

I don't like that we're doing all of this because of licenses or
whatever and would still hope to figure out a way to address those
issues but I haven't had time to go research it myself and evidently
KaiGai and others see the issues there as insurmountable, so they're
trying to work around it by creating a pluggable interface where an
extension could provide those join types.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 13:48, Stephen Frost sfr...@snowman.net wrote:

 We have multiple use cases where an alternate data structure could be
 used to speed up queries.

 I don't view on-GPU memory as being an alternate *permanent* data store.

As I've said, others have expressed an interest in placing specific
data on specific external resources that we would like to use to speed
up queries. That might be termed a cache of various kinds or it
might be simply be an allocation of that resource to a specific
purpose.

If we forget GPUs, that leaves multiple use cases that do fit the description.

 Perhaps that's the disconnect that we have here, as it was my
 understanding that we're talking about using GPUs to make queries run
 faster where the data comes from regular tables.

I'm trying to consider a group of use cases, so we get a generic API
that is useful to many people, not just to one use case. I had
understood the argument to be there must be multiple potential users
of an API before we allow it.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-05-08 Thread Fujii Masao
On Wed, May 7, 2014 at 4:57 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Tue, May 6, 2014 at 7:04 PM, Christoph Berg c...@df7cb.de wrote:
 Hi,

 if you split configuration and data by setting data_directory,
 postgresql.auto.conf is writen to the data directory
 (/var/lib/postgresql/9.4/main in Debian), but tried to be read from
 the etc directory (/etc/postgresql/9.4/main).

 One place to fix it would be in ProcessConfigFile in
 src/backend/utils/misc/guc-file.l:162 by always setting
 CallingFileName = NULL in src/backend/utils/misc/guc-file.l:162, but
 that breaks later in AbsoluteConfigLocation() when data_directory is
 NULL. (As the comment in ProcessConfigFile says.)

 This problem occurs because we don't have the value of data_directory
 set in postgresql.conf by the time we want to parse .auto.conf file
 during server start.  The value of data_directory is only available after
 processing of config files.  To fix it, we need to store the value of
 data_directory during parse of postgresql.conf file so that we can use it
 till data_directory is actually set.  Attached patch fixes the problem.
 Could you please once confirm if it fixes the problem in your
 env./scenario.

Maybe this is nitpicking, but what happens when postgresql.auto.conf also
includes the setting of data_directory? This is possible because we can
set data_directory via ALTER SYSTEM now. Should we just ignore such
problematic setting in postgresql.auto.conf with warning message?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 04:33, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

 In case when it replaced join relations by ForeignScan, it will be almost
 same as expected ForeignScan with join-pushed down. Unlike usual table scan,
 it does not have actual relation definition on catalog, and its result
 tuple-slot is determined on the fly.
 One thing different from the remote-join is, this ForeignScan node may have
 sub-plans locally, if FDW driver (e.g GPU execution) may have capability on
 Join only, but no relation scan portion.
 So, unlike its naming, I want ForeignScan to support to have sub-plans if
 FDW driver supports the capability.

From here, it looks exactly like pushing a join into an FDW. If we had
that, we wouldn't need Custom Scan at all.

I may be mistaken and there is a critical difference. Local sub-plans
doesn't sound like a big difference.


Have we considered having an Optimizer and Executor plugin that does
this without touching core at all?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_shmem_allocations view

2014-05-08 Thread Andres Freund
On 2014-05-08 07:58:34 -0400, Robert Haas wrote:
 On Wed, May 7, 2014 at 5:54 PM, Andres Freund and...@2ndquadrant.com wrote:
  Hm. Not sure what you're ACKing here ;).
 
 The idea of giving the unallocated memory a NULL key.

Ok. A new version of the patches implementing that are
attached. Including a couple of small fixups and docs. The latter aren't
extensive, but that doesn't seem to be warranted anyway.

  There's lots of allocations from shmem that cannot be associated with
  any index entry though. Not just ShmemIndex's own entry. Most
  prominently most of the memory used for SharedBufHash isn't actually
  associated with the Shared Buffer Lookup Table entry - imo a
  dynahash.c defficiency.

 Hmm, I don't know what to do about that.

Well, we have to live with it for now :)

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From c219c03a173fef962c1caba9f016d5d87448fd8f Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 6 May 2014 19:42:36 +0200
Subject: [PATCH 1/4] Associate names to created dynamic shared memory
 segments.

At some later point we want to add a view show all allocated dynamic
shared memory segments so admins can understand resource usage. To
avoid breaking the API in 9.5 add the necessary name now.
---
 contrib/test_shm_mq/setup.c   |  2 +-
 src/backend/storage/ipc/dsm.c | 60 ++-
 src/include/storage/dsm.h |  2 +-
 3 files changed, 39 insertions(+), 25 deletions(-)

diff --git a/contrib/test_shm_mq/setup.c b/contrib/test_shm_mq/setup.c
index 572cf88..897c47b 100644
--- a/contrib/test_shm_mq/setup.c
+++ b/contrib/test_shm_mq/setup.c
@@ -125,7 +125,7 @@ setup_dynamic_shared_memory(int64 queue_size, int nworkers,
 	segsize = shm_toc_estimate(e);
 
 	/* Create the shared memory segment and establish a table of contents. */
-	seg = dsm_create(shm_toc_estimate(e));
+	seg = dsm_create(test_shm_mq, shm_toc_estimate(e));
 	toc = shm_toc_create(PG_TEST_SHM_MQ_MAGIC, dsm_segment_address(seg),
 		 segsize);
 
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index a5c0084..c8fdf6e 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -80,8 +80,10 @@ struct dsm_segment
 /* Shared-memory state for a dynamic shared memory segment. */
 typedef struct dsm_control_item
 {
-	dsm_handle	handle;
+	dsm_handle	handle;			/* segment identifier */
 	uint32		refcnt;			/* 2+ = active, 1 = moribund, 0 = gone */
+	Size		size;			/* current size */
+	char		name[SHMEM_INDEX_KEYSIZE]; /* informational name */
 } dsm_control_item;
 
 /* Layout of the dynamic shared memory control segment. */
@@ -454,14 +456,16 @@ dsm_set_control_handle(dsm_handle h)
  * Create a new dynamic shared memory segment.
  */
 dsm_segment *
-dsm_create(Size size)
+dsm_create(const char *name, Size size)
 {
 	dsm_segment *seg = dsm_create_descriptor();
-	uint32		i;
-	uint32		nitems;
+	dsm_control_item *item;
+	uint32		slot;
 
 	/* Unsafe in postmaster (and pointless in a stand-alone backend). */
 	Assert(IsUnderPostmaster);
+	Assert(name != NULL  strlen(name)  0 
+		   strlen(name)  SHMEM_INDEX_KEYSIZE - 1);
 
 	if (!dsm_init_done)
 		dsm_backend_startup();
@@ -479,33 +483,39 @@ dsm_create(Size size)
 	/* Lock the control segment so we can register the new segment. */
 	LWLockAcquire(DynamicSharedMemoryControlLock, LW_EXCLUSIVE);
 
-	/* Search the control segment for an unused slot. */
-	nitems = dsm_control-nitems;
-	for (i = 0; i  nitems; ++i)
+	/*
+	 * Search the control segment for an unused slot that's previously been
+	 * used. If we don't find one initialize a new one if there's still space.
+	 */
+	for (slot = 0; slot  dsm_control-nitems; ++slot)
 	{
-		if (dsm_control-item[i].refcnt == 0)
-		{
-			dsm_control-item[i].handle = seg-handle;
-			/* refcnt of 1 triggers destruction, so start at 2 */
-			dsm_control-item[i].refcnt = 2;
-			seg-control_slot = i;
-			LWLockRelease(DynamicSharedMemoryControlLock);
-			return seg;
-		}
+		if (dsm_control-item[slot].refcnt == 0)
+			break;
 	}
 
-	/* Verify that we can support an additional mapping. */
-	if (nitems = dsm_control-maxitems)
+	/* Verify that we can support the mapping. */
+	if (slot = dsm_control-maxitems)
 		ereport(ERROR,
 (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
  errmsg(too many dynamic shared memory segments)));
 
-	/* Enter the handle into a new array slot. */
-	dsm_control-item[nitems].handle = seg-handle;
+	item = dsm_control-item[slot];
+	item-handle = seg-handle;
 	/* refcnt of 1 triggers destruction, so start at 2 */
-	dsm_control-item[nitems].refcnt = 2;
-	seg-control_slot = nitems;
-	dsm_control-nitems++;
+	item-refcnt = 2;
+	item-size = size;
+	strncpy(item-name, name, SHMEM_INDEX_KEYSIZE);
+	item-name[SHMEM_INDEX_KEYSIZE - 1] = 0;
+
+	seg-control_slot = slot;
+
+	/*
+	 * Increase number of initilized slots if we didn't reuse a 

Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
 On 8 May 2014 13:48, Stephen Frost sfr...@snowman.net wrote:
  I don't view on-GPU memory as being an alternate *permanent* data store.
 
 As I've said, others have expressed an interest in placing specific
 data on specific external resources that we would like to use to speed
 up queries. That might be termed a cache of various kinds or it
 might be simply be an allocation of that resource to a specific
 purpose.

I don't think some generalized structure that addresses the goals of
FDWs, CustomPaths, MatViews and query cacheing is going to be workable
and I'm definitely against having to specify at a per-relation level
when I want certain join types to be considered.

  Perhaps that's the disconnect that we have here, as it was my
  understanding that we're talking about using GPUs to make queries run
  faster where the data comes from regular tables.
 
 I'm trying to consider a group of use cases, so we get a generic API
 that is useful to many people, not just to one use case. I had
 understood the argument to be there must be multiple potential users
 of an API before we allow it.

The API you've outlined requires users to specify on a per-relation
basis what join types are valid.  As for if CustomPlans, there's
certainly potential for many use-cases there beyond just GPUs.  What I'm
unsure about is if any others would actually need to be implemented
externally as the GPU-related work seems to need or if we would just
implement those other join types in core.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Kouhei Kaigai
 On Wed, May 7, 2014 at 4:01 AM, Simon Riggs si...@2ndquadrant.com wrote:
  Agreed. My proposal is that if the planner allows the lookaside to an
  FDW then we pass the query for full execution on the FDW. That means
  that the scan, aggregate and join could take place via the FDW. i.e.
  Custom Plan == lookaside + FDW
 
  Or put another way, if we add Lookaside then we can just plug in the
  pgstrom FDW directly and we're done. And everybody else's FDW will
  work as well, so Citus etcc will not need to recode.
 
 As Stephen notes downthread, Tom has already expressed opposition to this
 idea on other threads, and I tend to agree with him, at least to some degree.
 I think the drive to use foreign data wrappers for PGStrom, CitusDB, and
 other things that aren't really foreign data wrappers as originally
 conceived is a result of the fact that we've got only one interface in this
 area that looks remotely like something pluggable; and so everyone's trying
 to fit things into the constraints of that interface whether it's actually
 a good fit or not.
 Unfortunately, I think what CitusDB really wants is pluggable storage, and
 what PGStrom really wants is custom paths, and I don't think either of those
 things is the same as what FDWs provide.
 
Yes, what PGStrom really needs is a custom paths; that allows extension to
replace a part of built-in nodes according to extension's characteristics.
The discussion upthread clarified that FDW needs to be enhanced to support
functionality that PGStrom wants to provide, however, some of them also needs
redefinition of FDW, indeed.

Umm... I'm now missing the direction towards my goal.
What approach is the best way to glue PostgreSQL and PGStrom?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
 From here, it looks exactly like pushing a join into an FDW. If we had
 that, we wouldn't need Custom Scan at all.
 
 I may be mistaken and there is a critical difference. Local sub-plans
 doesn't sound like a big difference.

Erm.  I'm not sure that you're really thinking through what you're
suggesting.

Allow me to re-state your suggestion here:

An FDW is loaded which provides hook for join push-down (whatever those
end up being).

A query is run which joins *local* table A to *local* table B.  Standard
heaps, standard indexes, all local to this PG instance.

The FDW which supports join push-down is then passed this join for
planning, with local sub-plans for the local tables.

 Have we considered having an Optimizer and Executor plugin that does
 this without touching core at all?

Uh, isn't that what we're talking about?  The issue is that there's a
bunch of internal functions that such a plugin would need to either have
access to or re-implement, but we'd rather not expose those internal
functions to the whole world because they're, uh, internal helper
routines, essentially, which could disappear in another release.

The point is that there isn't a good API for this today and what's being
proposed isn't a good API, it's just bolted-on to the existing system by
exposing what are rightfully internal routines.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 14:32, Stephen Frost sfr...@snowman.net wrote:

 The API you've outlined requires users to specify on a per-relation
 basis what join types are valid.

No, it doesn't. I've not said or implied that at any point.

If you keep telling me what I mean, rather than asking, we won't get anywhere.

I think that's as far as we'll get on email.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-05-08 Thread Bruce Momjian
On Tue, May  6, 2014 at 06:20:53PM -0400, Tom Lane wrote:
 David E. Wheeler da...@justatheory.com writes:
  On May 6, 2014, at 2:20 PM, Bruce Momjian br...@momjian.us wrote:
  Well, then, we only have a few days to come up with a name.
 
  What are the options?
 
 We have no proposals as yet.
 
 I've been looking at the source code to try to understand the difference
 between the two opclasses (and BTW I concur with the opinions expressed
 recently about the poor state of the internal documentation for jsonb).
 If I've got it straight:
 
 jsonb_ops indexes keys and values separately, so for instance {xyz: 2}
 would give rise to GIN entries that are effectively the strings Kxyz
 and V2.  If you're looking for tuples containing {xyz: 2} then you
 would be looking for the AND of those independent index entries, which
 fortunately GIN is pretty good at computing.  But you could also look
 for just keys or just values.
 
 jsonb_hash_ops creates an index entry only for values, but what it
 stores is a hash of both the value and the key it's stored under.
 So in this example you'd get a hash combining xyz and 2.  This
 means the only type of query you can perform is like find JSON tuples
 containing {xyz: 2}.

Good summary, thanks.  This is the information I was hoping we had in
our docs.  How does hstore deal with these issues?

 Because jsonb_ops stores the *whole* value, you can do lossless index
 searches (no recheck needed on the heap tuple), but you also run the
 risk of long strings failing to fit into an index entry.  Since jsonb_ops
 reduces everything to a hash, there's no possibility of index failure,
 but all queries are lossy and require recheck.
 
 TBH, at this point I'm sort of agreeing with the thought expressed
 upthread that maybe neither of these should be the default as-is.
 They seem like rather arbitrary combinations of choices.  In particular
 I wonder why there's not an option to store keys and values separately,
 but as hashes not as the original strings, so that indexability of
 everything could be guaranteed.  Or a variant of that might be to hash
 only strings that are too large to fit in an index entry, and force
 recheck only when searching for a string that needed hashing.
 
 I wonder whether the most effective use of time at this point
 wouldn't be to fix jsonb_ops to do that, rather than arguing about
 what to rename it to.  If it didn't have the failure-for-long-strings
 problem I doubt anybody would be unhappy about making it the default.

Can we hash just the values, not the keys, in jsonb_ops, and hash the
combo in jsonb_hash_ops.  That would give us key-only lookups without a
recheck.

How do we index long strings now?  Is it he combination of GIN and long
strings that is the problem?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
Simon,

Perhaps you've changed your proposal wrt LOOKASIDES's and I've missed it
somewhere in the thread, but this is what I was referring to with my
concerns regarding per-relation definition of 'LOOKASIDES':

* Simon Riggs (si...@2ndquadrant.com) wrote:
 Roughly, I'm thinking of this...
 
 CREATE LOOKASIDE ON foo
TO foo_mat_view;
 
 and also this...
 
 CREATE LOOKASIDE ON foo
TO foo_as_a_foreign_table   /* e.g. PGStrom */

where I took 'foo' to mean 'a relation'.

Your downthread comments on 'CREATE MATERIALIZED VIEW' are in the same
vein, though there I agree that we need it per-relation as there are
other trade-offs to consider (storage costs of the matview, cost to
maintain the matview, etc, similar to indexes).

The PGStrom proposal, aiui, is to add a new join type which supports
using a GPU to answer a query where all the data is in regular PG
tables.  I'd like that to just work when a GPU is available (perhaps
modulo having to install some extension), for any join which is costed
to be cheaper/faster when done that way.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] New pg_lsn type doesn't have hash/btree opclasses

2014-05-08 Thread Andres Freund
Hi,

On 2014-05-06 23:55:04 -0300, Fabrízio de Royes Mello wrote:
 If helps, I added some regression tests to the lastest patch.

I think we should apply this patch now. It's much more sensible with the
opclasses present and we don't win anything by waiting for 9.5.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 14:40, Stephen Frost sfr...@snowman.net wrote:

 Allow me to re-state your suggestion here:

 An FDW is loaded which provides hook for join push-down (whatever those
 end up being).

 A query is run which joins *local* table A to *local* table B.  Standard
 heaps, standard indexes, all local to this PG instance.

 The FDW which supports join push-down is then passed this join for
 planning, with local sub-plans for the local tables.

Yes that is correct; thank you for confirming your understanding with me.

That also supports custom join of local to non-local table, or custom
join of two non-local tables.

If we can use interfaces that already exist with efficiency, why
invent a new one?


 Have we considered having an Optimizer and Executor plugin that does
 this without touching core at all?

 Uh, isn't that what we're talking about?

No. I meant writing this as an extension rather than a patch on core.

 The issue is that there's a
 bunch of internal functions that such a plugin would need to either have
 access to or re-implement, but we'd rather not expose those internal
 functions to the whole world because they're, uh, internal helper
 routines, essentially, which could disappear in another release.

 The point is that there isn't a good API for this today and what's being
 proposed isn't a good API, it's just bolted-on to the existing system by
 exposing what are rightfully internal routines.

I think the main point is that people don't want to ask for our
permission before they do what they want to do.

We either help people use Postgres, or they go elsewhere.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 14:49, Stephen Frost sfr...@snowman.net wrote:

 Your downthread comments on 'CREATE MATERIALIZED VIEW' are in the same
 vein, though there I agree that we need it per-relation as there are
 other trade-offs to consider (storage costs of the matview, cost to
 maintain the matview, etc, similar to indexes).

 The PGStrom proposal, aiui, is to add a new join type which supports
 using a GPU to answer a query where all the data is in regular PG
 tables.  I'd like that to just work when a GPU is available (perhaps
 modulo having to install some extension), for any join which is costed
 to be cheaper/faster when done that way.

All correct and agreed. As I explained earlier, lets cover the join
requirement here and we can discuss lookasides to data structures at
Pgcon.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-05-08 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Tue, May  6, 2014 at 06:20:53PM -0400, Tom Lane wrote:
 I wonder why there's not an option to store keys and values separately,
 but as hashes not as the original strings, so that indexability of
 everything could be guaranteed.  Or a variant of that might be to hash
 only strings that are too large to fit in an index entry, and force
 recheck only when searching for a string that needed hashing.
 
 I wonder whether the most effective use of time at this point
 wouldn't be to fix jsonb_ops to do that, rather than arguing about
 what to rename it to.  If it didn't have the failure-for-long-strings
 problem I doubt anybody would be unhappy about making it the default.

 Can we hash just the values, not the keys, in jsonb_ops, and hash the
 combo in jsonb_hash_ops.  That would give us key-only lookups without a
 recheck.

No, because there's nothing in JSON limiting the length of keys, any more
than values.

I think the idea of hashing only keys/values that are too long is a
reasonable compromise.  I've not finished coding it (because I keep
getting distracted by other problems in the code :-() but it does not
look to be very difficult.  I'm envisioning the cutoff as being something
like 128 bytes; in practice that would mean that few if any keys get
hashed, I think.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
 On 8 May 2014 14:40, Stephen Frost sfr...@snowman.net wrote:
  Allow me to re-state your suggestion here:
 
  An FDW is loaded which provides hook for join push-down (whatever those
  end up being).
 
  A query is run which joins *local* table A to *local* table B.  Standard
  heaps, standard indexes, all local to this PG instance.
 
  The FDW which supports join push-down is then passed this join for
  planning, with local sub-plans for the local tables.
 
 Yes that is correct; thank you for confirming your understanding with me.

I guess for my part, that doesn't look like an FDW any more.

 That also supports custom join of local to non-local table, or custom
 join of two non-local tables.

Well, we already support these, technically, but the FDW
doesn't actually implement the join, it's done in core.

 If we can use interfaces that already exist with efficiency, why
 invent a new one?

Perhaps once we have a proposal for FDW join push-down this will make
sense, but I'm not seeing it right now.

  Have we considered having an Optimizer and Executor plugin that does
  this without touching core at all?
 
  Uh, isn't that what we're talking about?
 
 No. I meant writing this as an extension rather than a patch on core.

KaiGai's patches have been some changes to core and then an extension
which uses those changes.  The changes to core include exposing internal
functions for extensions to use, which will undoubtably end up being a
sore spot and fragile.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-05-08 Thread Bruce Momjian
On Thu, May  8, 2014 at 10:16:54AM -0400, Tom Lane wrote:
  Can we hash just the values, not the keys, in jsonb_ops, and hash the
  combo in jsonb_hash_ops.  That would give us key-only lookups without a
  recheck.
 
 No, because there's nothing in JSON limiting the length of keys, any more
 than values.
 
 I think the idea of hashing only keys/values that are too long is a
 reasonable compromise.  I've not finished coding it (because I keep
 getting distracted by other problems in the code :-() but it does not
 look to be very difficult.  I'm envisioning the cutoff as being something
 like 128 bytes; in practice that would mean that few if any keys get
 hashed, I think.

Yes, that would be nice.  Ideally we would not be doing this so close to
beta, but it is what it is, and if we need to break binary compatibility
after beta1, at least we have pg_upgrade.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-05-08 Thread Andres Freund
On 2014-05-08 10:34:04 -0400, Bruce Momjian wrote:
 On Thu, May  8, 2014 at 10:16:54AM -0400, Tom Lane wrote:
   Can we hash just the values, not the keys, in jsonb_ops, and hash the
   combo in jsonb_hash_ops.  That would give us key-only lookups without a
   recheck.
  
  No, because there's nothing in JSON limiting the length of keys, any more
  than values.
  
  I think the idea of hashing only keys/values that are too long is a
  reasonable compromise.  I've not finished coding it (because I keep
  getting distracted by other problems in the code :-() but it does not
  look to be very difficult.  I'm envisioning the cutoff as being something
  like 128 bytes; in practice that would mean that few if any keys get
  hashed, I think.
 
 Yes, that would be nice.  Ideally we would not be doing this so close to
 beta, but it is what it is, and if we need to break binary compatibility
 after beta1, at least we have pg_upgrade.

If we break binary compatibility pg_upgrade won't be able to help. Since
the data files wont be, err, binary compatibile.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-05-08 Thread Bruce Momjian
On Thu, May  8, 2014 at 04:37:05PM +0200, Andres Freund wrote:
 On 2014-05-08 10:34:04 -0400, Bruce Momjian wrote:
  On Thu, May  8, 2014 at 10:16:54AM -0400, Tom Lane wrote:
Can we hash just the values, not the keys, in jsonb_ops, and hash the
combo in jsonb_hash_ops.  That would give us key-only lookups without a
recheck.
   
   No, because there's nothing in JSON limiting the length of keys, any more
   than values.
   
   I think the idea of hashing only keys/values that are too long is a
   reasonable compromise.  I've not finished coding it (because I keep
   getting distracted by other problems in the code :-() but it does not
   look to be very difficult.  I'm envisioning the cutoff as being something
   like 128 bytes; in practice that would mean that few if any keys get
   hashed, I think.
  
  Yes, that would be nice.  Ideally we would not be doing this so close to
  beta, but it is what it is, and if we need to break binary compatibility
  after beta1, at least we have pg_upgrade.
 
 If we break binary compatibility pg_upgrade won't be able to help. Since
 the data files wont be, err, binary compatibile.

Oops, yeah.  pg_upgrade only helps with system table changes. We would
have to require users to dump/reload any changed tables or recreate any
indexs.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-05-08 Thread Amit Kapila
On Thu, May 8, 2014 at 6:51 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, May 7, 2014 at 4:57 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 This problem occurs because we don't have the value of data_directory
 set in postgresql.conf by the time we want to parse .auto.conf file
 during server start.  The value of data_directory is only available after
 processing of config files.  To fix it, we need to store the value of
 data_directory during parse of postgresql.conf file so that we can use it
 till data_directory is actually set.  Attached patch fixes the problem.
 Could you please once confirm if it fixes the problem in your
 env./scenario.

 Maybe this is nitpicking, but what happens when postgresql.auto.conf also
 includes the setting of data_directory? This is possible because we can
 set data_directory via ALTER SYSTEM now.

   Yes, that will also be issue similar to above.

 Should we just ignore such
 problematic setting in postgresql.auto.conf with warning message?

Another way could be that we detect the same during server start
(while parsing postgresql.auto.conf) and then allow for reparse of
auto conf file, but I think that will be bit more complicated.  So lets
try to solve it in a way suggested by you.  If there is no objection by
anyone else then I will update the patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-05-08 Thread Stephen Frost
* Amit Kapila (amit.kapil...@gmail.com) wrote:
 On Thu, May 8, 2014 at 6:51 PM, Fujii Masao masao.fu...@gmail.com wrote:
  Should we just ignore such
  problematic setting in postgresql.auto.conf with warning message?
 
 Another way could be that we detect the same during server start
 (while parsing postgresql.auto.conf) and then allow for reparse of
 auto conf file, but I think that will be bit more complicated.  So lets
 try to solve it in a way suggested by you.  If there is no objection by
 anyone else then I will update the patch.

Pretty sure this is more-or-less exactly what I suggested quite a while
back during the massive ALTER SYSTEM SET thread..  There are certain
options which we shouldn't be allowing in .auto.conf.  I doubt this is
going to be the only one...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-05-08 Thread Andres Freund
On 2014-05-08 22:21:43 +0900, Fujii Masao wrote:
 On Wed, May 7, 2014 at 4:57 PM, Amit Kapila amit.kapil...@gmail.com wrote:
  On Tue, May 6, 2014 at 7:04 PM, Christoph Berg c...@df7cb.de wrote:
  Hi,
 
  if you split configuration and data by setting data_directory,
  postgresql.auto.conf is writen to the data directory
  (/var/lib/postgresql/9.4/main in Debian), but tried to be read from
  the etc directory (/etc/postgresql/9.4/main).
 
  One place to fix it would be in ProcessConfigFile in
  src/backend/utils/misc/guc-file.l:162 by always setting
  CallingFileName = NULL in src/backend/utils/misc/guc-file.l:162, but
  that breaks later in AbsoluteConfigLocation() when data_directory is
  NULL. (As the comment in ProcessConfigFile says.)
 
  This problem occurs because we don't have the value of data_directory
  set in postgresql.conf by the time we want to parse .auto.conf file
  during server start.  The value of data_directory is only available after
  processing of config files.  To fix it, we need to store the value of
  data_directory during parse of postgresql.conf file so that we can use it
  till data_directory is actually set.  Attached patch fixes the problem.
  Could you please once confirm if it fixes the problem in your
  env./scenario.
 
 Maybe this is nitpicking, but what happens when postgresql.auto.conf also
 includes the setting of data_directory? This is possible because we can
 set data_directory via ALTER SYSTEM now. Should we just ignore such
 problematic setting in postgresql.auto.conf with warning message?

I think that's a case of Doctor, it hurts when I do this. Doctor: don't
do that then.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-05-08 Thread Robert Haas
According to the documentation for PQputCopyEnd:

 The result is 1 if the termination data was sent, zero if it was not sent 
 because the attempt would block (this case is only possible if the connection 
 is in
 nonblocking mode), or -1 if an error occurred. (Use PQerrorMessage to 
 retrieve details if the return value is -1. If the value is zero, wait for 
 write-ready and try again.)

However, pqPutCopyEnd contains no return statement that can ever
possibly return 0.  I think the problem is approximately here:

/* Try to flush data */
if (pqFlush(conn)  0)
return -1;

pqFlush() returns 0 if no data is waiting to be sent, or otherwise the
return value of pqSendSome().  pqSendSome() returns -1 if an error
occurs, 0 if all data is sent, or 1 if some data was sent but the
socket is non-blocking and the caller must try again later.  It seems
to me that when pqSendSome() returns 1, pqPutCopyEnd ought to return 0
in order to meet its API contract - and then the client, presumably,
should repeatedly wait for the socket to become write-ready and then
try PQflush() until PQflush() returns non-zero.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] popen and pclose redefinitions causing many warning in Windows build

2014-05-08 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 05/08/2014 08:01 AM, Michael Paquier wrote:
 Since commit a692ee5, code compilation on windows is full of warnings
 caused by the re-definitions of popen and pclose:

 Hmm. Does the MinGW version of popen() and system() do the quoting for 
 you? If we just #ifdef the defines, then we will not use the wrappers on 
 MinGW, which would be wrong if the quoting is needed there. If it's not 
 needed, then we shouldn't be compiling the wrapper functions in the 
 first place.

Another problem, if we do need the wrappers on mingw, is that the
#undef commands in system.c will presumably result in the wrong
things happening in the wrapper functions, since the platform needs
us to use their macros there.

The simplest workaround I can think of is to change the stanza in
port.h to be like

   #ifndef DONT_DEFINE_SYSTEM_POPEN
   #undef system
   #define system(a) pgwin32_system(a)
   #undef popen
   #define popen(a,b) pgwin32_popen(a,b)
   #undef pclose
   #define pclose(a) _pclose(a)
   #endif

and then have system.c do #define DONT_DEFINE_SYSTEM_POPEN before
including postgres.h.  This is pretty grotty, but on the other hand
it'd remove the need for the #undef's in system.c.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compilation errors with mingw build caused by undefined optreset

2014-05-08 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes:
 Since commit 60ff2fd introducing the centralizated getopt-related
 things in a global header file, build on Windows with mingw is failing

Hm, buildfarm member narwhal doesn't seem to be having any such problem.
(It's got its own issues, but not this.)

Do you think you could put up a buildfarm critter using whatever version
of mingw you're using?  Without buildfarm feedback, things *will* break
regularly; Windows is just too weird to expect otherwise.

 because of some declarations of HAVE_INT_OPTRESET causing optreset to
 become undefined:
 This failure is new with 9.4, and attached is a patch fixing it...

I'm a bit suspicious of this patch because of the comment in pg_getopt.h
saying that cygwin doesn't want those variables to be declared.
We can try it, but the lack of up-to-date cygwin members in the buildfarm
means we won't be real sure whether it breaks cygwin.

Of course, I guess the response to any complaints can be please put
up a buildfarm member ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] popen and pclose redefinitions causing many warning in Windows build

2014-05-08 Thread Andrew Dunstan


On 05/08/2014 11:19 AM, Tom Lane wrote:

Heikki Linnakangas hlinnakan...@vmware.com writes:

On 05/08/2014 08:01 AM, Michael Paquier wrote:

Since commit a692ee5, code compilation on windows is full of warnings
caused by the re-definitions of popen and pclose:

Hmm. Does the MinGW version of popen() and system() do the quoting for
you? If we just #ifdef the defines, then we will not use the wrappers on
MinGW, which would be wrong if the quoting is needed there. If it's not
needed, then we shouldn't be compiling the wrapper functions in the
first place.

Another problem, if we do need the wrappers on mingw, is that the
#undef commands in system.c will presumably result in the wrong
things happening in the wrapper functions, since the platform needs
us to use their macros there.




I'm pretty sure we need this on Mingw - this SYSTEMQUOTE stuff dates 
back well before 8.3, IIRC, which is when we first got full MSVC support.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] showing index maintenance on EXPLAIN

2014-05-08 Thread Robert Haas
On Thu, May 8, 2014 at 2:31 AM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Wed, May 7, 2014 at 10:52 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova ja...@2ndquadrant.com wrote:
 Hi,

 This patch implements $subject only when ANALYZE and VERBOSE are on.
 I made it that way because for years nobody seemed interested in this
 info (at least no one did it) so i decided that maybe is to much
 information for most people (actually btree indexes are normally very
 fast).


 Why to capture only for Index Insert/Update and not for Read; is it
 because Read will be always fast ot implementation complexity?


 EXPLAIN ANALYZE already shows that on any SELECT that uses an index in
 some way. Or are you thinking on something else?

 Why not similar timings for heap?


 well actual time shows us total time of the operation so just
 deducting the time spent on triggers, indexes and planning seems like
 a way to get heap modification time.

 yes, maybe we still need some additional data. for example, i could
 want to know how much time we spent extending a relation.

 Why can't we print when only Analyze is used with Explain, the
 execution time is printed with Analyze option?


 i'm not sure the info is useful for everyone, i'm not opposed to show
 it all the time though

 Could you please tell in what all kind of scenario's, do you expect it
 to be useful?
 One I could think is that if there are multiple indexes on a table and user
 wants to find out if any particular index is consuming more time.


 exactly my use case. consider this plan (we spent 78% of the time
 updating the index uniq_idx_on_text):

QUERY PLAN
 
  Insert on public.t1 (actual time=0.540..0.540 rows=0 loops=1)
-  Result (actual time=0.046..0.049 rows=1 loops=1)
  Output: some long data here
  Index uniq_idx_on_text on t1: time=0.421 rows=1
  Index t1_pkey on t1: time=0.027 rows=1
  Total runtime: 0.643 ms
 (6 rows)

I would have expected the information about index maintenance times to
be associated with the Insert node, not the plan overall.  IIUC, you
could have more than one such node if, for example, there are
writeable CTEs involved.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 15:25, Stephen Frost sfr...@snowman.net wrote:
 * Simon Riggs (si...@2ndquadrant.com) wrote:
 On 8 May 2014 14:40, Stephen Frost sfr...@snowman.net wrote:
  Allow me to re-state your suggestion here:
 
  An FDW is loaded which provides hook for join push-down (whatever those
  end up being).
 
  A query is run which joins *local* table A to *local* table B.  Standard
  heaps, standard indexes, all local to this PG instance.
 
  The FDW which supports join push-down is then passed this join for
  planning, with local sub-plans for the local tables.

 Yes that is correct; thank you for confirming your understanding with me.

 I guess for my part, that doesn't look like an FDW any more.

If it works, it works. If it doesn't, we can act otherwise.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compilation errors with mingw build caused by undefined optreset

2014-05-08 Thread Andrew Dunstan


On 05/08/2014 11:30 AM, Tom Lane wrote:

Michael Paquier michael.paqu...@gmail.com writes:

Since commit 60ff2fd introducing the centralizated getopt-related
things in a global header file, build on Windows with mingw is failing

Hm, buildfarm member narwhal doesn't seem to be having any such problem.
(It's got its own issues, but not this.)



jacana and frogmouth are also Mingw animals, and are not having issues.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 7 May 2014 02:05, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

 (1) DDL support and system catalog

 Simon suggested that DDL command should be supported to track custom-
 plan providers being installed, and to avoid nonsense hook calls
 if it is an obvious case that custom-plan provider can help. It also
 makes sense to give a chance to load extensions once installed.
 (In the previous design, I assumed modules are loaded by LOAD command
 or *_preload_libraries parameters).

I've tried hard to bend my mind to this and its beginning to sink in.

We've already got pg_am for indexes, and soon to have pg_seqam for sequences.

It would seem normal and natural to have

* pg_joinam catalog table for join methods with a join method API
Which would include some way of defining which operators/datatypes we
consider this for, so if PostGIS people come up with some fancy GIS
join thing, we don't invoke it every time even when its inapplicable.
I would prefer it if PostgreSQL also had some way to control when the
joinam was called, possibly with some kind of table_size_threshold on
the AM tuple, which could be set to =0 to control when this was even
considered.

* pg_scanam catalog table for scan methods with a scan method API
Again, a list of operators that can be used with it, like indexes and
operator classes

By analogy to existing mechanisms, we would want

* A USERSET mechanism to allow users to turn it off for testing or
otherwise, at user, database level

We would also want

* A startup call that allows us to confirm it is available and working
correctly, possibly with some self-test for hardware, performance
confirmation/derivation of planning parameters

* Some kind of trace mode that would allow people to confirm the
outcome of calls

* Some interface to the stats system so we could track the frequency
of usage of each join/scan type. This would be done within Postgres,
tracking the calls by name, rather than trusting the plugin to do it
for us


 I tried to implement the following syntax:

   CREATE CUSTOM PLAN name FOR (scan|join|any) HANDLER func_name;

Not sure if we need that yet

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-05-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 According to the documentation for PQputCopyEnd:
 The result is 1 if the termination data was sent, zero if it was not sent 
 because the attempt would block (this case is only possible if the 
 connection is in
 nonblocking mode), or -1 if an error occurred. (Use PQerrorMessage to 
 retrieve details if the return value is -1. If the value is zero, wait for 
 write-ready and try again.)

 However, pqPutCopyEnd contains no return statement that can ever
 possibly return 0.

IIRC, for some of the libpq functions, the API spec was intended to allow
for future async behavior that wasn't actually implemented.  So the mere
fact that it can't return zero today is not a problem.  If it might block
instead, then we have a problem, but that's not what you're saying.

Also, what the API spec says is that clients should retry PQputCopyEnd
on a zero return.  We do *not* want them to do that here; once we've
changed asyncStatus, a retry would report an error.  So the API spec
in this case is intended to require a retry loop that the current
implementation doesn't actually need, but that conceivably could be
needed after some future refactoring.  In particular, we might want
to fix the code so that it returns zero if it fails to queue the
COPY END message at all.

 pqFlush() returns 0 if no data is waiting to be sent, or otherwise the
 return value of pqSendSome().  pqSendSome() returns -1 if an error
 occurs, 0 if all data is sent, or 1 if some data was sent but the
 socket is non-blocking and the caller must try again later.  It seems
 to me that when pqSendSome() returns 1, pqPutCopyEnd ought to return 0
 in order to meet its API contract - and then the client, presumably,
 should repeatedly wait for the socket to become write-ready and then
 try PQflush() until PQflush() returns non-zero.

That would be a change in the API spec, not just the implementation,
and it's not clear to me that it would actually be helpful to any
clients.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] popen and pclose redefinitions causing many warning in Windows build

2014-05-08 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 I'm pretty sure we need this on Mingw - this SYSTEMQUOTE stuff dates 
 back well before 8.3, IIRC, which is when we first got full MSVC support.

I tried googling for some info on this, and got a number of hits
suggesting that mingw didn't emulate popen at all till pretty recently.
For instance this:
https://lists.fedoraproject.org/pipermail/fedora-mingw/2009-September/002087.html
Jones is an ex-coworker of mine, and I'm pretty sure that if he said
it wasn't there then it wasn't there.

So I'm confused about how this worked at all in mingw back-when.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-05-08 Thread Christoph Berg
Re: Andres Freund 2014-05-08 20140508145901.gb1...@awork2.anarazel.de
  Maybe this is nitpicking, but what happens when postgresql.auto.conf also
  includes the setting of data_directory? This is possible because we can
  set data_directory via ALTER SYSTEM now. Should we just ignore such
  problematic setting in postgresql.auto.conf with warning message?
 
 I think that's a case of Doctor, it hurts when I do this. Doctor: don't
 do that then.

I'd opt to forbid setting data_directory at ALTER SYSTEM time. For the
other options, I agree with Andres that you should get to keep all
parts if you manage to break it.

Fortunately, ALTER SYSTEM already refuses to change config_file :)

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-08 Thread Tomas Vondra
On 6.5.2014 23:01, Tomas Vondra wrote:
 On 6.5.2014 22:24, Tom Lane wrote:
 Tomas Vondra t...@fuzzy.cz writes:
 I recall there was a call for more animals with CLOBBER_CACHE_ALWAYS
 some time ago, so I went and enabled that on all three animals. Let's
 see how long that will take.

 I see there are more 'clobber' options in the code: CLOBBER_FREED_MEMORY
 and CLOBBER_CACHE_RECURSIVELY. Would that be a good idea to enable these
 as well?

 The time requirements will be much higher (especially for the
 RECURSIVELY option), but running that once a week shouldn't be a big
 deal - the machine is pretty much dedicated to the buildfarm.

 I've never had the patience to run the regression tests to completion
 with CLOBBER_CACHE_RECURSIVELY at all, let alone do it on a regular 
 basis. (I wonder if there's some easy way to run it for just a few 
 regression tests...)
 
 Now, that's a challenge ;-)
 

 I think testing CLOBBER_FREED_MEMORY would be sensible though.
 
 OK, I've enabled this for now.

H, with CLOBBER_CACHE_ALWAYS + CLOBBER_FREED_MEMORY the tests take
~20h on a single branch/animal. With a single locale (e.g. C) it would
take ~4h, but we're testing a bunch of additional czech/slovak locales.

The tests are running in sequence (magpie-treepie-fulmar) so with all
6 branches, this would take ~14 days to complete. I don't mind the
machine is running tests 100% of the time, that's why it's in buildfarm,
but I'd rather see the failures soon after the commit (and two weeks is
well over the soon edge, IMHO).

So I'm thinking about how to improve this. I'd like to keep the options
for all the branches (e.g. not just HEAD, as a few other animals do).
But I'm thinking about running the tests in parallel, somehow - the
machine has 4 cores, and most of the time only one of them is used. I
don't expect a perfect ~3x speedup, but getting ~2x would be nice.

Any recommendations how to do that? I see there's 'base_port' in the
config - is it enough to tweak this, or do I need to run separate the
animals using e.g. lxc?

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-05-08 Thread Robert Haas
On Thu, May 8, 2014 at 12:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 According to the documentation for PQputCopyEnd:
 The result is 1 if the termination data was sent, zero if it was not sent 
 because the attempt would block (this case is only possible if the 
 connection is in
 nonblocking mode), or -1 if an error occurred. (Use PQerrorMessage to 
 retrieve details if the return value is -1. If the value is zero, wait for 
 write-ready and try again.)

 However, pqPutCopyEnd contains no return statement that can ever
 possibly return 0.

 IIRC, for some of the libpq functions, the API spec was intended to allow
 for future async behavior that wasn't actually implemented.  So the mere
 fact that it can't return zero today is not a problem.  If it might block
 instead, then we have a problem, but that's not what you're saying.

 Also, what the API spec says is that clients should retry PQputCopyEnd
 on a zero return.  We do *not* want them to do that here; once we've
 changed asyncStatus, a retry would report an error.  So the API spec
 in this case is intended to require a retry loop that the current
 implementation doesn't actually need, but that conceivably could be
 needed after some future refactoring.  In particular, we might want
 to fix the code so that it returns zero if it fails to queue the
 COPY END message at all.

Yeah, good point.  I think what confused me is the fact that the
documentation says that when PQputCopyEnd() returns 1, that means that
the termination data was sent.  So my application sat there and
waited for a server response that never showed up, because in fact the
termination data had *not* been sent.

What I'm now thinking I need to do is something like this:

1. If PQputCopyEnd returns -1, error.
2. while ((rc = PQflush(conn)) != 0) { if (rc  0) { error; } else {
wait for socket to become read-ready or write-ready; } }
3. while (PQisBusy(conn)) { wait for the socket to become read-ready; }
4. PQgetResult()

Does that sound right?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A couple logical decoding fixes/patches

2014-05-08 Thread Andres Freund
Hi,

Patch 01: Fix a couple of embarassing typos. Most of them harmless, but
one isn't and can lead to crashing or decoding wrong data.

Patch 02: Don't crash with an Assert() failure if wal_level=logical but
max_replication_slots=0.

Patch 03: Add valgrind suppression for writing out padding bytes. That's
better than zeroing the data from the get go because unitialized
accesses are still detected.

Details are in the commit messages of the individual patches.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 9987ff3fead5cee0b9c7da94c8d8a665015be7c2 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Thu, 8 May 2014 17:00:06 +0200
Subject: [PATCH 1/3] Fix a couple of brown paper bag typos in the logical
 decoding code.

Most of these were fairly harmless but the typos in
ReorderBufferSerializeChange() could lead to crashes or wrong data
being decoded.

The ReorderBufferSerializeChange() bug was encountered by Christian
Kruse; the rest were found during a review for similar sillyness.
---
 src/backend/replication/logical/reorderbuffer.c | 13 +
 src/backend/replication/logical/snapbuild.c | 11 +--
 2 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 7f2bbca..f96e3e1 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2064,15 +2064,15 @@ ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
 if (snap-xcnt)
 {
 	memcpy(data, snap-xip,
-		   sizeof(TransactionId) + snap-xcnt);
-	data += sizeof(TransactionId) + snap-xcnt;
+		   sizeof(TransactionId) * snap-xcnt);
+	data += sizeof(TransactionId) * snap-xcnt;
 }
 
 if (snap-subxcnt)
 {
 	memcpy(data, snap-subxip,
-		   sizeof(TransactionId) + snap-subxcnt);
-	data += sizeof(TransactionId) + snap-subxcnt;
+		   sizeof(TransactionId) * snap-subxcnt);
+	data += sizeof(TransactionId) * snap-subxcnt;
 }
 break;
 			}
@@ -2168,15 +2168,12 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn,
 
 		}
 
-		ReorderBufferSerializeReserve(rb, sizeof(ReorderBufferDiskChange));
-
-
 		/*
 		 * Read the statically sized part of a change which has information
 		 * about the total size. If we couldn't read a record, we're at the
 		 * end of this file.
 		 */
-
+		ReorderBufferSerializeReserve(rb, sizeof(ReorderBufferDiskChange));
 		readBytes = read(*fd, rb-outbuf, sizeof(ReorderBufferDiskChange));
 
 		/* eof */
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index cb45f90..f00fd7d 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -307,8 +307,7 @@ AllocateSnapshotBuilder(ReorderBuffer *reorder,
 	builder-committed.xip =
 		palloc0(builder-committed.xcnt_space * sizeof(TransactionId));
 	builder-committed.includes_all_transactions = true;
-	builder-committed.xip =
-		palloc0(builder-committed.xcnt_space * sizeof(TransactionId));
+
 	builder-initial_xmin_horizon = xmin_horizon;
 	builder-transactions_after = start_lsn;
 
@@ -1691,7 +1690,7 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
 
 	/* restore running xacts information */
 	sz = sizeof(TransactionId) * ondisk.builder.running.xcnt_space;
-	ondisk.builder.running.xip = MemoryContextAlloc(builder-context, sz);
+	ondisk.builder.running.xip = MemoryContextAllocZero(builder-context, sz);
 	readBytes = read(fd, ondisk.builder.running.xip, sz);
 	if (readBytes != sz)
 	{
@@ -1705,7 +1704,7 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
 
 	/* restore committed xacts information */
 	sz = sizeof(TransactionId) * ondisk.builder.committed.xcnt;
-	ondisk.builder.committed.xip = MemoryContextAlloc(builder-context, sz);
+	ondisk.builder.committed.xip = MemoryContextAllocZero(builder-context, sz);
 	readBytes = read(fd, ondisk.builder.committed.xip, sz);
 	if (readBytes != sz)
 	{
@@ -1763,10 +1762,10 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
 	}
 	ondisk.builder.committed.xip = NULL;
 
-	builder-running.xcnt = ondisk.builder.committed.xcnt;
+	builder-running.xcnt = ondisk.builder.running.xcnt;
 	if (builder-running.xip)
 		pfree(builder-running.xip);
-	builder-running.xcnt_space = ondisk.builder.committed.xcnt_space;
+	builder-running.xcnt_space = ondisk.builder.running.xcnt_space;
 	builder-running.xip = ondisk.builder.running.xip;
 
 	/* our snapshot is not interesting anymore, build a new one */
-- 
1.8.5.rc2.dirty

From 863589478ce185cba02e8ad5a4ff2a8b79588cb0 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Thu, 8 May 2014 18:02:20 +0200
Subject: [PATCH 2/3] Remove overeager assertion in wal_level=logical specific
 code.

The assertion 

[HACKERS] [PATCH] Fix harmless access to uninitialized memory in ri_triggers.c.

2014-05-08 Thread andres
From: Andres Freund and...@anarazel.de

When cache invalidations arrive while ri_LoadConstraintInfo() is busy
filling a new cache entry, InvalidateConstraintCacheCallBack()
compares the - not yet initialized - oidHashValue field with the
to-be-invalidated hash value. To fix check whether the entry is
already marked as invalid.
---
 src/backend/utils/adt/ri_triggers.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/ri_triggers.c 
b/src/backend/utils/adt/ri_triggers.c
index d30847b..e4d7b2c 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -2934,7 +2934,8 @@ InvalidateConstraintCacheCallBack(Datum arg, int cacheid, 
uint32 hashvalue)
hash_seq_init(status, ri_constraint_cache);
while ((hentry = (RI_ConstraintInfo *) hash_seq_search(status)) != 
NULL)
{
-   if (hashvalue == 0 || hentry-oidHashValue == hashvalue)
+   if (hentry-valid 
+   (hashvalue == 0 || hentry-oidHashValue == hashvalue))
hentry-valid = false;
}
 }
-- 
1.8.5.rc2.dirty



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe

2014-05-08 Thread Andres Freund
On 2014-05-05 15:41:22 -0400, Tom Lane wrote:
  After far, far too much confused head scratching, code reading, random
  elog()s et al I found out that this is just because of a deficiency in
  valgrind's undefinedness tracking. [...]
  Unfortunately this cannot precisely be caught by valgrind's
  suppressions. Thus I'd like to add memset(SharedInvalidationMessage msg,
  0) in AddCatcacheInvalidationMessage() et al. to suppress these
  warnings. Imo we can just add them unconditionally, but if somebody else
  prefers we can add #ifdef USE_VALGRIND around them.
 
 I'd be okay with USE_VALGRIND.  I'm not particularly hot on adding a
 memset for everybody just to make valgrind less confused.  Especially
 since that's really going to hide any problems, not fix them.

The attached patch does VALGRIND_MAKE_MEM_DEFINED() on the relevant
structs. That's already defined to empty if USE_VALGRIND isn't defined.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 4da7b0caa2058c8b07ee4a7ab529b9ca0c292bcf Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Wed, 7 May 2014 22:30:05 +0200
Subject: [PATCH] Silence a spurious valgrind warning in inval.c.

For valgrind's sake explicitly define all bytes of
SharedInvalidationMessage as defined before sending them. Otherwise
valgrind sometimes treats bytes as undefined that aren't because
valgrind doesn't understand that the ringbuffer in sinvaladt.c is
accesses by multiple processes. Valgrind remembers that it stored a
undefined byte into parts of a slot in the ringbuffer and warns when
it uses that byte again - not realizing it has been stored by another
process.
---
 src/backend/utils/cache/inval.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index 5971469..dd46e18 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -103,6 +103,7 @@
 #include storage/smgr.h
 #include utils/catcache.h
 #include utils/inval.h
+#include utils/memdebug.h
 #include utils/memutils.h
 #include utils/rel.h
 #include utils/relmapper.h
@@ -332,6 +333,14 @@ AddCatcacheInvalidationMessage(InvalidationListHeader *hdr,
 	msg.cc.id = (int8) id;
 	msg.cc.dbId = dbId;
 	msg.cc.hashValue = hashValue;
+	/*
+	 * Define padding bytes to be defined, otherwise the sinvaladt.c
+	 * ringbuffer, which is accessed by multiple processes, will cause false
+	 * positives because valgrind remembers the undefined bytes from this
+	 * processes store, not realizing that another process has written since.
+	 */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	AddInvalidationMessage(hdr-cclist, msg);
 }
 
@@ -347,6 +356,9 @@ AddCatalogInvalidationMessage(InvalidationListHeader *hdr,
 	msg.cat.id = SHAREDINVALCATALOG_ID;
 	msg.cat.dbId = dbId;
 	msg.cat.catId = catId;
+	/* check AddCatcacheInvalidationMessage() for an explanation */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	AddInvalidationMessage(hdr-cclist, msg);
 }
 
@@ -370,6 +382,9 @@ AddRelcacheInvalidationMessage(InvalidationListHeader *hdr,
 	msg.rc.id = SHAREDINVALRELCACHE_ID;
 	msg.rc.dbId = dbId;
 	msg.rc.relId = relId;
+	/* check AddCatcacheInvalidationMessage() for an explanation */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	AddInvalidationMessage(hdr-rclist, msg);
 }
 
@@ -393,6 +408,9 @@ AddSnapshotInvalidationMessage(InvalidationListHeader *hdr,
 	msg.sn.id = SHAREDINVALSNAPSHOT_ID;
 	msg.sn.dbId = dbId;
 	msg.sn.relId = relId;
+	/* check AddCatcacheInvalidationMessage() for an explanation */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	AddInvalidationMessage(hdr-rclist, msg);
 }
 
@@ -1242,6 +1260,9 @@ CacheInvalidateSmgr(RelFileNodeBackend rnode)
 	msg.sm.backend_hi = rnode.backend  16;
 	msg.sm.backend_lo = rnode.backend  0x;
 	msg.sm.rnode = rnode.node;
+	/* check AddCatcacheInvalidationMessage() for an explanation */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	SendSharedInvalidMessages(msg, 1);
 }
 
@@ -1267,6 +1288,9 @@ CacheInvalidateRelmap(Oid databaseId)
 
 	msg.rm.id = SHAREDINVALRELMAP_ID;
 	msg.rm.dbId = databaseId;
+	/* check AddCatcacheInvalidationMessage() for an explanation */
+	VALGRIND_MAKE_MEM_DEFINED(msg, sizeof(msg));
+
 	SendSharedInvalidMessages(msg, 1);
 }
 
-- 
1.8.5.rc2.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recursive ReceiveSharedInvalidMessages not safe

2014-05-08 Thread Andres Freund
On 2014-05-05 18:50:59 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-05-05 15:41:22 -0400, Tom Lane wrote:
  Looks all right to me.  Yeah, the right shift might have undefined
  high-order bits, but we don't care because we're storing the result
  into an int16.
 
  Doesn't at the very least
  rnode.backend = (msg-sm.backend_hi  16) | (int) 
  msg-sm.backend_lo;
  have to be
  rnode.backend = ((int) msg-sm.backend_hi  16) | (int) 
  msg-sm.backend_lo;
 
 A promotion to int (or wider) is implicit in any arithmetic operation,
 last I checked the C standard.  The (int) on the other side isn't
 necessary either.

Done in the attached patch.

 We might actually be better off casting both sides to unsigned int,
 just to enforce that the left shifting is done with unsigned semantics.
 
  c) The ProcessMessageList() calls access msg-rc.id to test for the type
  of the existing message. That's not nice.
 
  Huh?
 
  The checks should access msg-id, not msg-rc.id...
 
 Meh.  If they're not the same thing, we've got big trouble anyway.
 But if you want to change it as a cosmetic thing, no objection.

Changed as well.

On 2014-05-05 15:41:22 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  a) SICleanupQueue() sometimes releases and reacquires the write lock
 held on the outside. That's pretty damn fragile, not to mention
 ugly. Even slight reformulations of the code in SIInsertDataEntries()
 can break this... Can we please extend the comment in the latter that
 to mention the lock dropping explicitly?
 
 Want to write a better comment?

Edited the comment and, in a perhaps debatable fashion, moved some
variable declarations + volatile into the retry loop for some added
cozyness. If the compiler inlines the cleanup function it could very
well decide to fetch some of the variables only once.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From a24973508710c965035fca45e933823ce49a5a4f Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Thu, 8 May 2014 16:13:44 +0200
Subject: [PATCH] Minor cleanups and comment improvements in the cache
 invalidation code.

---
 src/backend/storage/ipc/sinvaladt.c | 11 ---
 src/backend/utils/cache/inval.c |  9 +
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index 0328660..3966ccc 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -453,7 +453,6 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n)
 	while (n  0)
 	{
 		int			nthistime = Min(n, WRITE_QUANTUM);
-		int			numMsgs;
 		int			max;
 		int			i;
 
@@ -466,11 +465,17 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n)
 		 * queue and reset anyone who is preventing space from being freed.
 		 * Otherwise, clean the queue only when it's exceeded the next
 		 * fullness threshold.  We have to loop and recheck the buffer state
-		 * after any call of SICleanupQueue.
+		 * after any call of SICleanupQueue because the cleanup sometimes
+		 * require releasing the write lock acquired above.
 		 */
 		for (;;)
 		{
-			numMsgs = segP-maxMsgNum - segP-minMsgNum;
+			int numMsgs;
+			/* use volatile pointer to prevent code rearrangement */
+			volatile SISeg *vsegP = segP;
+
+			numMsgs = vsegP-maxMsgNum - vsegP-minMsgNum;
+
 			if (numMsgs + nthistime  MAXNUMMESSAGES ||
 numMsgs = segP-nextThreshold)
 SICleanupQueue(true, nthistime);
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index dd46e18..4c3197b 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -374,7 +374,7 @@ AddRelcacheInvalidationMessage(InvalidationListHeader *hdr,
 	/* Don't add a duplicate item */
 	/* We assume dbId need not be checked because it will never change */
 	ProcessMessageList(hdr-rclist,
-	   if (msg-rc.id == SHAREDINVALRELCACHE_ID 
+	   if (msg-id == SHAREDINVALRELCACHE_ID 
 		   msg-rc.relId == relId)
 	   return);
 
@@ -400,7 +400,7 @@ AddSnapshotInvalidationMessage(InvalidationListHeader *hdr,
 	/* Don't add a duplicate item */
 	/* We assume dbId need not be checked because it will never change */
 	ProcessMessageList(hdr-rclist,
-	   if (msg-sn.id == SHAREDINVALSNAPSHOT_ID 
+	   if (msg-id == SHAREDINVALSNAPSHOT_ID 
 		   msg-sn.relId == relId)
 	   return);
 
@@ -580,7 +580,8 @@ LocalExecuteInvalidationMessage(SharedInvalidationMessage *msg)
 		RelFileNodeBackend rnode;
 
 		rnode.node = msg-sm.rnode;
-		rnode.backend = (msg-sm.backend_hi  16) | (int) msg-sm.backend_lo;
+		rnode.backend = (BackendId) (((uint32) msg-sm.backend_hi  16)
+	 |(uint32) msg-sm.backend_lo);
 		smgrclosenode(rnode);
 	}
 	else if (msg-id == SHAREDINVALRELMAP_ID)
@@ -1257,7 

Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-05-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 What I'm now thinking I need to do is something like this:

 1. If PQputCopyEnd returns -1, error.
 2. while ((rc = PQflush(conn)) != 0) { if (rc  0) { error; } else {
 wait for socket to become read-ready or write-ready; } }
 3. while (PQisBusy(conn)) { wait for the socket to become read-ready; }
 4. PQgetResult()

 Does that sound right?

Yeah, more or less --- I think you need a PQconsumeInput there somewhere.

There is a PQflush call in PQconsumeInput that is intended to keep clients
from having to do that for themselves; but I'm not sure that it helps,
since clients probably only call PQconsumeInput when the socket is
read-ready --- and it wouldn't be in this situation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] popen and pclose redefinitions causing many warning in Windows build

2014-05-08 Thread Andrew Dunstan


On 05/08/2014 12:14 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

I'm pretty sure we need this on Mingw - this SYSTEMQUOTE stuff dates
back well before 8.3, IIRC, which is when we first got full MSVC support.

I tried googling for some info on this, and got a number of hits
suggesting that mingw didn't emulate popen at all till pretty recently.
For instance this:
https://lists.fedoraproject.org/pipermail/fedora-mingw/2009-September/002087.html
Jones is an ex-coworker of mine, and I'm pretty sure that if he said
it wasn't there then it wasn't there.

So I'm confused about how this worked at all in mingw back-when.



Mingw didn't, possibly, but we didn't rely on that. The MS C runtime 
library (msvcrt.dll_ has had _popen() forever, and we had this for 
Windows in PG 8.0 and have it still in all the back branches AFAICT:


   #define popen(a,b) _popen(a,b)
   #define pclose(a) _pclose(a)


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-05-08 Thread Robert Haas
On Thu, May 8, 2014 at 12:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 What I'm now thinking I need to do is something like this:

 1. If PQputCopyEnd returns -1, error.
 2. while ((rc = PQflush(conn)) != 0) { if (rc  0) { error; } else {
 wait for socket to become read-ready or write-ready; } }
 3. while (PQisBusy(conn)) { wait for the socket to become read-ready; }
 4. PQgetResult()

 Does that sound right?

 Yeah, more or less --- I think you need a PQconsumeInput there somewhere.

 There is a PQflush call in PQconsumeInput that is intended to keep clients
 from having to do that for themselves; but I'm not sure that it helps,
 since clients probably only call PQconsumeInput when the socket is
 read-ready --- and it wouldn't be in this situation.

OK.  It still seems to me that there's a doc fix needed here, if
nothing else.  The documentation for PQputCopyEnd() clearly implies
that if you get a return value of 1, the message is sent, and that's
just not true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 release notes

2014-05-08 Thread MauMau

From: Bruce Momjian br...@momjian.us

I have completed the initial version of the 9.4 release notes.  You can
view them here:

http://www.postgresql.org/docs/devel/static/release-9-4.html

Feedback expected and welcomed.  I expect to be modifying this until we
release 9.4 final.  I have marked items where I need help with question
marks.


Could you add the following item, client-only installation on Windows, if 
it's appropriate for release note?  It will be useful for those like 
EnterpriseDB who develop products derived from PostgreSQL.


https://commitfest.postgresql.org/action/patch_view?id=1326

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-05-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 OK.  It still seems to me that there's a doc fix needed here, if
 nothing else.  The documentation for PQputCopyEnd() clearly implies
 that if you get a return value of 1, the message is sent, and that's
 just not true.

That's fair.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] showing index maintenance on EXPLAIN

2014-05-08 Thread Jaime Casanova
On Thu, May 8, 2014 at 10:41 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, May 8, 2014 at 2:31 AM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Wed, May 7, 2014 at 10:52 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova ja...@2ndquadrant.com 
 wrote:
 Hi,

 This patch implements $subject only when ANALYZE and VERBOSE are on.
 I made it that way because for years nobody seemed interested in this
 info (at least no one did it) so i decided that maybe is to much
 information for most people (actually btree indexes are normally very
 fast).


 I would have expected the information about index maintenance times to
 be associated with the Insert node, not the plan overall.  IIUC, you
 could have more than one such node if, for example, there are
 writeable CTEs involved.


i followed how trigger info is showed in explain. I can try to change
it, if that is what most people prefer.

   QUERY PLAN
-
 Insert on pgbench_accounts (actual time=0.249..0.249 rows=0 loops=1)
   CTE results
 -  Insert on pgbench_accounts pgbench_accounts_1 (actual
time=0.152..0.159 rows=1 loops=1)
   -  Result (actual time=0.003..0.004 rows=1 loops=1)
   -  CTE Scan on results (actual time=0.165..0.174 rows=1 loops=1)
 Trigger trg1 on pgbench_accounts: time=0.033 calls=1
 Trigger trg1 on pgbench_accounts: time=0.059 calls=1
 Total runtime: 0.377 ms
(8 rows)


-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 release notes

2014-05-08 Thread Peter Geoghegan
On Thu, May 8, 2014 at 10:50 AM, MauMau maumau...@gmail.com wrote:
 Could you add the following item, client-only installation on Windows, if
 it's appropriate for release note?


I think that it is appropriate.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-08 Thread Alvaro Herrera
Tomas Vondra wrote:

 H, with CLOBBER_CACHE_ALWAYS + CLOBBER_FREED_MEMORY the tests take
 ~20h on a single branch/animal. With a single locale (e.g. C) it would
 take ~4h, but we're testing a bunch of additional czech/slovak locales.
 
 The tests are running in sequence (magpie-treepie-fulmar) so with all
 6 branches, this would take ~14 days to complete. I don't mind the
 machine is running tests 100% of the time, that's why it's in buildfarm,
 but I'd rather see the failures soon after the commit (and two weeks is
 well over the soon edge, IMHO).
 
 So I'm thinking about how to improve this. I'd like to keep the options
 for all the branches (e.g. not just HEAD, as a few other animals do).

I think doing the CLOBBER_CACHE_ALWAYS + CLOBBER_FREED_MEMORY runs in a
separate set of animals that does a single locale is enough.  It's
pretty dubious that there is any point in doing the CLOBBER stuff in
multiple locales.

Perhaps run these in HEAD only once a day, and do the other branches
once a week, or something like that.

If you add a seventh animal that does the recursive clobber thingy,
running say once a month, that could be useful too.  Since you have
spare CPUs, it shouldn't matter if it takes a whole week to run.

The advantage of doing this in separate animals is that you can run them
in parallel with your regular non-CLOBBER animals (which would run more
frequently.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-05-08 Thread Robert Haas
On Thu, May 8, 2014 at 1:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 OK.  It still seems to me that there's a doc fix needed here, if
 nothing else.  The documentation for PQputCopyEnd() clearly implies
 that if you get a return value of 1, the message is sent, and that's
 just not true.

 That's fair.

Something like the attached?

I don't really know what to say about 0, since it's never actually
returned, so I left that out.  But I'm up for suggestions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index e10ccec..d685894 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -5025,12 +5025,12 @@ int PQputCopyEnd(PGconn *conn,
   /para
 
   para
-   The result is 1 if the termination data was sent, zero if it was
-   not sent because the attempt would block (this case is only possible
-   if the connection is in nonblocking mode), or -1 if an error
-   occurred.  (Use functionPQerrorMessage/function to retrieve
-   details if the return value is -1.  If the value is zero, wait for
-   write-ready and try again.)
+   The result is 1 if the termination data was sent, or if the socket
+   is in non-blocking mode and the data was queued but could not be sent
+   because the attempt would block.  (In non-blocking mode, to be certain
+   that the data has been sent, you should call functionPQflush/
+   until it returns zero.)  The result is -1 if an error has occurred.
+   (Use functionPQerrorMessage/function to retrieve details.)
   /para
 
   para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
 It would seem normal and natural to have
 
 * pg_joinam catalog table for join methods with a join method API
 Which would include some way of defining which operators/datatypes we
 consider this for, so if PostGIS people come up with some fancy GIS
 join thing, we don't invoke it every time even when its inapplicable.
 I would prefer it if PostgreSQL also had some way to control when the
 joinam was called, possibly with some kind of table_size_threshold on
 the AM tuple, which could be set to =0 to control when this was even
 considered.

It seems useful to think about how we would redefine our existing join
methods using such a structure.  While thinking about that, it seems
like we would worry more about what the operators provide rather than
the specific operators themselves (ala hashing / HashJoin) and I'm not
sure we really care about the data types directly- just about the
operations which we can do on them..

I can see a case for sticking data types into this if we feel that we
have to constrain the path possibilities for some reason, but I'd rather
try and deal with any issues around it doesn't make sense to do X
because we'll know it'll be really expensive through the cost model
instead of with a table that defines what's allowed or not allowed.
There may be cases where we get the costing wrong and it's valuable
to be able to tweak cost values on a per-connection basis or for
individual queries.

I don't mean to imply that a 'pg_joinam' table is a bad idea, just that
I'd think of it being defined in terms of what capabilities it requires
of operators and a way for costing to be calculated for it, plus the
actual functions which it provides to implement the join itself (to
include some way to get output suitable for explain, etc..).

 * pg_scanam catalog table for scan methods with a scan method API
 Again, a list of operators that can be used with it, like indexes and
 operator classes

Ditto for this- but there's lots of other things this makes me wonder
about because it's essentially trying to define a pluggable storage
layer, which is great, but also requires some way to deal with all of
things we use our storage system for: cacheing / shared buffers,
locking, visibility, WAL, unique identifier / ctid (for use in indexes,
etc)...

 By analogy to existing mechanisms, we would want
 
 * A USERSET mechanism to allow users to turn it off for testing or
 otherwise, at user, database level

If we re-implement our existing components through this (eat our own
dogfood as it were), I'm not sure that we'd be able to have a way to
turn it on/off..  I realize we wouldn't have to, but then it seems like
we'd have two very different code paths and likely a different level of
support / capability afforded to external storage systems and then I
wonder if we're not back to just FDWs again..

 We would also want
 
 * A startup call that allows us to confirm it is available and working
 correctly, possibly with some self-test for hardware, performance
 confirmation/derivation of planning parameters

Yeah, we'd need this for anything that supports a GPU, regardless of how
we implement it, I'd think.

 * Some kind of trace mode that would allow people to confirm the
 outcome of calls

Seems like this would be useful independently of the rest..

 * Some interface to the stats system so we could track the frequency
 of usage of each join/scan type. This would be done within Postgres,
 tracking the calls by name, rather than trusting the plugin to do it
 for us

This is definitely something I want for core already...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Robert Haas
On Thu, May 8, 2014 at 3:10 PM, Stephen Frost sfr...@snowman.net wrote:
 * Simon Riggs (si...@2ndquadrant.com) wrote:
 It would seem normal and natural to have

 * pg_joinam catalog table for join methods with a join method API
 Which would include some way of defining which operators/datatypes we
 consider this for, so if PostGIS people come up with some fancy GIS
 join thing, we don't invoke it every time even when its inapplicable.
 I would prefer it if PostgreSQL also had some way to control when the
 joinam was called, possibly with some kind of table_size_threshold on
 the AM tuple, which could be set to =0 to control when this was even
 considered.

 It seems useful to think about how we would redefine our existing join
 methods using such a structure.  While thinking about that, it seems
 like we would worry more about what the operators provide rather than
 the specific operators themselves (ala hashing / HashJoin) and I'm not
 sure we really care about the data types directly- just about the
 operations which we can do on them..

I'm pretty skeptical about this whole line of inquiry.  We've only got
three kinds of joins, and each one of them has quite a bit of bespoke
logic, and all of this code is pretty performance-sensitive on large
join nests.  If there's a way to make this work for KaiGai's use case
at all, I suspect something really lightweight like a hook, which
should have negligible impact on other workloads, is a better fit than
something involving system catalog access.  But I might be wrong.

I also think that there are really two separate problems here: getting
the executor to call a custom scan node when it shows up in the plan
tree; and figuring out how to get it into the plan tree in the first
place.  I'm not sure we've properly separated those problems, and I'm
not sure into which category the issues that sunk KaiGai's 9.4 patch
fell.  Most of this discussion seems like it's about the latter
problem, but we need to solve both.  For my money, we'd be better off
getting some kind of basic custom scan node functionality committed
first, even if the cases where you can actually inject them into real
plans are highly restricted.  Then, we could later work on adding more
ways to inject them in more places.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-08 Thread Andrew Dunstan




On 05/08/2014 12:21 PM, Tomas Vondra wrote:

On 6.5.2014 23:01, Tomas Vondra wrote:

On 6.5.2014 22:24, Tom Lane wrote:

Tomas Vondra t...@fuzzy.cz writes:

I recall there was a call for more animals with CLOBBER_CACHE_ALWAYS
some time ago, so I went and enabled that on all three animals. Let's
see how long that will take.
I see there are more 'clobber' options in the code: CLOBBER_FREED_MEMORY
and CLOBBER_CACHE_RECURSIVELY. Would that be a good idea to enable these
as well?
The time requirements will be much higher (especially for the
RECURSIVELY option), but running that once a week shouldn't be a big
deal - the machine is pretty much dedicated to the buildfarm.

I've never had the patience to run the regression tests to completion
with CLOBBER_CACHE_RECURSIVELY at all, let alone do it on a regular
basis. (I wonder if there's some easy way to run it for just a few
regression tests...)

Now, that's a challenge ;-)


I think testing CLOBBER_FREED_MEMORY would be sensible though.

OK, I've enabled this for now.

H, with CLOBBER_CACHE_ALWAYS + CLOBBER_FREED_MEMORY the tests take
~20h on a single branch/animal. With a single locale (e.g. C) it would
take ~4h, but we're testing a bunch of additional czech/slovak locales.

The tests are running in sequence (magpie-treepie-fulmar) so with all
6 branches, this would take ~14 days to complete. I don't mind the
machine is running tests 100% of the time, that's why it's in buildfarm,
but I'd rather see the failures soon after the commit (and two weeks is
well over the soon edge, IMHO).

So I'm thinking about how to improve this. I'd like to keep the options
for all the branches (e.g. not just HEAD, as a few other animals do).
But I'm thinking about running the tests in parallel, somehow - the
machine has 4 cores, and most of the time only one of them is used. I
don't expect a perfect ~3x speedup, but getting ~2x would be nice.

Any recommendations how to do that? I see there's 'base_port' in the
config - is it enough to tweak this, or do I need to run separate the
animals using e.g. lxc?



Here is what I do on my FreeBSD VM. I have 2 animals, nightjar and 
friarbird. They have the same buildroot. friarbird is set up to build 
with CLOBBER_CACHE_ALWAYS, building just HEAD and just testing C locale; 
nightjar builds all branches we are interested in and tests locale 
cs_CZ.utf8 in addition to C.


Other than those differences they are pretty similar.

Here is the crontab that drives them:

   27 5-22 * * * cd bf  ./run_branches.pl --run-all --verbose
   --config=nightjarx.conf  bf.out 21
   20 0 * * * cd bf  ./run_branches.pl --run-all --verbose
   --config=friarbird.conf --skip-steps=install-check  bf.out 21


The buildfarm code has enough locking smarts to make sure we don't get 
any build collisions doing this.


If you have an animal to do a special type of build (e.g. CLOBBER_foo) 
then it's probably a good idea to set a note for that animal - see the 
buildfarm program setnotes.pl. friarbird has the note set Uses 
-DCLOBBER_CACHE_ALWAYS.



If you want to do this in parallel, then you will need different 
buildroots and different base ports for each animal. I would not run the 
same animal on different branches concurrently, that is quite likely to 
end up in port collisions.



HTH.

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 20:40, Robert Haas robertmh...@gmail.com wrote:

 For my money, we'd be better off
 getting some kind of basic custom scan node functionality committed
 first, even if the cases where you can actually inject them into real
 plans are highly restricted.  Then, we could later work on adding more
 ways to inject them in more places.

We're past the prototyping stage and into productionising what we know
works, AFAIK. If that point is not clear, then we need to discuss that
first.

At the moment the Custom join hook is called every time we attempt to
cost a join, with no restriction.

I would like to highly restrict this, so that we only consider a
CustomJoin node when we have previously said one might be usable and
the user has requested this (e.g. enable_foojoin = on)

We only consider merge joins if the join uses operators with oprcanmerge=true.
We only consider hash joins if the join uses operators with oprcanhash=true

So it seems reasonable to have a way to define/declare what is
possible and what is not. But my take is that adding a new column to
pg_operator for every CustomJoin node is probably out of the question,
hence my suggestion to list the operators we know it can work with.

Given that everything else in Postgres is agnostic and configurable,
I'm looking to do the same here.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-08 Thread Alvaro Herrera
Andrew Dunstan wrote:

 Here is what I do on my FreeBSD VM. I have 2 animals, nightjar and
 friarbird. They have the same buildroot. friarbird is set up to
 build with CLOBBER_CACHE_ALWAYS, building just HEAD and just testing
 C locale; nightjar builds all branches we are interested in and
 tests locale cs_CZ.utf8 in addition to C.

So nightjar would build frequently almost all the time, but as soon as
friarbird is doing a CLOBBER run nightjar would just stop running?  That
seems a bit odd, given that the CLOBBER runs take a lot longer than
non-CLOBBER ones.  (I guess it makes sense if you don't want to devote
double CPU time now and then to running the CLOBBER animal, but other
than that there doesn't seem to be a point to setting it up like that.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 20:10, Stephen Frost sfr...@snowman.net wrote:

 * A USERSET mechanism to allow users to turn it off for testing or
 otherwise, at user, database level

 If we re-implement our existing components through this (eat our own
 dogfood as it were), I'm not sure that we'd be able to have a way to
 turn it on/off..  I realize we wouldn't have to, but then it seems like
 we'd have two very different code paths and likely a different level of
 support / capability afforded to external storage systems and then I
 wonder if we're not back to just FDWs again..

We have SET enable_hashjoin = on | off

I would like a way to do the equivalent of SET enable_mycustomjoin =
off so that when it starts behaving weirdly in production, I can turn
it off so we can prove that is not the casue, or keep it turned off if
its a problem. I don't want to have to call a hook and let the hook
decide whether it can be turned off or not.

Postgres should be in control of the plugin, not give control to the
plugin every time and hope it gives us control back.

(I'm trying to take the FDW isn't the right way line of thinking to
its logical conclusions, so we can decide).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I'm pretty skeptical about this whole line of inquiry.  We've only got
 three kinds of joins, and each one of them has quite a bit of bespoke
 logic, and all of this code is pretty performance-sensitive on large
 join nests.  If there's a way to make this work for KaiGai's use case
 at all, I suspect something really lightweight like a hook, which
 should have negligible impact on other workloads, is a better fit than
 something involving system catalog access.  But I might be wrong.

We do a great deal of catalog consultation already during planning,
so I think a few more wouldn't be a problem, especially if the planner
is smart enough to touch the catalogs just once (per query?) and cache
the results.  However, your point about lots of bespoke logic is dead
on, and I'm afraid it's damn near a fatal objection.  As just one example,
if we did not have merge joins then an awful lot of what the planner does
with path keys simply wouldn't exist, or at least would look a lot
different than it does.  Without that infrastructure, I can't imagine
that a plugin approach would be able to plan mergejoins anywhere near as
effectively.  Maybe there's a way around this issue, but it sure won't
just be a pg_am-like API.

 I also think that there are really two separate problems here: getting
 the executor to call a custom scan node when it shows up in the plan
 tree; and figuring out how to get it into the plan tree in the first
 place.  I'm not sure we've properly separated those problems, and I'm
 not sure into which category the issues that sunk KaiGai's 9.4 patch
 fell.

I thought that the executor side of his patch wasn't in bad shape.  The
real problems were in the planner, and indeed largely in the backend
part of the planner where there's a lot of hard-wired logic for fixing up
low-level details of the constructed plan tree.  It seems like in
principle it might be possible to make that logic cleanly extensible,
but it'll likely take a major rewrite.  The patch tried to skate by with
just exposing a bunch of internal functions, which I don't think is a
maintainable approach, either for the core or for the extensions using it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-08 Thread Andrew Dunstan


On 05/08/2014 04:09 PM, Alvaro Herrera wrote:

Andrew Dunstan wrote:


Here is what I do on my FreeBSD VM. I have 2 animals, nightjar and
friarbird. They have the same buildroot. friarbird is set up to
build with CLOBBER_CACHE_ALWAYS, building just HEAD and just testing
C locale; nightjar builds all branches we are interested in and
tests locale cs_CZ.utf8 in addition to C.

So nightjar would build frequently almost all the time, but as soon as
friarbird is doing a CLOBBER run nightjar would just stop running?  That
seems a bit odd, given that the CLOBBER runs take a lot longer than
non-CLOBBER ones.  (I guess it makes sense if you don't want to devote
double CPU time now and then to running the CLOBBER animal, but other
than that there doesn't seem to be a point to setting it up like that.)




Why? This was actually discussed when I set this up and Tom opined that 
a once a day run with CLOBBER_CACHE_ALWAYS was plenty. It takes about 4 
/12 hours. The rest of the time nightjar runs. friarbird runs a bit 
after midnight US East Coast time, which is generally a slowish time for 
commits, so not running nightjar at that time seems perfectly reasonable.


I really don't get what your objection to the setup is. And no, I don't 
want them to run concurrently, I'd rather spread out the cycles.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 8 May 2014 20:40, Robert Haas robertmh...@gmail.com wrote:
 For my money, we'd be better off
 getting some kind of basic custom scan node functionality committed
 first, even if the cases where you can actually inject them into real
 plans are highly restricted.  Then, we could later work on adding more
 ways to inject them in more places.

 We're past the prototyping stage and into productionising what we know
 works, AFAIK. If that point is not clear, then we need to discuss that
 first.

OK, I'll bite: what here do we know works?  Not a damn thing AFAICS;
it's all speculation that certain hooks might be useful, and speculation
that's not supported by a lot of evidence.  If you think this isn't
prototyping, I wonder what you think *is* prototyping.

It seems likely to me that our existing development process is not
terribly well suited to developing a good solution in this area.
We need to be able to try some things and throw away what doesn't
work; but the project's mindset is not conducive to throwing features
away once they've appeared in a shipped release.  And the other side
of the coin is that trying these things is not inexpensive: you have
to write some pretty serious code before you have much of a feel for
whether a planner hook API is actually any good.  So by the time
you've built something of the complexity of, say, contrib/postgres_fdw,
you don't really want to throw that away in the next major release.
And that's at the bottom end of the scale of the amount of work that'd
be needed to do anything with the sorts of interfaces we're discussing.

So I'm not real sure how we move forward.  Maybe something to brainstorm
about in Ottawa.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 We only consider merge joins if the join uses operators with oprcanmerge=true.
 We only consider hash joins if the join uses operators with oprcanhash=true

 So it seems reasonable to have a way to define/declare what is
 possible and what is not. But my take is that adding a new column to
 pg_operator for every CustomJoin node is probably out of the question,
 hence my suggestion to list the operators we know it can work with.

For what that's worth, I'm not sure that either the oprcanmerge or
oprcanhash columns really pull their weight.  We could dispense with both
at the cost of doing some wasted lookups in pg_amop.  (Perhaps we should
replace them with a single oprisequality column, which would amount to
a hint that it's worth looking for hash or merge properties, or for other
equality-ish properties in future.)

So I think something comparable to an operator class is indeed a better
approach than adding more columns to pg_operator.  Other than the
connection to pg_am, you could pretty nearly just use the operator class
infrastructure as-is for a lot of operator-property things like this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-08 Thread Andrew Dunstan


On 05/08/2014 04:54 PM, Andrew Dunstan wrote:




Why? This was actually discussed when I set this up and Tom opined 
that a once a day run with CLOBBER_CACHE_ALWAYS was plenty. It takes 
about 4 /12 hours. The rest of the time nightjar runs. friarbird runs 
a bit after midnight US East Coast time, which is generally a slowish 
time for commits, so not running nightjar at that time seems perfectly 
reasonable.





er, that's 4 1/2 hours.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-05-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, May 8, 2014 at 1:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 OK.  It still seems to me that there's a doc fix needed here, if
 nothing else.  The documentation for PQputCopyEnd() clearly implies
 that if you get a return value of 1, the message is sent, and that's
 just not true.

 That's fair.

 Something like the attached?

 I don't really know what to say about 0, since it's never actually
 returned, so I left that out.  But I'm up for suggestions.

I think you should leave the existing verbiage about zero alone.
If we take it out, we're painting ourselves into a corner about
ever fixing the buffer-is-too-full failure case.

Perhaps the text should be like this:

The result is 1 if the termination message was sent; or in nonblocking
mode, this may only indicate that the termination message was successfully
queued.  (In nonblocking mode, to be certain that the data has been sent,
you should next wait for write-ready and call functionPQflush/,
repeating until it returns zero.)  Zero indicates that the function could
not queue the termination message because of full buffers; this will only
happen in nonblocking mode.  (In this case, wait for write-ready and try
the PQputCopyEnd call again.)  If a hard error occurs, -1 is returned; you
can use functionPQerrorMessage/function to retrieve details.

If it bothers you to document behavior that's not actually there yet,
maybe you should go fix the bug ;-).  I had thought this would require
a lot of refactoring, but it might work to just check if there's enough
buffer space for the message and return zero immediately if not.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Simon Riggs
On 8 May 2014 21:55, Tom Lane t...@sss.pgh.pa.us wrote:

 So I'm not real sure how we move forward.  Maybe something to brainstorm
 about in Ottawa.

I'm just about to go on away for a week, so that's probably the best
place to leave (me out of) the discussion until Ottawa.

I've requested some evidence this hardware route is worthwhile from my
contacts, so we'll see what we get. Presumably Kaigai has something to
share already also.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-08 Thread Alvaro Herrera
Andrew Dunstan wrote:

 I really don't get what your objection to the setup is. And no, I
 don't want them to run concurrently, I'd rather spread out the
 cycles.

I wasn't objecting, merely an observation.  Note that Tomas mentioned
he's okay with running 4 builds at once.  My main point here, really, is
that having a larger number of animals shouldn't be an impediment for a
more complex permutation of configurations, if he's okay with doing
that.  I assume you wouldn't object to my approving four extra animals
running on the same machine, if Tomas wants to go for that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Peter Geoghegan
On Thu, May 8, 2014 at 6:34 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 Umm... I'm now missing the direction towards my goal.
 What approach is the best way to glue PostgreSQL and PGStrom?

I haven't really paid any attention to PGStrom. Perhaps it's just that
I missed it, but I would find it useful if you could direct me towards
a benchmark or something like that, that demonstrates a representative
scenario in which the facilities that PGStrom offers are compelling
compared to traditional strategies already implemented in Postgres and
other systems.

If I wanted to make joins faster, personally, I would look at
opportunities to optimize our existing hash joins to take better
advantage of modern CPU characteristics. A lot of the research
suggests that it may be useful to implement techniques that take
better advantage of available memory bandwidth through techniques like
prefetching and partitioning, perhaps even (counter-intuitively) at
the expense of compute bandwidth. It's possible that it just needs to
be explained to me, but, with respect, intuitively I have a hard time
imagining that offloading joins to the GPU will help much in the
general case. Every paper on joins from the last decade talks a lot
about memory bandwidth and memory latency. Are you concerned with some
specific case that I may have missed? In what scenario might a
cost-based optimizer reasonably prefer a custom join node implemented
by PgStrom, over any of the existing join node types? It's entirely
possible that I simply missed relevant discussions here.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-08 Thread Andrew Dunstan


On 05/08/2014 05:21 PM, Alvaro Herrera wrote:

Andrew Dunstan wrote:


I really don't get what your objection to the setup is. And no, I
don't want them to run concurrently, I'd rather spread out the
cycles.

I wasn't objecting, merely an observation.  Note that Tomas mentioned
he's okay with running 4 builds at once.  My main point here, really, is
that having a larger number of animals shouldn't be an impediment for a
more complex permutation of configurations, if he's okay with doing
that.  I assume you wouldn't object to my approving four extra animals
running on the same machine, if Tomas wants to go for that.




No, that's fine.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-05-08 Thread Tom Lane
I wrote:
 I think the idea of hashing only keys/values that are too long is a
 reasonable compromise.  I've not finished coding it (because I keep
 getting distracted by other problems in the code :-() but it does not
 look to be very difficult.  I'm envisioning the cutoff as being something
 like 128 bytes; in practice that would mean that few if any keys get
 hashed, I think.

Attached is a draft patch for this.  In addition to the hash logic per se,
I made these changes:

* Replaced the K/V prefix bytes with a code that distinguishes the types
of JSON values.  While this is not of any huge significance for the
current index search operators, it's basically free to store the info,
so I think we should do it for possible future use.

* Fixed the problem with exists returning rows it shouldn't.  I
concluded that the best fix is just to force recheck for exists, which
allows considerable simplification in the consistent functions.

* Tried to improve the comments in jsonb_gin.c.

Barring objections I'll commit this tomorrow, and also try to improve the
user-facing documentation about the jsonb opclasses.

regards, tom lane

diff --git a/src/backend/utils/adt/jsonb_gin.c b/src/backend/utils/adt/jsonb_gin.c
index 592036a..2c4ade2 100644
*** a/src/backend/utils/adt/jsonb_gin.c
--- b/src/backend/utils/adt/jsonb_gin.c
***
*** 14,19 
--- 14,20 
  #include postgres.h
  
  #include access/gin.h
+ #include access/hash.h
  #include access/skey.h
  #include catalog/pg_collation.h
  #include catalog/pg_type.h
*** typedef struct PathHashStack
*** 26,39 
  	struct PathHashStack *parent;
  } PathHashStack;
  
! static text *make_text_key(const char *str, int len, char flag);
! static text *make_scalar_key(const JsonbValue *scalarVal, char flag);
  
  /*
   *
   * jsonb_ops GIN opclass support functions
   *
   */
  Datum
  gin_compare_jsonb(PG_FUNCTION_ARGS)
  {
--- 27,41 
  	struct PathHashStack *parent;
  } PathHashStack;
  
! static Datum make_text_key(char flag, const char *str, int len);
! static Datum make_scalar_key(const JsonbValue *scalarVal, bool is_key);
  
  /*
   *
   * jsonb_ops GIN opclass support functions
   *
   */
+ 
  Datum
  gin_compare_jsonb(PG_FUNCTION_ARGS)
  {
*** gin_extract_jsonb(PG_FUNCTION_ARGS)
*** 65,144 
  {
  	Jsonb	   *jb = (Jsonb *) PG_GETARG_JSONB(0);
  	int32	   *nentries = (int32 *) PG_GETARG_POINTER(1);
- 	Datum	   *entries = NULL;
  	int			total = 2 * JB_ROOT_COUNT(jb);
- 	int			i = 0,
- r;
  	JsonbIterator *it;
  	JsonbValue	v;
  
  	if (total == 0)
  	{
  		*nentries = 0;
  		PG_RETURN_POINTER(NULL);
  	}
  
  	entries = (Datum *) palloc(sizeof(Datum) * total);
  
  	it = JsonbIteratorInit(jb-root);
  
  	while ((r = JsonbIteratorNext(it, v, false)) != WJB_DONE)
  	{
  		if (i = total)
  		{
  			total *= 2;
  			entries = (Datum *) repalloc(entries, sizeof(Datum) * total);
  		}
  
- 		/*
- 		 * Serialize keys and elements equivalently,  but only when elements
- 		 * are Jsonb strings.  Otherwise, serialize elements as values.  Array
- 		 * elements are indexed as keys, for the benefit of
- 		 * JsonbExistsStrategyNumber.  Our definition of existence does not
- 		 * allow for checking the existence of a non-jbvString element (just
- 		 * like the definition of the underlying operator), because the
- 		 * operator takes a text rhs argument (which is taken as a proxy for
- 		 * an equivalent Jsonb string).
- 		 *
- 		 * The way existence is represented does not preclude an alternative
- 		 * existence operator, that takes as its rhs value an arbitrarily
- 		 * internally-typed Jsonb.  The only reason that isn't the case here
- 		 * is that the existence operator is only really intended to determine
- 		 * if an object has a certain key (object pair keys are of course
- 		 * invariably strings), which is extended to jsonb arrays.  You could
- 		 * think of the default Jsonb definition of existence as being
- 		 * equivalent to a definition where all types of scalar array elements
- 		 * are keys that we can check the existence of, while just forbidding
- 		 * non-string notation.  This inflexibility prevents the user from
- 		 * having to qualify that the rhs string is a raw scalar string (that
- 		 * is, naturally no internal string quoting in required for the text
- 		 * argument), and allows us to not set the reset flag for
- 		 * JsonbExistsStrategyNumber, since we know that keys are strings for
- 		 * both objects and arrays, and don't have to further account for type
- 		 * mismatch.  Not having to set the reset flag makes it less than
- 		 * tempting to tighten up the definition of existence to preclude
- 		 * array elements entirely, which would arguably be a simpler
- 		 * alternative. In any case the infrastructure used to implement the
- 		 * existence operator could trivially support this hypothetical,
- 		 * slightly distinct definition of existence.
- 		 */
  		switch (r)
  		{
  			

Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2014-05-08 Thread David G Johnston
Tom Lane-2 wrote
 Robert Haas lt;

 robertmhaas@

 gt; writes:
 On Thu, May 8, 2014 at 1:51 PM, Tom Lane lt;

 tgl@.pa

 gt; wrote:
 Robert Haas lt;

 robertmhaas@

 gt; writes:
 OK.  It still seems to me that there's a doc fix needed here, if
 nothing else.  The documentation for PQputCopyEnd() clearly implies
 that if you get a return value of 1, the message is sent, and that's
 just not true.
 
 That's fair.
 
 Something like the attached?
 
 I don't really know what to say about 0, since it's never actually
 returned, so I left that out.  But I'm up for suggestions.
 
 I think you should leave the existing verbiage about zero alone.
 If we take it out, we're painting ourselves into a corner about
 ever fixing the buffer-is-too-full failure case.
 
 Perhaps the text should be like this:
 
 The result is 1 if the termination message was sent; or in nonblocking
 mode, this may only indicate that the termination message was successfully
 queued.  (In nonblocking mode, to be certain that the data has been sent,
 you should next wait for write-ready and call 
 function
 PQflush
 /
 ,
 repeating until it returns zero.)  Zero indicates that the function could
 not queue the termination message because of full buffers; this will only
 happen in nonblocking mode.  (In this case, wait for write-ready and try
 the PQputCopyEnd call again.)  If a hard error occurs, -1 is returned; you
 can use 
 function
 PQerrorMessage
 /function
  to retrieve details.
 
 If it bothers you to document behavior that's not actually there yet,
 maybe you should go fix the bug ;-).  I had thought this would require
 a lot of refactoring, but it might work to just check if there's enough
 buffer space for the message and return zero immediately if not.

Had trouble comprehending Tom's wording so decided to word-smith a little:

The return value can be one of [-1, 0 , 1] whose meaning depends on whether
you are in blocking or non-blocking mode.  
In blocking mode a 1 means success and -1 means permanent failure; it is not
possible to receive a 0 in blocking mode - all calls will either succeed (1)
or fail (-1).
In non-blocking mode a 1 means that the termination message was placed in
the queue while a -1 means an immediate and permanent failure. A return
value of 0 means that the queue was full and that you need to try again at
the next wait-ready.
[?]While in non-blocking mode once you have received a return value of 1 you
will need to continually poll, at each wait-ready, functionPQflush/
until it returns 0 (queue empty) or -1 (failure).
The error associated with the -1 return can be retrieved using
functionPQerrorMessage/

[?]Isn't the whole PQflush comment simply repeating what is written
elsewhere?  The requirement is standard when working in non-blocking mode.

It does seem like the buffer full case would be fairly direct to
resolve...and I'm not sure what the explanation would be, in the doc above,
as to why 0 is not returned by the current implementation because (the
queue can never be full, or, even in non-blocking mode the system will block
to add to the queue in this function)

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/PQputCopyEnd-doesn-t-adhere-to-its-API-contract-tp5803240p5803297.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-08 Thread Tomas Vondra
On 8.5.2014 23:48, Andrew Dunstan wrote:
 
 On 05/08/2014 05:21 PM, Alvaro Herrera wrote:
 Andrew Dunstan wrote:

 I really don't get what your objection to the setup is. And no, I
 don't want them to run concurrently, I'd rather spread out the
 cycles.
 I wasn't objecting, merely an observation. Note that Tomas
 mentioned he's okay with running 4 builds at once. My main point
 here, really, is that having a larger number of animals shouldn't
 be an impediment for a more complex permutation of configurations,
 if he's okay with doing that. I assume you wouldn't object to my
 approving four extra animals running on the same machine, if Tomas
 wants to go for that.

So, if I get this right, the proposal is to have 7 animals:


1) all branches/locales, frequent builds (every few hours)
  magpie  - gcc
  fulmar  - icc
  treepie - clang

2) single branch/locale, CLOBBER, built once a week
  magpie2 - gcc
  fulmar2 - icc
  treepie - clang

3) single branch/locale, recursive CLOBBER, built once a month


I don't particularly mind the number of animals, although I was shooting
for lower number.

The only question is - should we use 3 animals for the recursive CLOBBER
too? I mean, one for each compiler?

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issue with GRANT/COMMENT ON FUNCTION with default

2014-05-08 Thread Jim Nasby
On 5/5/14, 3:22 PM, Alvaro Herrera wrote: 
 Jim Nasby wrote: 
 Prior to default parameters on functions, GRANT and COMMENT accepted full 
 parameter syntax. IE: 
 
 GRANT EXECUTE ON test(t text) TO public 
 
 as opposed to regprocedure, which only accepts the data types ( test(text), 
 not test(t text) ). 
 
 They do not accept DEFAULT though: 
 
 GRANT EXECUTE ON FUNCTION test(t text DEFAULT '') to public; 
 ERROR:  syntax error at or near DEFAULT 
 LINE 1: GRANT EXECUTE ON FUNCTION test(t text DEFAULT '') to public; 
 
 Presumably this is just an oversight? 
 
 I have to say that accepting DEFAULT there seems pretty odd to me.  What 
 if you specify the wrong default?  Do you get a no such function 
 error?  That would be pretty unhelpful.  But then accepting it ignoring 
 the fact that the default is wrong would be rather strange. 

We already have that exact problem with the name of the argument. 

decibel@decina.cashnetusa=# CREATE FUNCTION test(t text default '') RETURNS 
text LANGUAGE sql AS 'SELECT $1'; 
CREATE FUNCTION 
decibel@decina.cashnetusa=# GRANT EXECUTE ON FUNCTION test(baz text) to public; 
GRANT 
decibel@decina.cashnetusa=# 

 Related to that, is it intentional that the regprocedure cast 
 disallows *any* decorators to the function, other than type? If 
 regprocedure at least accepted the full function parameter definition 
 you could use it to get a definitive reference to a function. 
 
 Does pg_identify_object() give you what you want? 

No, because I'd need an OID, and I have no way of reliably getting that because 
the regprocedure cast won't even accept additional information beyond data type.
-- 
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compilation errors with mingw build caused by undefined optreset

2014-05-08 Thread Michael Paquier
On Fri, May 9, 2014 at 12:55 AM, Andrew Dunstan and...@dunslane.net wrote:
 On 05/08/2014 11:30 AM, Tom Lane wrote:
 Michael Paquier michael.paqu...@gmail.com writes:

 Since commit 60ff2fd introducing the centralizated getopt-related
 things in a global header file, build on Windows with mingw is failing

 Hm, buildfarm member narwhal doesn't seem to be having any such problem.
 (It's got its own issues, but not this.)
 jacana and frogmouth are also Mingw animals, and are not having issues.
Seems like the version I am using, which is the one actually specified
on the wiki here (
http://www.postgresql.org/docs/devel/static/installation-platform-notes.html#INSTALLATION-NOTES-MINGW)
is able to reproduce that...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compilation errors with mingw build caused by undefined optreset

2014-05-08 Thread Michael Paquier
On Fri, May 9, 2014 at 12:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Michael Paquier michael.paqu...@gmail.com writes:
 Since commit 60ff2fd introducing the centralizated getopt-related
 things in a global header file, build on Windows with mingw is failing

 Hm, buildfarm member narwhal doesn't seem to be having any such problem.
 (It's got its own issues, but not this.)

 Do you think you could put up a buildfarm critter using whatever version
 of mingw you're using?  Without buildfarm feedback, things *will* break
 regularly; Windows is just too weird to expect otherwise.
Yeah, after sleeping on this I had the same thought. I'll try to get
one working properly.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: popen and pclose redefinitions causing many warning in Windows build

2014-05-08 Thread Noah Misch
On Thu, May 08, 2014 at 12:14:44PM -0400, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
  I'm pretty sure we need this on Mingw - this SYSTEMQUOTE stuff dates 
  back well before 8.3, IIRC, which is when we first got full MSVC support.
 
 I tried googling for some info on this, and got a number of hits
 suggesting that mingw didn't emulate popen at all till pretty recently.
 For instance this:
 https://lists.fedoraproject.org/pipermail/fedora-mingw/2009-September/002087.html
 Jones is an ex-coworker of mine, and I'm pretty sure that if he said
 it wasn't there then it wasn't there.

I doubt MinGW has overridden popen() at runtime; that would be contrary to its
design criteria.  The headers, however, are MinGW territory.  MinGW declares
both _popen() and popen() as functions.  MinGW-w64, a project more distinct
from MinGW than it sounds, uses #define popen _popen:

MinGW: 
http://sourceforge.net/p/mingw/mingw-org-wsl/ci/master/tree/include/stdio.h#l467
MinGW-w64: 
http://sourceforge.net/p/mingw-w64/code/HEAD/tree/trunk/mingw-w64-headers/crt/stdio.h#l496

Building with any recent MinGW-w64, 32-bit or 64-bit, gets the reported
warnings; building with MinGW proper does not.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 release notes

2014-05-08 Thread Michael Paquier
On Fri, May 9, 2014 at 2:50 AM, MauMau maumau...@gmail.com wrote:
 From: Bruce Momjian br...@momjian.us

 I have completed the initial version of the 9.4 release notes.  You can
 view them here:

 http://www.postgresql.org/docs/devel/static/release-9-4.html

 Feedback expected and welcomed.  I expect to be modifying this until we
 release 9.4 final.  I have marked items where I need help with question
 marks.


 Could you add the following item, client-only installation on Windows, if
 it's appropriate for release note?  It will be useful for those like
 EnterpriseDB who develop products derived from PostgreSQL.

 https://commitfest.postgresql.org/action/patch_view?id=1326
+1. Thanks for pointing that out as well!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Robert Haas
On Thu, May 8, 2014 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I thought that the executor side of his patch wasn't in bad shape.  The
 real problems were in the planner, and indeed largely in the backend
 part of the planner where there's a lot of hard-wired logic for fixing up
 low-level details of the constructed plan tree.  It seems like in
 principle it might be possible to make that logic cleanly extensible,
 but it'll likely take a major rewrite.  The patch tried to skate by with
 just exposing a bunch of internal functions, which I don't think is a
 maintainable approach, either for the core or for the extensions using it.

Well, I consider that somewhat good news, because I think it would be
rather nice if we could get by with solving one problem at a time, and
if the executor part is close to being well-solved, excellent.

My ignorance is probably showing here, but I guess I don't understand
why it's so hard to deal with the planner side of things.  My
perhaps-naive impression is that a Seq Scan node, or even an Index
Scan node, is not all that complicated.  If we just want to inject
some more things that behave a lot like those into various baserels, I
guess I don't understand why that's especially hard.

Now I do understand that part of what KaiGai wants to do here is
inject custom scan paths as additional paths for *joinrels*.  And I
can see why that would be somewhat more complicated.  But I also don't
see why that's got to be part of the initial commit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Kouhei Kaigai
  I also think that there are really two separate problems here: getting
  the executor to call a custom scan node when it shows up in the plan
  tree; and figuring out how to get it into the plan tree in the first
  place.  I'm not sure we've properly separated those problems, and I'm
  not sure into which category the issues that sunk KaiGai's 9.4 patch
  fell.
 
 I thought that the executor side of his patch wasn't in bad shape.  The
 real problems were in the planner, and indeed largely in the backend
 part of the planner where there's a lot of hard-wired logic for fixing up
 low-level details of the constructed plan tree.  It seems like in principle
 it might be possible to make that logic cleanly extensible, but it'll likely
 take a major rewrite.  The patch tried to skate by with just exposing a
 bunch of internal functions, which I don't think is a maintainable approach,
 either for the core or for the extensions using it.
 
(I'm now trying to catch up the discussion last night...)

I initially intended to allow extensions to add their custom-path based
on their arbitrary decision, because the core backend cannot have
expectation towards the behavior of custom-plan.
However, of course, the custom-path that replaces built-in paths shall
have compatible behavior in spite of different implementation.

So, I'm inclined to the direction that custom-plan provider will inform
the core backend what they can do, and planner will give extensions more
practical information to construct custom path node.

Let me investigate how to handle join replacement by custom-path in the
planner stage.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Kouhei Kaigai
 On Thu, May 8, 2014 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I thought that the executor side of his patch wasn't in bad shape.
  The real problems were in the planner, and indeed largely in the backend
  part of the planner where there's a lot of hard-wired logic for fixing
  up low-level details of the constructed plan tree.  It seems like in
  principle it might be possible to make that logic cleanly extensible,
  but it'll likely take a major rewrite.  The patch tried to skate by
  with just exposing a bunch of internal functions, which I don't think
  is a maintainable approach, either for the core or for the extensions
 using it.
 
 Well, I consider that somewhat good news, because I think it would be rather
 nice if we could get by with solving one problem at a time, and if the 
 executor
 part is close to being well-solved, excellent.
 
 My ignorance is probably showing here, but I guess I don't understand why
 it's so hard to deal with the planner side of things.  My perhaps-naive
 impression is that a Seq Scan node, or even an Index Scan node, is not all
 that complicated.  If we just want to inject some more things that behave
 a lot like those into various baserels, I guess I don't understand why that's
 especially hard.
 
 Now I do understand that part of what KaiGai wants to do here is inject
 custom scan paths as additional paths for *joinrels*.  And I can see why
 that would be somewhat more complicated.  But I also don't see why that's
 got to be part of the initial commit.
 
I'd also like to take this approach. Even though we eventually need to take
a graceful approach for join replacement by custom-path, it partially makes
sense to have minimum functionality set first.

Then, we can focus on how to design planner integration for joinning.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Kouhei Kaigai
 On Thu, May 8, 2014 at 6:34 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
  Umm... I'm now missing the direction towards my goal.
  What approach is the best way to glue PostgreSQL and PGStrom?
 
 I haven't really paid any attention to PGStrom. Perhaps it's just that I
 missed it, but I would find it useful if you could direct me towards a
 benchmark or something like that, that demonstrates a representative
 scenario in which the facilities that PGStrom offers are compelling compared
 to traditional strategies already implemented in Postgres and other
 systems.
 
Implementation of Hash-Join on GPU side is still under development.

Only available use-case right now is an alternative scan path towards
full table scan in case when a table contains massive amount of records
and qualifiers are enough complicated.

EXPLAIN command below is, a sequential scan towards a table that contains
80M records (all of them are on memory; no disk accesses during execution).
Nvidia's GT640 takes advantages towards single threaded Core i5 4570S, at
least.


postgres=# explain (analyze) select count(*) from t1 where sqrt((x-20.0)^2 + 
(y-20.0)^2)  10;
QUERY 
PLAN
--
 Aggregate  (cost=10003175757.67..10003175757.68 rows=1 width=0) (actual 
time=46648.635..46648.635 rows=1 loops=1)
   -  Seq Scan on t1  (cost=100.00..10003109091.00 rows=2667 
width=0) (actual time=0.047..46351.567 rows=2513814 loops=1)
 Filter: (sqrtx - 20::double precision) ^ 2::double precision) + 
((y - 20::double precision) ^ 2::double precision)))  10::double precision)
 Rows Removed by Filter: 77486186
 Planning time: 0.066 ms
 Total runtime: 46648.668 ms
(6 rows)
postgres=# set pg_strom.enabled = on;
SET
postgres=# explain (analyze) select count(*) from t1 where sqrt((x-20.0)^2 + 
(y-20.0)^2)  10;
   
QUERY PLAN
-
 Aggregate  (cost=1274424.33..1274424.34 rows=1 width=0) (actual 
time=1784.729..1784.729 rows=1 loops=1)
   -  Custom (GpuScan) on t1  (cost=1.00..1207757.67 rows=2667 
width=0) (actual time=179.748..1567.018 rows=2513699 loops=1)
 Host References:
 Device References: x, y
 Device Filter: (sqrtx - 20::double precision) ^ 2::double 
precision) + ((y - 20::double precision) ^ 2::double precision)))  10::double 
precision)
 Total time to load: 0.231 ms
 Avg time in send-mq: 0.027 ms
 Max time to build kernel: 1.064 ms
 Avg time of DMA send: 3.050 ms
 Total time of DMA send: 933.318 ms
 Avg time of kernel exec: 5.117 ms
 Total time of kernel exec: 1565.799 ms
 Avg time of DMA recv: 0.086 ms
 Total time of DMA recv: 26.289 ms
 Avg time in recv-mq: 0.011 ms
 Planning time: 0.094 ms
 Total runtime: 1784.793 ms
(17 rows)


 If I wanted to make joins faster, personally, I would look at opportunities
 to optimize our existing hash joins to take better advantage of modern CPU
 characteristics. A lot of the research suggests that it may be useful to
 implement techniques that take better advantage of available memory
 bandwidth through techniques like prefetching and partitioning, perhaps
 even (counter-intuitively) at the expense of compute bandwidth. It's
 possible that it just needs to be explained to me, but, with respect,
 intuitively I have a hard time imagining that offloading joins to the GPU
 will help much in the general case. Every paper on joins from the last decade
 talks a lot about memory bandwidth and memory latency. Are you concerned
 with some specific case that I may have missed? In what scenario might a
 cost-based optimizer reasonably prefer a custom join node implemented by
 PgStrom, over any of the existing join node types? It's entirely possible
 that I simply missed relevant discussions here.
 
If our purpose is to consume 100% capacity of GPU device, memory bandwidth
is troublesome. But I'm not interested in GPU benchmarking.
Things I want to do is, accelerate complicated query processing than existing
RDBMS, with cheap in cost and transparent to existing application approach.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
 On 8 May 2014 20:40, Robert Haas robertmh...@gmail.com wrote:
  For my money, we'd be better off
  getting some kind of basic custom scan node functionality committed
  first, even if the cases where you can actually inject them into real
  plans are highly restricted.  Then, we could later work on adding more
  ways to inject them in more places.
 
 We're past the prototyping stage and into productionising what we know
 works, AFAIK. If that point is not clear, then we need to discuss that
 first.
 
 At the moment the Custom join hook is called every time we attempt to
 cost a join, with no restriction.
 
 I would like to highly restrict this, so that we only consider a
 CustomJoin node when we have previously said one might be usable and
 the user has requested this (e.g. enable_foojoin = on)

This is part of what I disagree with- I'd rather not require users to
know and understand when they want to do a HashJoin vs. a MergeJoin vs.
a CustomJoinTypeX.

 We only consider merge joins if the join uses operators with oprcanmerge=true.
 We only consider hash joins if the join uses operators with oprcanhash=true

I wouldn't consider those generally user-facing options, and the
enable_X counterparts are intended for debugging and not to be used in
production environments.  To the point you make in the other thread- I'm
fine w/ having similar cost-based enable_X options, but I think we're
doing our users a disservice if we require that they populate or update
a table.  Perhaps an extension needs to do that on installation, but
that would need to enable everything to avoid the user having to mess
around with the table.

 So it seems reasonable to have a way to define/declare what is
 possible and what is not. But my take is that adding a new column to
 pg_operator for every CustomJoin node is probably out of the question,
 hence my suggestion to list the operators we know it can work with.

It does seem like there should be some work done in this area, as Tom
mentioned, to avoid needing to have more columns to track how equality
can be done.  I do wonder just how we'd deal with this when it comes to
GPUs as, presumably, the code to implement the equality for various
types would have to be written in CUDA-or-whatever.

 Given that everything else in Postgres is agnostic and configurable,
 I'm looking to do the same here.

It's certainly a neat idea, but I do have concerns (which appear to be
shared by others) about just how practical it'll be and how much rework
it'd take and the question about if it'd really be used in the end..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Kouhei Kaigai
  So it seems reasonable to have a way to define/declare what is
  possible and what is not. But my take is that adding a new column to
  pg_operator for every CustomJoin node is probably out of the question,
  hence my suggestion to list the operators we know it can work with.
 
 It does seem like there should be some work done in this area, as Tom 
 mentioned,
 to avoid needing to have more columns to track how equality can be done.
 I do wonder just how we'd deal with this when it comes to GPUs as, presumably,
 the code to implement the equality for various types would have to be written
 in CUDA-or-whatever.
 
GPU has workloads likes and dislikes. It is a reasonable idea to list up
operators (or something else) that have advantage to run on custom-path.
For example, numeric calculation on fixed-length variables has greate
advantage on GPU, but locale aware text matching is not a workload suitable
to GPUs.
It may be a good hint for planner to pick up candidate paths to be considered.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Peter Geoghegan (p...@heroku.com) wrote:
 On Thu, May 8, 2014 at 6:34 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
  Umm... I'm now missing the direction towards my goal.
  What approach is the best way to glue PostgreSQL and PGStrom?
 
 I haven't really paid any attention to PGStrom. Perhaps it's just that
 I missed it, but I would find it useful if you could direct me towards
 a benchmark or something like that, that demonstrates a representative
 scenario in which the facilities that PGStrom offers are compelling
 compared to traditional strategies already implemented in Postgres and
 other systems.

I agree that some concrete evidence would be really nice.  I
more-or-less took KaiGai's word on it, but having actual benchmarks
would certainly be better.

 If I wanted to make joins faster, personally, I would look at
 opportunities to optimize our existing hash joins to take better
 advantage of modern CPU characteristics.

Yeah, I'm pretty confident we're leaving a fair bit on the table right
there based on my previous investigation into this area.  There were
easily cases which showed a 3x improvement, as I recall (the trade-off
being increased memory usage for a larger, sparser hash table).  Sadly,
there were also cases which ended up being worse and it seemed to be
very sensetive to the size of the hash table which ends up being built
and the size of the on-CPU cache.

 A lot of the research
 suggests that it may be useful to implement techniques that take
 better advantage of available memory bandwidth through techniques like
 prefetching and partitioning, perhaps even (counter-intuitively) at
 the expense of compute bandwidth.

While I agree with this, one of the big things about GPUs is that they
operate in a highly parallel fashion and across a different CPU/Memory
architecture than what we're used to (for starters, everything is much
closer).  In a traditional memory system, there's a lot of back and
forth to memory, but a single memory dump over to the GPU's memory where
everything is processed in a highly parallel way and then shipped back
wholesale to main memory is at least conceivably faster.

Of course, things will change when we are able to parallelize joins
across multiple CPUs ourselves..  In a way, the PGStrom approach gets to
cheat us today, since it can parallelize the work where core can't and
that ends up not being an entirely fair comparison.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 Well, I consider that somewhat good news, because I think it would be
 rather nice if we could get by with solving one problem at a time, and
 if the executor part is close to being well-solved, excellent.

Sadly, I'm afraid the news really isn't all that good in the end..

 My ignorance is probably showing here, but I guess I don't understand
 why it's so hard to deal with the planner side of things.  My
 perhaps-naive impression is that a Seq Scan node, or even an Index
 Scan node, is not all that complicated.  If we just want to inject
 some more things that behave a lot like those into various baserels, I
 guess I don't understand why that's especially hard.

That's not what is being asked for here though...

 Now I do understand that part of what KaiGai wants to do here is
 inject custom scan paths as additional paths for *joinrels*.  And I
 can see why that would be somewhat more complicated.  But I also don't
 see why that's got to be part of the initial commit.

I'd say it's more than part of what the goal is here- it's more or
less what everything boils down to.  Oh, plus being able to replace
aggregates with a GPU-based operation instead, but that's no trivially
done thing either really (if it is, let's get it done for FDWs
already...).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 I initially intended to allow extensions to add their custom-path based
 on their arbitrary decision, because the core backend cannot have
 expectation towards the behavior of custom-plan.
 However, of course, the custom-path that replaces built-in paths shall
 have compatible behavior in spite of different implementation.

I didn't ask this before but it's been on my mind for a while- how will
this work for custom data types, ala the 'geometry' type from PostGIS?
There's user-provided code that we have to execute to check equality for
those, but they're not giving us CUDA code to run to perform that
equality...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Kouhei Kaigai
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  I initially intended to allow extensions to add their custom-path
  based on their arbitrary decision, because the core backend cannot
  have expectation towards the behavior of custom-plan.
  However, of course, the custom-path that replaces built-in paths shall
  have compatible behavior in spite of different implementation.
 
 I didn't ask this before but it's been on my mind for a while- how will
 this work for custom data types, ala the 'geometry' type from PostGIS?
 There's user-provided code that we have to execute to check equality for
 those, but they're not giving us CUDA code to run to perform that equality...
 
If custom-plan provider support the user-defined data types such as PostGIS,
it will be able to pick up these data types also, in addition to built-in
ones. It fully depends on coverage of the extension.
If not a supported data type, it is not a show-time of GPUs.

In my case, if PG-Strom can also have compatible code, but runnable on OpenCL,
of them, it will say yes, I can handle this data type.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 GPU has workloads likes and dislikes. It is a reasonable idea to list up
 operators (or something else) that have advantage to run on custom-path.
 For example, numeric calculation on fixed-length variables has greate
 advantage on GPU, but locale aware text matching is not a workload suitable
 to GPUs.

Right- but this points out exactly what I was trying to bring up.

Locale-aware text matching requires running libc-provided code, which
isn't going to happen on the GPU (unless we re-implement it...).
Aren't we going to have the same problem with the 'numeric' type?  Our
existing functions won't be usable on the GPU and we'd have to
re-implement them and then make darn sure that they produce the same
results...

We'll also have to worry about any cases where we have a libc function
and a CUDA function and convince ourselves that there's no difference
between the two..  Not sure exactly how we'd built this kind of
knowledge into the system through a catalog (I tend to doubt that'd
work, in fact) and trying to make it work from an extension in a way
that it would work with *other* extensions strikes me as highly
unlikely.  Perhaps the extension could provide the core types and the
other extensions could provide their own bits to hook into the right
places, but that sure seems fragile.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  I didn't ask this before but it's been on my mind for a while- how will
  this work for custom data types, ala the 'geometry' type from PostGIS?
  There's user-provided code that we have to execute to check equality for
  those, but they're not giving us CUDA code to run to perform that 
  equality...
  
 If custom-plan provider support the user-defined data types such as PostGIS,
 it will be able to pick up these data types also, in addition to built-in
 ones. It fully depends on coverage of the extension.
 If not a supported data type, it is not a show-time of GPUs.

So the extension will need to be aware of all custom data types and then
installed *after* all other extensions are installed?  That doesn't
strike me as workable...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Kouhei Kaigai
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
   I didn't ask this before but it's been on my mind for a while- how
   will this work for custom data types, ala the 'geometry' type from
 PostGIS?
   There's user-provided code that we have to execute to check equality
   for those, but they're not giving us CUDA code to run to perform that
 equality...
  
  If custom-plan provider support the user-defined data types such as
  PostGIS, it will be able to pick up these data types also, in addition
  to built-in ones. It fully depends on coverage of the extension.
  If not a supported data type, it is not a show-time of GPUs.
 
 So the extension will need to be aware of all custom data types and then
 installed *after* all other extensions are installed?  That doesn't strike
 me as workable...
 
I'm not certain why do you think an extension will need to support all
the data types.
Even if it works only for a particular set of data types, it makes sense
as long as it covers data types user actually using.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Peter Geoghegan
On Thu, May 8, 2014 at 7:13 PM, Stephen Frost sfr...@snowman.net wrote:
 Of course, things will change when we are able to parallelize joins
 across multiple CPUs ourselves..  In a way, the PGStrom approach gets to
 cheat us today, since it can parallelize the work where core can't and
 that ends up not being an entirely fair comparison.

I was thinking of SIMD, along similar lines. We might be able to cheat
our way out of having to solve some of the difficult problems of
parallelism that way. For example, if you can build a SIMD-friendly
bitonic mergesort, and combine that with poor man's normalized keys,
that could make merge joins on text faster. That's pure speculation,
but it seems like an interesting possibility.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  So the extension will need to be aware of all custom data types and then
  installed *after* all other extensions are installed?  That doesn't strike
  me as workable...
  
 I'm not certain why do you think an extension will need to support all
 the data types.

Mostly because we have a very nice extension system which quite a few
different extensions make use of and it'd be pretty darn unfortunate if
none of them could take advtange of GPUs because we decided that the
right way to support GPUs was through an extension.

This is argument which might be familiar to some as it was part of the
reason that json and jsonb were added to core, imv...

 Even if it works only for a particular set of data types, it makes sense
 as long as it covers data types user actually using.

I know quite a few users of PostGIS, ip4r, and hstore...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [v9.5] Custom Plan API

2014-05-08 Thread Robert Haas
On Thu, May 8, 2014 at 10:16 PM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 Well, I consider that somewhat good news, because I think it would be
 rather nice if we could get by with solving one problem at a time, and
 if the executor part is close to being well-solved, excellent.

 Sadly, I'm afraid the news really isn't all that good in the end..

 My ignorance is probably showing here, but I guess I don't understand
 why it's so hard to deal with the planner side of things.  My
 perhaps-naive impression is that a Seq Scan node, or even an Index
 Scan node, is not all that complicated.  If we just want to inject
 some more things that behave a lot like those into various baserels, I
 guess I don't understand why that's especially hard.

 That's not what is being asked for here though...

I am not sure what your point is here.  Here's mine: if we can strip
this down to the executor support plus the most minimal planner
support possible, we might be able to get *something* committed.  Then
we can extend it in subsequent commits.

You seem to be saying there's no value in getting anything committed
unless it handles the scan-substituting-for-join case.  I don't agree.
 Incremental commits are good, whether they get you all the way to
where you want to be or not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >