Re: [HACKERS] Free indexed_tlist memory explicitly within set_plan_refs()

2015-07-12 Thread Andres Freund
On 2015-05-28 14:37:43 -0700, Peter Geoghegan wrote:
 To fix, allow ParseState to reflect that an individual statement can be
 both p_is_insert and p_is_update at the same time.

   /* Process DO UPDATE */
   if (onConflictClause-action == ONCONFLICT_UPDATE)
   {
 + /* p_is_update must be set here, after INSERT targetlist 
 processing */
 + pstate-p_is_update = true;
 +

It's not particularly pretty that you document in the commit message
that both is_insert and is_update can be set at the same time, and then
it has constraints like the above.

But that's more crummy API's fault than yours.

I'm right now not really coming up with a better idea how to fix
this. So I guess I'll apply something close to this tomorrow.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Free indexed_tlist memory explicitly within set_plan_refs()

2015-07-12 Thread Peter Geoghegan
On Sun, Jul 12, 2015 at 1:45 PM, Andres Freund and...@anarazel.de wrote:
 But that's more crummy API's fault than yours.

As you probably noticed, the only reason the p_is_update and
p_is_insert fields exist is for transformAssignedExpr() -- in fact, in
the master branch, nothing checks the value of p_is_update (although I
suppose a hook into a third-party module could see that module test
ParseState.p_is_update, so that isn't quite true).

 I'm right now not really coming up with a better idea how to fix
 this. So I guess I'll apply something close to this tomorrow.

Sounds good.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22

2015-07-12 Thread Noah Misch
On Sun, Jul 12, 2015 at 05:02:51PM -0400, Andrew Dunstan wrote:
 Marco Atzeri marco.atz...@gmail.com writes:
 for what I see the hstore_plperl link has a double problem.
 It requires a link to hstore
 as it also requires a link to perl.
 Attached patch for solving this and a similar issue with python.
 +ifeq ($(PORTNAME), cygwin)
 +# This means we need an in-tree build on Windows, not a pgxs build
 +SHLIB_LINK += -L../hstore -lhstore -L$(perl_archlibexp)/CORE -lperl
 +endif

That's the right general strategy, agreed.

 Unless there is further argument I propose to commit something very like
 Marco's suggestion for hstore_plperl, hstore_plpython and ltree_plpython

Would you post the final patch for review?

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22

2015-07-12 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 07/04/2015 11:02 AM, Tom Lane wrote:
 It's not apparent to me how that works at all.

 BTW, the .a files being linked to above are not like Unix .a static 
 archives - they are import library files, which I think they are only 
 used at link time, not run time. The path to the DLLs isn't being hardcoded.

Ah, I see.  So then what Marco is suggesting is copying a coding pattern
that does work.

 Unless there is further argument I propose to commit something very like 
 Marco's suggestion for hstore_plperl, hstore_plpython and ltree_plpython

No objection here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] creating extension including dependencies

2015-07-12 Thread David Fetter
On Tue, Jul 07, 2015 at 10:14:49AM -0700, David E. Wheeler wrote:
 On Jul 7, 2015, at 6:41 AM, Andres Freund and...@anarazel.de wrote:
 
  At the minimum I'd like to see that CREATE EXTENSION foo; would
  install install extension 'bar' if foo dependended on 'bar' if
  CASCADE is specified. Right now we always error out saying that
  the dependency on 'bar' is not fullfilled - not particularly
  helpful.
 
 +1
 
 If `yum install foo` also installs bar, and `pgxn install foo`
 downloads, builds, and installs bar, it makes sense to me that
 `CREATE EXTENSION foo` would install bar if it was available, and
 complain if it wasn’t.

This is this baseline sane behavior.  Getting the full dependency
tree, although it would be very handy, would require more
infrastructure than we have now.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for a couple of resource leaks

2015-07-12 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes:
 Coverity is pointing out a couple of resource leaks:
 ...
 Attached is a patch to address all those things. Backpatches would be good
 to have as well.

I'll take this up, unless some other committer is already on it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22

2015-07-12 Thread Andrew Dunstan


On 07/04/2015 01:09 PM, Andrew Dunstan wrote:


On 07/04/2015 11:02 AM, Tom Lane wrote:

Marco Atzeri marco.atz...@gmail.com writes:

for what I see the hstore_plperl link has a double problem.
It requires a link to hstore
as it also requires a link to perl.
Attached patch for solving this and a similar issue with python.
+ifeq ($(PORTNAME), cygwin)
+# This means we need an in-tree build on Windows, not a pgxs build
+SHLIB_LINK += -L../hstore -lhstore -L$(perl_archlibexp)/CORE -lperl
+endif
[ and likewise for the other contrib transform modules ]

I wondered how come we had not seen this problem in the buildfarm,
but the answer appears to be that our only working Cygwin critter
(brolga) doesn't build any of the optional PLs, so it skips these
modules altogether.  Seems like we need to improve that situation.

Also, I noted that the regular win32 path in these makefiles
says, eg,

ifeq ($(PORTNAME), win32)
# these settings are the same as for plperl
override CPPFLAGS += -DPLPERL_HAVE_UID_GID -Wno-comment
# This means we need an in-tree build on Windows, not a pgxs build
SHLIB_LINK += ../hstore/libhstore.a $(wildcard 
../../src/pl/plperl/libperl*.a)

endif

It's not apparent to me how that works at all.  It seems to specify
hard-linking a copy of hstore as well as a copy of plperl into the
shlib for hstore_plperl.  Then at runtime, there will *also* be the
hstore and plperl shlibs in memory.  At best that means substantial
memory bloat, but it seems likely to me that it would fail altogether,
particular for plperl which has got a substantial amount of 
semantically-

important static storage.  Two copies of that storage will not end well.






Windows finds the DLL in its path. I just tested this by removing the 
intree pl DLLs after installing and then running contrib installcheck, 
and it worked fine. Whether or not it actually loads the DLL twice 
when it can find the intree DLL I don't know for sure, maybe someone 
with more Windows-fu than me can inform our ignorance.




BTW, the .a files being linked to above are not like Unix .a static 
archives - they are import library files, which I think they are only 
used at link time, not run time. The path to the DLLs isn't being hardcoded.


Unless there is further argument I propose to commit something very like 
Marco's suggestion for hstore_plperl, hstore_plpython and ltree_plpython


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One question about security label command

2015-07-12 Thread Adam Brightwell
Stephen,

 Stephen, would you have the time to review this patch, and commit if
 appropriate, please? And if you could set up the buildfarm animal to run
 this, even better.

I gave this a quick review/test against master (0a0fe2f).  Everything
builds and installs as would be expected.

All of the context changes make sense to me.  However, I am still
seeing some errors in the regression tests.  The errors look like they
are solely based around log messages and not 'functionality'.  I have
attached the 'regression.diffs' for reference.

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


regression.diffs
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for a couple of resource leaks

2015-07-12 Thread Michael Paquier
On Mon, Jul 13, 2015 at 4:06 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Michael Paquier michael.paqu...@gmail.com writes:
 Coverity is pointing out a couple of resource leaks:
 ...
 Attached is a patch to address all those things. Backpatches would be good
 to have as well.

 I'll take this up, unless some other committer is already on it.

Thanks for taking up those things! Next time I'll mention as well
until which version each fix should be backpatched to facilitate
committer's work. That was obviously bad of me to not mention it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] optimizing vacuum truncation scans

2015-07-12 Thread Haribabu Kommi
On Thu, Jul 9, 2015 at 5:36 PM, Haribabu Kommi kommi.harib...@gmail.com wrote:

 I will do some performance tests and send you the results.

Here are the performance results tested on my machine.


 Head  vm patchvm+prefetch patch

First vacuum120sec1sec 1sec
second vacuum180 sec   180 sec30 sec

I did some modifications in the code to skip the vacuum truncation by
the first vacuum command.
This way I collected the second vacuum time taken performance.

I just combined your vm and prefetch patch into a single patch
vm+prefetch patch without a GUC.
I kept the prefetch as 32 and did the performance test. I chosen
prefetch based on the current
buffer access strategy, which is 32 for vacuum presently instead of an
user option.
Here I attached the modified patch with both vm+prefetch logic.

I will do some tests on a machine with SSD and let you know the
results. Based on these results,
we can decide whether we need a GUC or not? based on the impact of
prefetch on ssd machines.

Regards,
Hari Babu
Fujitsu Australia


vac_trunc_trust_vm_and_prefetch.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] anole: assorted stability problems

2015-07-12 Thread Robert Haas
On Tue, Jul 7, 2015 at 7:25 AM, Andres Freund and...@anarazel.de wrote:
 On 2015-06-30 11:35:56 +0200, Andres Freund wrote:
 On 2015-06-29 22:58:05 -0400, Tom Lane wrote:
  So personally, I would be inclined to put back the volatile qualifier,
  independently of any fooling around with _Asm_double_magic_xyzzy
  calls.

 I'm not sure. I think the reliance on an explicit memory barrier is a
 lot more robust and easy to understand than some barely documented odd
 behaviour around volatile. On the other hand the old way worked for a
 long while.

 I'm inclined to just do both on platforms as odd as IA6. But it'd like
 to let anole run with the current set a bit longer - if it doesn't work
 we have more problems than just S_UNLOCK(). It seems EDB has increased
 the run rate for now, so it shouldn't take too long:
 http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=anolebr=HEAD

 So, it's starting to look good. Not exactly allowing for a lot of
 confidence yet, but still:
 http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=anolebr=HEAD

 I'm inclined to simply revise the comments now, and *not* reintroduce
 the volatile. The assumptions documented in:

 /*
  * Intel Itanium, gcc or Intel's compiler.
  *
  * Itanium has weak memory ordering, but we rely on the compiler to enforce
  * strict ordering of accesses to volatile data.  In particular, while the
  * xchg instruction implicitly acts as a memory barrier with 'acquire'
  * semantics, we do not have an explicit memory fence instruction in the
  * S_UNLOCK macro.  We use a regular assignment to clear the spinlock, and
  * trust that the compiler marks the generated store instruction with the
  * .rel opcode.
  *
  * Testing shows that assumption to hold on gcc, although I could not find
  * any explicit statement on that in the gcc manual.  In Intel's compiler,
  * the -m[no-]serialize-volatile option controls that, and testing shows that
  * it is enabled by default.
  */

 don't sound exactly bullet proof to me. I also personally find explicit
 barriers easier to understand in the light of all the other spinlock
 implementations.

 Comments?

I'm fine with that plan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape

2015-07-12 Thread Tom Lane
I wrote:
 The two contrib modules this patch added are nowhere near fit for public
 consumption.  They cannot clean up after themselves when dropped:
 ...
 Raw inserts into system catalogs just
 aren't a sane thing to do in extensions.

I had some thoughts about how we might fix that, without going to the
rather tedious lengths of creating a complete set of DDL infrastructure
for CREATE/DROP TABLESAMPLE METHOD.

First off, the extension API designed for the tablesample patch is
evidently modeled on the index AM API, which was not a particularly good
precedent --- it's not a coincidence that index AMs can't be added or
dropped on-the-fly.  Modeling a server internal API as a set of
SQL-visible functions is problematic when the call signatures of those
functions can't be accurately described by SQL datatypes, and it's rather
pointless and inefficient when none of the functions in question is meant
to be SQL-callable.  It's even less attractive if you don't think you've
got a completely stable API spec, because adding, dropping, or changing
signature of any one of the API functions then involves a pile of
easy-to-mess-up catalog changes on top of the actually useful work.
Not to mention then having to think about backwards compatibility of
your CREATE command's arguments.

We have a far better model to follow already, namely the foreign data
wrapper API.  In that, there's a single SQL-visible function that returns
a dummy datatype indicating that it's an FDW handler, and when called,
it hands back a struct containing pointers to all the other functions
that the particular wrapper needs to supply (and, if necessary, the struct
could have non-function-pointer fields containing other info the core
system might need to know about the handler).  We could similarly invent a
pseudotype tablesample_handler and represent each tablesample method by
a single SQL function returning tablesample_handler.  Any future changes
in the API for tablesample handlers would then appear as changes in the C
definition of the struct returned by the handler, which requires no
SQL-visible thrashing, hence creates no headaches for pg_upgrade, and it
is pretty easy to design it so that failure to update an external module
that implements the API results in C compiler errors and/or sane fallback
behavior.

Once we've done that, I think we don't even need a special catalog
representing tablesample methods.  Given FROM foo TABLESAMPLE
bernoulli(...), the parser could just look for a function bernoulli()
returning tablesample_handler, and it's done.  The querytree would have an
ordinary function dependency on that function, which would be sufficient
to handle DROP dependency behaviors properly.  (On reflection, maybe
better if it's bernoulli(internal) returns tablesample_handler,
so as to guarantee that such a function doesn't create a conflict with
any user-defined function of the same name.)

Thoughts?

regards, tom lane

PS: now that I've written this rant, I wonder why we don't redesign the
index AM API along the same lines.  It probably doesn't matter much at
the moment, but if we ever get serious about supporting index AM
extensions, I think we ought to consider doing that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A little RLS oversight?

2015-07-12 Thread Michael Paquier
On Sun, Jul 12, 2015 at 5:59 PM, Yaroslav wrote:
 I can still see all statistics for 'test' in pg_stats under unprivileged
 user.

Indeed, this looks like an oversight of RLS. Even if a policy is
defined to prevent a user from seeing the rows of other users, it is
still possible to get some information though this view.
I am adding an open item regarding that for 9.5.

 I'd prefer statistics on RLS-enabled tables to be simply hidden completely
 for unprivileged users.

This looks like something simple enough to do.
@Stephen: perhaps you have some thoughts on the matter? Currently
pg_stats breaks its promise to only show information about the rows
current user can read.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Generalized JSON output functions

2015-07-12 Thread Pavel Stehule
2015-07-12 10:29 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de
:

 On Jul 11, 2015 8:41 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 
  There is simple rule - be strict on output and tolerant on input. If I
 understand to sense of this patch - the target is one same format of JSON
 documents - so there are no space for any variability.

 So, would you prefer explain json format on a single line - no indentation
 or whitespace whatsoever?

yes, - if you need pretty format - there is function json_pretty - any more
styles is wrong on Postgres side.


 This far it was only about whitespace, but it can be useful for tweaking
 other aspects of output, as I've mentioned before.

Postgres is database, not presentation server - it have to to any database
operations, quickly as possible - and formatting is part of client side.

 I can imagine the ability for 3rd-party code to override certain aspects
 of the output would be really useful for extensions or background workers,
 decoding plugins, etc.

we talking about output - I can imagine, so there is only two possibilities
- plain join, and pretty formatted join (but with only one style).



  I am thinking so general json functions has sense, but I partially
 disagree with your implementation.

 Then what would you differently exactly?

simple code.



 --
 Alex



Re: [HACKERS] Re: Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)

2015-07-12 Thread Michael Paquier
On Sat, Jul 11, 2015 at 9:28 PM, Andres Freund and...@anarazel.de wrote:

 On 2015-07-11 21:09:05 +0900, Michael Paquier wrote:
  Something like the patches attached

 Thanks for that!

  could be considered, one is for master
  and REL9_5_STABLE to remove ssl_renegotiation_limit, the second one for
  ~REL9_4_STABLE to change the default to 0.

  diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
  index c669f75..16c0ce5 100644
  --- a/doc/src/sgml/config.sgml
  +++ b/doc/src/sgml/config.sgml
  @@ -1040,7 +1040,7 @@ include_dir 'conf.d'
   cryptanalysis when large amounts of traffic can be examined,
 but it
   also carries a large performance penalty. The sum of sent and
 received
   traffic is used to check the limit. If this parameter is set to
 0,
  -renegotiation is disabled. The default is literal512MB/.
  +renegotiation is disabled. The default is literal0/.

 I think we should put in a warning or at least note about the dangers of
 enabling it (connection breaks, exposure to several open openssl bugs).


This sounds like a good idea to me. Here is an idea:
+   warning
+para
+ Enabling varnamessl_renegotiation_limit/ can cause various
+ problems endangering the stability of a productnamePostgreSQL/
+ instance like connection breaking suddendly and exposes the
+ server to bugs related to the internal implementation of
renegotiation
+ done in the SSL libraries used.
+/para
+   /warning
Attached is v2 for ~9.4.
Regards,
-- 
Michael


20150712_ssl_renegotiation_remove-94_v2.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fixes for a couple of resource leaks

2015-07-12 Thread Michael Paquier
Hi all,

Coverity is pointing out a couple of resource leaks:
- In DropReplicationSlot@streamutil.c, query is leaked.
- In getTransforms@pg_dump.c, the alloced result of get_language_name is
not free'd. Other code paths calling this routine do the job.
- In libpqGetCurrentXlogInsertLocation@libpq_fetch.c (pg_rewind), the value
val returned by run_simple_query should be free'd. Other code paths do so.
- In libpqProcessFileList@libpq_fetch.c, a call to PQclear is missing for a
query result.
- In vacuum_one_database@vacuumdb.c, a call to PQclear is missing.
Attached is a patch to address all those things. Backpatches would be good
to have as well.
Regards,
-- 
Michael


20150712_memory_leaks.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Could be improved point of UPSERT

2015-07-12 Thread Yourfriend
Hi, Hackers,

The feature of UPSERT  was my most interested one of 9.5, I really like
need it.

I have test the different usages for this features like one record input,
multi records input,
and also more than 10,000 records upserting, all look great, thanks for
your work.

When I checked my records from these tests, I found that there was one
result that might be
improved, the issue is,  although there is no new records added to the
table when conflict happens,
but the sequence for this table had been touched, so when a new record is
actually  added after that,
the sequence will skip the numbers when it was touched, then we get a not
reasonable result (from my opinion). The scenario is as following:

1, System:  PostgreSQL 9.5 Alpha + Win7 X64
2, Enter pgadmin:
3, create table foobar (
sysid serial,
theprovince varchar(20),
themonth  varchar(7),
therevenue integer default 0,
primary key (sysid),
unique (theprovince,themonth)
)

4, insert into foobar values
  ('BeiJing','2015-01',1234),
  ('ShangHai','2015-01',1927)
5, select * from foobar ;
sysid   theprovince  themonth   therevenue
   1  Beijing   2015-01 1234
2 ShangHai   2015-01 1927

6, insert into foobar values
  ('BeiJing','2015-01',1999),
  ('ShangHai','2015-01',1988)
 on conflict (theprovince,themonth) do update set
therevenue=excluded.therevenue;

7, select * from foobar ;
sysid   theprovince  themonth   therevenue
   1  Beijing   2015-01 1999
   2 ShangHai   2015-01 1988
8, insert into foobar values
  ('TianJing','2015-01',1888)

9, select * from foobar ;
sysid   theprovince  themonth   therevenue
   1  Beijing   2015-01 1999
   2 ShangHai   2015-01 1988
   5  TiangJing   2015-01 1888
---
Not reasonable result or issue:
   The third record of TianJing in the table gets the SYSID of 5, as the
sequence was accessed
twice by the step of 6.

Suggestion:  When a conflict was found for UPSERT, don't access the
sequence, so users can have a reasonable list of ID.


Regards,

Daojing Zhou.


Re: [HACKERS] drop/truncate table sucks for large values of shared buffers

2015-07-12 Thread Amit Kapila
On Thu, Jul 2, 2015 at 7:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Simon Riggs si...@2ndquadrant.com writes:
  On 2 July 2015 at 14:08, Heikki Linnakangas hlinn...@iki.fi wrote:
  I'm marking this as returned with feedback in the commitfest.

  There are no unresolved issues with the approach, nor is it true it is
  slower. If you think there are some, you should say what they are, not
act
  high handedly to reject a patch without reason.

 Have you read the thread?

I have read the thread again with a focus on the various problems
(objections) raised for this patch and here is a summarization of
my reading.

1. lseek can lie about EOF which could lead to serious problem
during checkpoint if drop table misses to remove the shared buffer
belonging to the table-to-be-dropped.

This problem can be solved by maintaining the list of dropped relations
and then while checkpoint clean such buffers during buffer-pool scan.
Something similar is already used to avoid similar problems during
FSync.

2. Patch doesn't cater to DROP TABLESPACE and DROP DATABASE
operations.

It would have been better if there could be a simpler solution for these
operations as well, but even if we have something that generic to
avoid problems for these operations, I think there is no reason why it
can't be easily adopted for Drop Table operation as the changes
proposed by this patch are very much localized to one function (which
we have to anyway change even without patch if we come-up with
a generic mechanism), the other changes required to avoid the
problem-1 (lseek problem) would still be required even when we
have patch for generic approach ready.  As mentioned by Andrew,
another thing to note is that these operations are much less used
as compare to Drop/Truncate Table, so I think optimzing these
are of somewhat lower priority.


3. Can't close-and-open a file (to avoid lseek lie about EOF or
otherwise) as that might lead to a failure if there is flush operation
for file in parallel.

I haven't checked about this, but I think we can find some way
to check if vm and fsm files exist before checking the number
of blocks for those files.


Apart from above, Heikki mentioned about overflow for total number of
blocks calculation, which I think is relatively simpler problem to fix.


  There were plenty of objections, as well as
 a design for a better solution.

I think here by better solution you mean radix based approach or
something similar, first I don't see any clear design for the same
and second even if tomorrow we have patch for the same ready,
it's not very difficult to change the proposed solution even after
it is committed as the changes are very much localized to one
function.

  In addition, we should think about
 more holistic solutions such as Andres' nearby proposal (which I doubt
 will work as-is, but maybe somebody will think of how to fix it).
 Committing a band-aid will just make it harder to pursue real fixes.


Right, but OTOH waiting for a long time to have some thing much
more generic doesn't sound wise either, especially when we can replace
the generic solution without much difficulty.

Having said above, I am not wedded to work on this idea, so if you
and or others have no inclination for the work in this direction, then
I will stop it and lets wait for the day when we have clear idea for
some generic way.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] [PATCH] Generalized JSON output functions

2015-07-12 Thread Merlin Moncure
On Sun, Jul 12, 2015 at 4:35 AM, Pavel Stehule pavel.steh...@gmail.com wrote:


 2015-07-12 10:29 GMT+02:00 Shulgin, Oleksandr
 oleksandr.shul...@zalando.de:

 On Jul 11, 2015 8:41 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 
  There is simple rule - be strict on output and tolerant on input. If I
  understand to sense of this patch - the target is one same format of JSON
  documents - so there are no space for any variability.

 So, would you prefer explain json format on a single line - no indentation
 or whitespace whatsoever?

 yes, - if you need pretty format - there is function json_pretty - any more
 styles is wrong on Postgres side.


 This far it was only about whitespace, but it can be useful for tweaking
 other aspects of output, as I've mentioned before.

 Postgres is database, not presentation server - it have to to any database
 operations, quickly as possible - and formatting is part of client side.

 I can imagine the ability for 3rd-party code to override certain aspects
 of the output would be really useful for extensions or background workers,
 decoding plugins, etc.

 we talking about output - I can imagine, so there is only two possibilities
 - plain join, and pretty formatted join (but with only one style).

This makes sense.  Postgres core really only needs to support the
minimum styles necessary for core requirements.  This means raw
unformatted json for data productions to client and an appropriate
formatting for explain.  Fancier stuff like a generic formatted is
fine but those features *belong in an extension*.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE doesn't actually satisfy the SQL spec, does it?

2015-07-12 Thread Robert Haas
On Sun, Jul 12, 2015 at 12:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 As best I can tell (evidence below), the SQL standard requires that if a
 single query reads a table with a TABLESAMPLE clause multiple times (say,
 because it's on the inside of a nestloop), then the exact same set of
 sampled rows are returned each time.

Hmm, I tend to agree that it would be good if it behaved that way.
Otherwise, it seems like the behavior could be quite surprising.
Generally, we don't want the set of tuples that can be seen by a query
to change during the query; that's one of the things that snapshot
isolation does for us, as compared with, say, a literal interpretation
of READ COMMITTED, which would behave as SnapshotNow used to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Generalized JSON output functions

2015-07-12 Thread Pavel Stehule
2015-07-12 20:11 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de
:

   we talking about output - I can imagine, so there is only two
 possibilities
   - plain join, and pretty formatted join (but with only one style).
 
  This makes sense.  Postgres core really only needs to support the
  minimum styles necessary for core requirements.  This means raw
  unformatted json for data productions to client and an appropriate
  formatting for explain.  Fancier stuff like a generic formatted is
  fine but those features *belong in an extension*.

 The thing is - it's not only about whitespace, otherwise I would probably
 not bother with the generic interface. For my original problem, there is
 simply no way to do this correctly in an extension w/o copying over all of
 the logic from json.c, which I have to do right now, would rather not.

I am sorry - we are talking about JSON, not about any styled document. I
disagree, so it has not be implemented as extension - the backport of JSON
support is a extension.

Regards

Pavel



 Alex



[HACKERS] TRANSFORM modules vs. AIX

2015-07-12 Thread Noah Misch
(was Re: [COMMITTERS] pgsql: Add transforms feature)

On Tue, May 05, 2015 at 10:24:03PM -0400, Peter Eisentraut wrote:
 Any dlopenable module will have undefined symbols, namely those that it
 uses to call back into the executable that loads the module.  The only
 way it can know to not complain about those symbols is if we tell it
 about that executable at build time.  Our current shared library build
 setup makes reference to the postgres executable only on three
 platforms: Windows, OS X, AIX.  All other platforms necessarily accept
 all undefined symbols at build time.  We already know that it can also
 be made to work on Windows and OS X.  I expect that we might need some
 tweaking on AIX, but I'm quite sure it can be made to work there also,

We can address AIX almost exactly like we address Windows, attached.
Relocating an AIX installation will then require adding $(pkglibdir) to
LD_LIBRARY_PATH.  Windows, too, must need the reference from ltree_plpython to
plpython; I'll add it separately.  (It is for PLyUnicode_FromStringAndSize(),
which we omit under Python 2.  The buildfarm's Windows+GCC+Python animal,
frogmouth, uses Python 2.)


contrib/hstore_plperl test coverage revealed a PL/Perl bug found in all
supported branches.  Any import of a native-code module crashes:

  create language plperlu;
  CREATE FUNCTION test1none() RETURNS int LANGUAGE plperlu
AS $$ use IO; return 0; $$;

man perlaix gives the essential clue:

   Note that starting from Perl 5.7.2 (and consequently 5.8.0) and AIX
   4.3 or newer Perl uses the AIX native dynamic loading interface in
   the so called runtime linking mode instead of the emulated interface
   that was used in Perl releases 5.6.1 and earlier or, for AIX releases
   4.2 and earlier.  This change does break backward compatibility with
   compiled modules from earlier perl releases.  The change was made to
   make Perl more compliant with other applications like Apache/mod_perl
   which are using the AIX native interface. This change also enables
   the use of C++ code with static constructors and destructors in perl
   extensions, which was not possible using the emulated interface.

PostgreSQL uses AIX default linking, not AIX runtime linking.  Perl module
shared objects (e.g. IO.so) contain undefined symbols and rely on the runtime
linker to resolve them.  In an executable free from the runtime linker, those
symbols just remain NULL.  I am inclined to back-patch the following:

--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -90,3 +90,3 @@ ifeq ($(PORTNAME), aix)
 postgres: $(POSTGRES_IMP)
-   $(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(call expand_subsys,$(OBJS)) 
-Wl,-bE:$(top_builddir)/src/backend/$(POSTGRES_IMP) $(LIBS) -o $@
+   $(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(call expand_subsys,$(OBJS)) 
-Wl,-bE:$(top_builddir)/src/backend/$(POSTGRES_IMP) $(LIBS) -Wl,-brtllib -o $@

That allows the runtime linker to process modules dlopen()'ed within the
postgres executable.  To my knowledge, it changes nothing else.


At some point, we might switch from AIX default linking to runtime linking.
(Concretely, that entails building modules with -Wl,-G and the postgres
executable with -Wl,-brtl.)  Runtime linking treats undefined symbols and
duplicate symbols much more like other modern systems treat them.  I don't
personally wish to rock this boat to that degree.  ld -lfoo ignores foo.so,
but ld -brtl -lfoo prefers foo.so over foo.a.  Therefore, such a change
wouldn't be back-patch material.  Details on AIX shared library linking:

http://www-01.ibm.com/support/knowledgecenter/ssw_aix_71/com.ibm.aix.genprogc/shared_object_runtime_linking.htm
http://www-01.ibm.com/support/knowledgecenter/ssw_aix_71/com.ibm.aix.cmds3/ld.htm
diff --git a/contrib/hstore_plperl/Makefile b/contrib/hstore_plperl/Makefile
index 19a8ab4..46ce424 100644
--- a/contrib/hstore_plperl/Makefile
+++ b/contrib/hstore_plperl/Makefile
@@ -23,12 +23,17 @@ include $(top_builddir)/src/Makefile.global
 include $(top_srcdir)/contrib/contrib-global.mk
 endif
 
+# In configurations that forbid undefined symbols in libraries, link with each
+# dependency.  This does preclude pgxs builds.
 ifeq ($(PORTNAME), win32)
 # these settings are the same as for plperl
 override CPPFLAGS += -DPLPERL_HAVE_UID_GID -Wno-comment
-# This means we need an in-tree build on Windows, not a pgxs build
 SHLIB_LINK += ../hstore/libhstore.a $(wildcard ../../src/pl/plperl/libperl*.a)
 endif
+ifeq ($(PORTNAME), aix)
+rpathdir = $(pkglibdir):$(perl_archlibexp)/CORE
+SHLIB_LINK += ../hstore/libhstore.exp $(perl_embed_ldflags)
+endif
 
 # As with plperl we need to make sure that the CORE directory is included
 # last, probably because it sometimes contains some header files with names
diff --git a/contrib/hstore_plpython/Makefile b/contrib/hstore_plpython/Makefile
index 6ee434b..9b53d06 100644
--- a/contrib/hstore_plpython/Makefile
+++ b/contrib/hstore_plpython/Makefile
@@ 

Re: [HACKERS] [PATCH] Generalized JSON output functions

2015-07-12 Thread Shulgin, Oleksandr
  we talking about output - I can imagine, so there is only two
possibilities
  - plain join, and pretty formatted join (but with only one style).

 This makes sense.  Postgres core really only needs to support the
 minimum styles necessary for core requirements.  This means raw
 unformatted json for data productions to client and an appropriate
 formatting for explain.  Fancier stuff like a generic formatted is
 fine but those features *belong in an extension*.

The thing is - it's not only about whitespace, otherwise I would probably
not bother with the generic interface. For my original problem, there is
simply no way to do this correctly in an extension w/o copying over all of
the logic from json.c, which I have to do right now, would rather not.

Alex


Re: [HACKERS] TABLESAMPLE doesn't actually satisfy the SQL spec, does it?

2015-07-12 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Jul 12, 2015 at 12:02 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 As best I can tell (evidence below), the SQL standard requires that if a
 single query reads a table with a TABLESAMPLE clause multiple times (say,
 because it's on the inside of a nestloop), then the exact same set of
 sampled rows are returned each time.

 Hmm, I tend to agree that it would be good if it behaved that way.
 Otherwise, it seems like the behavior could be quite surprising.

Yeah.  As a concrete example, consider

select * from t1, t2 tablesample ... where t1.x = t2.x

and suppose that there are multiple occurences of x = 10 in both tables.
As things stand, if the join is done as a nestloop then a particular t2
row with x = 10 might appear in the output joined with some of the t1 rows
with x = 10 but not with others.  On the other hand, the results of a hash
join would not be inconsistent in that way, since t2 would be read only
once.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] TABLESAMPLE doesn't actually satisfy the SQL spec, does it?

2015-07-12 Thread Tom Lane
As best I can tell (evidence below), the SQL standard requires that if a
single query reads a table with a TABLESAMPLE clause multiple times (say,
because it's on the inside of a nestloop), then the exact same set of
sampled rows are returned each time.  The BERNOULLI code, at least, fails
to achieve this.  Suppose that a concurrent session adds some tuples to
a page in between scans.  When that page is visited the next time, the
sampler RNG will be advanced more times than it was previously, because
it's called a number of times that depends on PageGetMaxOffsetNumber().
So the RNG state will be different at the end of the page, and then
the set of tuples selected from subsequent pages will be completely
different from what it was before.  This will happen even if none of the
added rows are committed yet, let alone visible to the query's snapshot;
which makes it a failure of serializability even if you don't believe
the argument that it violates the requirements for TABLESAMPLE.

Now, as for the language-lawyering aspect of it, I read in SQL2011
7.6 table reference general rule 5a:

  iv) Case:

1) If sample method specifies BERNOULLI, then the result of TF is a
  table containing approximately (N*S/100) rows of RT. The probability
  of a row of RT being included in result of TF is S/100. Further,
  whether a given row of RT is included in result of TF is independent
  of whether other rows of RT are included in result of TF.

2) Otherwise, result of TF is a table containing approximately
  (N*S/100) rows of RT. The probability of a row of RT being included in
  result of TF is S/100.

  v) If TF contains outer references, then a table with identical rows is
generated every time TF is evaluated with a given set of values for
outer references.

This seems to me to clearly require that a fixed set of samples (ie, a
table, whether real or virtual) is selected during any query.  Note that
this requirement is the same whether or not REPEATABLE is used.

Also, I found a description of IBM DB2's implementation of this feature,
http://www.almaden.ibm.com/cs/people/peterh/idugjbig.pdf
and in that they say:

  Semantically, sampling of a table occurs before any other query
  processing, such as applying predicates or performing joins. One can
  envision the original tables referenced in a query being initially
  replaced by temporary “reduced” tables containing sampled rows, and then
  normal query processing commencing on the reduced tables. (For
  performance reasons, actual processing may not occur in exactly this
  manner.) It follows, for example, that repeated accesses of a sampled
  table within the same query, such as in a nested-loop join or a
  correlated subquery, will see the same sample of that table within a
  single execution of that query.

They do mention that alterations to a table may result in *successive*
queries giving different results, even when using REPEATABLE.  But I do
not think it's okay for this to happen within a single query.  That would
mean that vagaries of plan choice, eg nestloop vs some other join method,
would produce inconsistent results.

A possible way around this problem is to redefine the sampling rule so
that it is not history-dependent but depends only on the tuple TIDs.
For instance, one could hash the TID of a candidate tuple, xor that with
a hash of the seed being used for the current query, and then select the
tuple if (hash/MAXINT)  P.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers