Re: [HACKERS] Free indexed_tlist memory explicitly within set_plan_refs()
On 2015-05-28 14:37:43 -0700, Peter Geoghegan wrote: To fix, allow ParseState to reflect that an individual statement can be both p_is_insert and p_is_update at the same time. /* Process DO UPDATE */ if (onConflictClause-action == ONCONFLICT_UPDATE) { + /* p_is_update must be set here, after INSERT targetlist processing */ + pstate-p_is_update = true; + It's not particularly pretty that you document in the commit message that both is_insert and is_update can be set at the same time, and then it has constraints like the above. But that's more crummy API's fault than yours. I'm right now not really coming up with a better idea how to fix this. So I guess I'll apply something close to this tomorrow. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Free indexed_tlist memory explicitly within set_plan_refs()
On Sun, Jul 12, 2015 at 1:45 PM, Andres Freund and...@anarazel.de wrote: But that's more crummy API's fault than yours. As you probably noticed, the only reason the p_is_update and p_is_insert fields exist is for transformAssignedExpr() -- in fact, in the master branch, nothing checks the value of p_is_update (although I suppose a hook into a third-party module could see that module test ParseState.p_is_update, so that isn't quite true). I'm right now not really coming up with a better idea how to fix this. So I guess I'll apply something close to this tomorrow. Sounds good. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
On Sun, Jul 12, 2015 at 05:02:51PM -0400, Andrew Dunstan wrote: Marco Atzeri marco.atz...@gmail.com writes: for what I see the hstore_plperl link has a double problem. It requires a link to hstore as it also requires a link to perl. Attached patch for solving this and a similar issue with python. +ifeq ($(PORTNAME), cygwin) +# This means we need an in-tree build on Windows, not a pgxs build +SHLIB_LINK += -L../hstore -lhstore -L$(perl_archlibexp)/CORE -lperl +endif That's the right general strategy, agreed. Unless there is further argument I propose to commit something very like Marco's suggestion for hstore_plperl, hstore_plpython and ltree_plpython Would you post the final patch for review? Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
Andrew Dunstan and...@dunslane.net writes: On 07/04/2015 11:02 AM, Tom Lane wrote: It's not apparent to me how that works at all. BTW, the .a files being linked to above are not like Unix .a static archives - they are import library files, which I think they are only used at link time, not run time. The path to the DLLs isn't being hardcoded. Ah, I see. So then what Marco is suggesting is copying a coding pattern that does work. Unless there is further argument I propose to commit something very like Marco's suggestion for hstore_plperl, hstore_plpython and ltree_plpython No objection here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] creating extension including dependencies
On Tue, Jul 07, 2015 at 10:14:49AM -0700, David E. Wheeler wrote: On Jul 7, 2015, at 6:41 AM, Andres Freund and...@anarazel.de wrote: At the minimum I'd like to see that CREATE EXTENSION foo; would install install extension 'bar' if foo dependended on 'bar' if CASCADE is specified. Right now we always error out saying that the dependency on 'bar' is not fullfilled - not particularly helpful. +1 If `yum install foo` also installs bar, and `pgxn install foo` downloads, builds, and installs bar, it makes sense to me that `CREATE EXTENSION foo` would install bar if it was available, and complain if it wasn’t. This is this baseline sane behavior. Getting the full dependency tree, although it would be very handy, would require more infrastructure than we have now. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixes for a couple of resource leaks
Michael Paquier michael.paqu...@gmail.com writes: Coverity is pointing out a couple of resource leaks: ... Attached is a patch to address all those things. Backpatches would be good to have as well. I'll take this up, unless some other committer is already on it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.5 Alpha 1 build fail with perl 5.22
On 07/04/2015 01:09 PM, Andrew Dunstan wrote: On 07/04/2015 11:02 AM, Tom Lane wrote: Marco Atzeri marco.atz...@gmail.com writes: for what I see the hstore_plperl link has a double problem. It requires a link to hstore as it also requires a link to perl. Attached patch for solving this and a similar issue with python. +ifeq ($(PORTNAME), cygwin) +# This means we need an in-tree build on Windows, not a pgxs build +SHLIB_LINK += -L../hstore -lhstore -L$(perl_archlibexp)/CORE -lperl +endif [ and likewise for the other contrib transform modules ] I wondered how come we had not seen this problem in the buildfarm, but the answer appears to be that our only working Cygwin critter (brolga) doesn't build any of the optional PLs, so it skips these modules altogether. Seems like we need to improve that situation. Also, I noted that the regular win32 path in these makefiles says, eg, ifeq ($(PORTNAME), win32) # these settings are the same as for plperl override CPPFLAGS += -DPLPERL_HAVE_UID_GID -Wno-comment # This means we need an in-tree build on Windows, not a pgxs build SHLIB_LINK += ../hstore/libhstore.a $(wildcard ../../src/pl/plperl/libperl*.a) endif It's not apparent to me how that works at all. It seems to specify hard-linking a copy of hstore as well as a copy of plperl into the shlib for hstore_plperl. Then at runtime, there will *also* be the hstore and plperl shlibs in memory. At best that means substantial memory bloat, but it seems likely to me that it would fail altogether, particular for plperl which has got a substantial amount of semantically- important static storage. Two copies of that storage will not end well. Windows finds the DLL in its path. I just tested this by removing the intree pl DLLs after installing and then running contrib installcheck, and it worked fine. Whether or not it actually loads the DLL twice when it can find the intree DLL I don't know for sure, maybe someone with more Windows-fu than me can inform our ignorance. BTW, the .a files being linked to above are not like Unix .a static archives - they are import library files, which I think they are only used at link time, not run time. The path to the DLLs isn't being hardcoded. Unless there is further argument I propose to commit something very like Marco's suggestion for hstore_plperl, hstore_plpython and ltree_plpython cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] One question about security label command
Stephen, Stephen, would you have the time to review this patch, and commit if appropriate, please? And if you could set up the buildfarm animal to run this, even better. I gave this a quick review/test against master (0a0fe2f). Everything builds and installs as would be expected. All of the context changes make sense to me. However, I am still seeing some errors in the regression tests. The errors look like they are solely based around log messages and not 'functionality'. I have attached the 'regression.diffs' for reference. -Adam -- Adam Brightwell - adam.brightw...@crunchydatasolutions.com Database Engineer - www.crunchydatasolutions.com regression.diffs Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixes for a couple of resource leaks
On Mon, Jul 13, 2015 at 4:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Paquier michael.paqu...@gmail.com writes: Coverity is pointing out a couple of resource leaks: ... Attached is a patch to address all those things. Backpatches would be good to have as well. I'll take this up, unless some other committer is already on it. Thanks for taking up those things! Next time I'll mention as well until which version each fix should be backpatched to facilitate committer's work. That was obviously bad of me to not mention it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] optimizing vacuum truncation scans
On Thu, Jul 9, 2015 at 5:36 PM, Haribabu Kommi kommi.harib...@gmail.com wrote: I will do some performance tests and send you the results. Here are the performance results tested on my machine. Head vm patchvm+prefetch patch First vacuum120sec1sec 1sec second vacuum180 sec 180 sec30 sec I did some modifications in the code to skip the vacuum truncation by the first vacuum command. This way I collected the second vacuum time taken performance. I just combined your vm and prefetch patch into a single patch vm+prefetch patch without a GUC. I kept the prefetch as 32 and did the performance test. I chosen prefetch based on the current buffer access strategy, which is 32 for vacuum presently instead of an user option. Here I attached the modified patch with both vm+prefetch logic. I will do some tests on a machine with SSD and let you know the results. Based on these results, we can decide whether we need a GUC or not? based on the impact of prefetch on ssd machines. Regards, Hari Babu Fujitsu Australia vac_trunc_trust_vm_and_prefetch.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] anole: assorted stability problems
On Tue, Jul 7, 2015 at 7:25 AM, Andres Freund and...@anarazel.de wrote: On 2015-06-30 11:35:56 +0200, Andres Freund wrote: On 2015-06-29 22:58:05 -0400, Tom Lane wrote: So personally, I would be inclined to put back the volatile qualifier, independently of any fooling around with _Asm_double_magic_xyzzy calls. I'm not sure. I think the reliance on an explicit memory barrier is a lot more robust and easy to understand than some barely documented odd behaviour around volatile. On the other hand the old way worked for a long while. I'm inclined to just do both on platforms as odd as IA6. But it'd like to let anole run with the current set a bit longer - if it doesn't work we have more problems than just S_UNLOCK(). It seems EDB has increased the run rate for now, so it shouldn't take too long: http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=anolebr=HEAD So, it's starting to look good. Not exactly allowing for a lot of confidence yet, but still: http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=anolebr=HEAD I'm inclined to simply revise the comments now, and *not* reintroduce the volatile. The assumptions documented in: /* * Intel Itanium, gcc or Intel's compiler. * * Itanium has weak memory ordering, but we rely on the compiler to enforce * strict ordering of accesses to volatile data. In particular, while the * xchg instruction implicitly acts as a memory barrier with 'acquire' * semantics, we do not have an explicit memory fence instruction in the * S_UNLOCK macro. We use a regular assignment to clear the spinlock, and * trust that the compiler marks the generated store instruction with the * .rel opcode. * * Testing shows that assumption to hold on gcc, although I could not find * any explicit statement on that in the gcc manual. In Intel's compiler, * the -m[no-]serialize-volatile option controls that, and testing shows that * it is enabled by default. */ don't sound exactly bullet proof to me. I also personally find explicit barriers easier to understand in the light of all the other spinlock implementations. Comments? I'm fine with that plan. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape
I wrote: The two contrib modules this patch added are nowhere near fit for public consumption. They cannot clean up after themselves when dropped: ... Raw inserts into system catalogs just aren't a sane thing to do in extensions. I had some thoughts about how we might fix that, without going to the rather tedious lengths of creating a complete set of DDL infrastructure for CREATE/DROP TABLESAMPLE METHOD. First off, the extension API designed for the tablesample patch is evidently modeled on the index AM API, which was not a particularly good precedent --- it's not a coincidence that index AMs can't be added or dropped on-the-fly. Modeling a server internal API as a set of SQL-visible functions is problematic when the call signatures of those functions can't be accurately described by SQL datatypes, and it's rather pointless and inefficient when none of the functions in question is meant to be SQL-callable. It's even less attractive if you don't think you've got a completely stable API spec, because adding, dropping, or changing signature of any one of the API functions then involves a pile of easy-to-mess-up catalog changes on top of the actually useful work. Not to mention then having to think about backwards compatibility of your CREATE command's arguments. We have a far better model to follow already, namely the foreign data wrapper API. In that, there's a single SQL-visible function that returns a dummy datatype indicating that it's an FDW handler, and when called, it hands back a struct containing pointers to all the other functions that the particular wrapper needs to supply (and, if necessary, the struct could have non-function-pointer fields containing other info the core system might need to know about the handler). We could similarly invent a pseudotype tablesample_handler and represent each tablesample method by a single SQL function returning tablesample_handler. Any future changes in the API for tablesample handlers would then appear as changes in the C definition of the struct returned by the handler, which requires no SQL-visible thrashing, hence creates no headaches for pg_upgrade, and it is pretty easy to design it so that failure to update an external module that implements the API results in C compiler errors and/or sane fallback behavior. Once we've done that, I think we don't even need a special catalog representing tablesample methods. Given FROM foo TABLESAMPLE bernoulli(...), the parser could just look for a function bernoulli() returning tablesample_handler, and it's done. The querytree would have an ordinary function dependency on that function, which would be sufficient to handle DROP dependency behaviors properly. (On reflection, maybe better if it's bernoulli(internal) returns tablesample_handler, so as to guarantee that such a function doesn't create a conflict with any user-defined function of the same name.) Thoughts? regards, tom lane PS: now that I've written this rant, I wonder why we don't redesign the index AM API along the same lines. It probably doesn't matter much at the moment, but if we ever get serious about supporting index AM extensions, I think we ought to consider doing that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A little RLS oversight?
On Sun, Jul 12, 2015 at 5:59 PM, Yaroslav wrote: I can still see all statistics for 'test' in pg_stats under unprivileged user. Indeed, this looks like an oversight of RLS. Even if a policy is defined to prevent a user from seeing the rows of other users, it is still possible to get some information though this view. I am adding an open item regarding that for 9.5. I'd prefer statistics on RLS-enabled tables to be simply hidden completely for unprivileged users. This looks like something simple enough to do. @Stephen: perhaps you have some thoughts on the matter? Currently pg_stats breaks its promise to only show information about the rows current user can read. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Generalized JSON output functions
2015-07-12 10:29 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de : On Jul 11, 2015 8:41 PM, Pavel Stehule pavel.steh...@gmail.com wrote: There is simple rule - be strict on output and tolerant on input. If I understand to sense of this patch - the target is one same format of JSON documents - so there are no space for any variability. So, would you prefer explain json format on a single line - no indentation or whitespace whatsoever? yes, - if you need pretty format - there is function json_pretty - any more styles is wrong on Postgres side. This far it was only about whitespace, but it can be useful for tweaking other aspects of output, as I've mentioned before. Postgres is database, not presentation server - it have to to any database operations, quickly as possible - and formatting is part of client side. I can imagine the ability for 3rd-party code to override certain aspects of the output would be really useful for extensions or background workers, decoding plugins, etc. we talking about output - I can imagine, so there is only two possibilities - plain join, and pretty formatted join (but with only one style). I am thinking so general json functions has sense, but I partially disagree with your implementation. Then what would you differently exactly? simple code. -- Alex
Re: [HACKERS] Re: Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)
On Sat, Jul 11, 2015 at 9:28 PM, Andres Freund and...@anarazel.de wrote: On 2015-07-11 21:09:05 +0900, Michael Paquier wrote: Something like the patches attached Thanks for that! could be considered, one is for master and REL9_5_STABLE to remove ssl_renegotiation_limit, the second one for ~REL9_4_STABLE to change the default to 0. diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index c669f75..16c0ce5 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1040,7 +1040,7 @@ include_dir 'conf.d' cryptanalysis when large amounts of traffic can be examined, but it also carries a large performance penalty. The sum of sent and received traffic is used to check the limit. If this parameter is set to 0, -renegotiation is disabled. The default is literal512MB/. +renegotiation is disabled. The default is literal0/. I think we should put in a warning or at least note about the dangers of enabling it (connection breaks, exposure to several open openssl bugs). This sounds like a good idea to me. Here is an idea: + warning +para + Enabling varnamessl_renegotiation_limit/ can cause various + problems endangering the stability of a productnamePostgreSQL/ + instance like connection breaking suddendly and exposes the + server to bugs related to the internal implementation of renegotiation + done in the SSL libraries used. +/para + /warning Attached is v2 for ~9.4. Regards, -- Michael 20150712_ssl_renegotiation_remove-94_v2.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fixes for a couple of resource leaks
Hi all, Coverity is pointing out a couple of resource leaks: - In DropReplicationSlot@streamutil.c, query is leaked. - In getTransforms@pg_dump.c, the alloced result of get_language_name is not free'd. Other code paths calling this routine do the job. - In libpqGetCurrentXlogInsertLocation@libpq_fetch.c (pg_rewind), the value val returned by run_simple_query should be free'd. Other code paths do so. - In libpqProcessFileList@libpq_fetch.c, a call to PQclear is missing for a query result. - In vacuum_one_database@vacuumdb.c, a call to PQclear is missing. Attached is a patch to address all those things. Backpatches would be good to have as well. Regards, -- Michael 20150712_memory_leaks.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Could be improved point of UPSERT
Hi, Hackers, The feature of UPSERT was my most interested one of 9.5, I really like need it. I have test the different usages for this features like one record input, multi records input, and also more than 10,000 records upserting, all look great, thanks for your work. When I checked my records from these tests, I found that there was one result that might be improved, the issue is, although there is no new records added to the table when conflict happens, but the sequence for this table had been touched, so when a new record is actually added after that, the sequence will skip the numbers when it was touched, then we get a not reasonable result (from my opinion). The scenario is as following: 1, System: PostgreSQL 9.5 Alpha + Win7 X64 2, Enter pgadmin: 3, create table foobar ( sysid serial, theprovince varchar(20), themonth varchar(7), therevenue integer default 0, primary key (sysid), unique (theprovince,themonth) ) 4, insert into foobar values ('BeiJing','2015-01',1234), ('ShangHai','2015-01',1927) 5, select * from foobar ; sysid theprovince themonth therevenue 1 Beijing 2015-01 1234 2 ShangHai 2015-01 1927 6, insert into foobar values ('BeiJing','2015-01',1999), ('ShangHai','2015-01',1988) on conflict (theprovince,themonth) do update set therevenue=excluded.therevenue; 7, select * from foobar ; sysid theprovince themonth therevenue 1 Beijing 2015-01 1999 2 ShangHai 2015-01 1988 8, insert into foobar values ('TianJing','2015-01',1888) 9, select * from foobar ; sysid theprovince themonth therevenue 1 Beijing 2015-01 1999 2 ShangHai 2015-01 1988 5 TiangJing 2015-01 1888 --- Not reasonable result or issue: The third record of TianJing in the table gets the SYSID of 5, as the sequence was accessed twice by the step of 6. Suggestion: When a conflict was found for UPSERT, don't access the sequence, so users can have a reasonable list of ID. Regards, Daojing Zhou.
Re: [HACKERS] drop/truncate table sucks for large values of shared buffers
On Thu, Jul 2, 2015 at 7:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: On 2 July 2015 at 14:08, Heikki Linnakangas hlinn...@iki.fi wrote: I'm marking this as returned with feedback in the commitfest. There are no unresolved issues with the approach, nor is it true it is slower. If you think there are some, you should say what they are, not act high handedly to reject a patch without reason. Have you read the thread? I have read the thread again with a focus on the various problems (objections) raised for this patch and here is a summarization of my reading. 1. lseek can lie about EOF which could lead to serious problem during checkpoint if drop table misses to remove the shared buffer belonging to the table-to-be-dropped. This problem can be solved by maintaining the list of dropped relations and then while checkpoint clean such buffers during buffer-pool scan. Something similar is already used to avoid similar problems during FSync. 2. Patch doesn't cater to DROP TABLESPACE and DROP DATABASE operations. It would have been better if there could be a simpler solution for these operations as well, but even if we have something that generic to avoid problems for these operations, I think there is no reason why it can't be easily adopted for Drop Table operation as the changes proposed by this patch are very much localized to one function (which we have to anyway change even without patch if we come-up with a generic mechanism), the other changes required to avoid the problem-1 (lseek problem) would still be required even when we have patch for generic approach ready. As mentioned by Andrew, another thing to note is that these operations are much less used as compare to Drop/Truncate Table, so I think optimzing these are of somewhat lower priority. 3. Can't close-and-open a file (to avoid lseek lie about EOF or otherwise) as that might lead to a failure if there is flush operation for file in parallel. I haven't checked about this, but I think we can find some way to check if vm and fsm files exist before checking the number of blocks for those files. Apart from above, Heikki mentioned about overflow for total number of blocks calculation, which I think is relatively simpler problem to fix. There were plenty of objections, as well as a design for a better solution. I think here by better solution you mean radix based approach or something similar, first I don't see any clear design for the same and second even if tomorrow we have patch for the same ready, it's not very difficult to change the proposed solution even after it is committed as the changes are very much localized to one function. In addition, we should think about more holistic solutions such as Andres' nearby proposal (which I doubt will work as-is, but maybe somebody will think of how to fix it). Committing a band-aid will just make it harder to pursue real fixes. Right, but OTOH waiting for a long time to have some thing much more generic doesn't sound wise either, especially when we can replace the generic solution without much difficulty. Having said above, I am not wedded to work on this idea, so if you and or others have no inclination for the work in this direction, then I will stop it and lets wait for the day when we have clear idea for some generic way. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] [PATCH] Generalized JSON output functions
On Sun, Jul 12, 2015 at 4:35 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2015-07-12 10:29 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de: On Jul 11, 2015 8:41 PM, Pavel Stehule pavel.steh...@gmail.com wrote: There is simple rule - be strict on output and tolerant on input. If I understand to sense of this patch - the target is one same format of JSON documents - so there are no space for any variability. So, would you prefer explain json format on a single line - no indentation or whitespace whatsoever? yes, - if you need pretty format - there is function json_pretty - any more styles is wrong on Postgres side. This far it was only about whitespace, but it can be useful for tweaking other aspects of output, as I've mentioned before. Postgres is database, not presentation server - it have to to any database operations, quickly as possible - and formatting is part of client side. I can imagine the ability for 3rd-party code to override certain aspects of the output would be really useful for extensions or background workers, decoding plugins, etc. we talking about output - I can imagine, so there is only two possibilities - plain join, and pretty formatted join (but with only one style). This makes sense. Postgres core really only needs to support the minimum styles necessary for core requirements. This means raw unformatted json for data productions to client and an appropriate formatting for explain. Fancier stuff like a generic formatted is fine but those features *belong in an extension*. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TABLESAMPLE doesn't actually satisfy the SQL spec, does it?
On Sun, Jul 12, 2015 at 12:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: As best I can tell (evidence below), the SQL standard requires that if a single query reads a table with a TABLESAMPLE clause multiple times (say, because it's on the inside of a nestloop), then the exact same set of sampled rows are returned each time. Hmm, I tend to agree that it would be good if it behaved that way. Otherwise, it seems like the behavior could be quite surprising. Generally, we don't want the set of tuples that can be seen by a query to change during the query; that's one of the things that snapshot isolation does for us, as compared with, say, a literal interpretation of READ COMMITTED, which would behave as SnapshotNow used to do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Generalized JSON output functions
2015-07-12 20:11 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de : we talking about output - I can imagine, so there is only two possibilities - plain join, and pretty formatted join (but with only one style). This makes sense. Postgres core really only needs to support the minimum styles necessary for core requirements. This means raw unformatted json for data productions to client and an appropriate formatting for explain. Fancier stuff like a generic formatted is fine but those features *belong in an extension*. The thing is - it's not only about whitespace, otherwise I would probably not bother with the generic interface. For my original problem, there is simply no way to do this correctly in an extension w/o copying over all of the logic from json.c, which I have to do right now, would rather not. I am sorry - we are talking about JSON, not about any styled document. I disagree, so it has not be implemented as extension - the backport of JSON support is a extension. Regards Pavel Alex
[HACKERS] TRANSFORM modules vs. AIX
(was Re: [COMMITTERS] pgsql: Add transforms feature) On Tue, May 05, 2015 at 10:24:03PM -0400, Peter Eisentraut wrote: Any dlopenable module will have undefined symbols, namely those that it uses to call back into the executable that loads the module. The only way it can know to not complain about those symbols is if we tell it about that executable at build time. Our current shared library build setup makes reference to the postgres executable only on three platforms: Windows, OS X, AIX. All other platforms necessarily accept all undefined symbols at build time. We already know that it can also be made to work on Windows and OS X. I expect that we might need some tweaking on AIX, but I'm quite sure it can be made to work there also, We can address AIX almost exactly like we address Windows, attached. Relocating an AIX installation will then require adding $(pkglibdir) to LD_LIBRARY_PATH. Windows, too, must need the reference from ltree_plpython to plpython; I'll add it separately. (It is for PLyUnicode_FromStringAndSize(), which we omit under Python 2. The buildfarm's Windows+GCC+Python animal, frogmouth, uses Python 2.) contrib/hstore_plperl test coverage revealed a PL/Perl bug found in all supported branches. Any import of a native-code module crashes: create language plperlu; CREATE FUNCTION test1none() RETURNS int LANGUAGE plperlu AS $$ use IO; return 0; $$; man perlaix gives the essential clue: Note that starting from Perl 5.7.2 (and consequently 5.8.0) and AIX 4.3 or newer Perl uses the AIX native dynamic loading interface in the so called runtime linking mode instead of the emulated interface that was used in Perl releases 5.6.1 and earlier or, for AIX releases 4.2 and earlier. This change does break backward compatibility with compiled modules from earlier perl releases. The change was made to make Perl more compliant with other applications like Apache/mod_perl which are using the AIX native interface. This change also enables the use of C++ code with static constructors and destructors in perl extensions, which was not possible using the emulated interface. PostgreSQL uses AIX default linking, not AIX runtime linking. Perl module shared objects (e.g. IO.so) contain undefined symbols and rely on the runtime linker to resolve them. In an executable free from the runtime linker, those symbols just remain NULL. I am inclined to back-patch the following: --- a/src/backend/Makefile +++ b/src/backend/Makefile @@ -90,3 +90,3 @@ ifeq ($(PORTNAME), aix) postgres: $(POSTGRES_IMP) - $(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(call expand_subsys,$(OBJS)) -Wl,-bE:$(top_builddir)/src/backend/$(POSTGRES_IMP) $(LIBS) -o $@ + $(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(call expand_subsys,$(OBJS)) -Wl,-bE:$(top_builddir)/src/backend/$(POSTGRES_IMP) $(LIBS) -Wl,-brtllib -o $@ That allows the runtime linker to process modules dlopen()'ed within the postgres executable. To my knowledge, it changes nothing else. At some point, we might switch from AIX default linking to runtime linking. (Concretely, that entails building modules with -Wl,-G and the postgres executable with -Wl,-brtl.) Runtime linking treats undefined symbols and duplicate symbols much more like other modern systems treat them. I don't personally wish to rock this boat to that degree. ld -lfoo ignores foo.so, but ld -brtl -lfoo prefers foo.so over foo.a. Therefore, such a change wouldn't be back-patch material. Details on AIX shared library linking: http://www-01.ibm.com/support/knowledgecenter/ssw_aix_71/com.ibm.aix.genprogc/shared_object_runtime_linking.htm http://www-01.ibm.com/support/knowledgecenter/ssw_aix_71/com.ibm.aix.cmds3/ld.htm diff --git a/contrib/hstore_plperl/Makefile b/contrib/hstore_plperl/Makefile index 19a8ab4..46ce424 100644 --- a/contrib/hstore_plperl/Makefile +++ b/contrib/hstore_plperl/Makefile @@ -23,12 +23,17 @@ include $(top_builddir)/src/Makefile.global include $(top_srcdir)/contrib/contrib-global.mk endif +# In configurations that forbid undefined symbols in libraries, link with each +# dependency. This does preclude pgxs builds. ifeq ($(PORTNAME), win32) # these settings are the same as for plperl override CPPFLAGS += -DPLPERL_HAVE_UID_GID -Wno-comment -# This means we need an in-tree build on Windows, not a pgxs build SHLIB_LINK += ../hstore/libhstore.a $(wildcard ../../src/pl/plperl/libperl*.a) endif +ifeq ($(PORTNAME), aix) +rpathdir = $(pkglibdir):$(perl_archlibexp)/CORE +SHLIB_LINK += ../hstore/libhstore.exp $(perl_embed_ldflags) +endif # As with plperl we need to make sure that the CORE directory is included # last, probably because it sometimes contains some header files with names diff --git a/contrib/hstore_plpython/Makefile b/contrib/hstore_plpython/Makefile index 6ee434b..9b53d06 100644 --- a/contrib/hstore_plpython/Makefile +++ b/contrib/hstore_plpython/Makefile @@
Re: [HACKERS] [PATCH] Generalized JSON output functions
we talking about output - I can imagine, so there is only two possibilities - plain join, and pretty formatted join (but with only one style). This makes sense. Postgres core really only needs to support the minimum styles necessary for core requirements. This means raw unformatted json for data productions to client and an appropriate formatting for explain. Fancier stuff like a generic formatted is fine but those features *belong in an extension*. The thing is - it's not only about whitespace, otherwise I would probably not bother with the generic interface. For my original problem, there is simply no way to do this correctly in an extension w/o copying over all of the logic from json.c, which I have to do right now, would rather not. Alex
Re: [HACKERS] TABLESAMPLE doesn't actually satisfy the SQL spec, does it?
Robert Haas robertmh...@gmail.com writes: On Sun, Jul 12, 2015 at 12:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: As best I can tell (evidence below), the SQL standard requires that if a single query reads a table with a TABLESAMPLE clause multiple times (say, because it's on the inside of a nestloop), then the exact same set of sampled rows are returned each time. Hmm, I tend to agree that it would be good if it behaved that way. Otherwise, it seems like the behavior could be quite surprising. Yeah. As a concrete example, consider select * from t1, t2 tablesample ... where t1.x = t2.x and suppose that there are multiple occurences of x = 10 in both tables. As things stand, if the join is done as a nestloop then a particular t2 row with x = 10 might appear in the output joined with some of the t1 rows with x = 10 but not with others. On the other hand, the results of a hash join would not be inconsistent in that way, since t2 would be read only once. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] TABLESAMPLE doesn't actually satisfy the SQL spec, does it?
As best I can tell (evidence below), the SQL standard requires that if a single query reads a table with a TABLESAMPLE clause multiple times (say, because it's on the inside of a nestloop), then the exact same set of sampled rows are returned each time. The BERNOULLI code, at least, fails to achieve this. Suppose that a concurrent session adds some tuples to a page in between scans. When that page is visited the next time, the sampler RNG will be advanced more times than it was previously, because it's called a number of times that depends on PageGetMaxOffsetNumber(). So the RNG state will be different at the end of the page, and then the set of tuples selected from subsequent pages will be completely different from what it was before. This will happen even if none of the added rows are committed yet, let alone visible to the query's snapshot; which makes it a failure of serializability even if you don't believe the argument that it violates the requirements for TABLESAMPLE. Now, as for the language-lawyering aspect of it, I read in SQL2011 7.6 table reference general rule 5a: iv) Case: 1) If sample method specifies BERNOULLI, then the result of TF is a table containing approximately (N*S/100) rows of RT. The probability of a row of RT being included in result of TF is S/100. Further, whether a given row of RT is included in result of TF is independent of whether other rows of RT are included in result of TF. 2) Otherwise, result of TF is a table containing approximately (N*S/100) rows of RT. The probability of a row of RT being included in result of TF is S/100. v) If TF contains outer references, then a table with identical rows is generated every time TF is evaluated with a given set of values for outer references. This seems to me to clearly require that a fixed set of samples (ie, a table, whether real or virtual) is selected during any query. Note that this requirement is the same whether or not REPEATABLE is used. Also, I found a description of IBM DB2's implementation of this feature, http://www.almaden.ibm.com/cs/people/peterh/idugjbig.pdf and in that they say: Semantically, sampling of a table occurs before any other query processing, such as applying predicates or performing joins. One can envision the original tables referenced in a query being initially replaced by temporary âreducedâ tables containing sampled rows, and then normal query processing commencing on the reduced tables. (For performance reasons, actual processing may not occur in exactly this manner.) It follows, for example, that repeated accesses of a sampled table within the same query, such as in a nested-loop join or a correlated subquery, will see the same sample of that table within a single execution of that query. They do mention that alterations to a table may result in *successive* queries giving different results, even when using REPEATABLE. But I do not think it's okay for this to happen within a single query. That would mean that vagaries of plan choice, eg nestloop vs some other join method, would produce inconsistent results. A possible way around this problem is to redefine the sampling rule so that it is not history-dependent but depends only on the tuple TIDs. For instance, one could hash the TID of a candidate tuple, xor that with a hash of the seed being used for the current query, and then select the tuple if (hash/MAXINT) P. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers