Re: [HACKERS] Why so few built-in range types?
On ons, 2011-11-30 at 17:56 -0500, Robert Haas wrote: On Wed, Nov 30, 2011 at 3:58 PM, Stephen Frost sfr...@snowman.net wrote: Erm, isn't there a contrib type that already does all that for you..? ip4r or whatever? Just saying, if you're looking for that capability.. Oh, huh, good to know. Still, I'm not sure why you need to load a separate type to get this... there's no reason why the built-in CIDR type couldn't support it. A couple of reasons: - ip4 is fixed-length, so it's much faster. (Obviously, this is living on borrowed time. Who knows.) - Conversely, it might be considered a feature that ip4 only stores IPv4 addresses. - ip4 really only stores a single address, not a netmask, not sometimes a netmask, or sometimes a range, or sometimes a network and an address, or whatever. That really seems like the most common use case, and no matter what you do with the other types, some stupid netmask will appear in your output when you least expect it. - Integrates with ip4r, which has GiST support. - Some old-school internet gurus worked out why inet and cidr have to behave the way they do, which no one else understands, and no one dares to discuss, whereas ip4/ip4r are simple and appear to be built for practical use. Really, it's all about worse is better. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug of recovery?
On 04.10.2011 09:43, Fujii Masao wrote: On Mon, Oct 3, 2011 at 4:32 PM, Simon Riggssi...@2ndquadrant.com wrote: I don't think this should use the rm_safe_restartpoint machinery. As you said, it's not tied to any specific resource manager. And I've actually been thinking that we will get rid of rm_safe_restartpoint altogether in the future. The two things that still use it are the b-tree and gin, and I'd like to change both of those to not require any post-recovery cleanup step to finish multi-page operations, similar to what I did with GiST in 9.1. I thought that was quite neat doing it that way, but there's no specific reason to do it that way I guess. If you're happy to rewrite the patch then I guess we're OK. I certainly would like to get rid of rm_safe_restartpoint in the longer term, hopefully sooner. Though Heikki might be already working on that,... Just haven't gotten around to it. It's a fair amount of work with little user-visible benefit. anyway, the attached patch is the version which doesn't use rm_safe_restartpoint machinery. Thanks, committed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
On ons, 2011-11-30 at 10:53 -0500, Tom Lane wrote: I think the important point here is that we need to support more than one level of validation, and that the higher levels can't really be applied by default in CREATE FUNCTION because they may fail on perfectly valid code. How would this work with anything other than PL/pgSQL in practice? Here is an additional use case: There are a bunch of syntax and style checkers for Python: pylint, pyflakes, pep8, pychecker, and maybe more. I would like to have a way to use these for PL/Python. Right now I use a tool I wrote called plpylint (https://github.com/petere/plpylint), which pulls the source code out of the database and runs pylint on the client, which works well enough, but what is being discussed here could lead to a better solution. So what I'd like to have is some way to say check all plpythonu functions [in this schema or whatever] using checker pylint where pylint was previously defined as a checker associated with the plpythonu language that actually invokes some user-defined function. Also, what kind of report does this generate? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
Tom Lane wrote: Do I understand right that the reason why the check function is different from the validator function is that it would be more difficult to add the checks to the validator function? Is that a good enough argument? From a user's perspective it is difficult to see why some checks are performed at function creation time, while others have to be explicitly checked with CHECK FUNCTION. I think it would be much more intuitive if CHECK FUNCTION does the same as function validation with check_function_bodies on. I think the important point here is that we need to support more than one level of validation, and that the higher levels can't really be applied by default in CREATE FUNCTION because they may fail on perfectly valid code. I understand now. There are three levels of checking: 1) Validation with check_function_bodies = off (checks nothing). 2) Validation with check_function_bodies = on (checks syntax). 3) CHECK FUNCTION (checks RAISE and objects referenced in the function). As long as 3) implies 2) (which I think it does), that makes sense. I guess I was led astray by the documentation in plhandler.sgml: Validator functions should typically honor the check_function_bodies parameter: [...] this parameter is turned off by pg_dump so that it can load procedural language functions without worrying about possible dependencies of the function bodies on other database objects. Dependencyies on other database objects seems more like a description of CHECK FUNCTION. But I guess that this documentation should be changed anyway to describe the check function. A bigger issue is that once you think about more than one kind of check, it becomes apparent that we might need some user-specifiable options to control which checks are applied. And I see no provision for that here. My attempt at a syntax that could also cover Peter's wish for multiple checker functions: CHECK FUNCTION { func(args) | ALL [IN SCHEMA schema] [FOR ROLE user] } [ USING check_function ] OPTIONS (optname optarg [, ...]) Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)
Rebased with head. -- With best regards, Alexander Korotkov. rangetypegist-0.4.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prep object creation hooks, and related sepgsql updates
I tried to implement remaining portion of the object creation permission patches using this approach; that temporary saves contextual information using existing ProcessUtility hook and ExecutorStart hook. It likely works fine towards my first problem; system catalog entry does not have all the information that requires to make access control decision. In the case of pg_database catalog, it does not inform us which database was its source. Also it maybe works towards my second problem; some of code paths internally used invokes object-access-hook with OAT_POST_CREATE, so entension is unavailable to decide whether permission checks should be applied, or not. In the case of pg_class, heap_create_with_catalog() is invoked by make_new_heap(), not only DefineRelation() and OpenIntoRel(). So, this patch checks which statement eventually calls these routines to decide necessity of permission checks. All I did is a simple hack on ProcessUtility hook and ExecutorStart hook, then post-creation-hook references the saved contextual information, as follows. sepgsql_utility_command(...) { sepgsql_context_info_t saved_context_info = sepgsql_context_info; PG_TRY() { sepgsql_context_info.cmdtype = nodeTag(parsetree); : if (next_ProcessUtility_hook) (*next_ProcessUtility_hook) () else standard_ProcessUtility() } PG_CATCH(); { sepgsql_context_info = saved_context_info; PG_RE_THROW(); } PG_END_TRY(); sepgsql_context_info = saved_context_info; } Then, sepgsql_relation_post_create() { : /* * Some internally used code paths call heap_create_with_catalog(), then * it launches this hook, even though it does not need permission check * on creation of relation. So, we skip these cases. */ switch (sepgsql_context_info.cmdtype) { case T_CreateStmt: case T_ViewStmt: case T_CreateSeqStmt: case T_CompositeTypeStmt: case T_CreateForeignTableStmt: case T_SelectStmt: break; default: /* internal calls */ return; } : } At least, it is working. However, it is not a perfect solution to the future updates of code paths in the core. Thanks, 2011/11/29 Kohei KaiGai kai...@kaigai.gr.jp: 2011/11/28 Dimitri Fontaine dimi...@2ndquadrant.fr: Kohei KaiGai kai...@kaigai.gr.jp writes: I found up a similar idea that acquires control on ProcessUtility_hook and save necessary contextual information on auto variable then kicks the original ProcessUtility_hook, then it reference the contextual information from object_access_hook. In this case that would be an INSTEAD OF trigger, from which you can call the original command with EXECUTE. You just have to protect yourself against infinite recursion, but that's doable. See attached example. Hmm... In my case, it does not need to depend on the command trigger. Let's see the attached patch; that hooks ProcessUtility_hook by sepgsql_utility_command, then it saves contextual information on auto variables. The object_access_hook with OAT_POST_CREATE shall be invoked from createdb() that was also called by standard_ProcessUtility. In this context, sepgsql_database_post_create can reference a property that is not contained withint pg_database catalog (name of the source database). At least, it may be able to solve my issues on hooks around object creation time. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- KaiGai Kohei kai...@kaigai.gr.jp pgsql-v9.2-sepgsql-create-permissions-part-1.database.patch Description: Binary data pgsql-v9.2-sepgsql-create-permissions-part-4.proc.patch Description: Binary data pgsql-v9.2-sepgsql-create-permissions-part-3.relation.patch Description: Binary data pgsql-v9.2-sepgsql-create-permissions-part-2.namespace.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Join push-down for foreign tables
On 17.11.2011 17:24, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: When the FDW recognizes it's being asked to join a ForeignJoinPath and a ForeignPath, or two ForeignJoinPaths, it throws away the old SQL it constructed to do the two-way join, and builds a new one to join all three tables. It should certainly not throw away the old SQL --- that path could still be chosen. Right, that was loose phrasing from me. That seems tedious, when there are a lot of tables involved. A FDW like the pgsql_fdw that constructs an SQL query doesn't need to consider pairs of joins. It could just as well build the SQL for the three-way join directly. I think the API needs to reflect that. Tom, what do you think of this part? I think it would be a lot more natural API if the planner could directly ask the FDW to construct a plan for a three (or more)-way join, instead of asking it to join a join relation into another relation. The proposed API is this: + FdwPlan * + PlanForeignJoin (Oid serverid, + PlannerInfo *root, + RelOptInfo *joinrel, + JoinType jointype, + SpecialJoinInfo *sjinfo, + Path *outer_path, + Path *inner_path, + List *restrict_clauses, + List *pathkeys); The problem I have with this is that the FDW shouldn't need outer_path and inner_path. All the information it needs is in 'joinrel'. Except for outer-joins, I guess; is there convenient way to get the join types involved in a join rel? It's there in SpecialJoinInfo, but if the FDW is only passed the RelOptInfo representing the three-way join, it's not there. Does the planner expect the result from the foreign server to be correctly sorted, if it passes pathkeys to that function? I wonder if we should have a heuristic to not even consider doing a join locally, if it can be done remotely. I think this is a bad idea. It will require major restructuring of the planner, and sometimes it will fail to find the best plan, in return for not much. The nature of join planning is that we investigate a lot of dead ends. Ok. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On 2 December 2011 01:13, Tom Lane t...@sss.pgh.pa.us wrote: Regardless of that, I'm still of the opinion that this patch is fundamentally misdesigned. The way it's set up, it is only possible to have a fast path for types that are hard-wired into tuplesort.c, and that flies in the face of twenty years' worth of effort to make Postgres an extensible system. What the user doesn't know can't hurt them. I'm not of the opinion that an internal optimisations flies in the face of anything - is it really so hard for you to hold your nose for a fairly isolated patch like this? I really don't care about performance measurements: this is not an acceptable design, period. If ever there was a justification for doing something so distasteful, I would think that a 60% decrease in the time taken to perform a raw sort for POD types (plus common covariants) would be it. What I want to see is some mechanism that pushes this out to the individual datatypes, so that both core and plug-in datatypes have access to the benefits. There are a number of ways that could be attacked, but the most obvious one is to invent additional, optional support function(s) for btree opclasses. I also still think that we should pursue the idea of a lower-overhead API for comparison functions. It may be that it's worth the code bulk to have an inlined copy of qsort for a small number of high-usage datatypes, but I think there are going to be a lot for which we aren't going to want to pay that price, and yet we could get some benefit from cutting the call overhead. I'm not opposed to that. It just seems fairly tangential to what I've done here, as well as considerably less important to users, particularly when you look at the numbers I've produced. It would be nice to get a lesser gain for text and things like that too, but other than that I don't see a huge demand. A lower-overhead API would also save cycles in usage of the comparison functions from btree itself (remember that?). Yes, I remember that. Why not do something similar there, and get similarly large benefits? I think in particular that the proposed approach to multiple sort keys is bogus and should be replaced by just using lower-overhead comparators. The gain per added code byte in doing it as proposed has got to be too low to be reasonable. Reasonable by what standard? We may well be better off with lower-overhead comparators for the multiple scanKey case (though I wouldn't like to bet on it) but we would most certainly not be better off without discriminating against single scanKey and multiple scanKey cases as I have. Look at the spreadsheet results_server.ods . Compare the not inlined and optimized sheets. It's clear from those numbers that a combination of inlining and of simply not having to consider a case that will never come up (multiple scanKeys), the inlining specialisation far outperforms the non-inlining, multiple scanKey specialization for the same data (I simply commented out inlining specializations so that non-inlining specialisations would always be called for int4 and friends, even if there was only a single scanKey). That's where the really dramatic improvements are seen. It's well worth having this additional benefit, because it's such a common case and the benefit is so big, and while I will be quite happy to pursue a plan to bring some of these benefits to all types such as the one you describe, I do not want to jettison the additional benefits described to do so. Isn't that reasonable? We're talking about taking 6778.302ms off a 20468.0ms query, rather than 1860.332ms. Not exactly a subtle difference. Assuming that those figures are representative of the gains to be had by a generic mechanism that does not inline/specialize across number of scanKeys, are you sure that that's worthwhile? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why so few built-in range types?
* Peter Eisentraut (pete...@gmx.net) wrote: - ip4 really only stores a single address, not a netmask, not sometimes a netmask, or sometimes a range, or sometimes a network and an address, or whatever. That really seems like the most common use case, and no matter what you do with the other types, some stupid netmask will appear in your output when you least expect it. This is definitely one of the funny complications with our built-in types. I don't feel that's a feature either. Nor do I consider it 'worse' that we have a type that actually makes sense. :) Regardless of who developed it, it's simply trying to do too much in one type. I'm also not convinced that our built-in types even operate in a completely sensible way when you consider all the interactions you could have between the different 'types' of that 'type', but I'll admit that I haven't got examples or illustrations of that- something better exists and is what I use and encourage others to use. In some ways, I would say this is akin to our built-in types vs. PostGIS. My argument isn't about features or capabilities in either case (though those are valuable too), it's about what's 'right' and makes sense, to me anyway. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Why so few built-in range types?
On Fri, Dec 2, 2011 at 3:42 AM, Peter Eisentraut pete...@gmx.net wrote: - ip4 is fixed-length, so it's much faster. (Obviously, this is living on borrowed time. Who knows.) Fair point. - Conversely, it might be considered a feature that ip4 only stores IPv4 addresses. True, although this can also be enforced by application logic or a check constraint quite easily. Of course that is likely not as fast, going to point #1. - ip4 really only stores a single address, not a netmask, not sometimes a netmask, or sometimes a range, or sometimes a network and an address, or whatever. That really seems like the most common use case, and no matter what you do with the other types, some stupid netmask will appear in your output when you least expect it. Yes, this is mildly annoying; but at worst it is a defect of inet, not cidr, which does exactly what I'd expect a cidr type to do. - Integrates with ip4r, which has GiST support. Well, OK, so I want GiST support for cidr. That's where this all started. - Some old-school internet gurus worked out why inet and cidr have to behave the way they do, which no one else understands, and no one dares to discuss, whereas ip4/ip4r are simple and appear to be built for practical use. Really, it's all about worse is better. Heh, OK, well, that's above my pay grade. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Fri, Dec 2, 2011 at 12:16 AM, Tom Lane t...@sss.pgh.pa.us wrote: It should be the same or better. Right now, we are going through FunctionCall2Coll to reach the FCI-style comparator. The shim function would be more or less equivalent to that, and since it's quite special purpose I would hope we could shave a cycle or two. For instance, we could probably afford to set up a dedicated FunctionCallInfo struct associated with the SortSupportInfo struct, and not have to reinitialize one each time. OK, but I think it's also going to cost you an extra syscache lookup, no? You're going to have to check for this new support function first, and then if you don't find it, you'll have to check for the original one. I don't think there's any higher-level caching around opfamilies to save our bacon here, is there? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why so few built-in range types?
- Цитат от Tom Lane (t...@sss.pgh.pa.us), на 02.12.2011 в 05:21 - Robert Haas robertmh...@gmail.com writes: On Thu, Dec 1, 2011 at 7:56 PM, Stephen Frost sfr...@snowman.net wrote: I don't have any particular care about if cidr has indexing support or not. I'm certainly not *against* it, except insofar as it encourages use of a data type that really could probably be better (by being more like ip4r..). Not that you're biased or anything! :-p IIRC, a lot of the basic behavior of the inet/cidr types was designed by Paul Vixie (though he's not to blame for their I/O presentation). So I'm inclined to doubt that they're as broken as Stephen claims. regards, tom lane I have looked at ip4r README file and my use of the extension. According to the README, the main reasons for ip4r to exist are: 1. No index support for buildin datatypes. 2. They are variable width datatypes, because inet/cidr supports IPv6. 3. Semantic overloading - no random ranges, you could combine IP addr and netmask in inet datatype. What I have found in my experience is that the semantics of inet/cidr is what you need in order to model IP networks - interfaces, addresses, routing tables, bgp sessions, LIR databases etc. In this regard the main semantic shortcommings of ip4r datatype are: 1. It could not represent address asignments. For example: ip4r('10.0.0.1/24') is invalid. You sould represent it with two ip4r fields - ip4r('10.0.0.1') for the address and ip4r('10.0.0.0/24') for the net. Using build-in datatypes it could be represented as inet('10.0.0.1/24') 2. You could have ip4r random ranges that could not exests in the IP network stack of any device. Eg. you could not configure route as 10.0.0.2-10.0.0.6 3. No IPv6 support. So, from my viewpoint the semantic overloading of inet type is what you want because it represents the semantics of IP networks. Best regards -- Luben Karavelov
Re: [HACKERS] review: CHECK FUNCTION statement
Peter Eisentraut pete...@gmx.net writes: On ons, 2011-11-30 at 10:53 -0500, Tom Lane wrote: I think the important point here is that we need to support more than one level of validation, and that the higher levels can't really be applied by default in CREATE FUNCTION because they may fail on perfectly valid code. How would this work with anything other than PL/pgSQL in practice? Well, that's TBD by the individual PL authors, but it hardly seems implausible that there might be lint-like checks applicable in many PLs. As long as we have the functionality pushed out to a PL-specific checker function, the details can be worked out later. So what I'd like to have is some way to say check all plpythonu functions [in this schema or whatever] using checker pylint where pylint was previously defined as a checker associated with the plpythonu language that actually invokes some user-defined function. That sounds like a language-specific option to me. Also, what kind of report does this generate? Good question. I suspect what Pavel has now will raise errors, but that doesn't scale very nicely to checking more than one function, or even to finding more than one bug in a single function. My first instinct is to say that it should work like plain EXPLAIN, ie, deliver a textual report that we send as if it were a query result. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Robert Haas robertmh...@gmail.com writes: OK, but I think it's also going to cost you an extra syscache lookup, no? You're going to have to check for this new support function first, and then if you don't find it, you'll have to check for the original one. I don't think there's any higher-level caching around opfamilies to save our bacon here, is there? [ shrug... ] If you are bothered by that, get off your duff and provide the function for your datatype. But it's certainly going to be in the noise for btree index usage, and I submit that query parsing/setup involves quite a lot of syscache lookups already. I think that as a performance objection, the above is nonsensical. (And I would also comment that your proposal with a handle is going to involve a table search that's at least as expensive as a syscache lookup.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade and regclass
In Postgres 8.4, pg_upgrade preserved pg_class relfilenodes by creating files in the file system. In Postgres 9.0, we changed this by creating pg_upgrade_support functions which allow us to directly preserve pg_class.oids. Unfortunately, check.c was not updated to reflect this and clusters using regclass were prevented from being upgraded by pg_upgrade. I have developed the attached patch to allow clusters using regclass to be upgraded. I plan to apply it to PG 9.0, 9.1, and HEAD. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c new file mode 100644 index d32a84c..7185f13 *** a/contrib/pg_upgrade/check.c --- b/contrib/pg_upgrade/check.c *** check_for_isn_and_int8_passing_mismatch( *** 611,621 /* * check_for_reg_data_type_usage() * pg_upgrade only preserves these system values: ! * pg_class.relfilenode * pg_type.oid * pg_enum.oid * ! * Most of the reg* data types reference system catalog info that is * not preserved, and hence these data types cannot be used in user * tables upgraded by pg_upgrade. */ --- 611,621 /* * check_for_reg_data_type_usage() * pg_upgrade only preserves these system values: ! * pg_class.oid * pg_type.oid * pg_enum.oid * ! * Many of the reg* data types reference system catalog info that is * not preserved, and hence these data types cannot be used in user * tables upgraded by pg_upgrade. */ *** check_for_reg_data_type_usage(ClusterInf *** 653,668 NOT a.attisdropped AND a.atttypid IN ( 'pg_catalog.regproc'::pg_catalog.regtype, ! 'pg_catalog.regprocedure'::pg_catalog.regtype, 'pg_catalog.regoper'::pg_catalog.regtype, ! 'pg_catalog.regoperator'::pg_catalog.regtype, ! 'pg_catalog.regclass'::pg_catalog.regtype, /* regtype.oid is preserved, so 'regtype' is OK */ ! 'pg_catalog.regconfig'::pg_catalog.regtype, ! 'pg_catalog.regdictionary'::pg_catalog.regtype) AND ! c.relnamespace = n.oid AND ! n.nspname != 'pg_catalog' AND ! n.nspname != 'information_schema'); ntups = PQntuples(res); i_nspname = PQfnumber(res, nspname); --- 653,668 NOT a.attisdropped AND a.atttypid IN ( 'pg_catalog.regproc'::pg_catalog.regtype, ! 'pg_catalog.regprocedure'::pg_catalog.regtype, 'pg_catalog.regoper'::pg_catalog.regtype, ! 'pg_catalog.regoperator'::pg_catalog.regtype, ! /* regclass.oid is preserved, so 'regclass' is OK */ /* regtype.oid is preserved, so 'regtype' is OK */ ! 'pg_catalog.regconfig'::pg_catalog.regtype, ! 'pg_catalog.regdictionary'::pg_catalog.regtype) AND ! c.relnamespace = n.oid AND ! n.nspname != 'pg_catalog' AND ! n.nspname != 'information_schema'); ntups = PQntuples(res); i_nspname = PQfnumber(res, nspname); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Java LISTEN/NOTIFY client library work-around
2011/12/1 Kris Jurka bo...@ejurka.com On Wed, 30 Nov 2011, Joel Jacobson wrote: As you know, LISTEN/NOTIFY is broken in the Java client library. You have to do a SELECT 1 in a while-loop to receive the notifications. http://jdbc.postgresql.org/documentation/head/listennotify.html This documentation is out of date. Currently you can get notifications without polling the database if you are not using a SSL connection. You still must poll the driver, using PGConnection.getNotifications, but it will return new notifications received without an intermediate database query. This doesn't work over SSL and potentially some other connection types because it uses InputStream.available that not all implementations support. I know it works without SSL, but we need SSL for this. If it would be possible to fix it, my company is as I said willing to pay for the cost of such a patch. Kris Jurka -- Joel Jacobson Trustly +46703603801 https://trustly.com
Re: [HACKERS] patch for type privileges
On 2011-12-01 22:14, Peter Eisentraut wrote: On tor, 2011-12-01 at 14:37 +0100, Yeb Havinga wrote: On 2011-11-29 18:47, Peter Eisentraut wrote: On tis, 2011-11-29 at 07:07 +0200, Peter Eisentraut wrote: On mån, 2011-11-28 at 11:41 +0100, Yeb Havinga wrote: On 2011-11-15 21:50, Peter Eisentraut wrote: Patch attached. I cannot get the patch to apply, this is the output of patch -p1 --dry-run on HEAD. I need to remerge it against concurrent range type activity. New patch attached. I'm looking at your patch. One thing that puzzled me for a while was that I could not restrict access to base types (either built-in or user defined). Is this intentional? Works for me: =# create user foo; =# revoke usage on type int8 from public; =# \c - foo = create table test1 (a int4, b int8); ERROR: permission denied for type bigint Hmm even though I have 'revoke all on type int2 from public' in my psql history, I cannot repeat what I think was happening yesterday. Probably I was still superuser in the window I was testing with, but will never no until time travel is invented. Or maybe I tested with a cast. Using a cast, it is possible to create a table with a code path through OpenIntoRel: session 1: t=# revoke all on type int2 from public; session2 : t= create table t2 (a int2); ERROR: permission denied for type smallint t= create table t as (select 1::int2 as a); SELECT 1 t= \d t Table public.t Column | Type | Modifiers +--+--- a | smallint | t= Something different: as non superuser I get this error when restricting a type I don't own: t= revoke all on type int2 from public; ERROR: unrecognized objkind: 6 My current time is limited but I will be able to look more at the patch in a few more days. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Java LISTEN/NOTIFY client library work-around
On Thu, Dec 1, 2011 at 6:21 AM, Joel Jacobson j...@trustly.com wrote: 2011/12/1 Kris Jurka bo...@ejurka.com On Wed, 30 Nov 2011, Joel Jacobson wrote: As you know, LISTEN/NOTIFY is broken in the Java client library. You have to do a SELECT 1 in a while-loop to receive the notifications. http://jdbc.postgresql.org/documentation/head/listennotify.html This documentation is out of date. Currently you can get notifications without polling the database if you are not using a SSL connection. You still must poll the driver, using PGConnection.getNotifications, but it will return new notifications received without an intermediate database query. This doesn't work over SSL and potentially some other connection types because it uses InputStream.available that not all implementations support. I know it works without SSL, but we need SSL for this. If it would be possible to fix it, my company is as I said willing to pay for the cost of such a patch. I certainly don't want to discourage you from cleaning up the jdbc ssl situation, but one workaround might be to use stunnel. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Join push-down for foreign tables
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Tom, what do you think of this part? I think it would be a lot more natural API if the planner could directly ask the FDW to construct a plan for a three (or more)-way join, instead of asking it to join a join relation into another relation. I think this is fundamentally not going to work, at least not without major and IMO unwise surgery on the planner. Building up joins pairwise is what it does. Furthermore, you seem to be imagining that there is only one best path for any join, which isn't the case. We'll typically have several paths under consideration because of cheapest-startup versus cheapest-total and/or different resulting sort orders. If we do what you're suggesting, that's going to either break entirely or require a much more complicated API for PlanForeignJoin. Does the planner expect the result from the foreign server to be correctly sorted, if it passes pathkeys to that function? Well, the result path should be marked with pathkeys if it is known to be sorted a certain way, or with NIL if not. There's no prejudgment as to what a particular join method will produce. That does raise interesting questions though as to how to interact with the remote-end planner --- if we've reported that a path has certain pathkeys, that probably means that the SQL sent to the remote had better say ORDER BY, which would be kind of annoying if in the end we weren't depending on the path to be sorted. I'm not sure what it would take to pass that information back down, though. What we might have to do to make this work conveniently is generate two versions of every foreign path: one marked with pathkeys, and one not. And make sure the former has a somewhat-higher cost. Then we'd know from which path gets picked whether the plan is actually depending on sorted results. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
Hello Also, what kind of report does this generate? Good question. I suspect what Pavel has now will raise errors, but that doesn't scale very nicely to checking more than one function, or even to finding more than one bug in a single function. I stop on first error now. Reason is reuse of functionality that can to mark a critical point (bug) of embedded query in console. Continuous checking is possible (plpgsql), but there are a few critical bugs, that does stop. For example - any buggy assign to record var causes uninitialized variable and any later checks are affected. This is possible when ASSIGN, FOR SELECT, SELECT INTO statements are used. It is small part of possible bugs but relative often pattern. So I didn't worry about it yet. My first instinct is to say that it should work like plain EXPLAIN, ie, deliver a textual report that we send as if it were a query result. can be - but EXPLAIN raises exception too, when there some error. There should be a some possibility to simply identify result. I am not sure if checking on empty result is good way. A raising exception should be option. When we return text, then we have to think about some structured form result - line, position, message. A advance of exception on first issue is fact, so these questions are solved. so CHECK can returns lines - like EXPLAIN, but can be nice and helpful for this moment a GUC - some like check_raises_exception = on|off. Default should be off. And I have to think about some FORMAT option. is it good plan? Pavel regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: OK, but I think it's also going to cost you an extra syscache lookup, no? You're going to have to check for this new support function first, and then if you don't find it, you'll have to check for the original one. I don't think there's any higher-level caching around opfamilies to save our bacon here, is there? [ shrug... ] If you are bothered by that, get off your duff and provide the function for your datatype. But it's certainly going to be in the noise for btree index usage, and I submit that query parsing/setup involves quite a lot of syscache lookups already. I think that as a performance objection, the above is nonsensical. (And I would also comment that your proposal with a handle is going to involve a table search that's at least as expensive as a syscache lookup.) Agreed. Doing something once and doing something in the sort loop are two different overheads. I am excited by this major speedup Peter Geoghegan has found. Postgres doesn't have parallel query, which is often used for sorting, so we are already behind some of the databases are compared against. Getting this speedup is definitely going to help us. And I do like the generic nature of where we are heading! -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
[ shrug... ] If you are bothered by that, get off your duff and provide the function for your datatype. But it's certainly going to be in the noise for btree index usage, and I submit that query parsing/setup involves quite a lot of syscache lookups already. I think that as a performance objection, the above is nonsensical. (And I would also comment that your proposal with a handle is going to involve a table search that's at least as expensive as a syscache lookup.) Agreed. Doing something once and doing something in the sort loop are two different overheads. I am excited by this major speedup Peter Geoghegan has found. Postgres doesn't have parallel query, which is often used for sorting, so we are already behind some of the databases are compared against. Getting this speedup is definitely going to help us. And I do like the generic nature of where we are heading! Oracle has not or had not parallel sort too, and I have a reports so Oracle does sort faster then PostgreSQL (but without any numbers). So any solution is welcome Regards Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
On Fri, Dec 2, 2011 at 2:35 PM, Bruce Momjian br...@momjian.us wrote: Agreed. Doing something once and doing something in the sort loop are two different overheads. OK, so I tried to code this up. Adding the new amproc wasn't too difficult (see attached). It wasn't obvious to me how to tie it into the tuplesort infrastructure, though, so instead of wasting time guessing what a sensible approach might be I'm going to use one of my lifelines and poll the audience (or is that ask an expert?). Currently the Tuplesortstate[1] has a pointer to an array of ScanKeyData objects, one per column being sorted. But now instead of FmgrInfo sk_func, the tuplesort code is going to want each scankey to contain SortSupportInfo(Data?) sk_sortsupport[2], because that's where we get the comparison function from. Should I just go ahead and add one more member to that struct, or is there some more appropriate way to handle this? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] Consistent capitalization is for wimps. [2] Hey, we could abbreviate that SSI! Oh, wait... sort-support.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
Hello My attempt at a syntax that could also cover Peter's wish for multiple checker functions: CHECK FUNCTION { func(args) | ALL [IN SCHEMA schema] [FOR ROLE user] } [ USING check_function ] OPTIONS (optname optarg [, ...]) check_function should be related to one language, so you have to specify language if you would to specify check_function (if we would to have more check functions for one language). Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inlining comparators as a performance optimisation
Robert Haas wrote: On Fri, Dec 2, 2011 at 2:35 PM, Bruce Momjian br...@momjian.us wrote: Agreed. ?Doing something once and doing something in the sort loop are two different overheads. OK, so I tried to code this up. Adding the new amproc wasn't too difficult (see attached). It wasn't obvious to me how to tie it into the tuplesort infrastructure, though, so instead of wasting time guessing what a sensible approach might be I'm going to use one of my lifelines and poll the audience (or is that ask an expert?). Currently the Tuplesortstate[1] has a pointer to an array of ScanKeyData objects, one per column being sorted. But now instead of FmgrInfo sk_func, the tuplesort code is going to want each scankey to contain SortSupportInfo(Data?) sk_sortsupport[2], because that's where we get the comparison function from. Should I just go ahead and add one more member to that struct, or is there some more appropriate way to handle this? Is this code immediately usable anywhere else in our codebasde, and if so, is it generic enough? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
2011/12/2 Pavel Stehule pavel.steh...@gmail.com: Hello My attempt at a syntax that could also cover Peter's wish for multiple checker functions: CHECK FUNCTION { func(args) | ALL [IN SCHEMA schema] [FOR ROLE user] } [ USING check_function ] OPTIONS (optname optarg [, ...]) some other idea about other using CHECK FUNCTION CHECK FUNCTION func(args) RETURNS ... AS $$ $$ LANGUAGE xxx This should to do check of function body without affect on registered function. This is addition to previous defined syntax. Nice a day Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
All: *ping* Trying to restart this discussion, since the feature bogged down on spec. We have consensus that we need to change how replication mode is mangaged; surely we can reach consensus on how to change it? On 11/8/11 11:39 AM, Josh Berkus wrote: configuration data somewhere else, but we really need to be able to tell the difference between starting PITR, continuing PITR after a mid-recovery crash, and finished PITR, up and running normally. A GUC is not a good way to do that. Does a GUC make sense to you for how to handle standby/master for replication? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Join push-down for foreign tables
On 02.12.2011 18:55, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: Tom, what do you think of this part? I think it would be a lot more natural API if the planner could directly ask the FDW to construct a plan for a three (or more)-way join, instead of asking it to join a join relation into another relation. I think this is fundamentally not going to work, at least not without major and IMO unwise surgery on the planner. Building up joins pairwise is what it does. Furthermore, you seem to be imagining that there is only one best path for any join, which isn't the case. No, I understand that the planner considers many alternatives, even at the same time, because of different output sort orders and startup vs. total cost. I'm imagining that the planner would ask the FDW to construct the two-way joins, and consider joining the results of those locally to the third table, and also ask the FDW to construct the three-way join as whole. And then choose the cheapest alternative. We'll typically have several paths under consideration because of cheapest-startup versus cheapest-total and/or different resulting sort orders. If we do what you're suggesting, that's going to either break entirely or require a much more complicated API for PlanForeignJoin. I don't understand why the FDW should care about the order the joins are constructed in in the planner. From the FDW's point of view, there's no difference between joining ((A B) C) and (A (B C)). Unless you also want to consider doing a remote join between (A B) and C, where C is a foreign table but A and B are local tables. That would in theory be possible to execute in the remote server, by shipping the result of (A B) to the remote server, but we'd also need a quite different executor API to handle that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
Hi, Hm. I just noticed a relatively big hole in the patch: The handling of deletion of dependent objects currently is nonexistant because they don't go through ProcessUtility... Currently thinking what the best way to nicely implement that is. For other commands the original command string is passed - that obviously won't work for that case... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FlexLocks
Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: Kevin Grittner kevin.gritt...@wicourts.gov wrote: The extraWaits code still looks like black magic to me [explanation of the extraWaits behavior] Thanks. I'll spend some time reviewing this part. There is some rearrangement of related code, and this should arm me with enough of a grasp to review that. I got through that without spotting any significant problems, although I offer the attached micro-optimizations for your consideration. (Applies over the top of your patches.) As far as I'm concerned it looks great and Ready for Committer except for the modularity/pluggability question. Perhaps that could be done as a follow-on patch (if deemed a good idea)? -Kevin diff --git a/src/backend/storage/lmgr/procarraylock.c b/src/backend/storage/lmgr/procarraylock.c index 7cd4b6b..13b51cb 100644 --- a/src/backend/storage/lmgr/procarraylock.c +++ b/src/backend/storage/lmgr/procarraylock.c @@ -153,9 +153,10 @@ ProcArrayLockClearTransaction(TransactionId latestXid) { volatile ProcArrayLockStruct *lock = ProcArrayLockPointer(); PGPROC *proc = MyProc; - int extraWaits = 0; boolmustwait; + Assert(TransactionIdIsValid(latestXid)); + HOLD_INTERRUPTS(); /* Acquire mutex. Time spent holding mutex should be short! */ @@ -186,8 +187,11 @@ ProcArrayLockClearTransaction(TransactionId latestXid) /* Rats, must wait. */ proc-flWaitLink = lock-ending; lock-ending = proc; - if (!TransactionIdIsValid(lock-latest_ending_xid) || - TransactionIdPrecedes(lock-latest_ending_xid, latestXid)) + /* +* lock-latest_ending_xid may be invalid, but invalid transaction +* IDs always precede valid ones. +*/ + if (TransactionIdPrecedes(lock-latest_ending_xid, latestXid)) lock-latest_ending_xid = latestXid; mustwait = true; } @@ -202,7 +206,9 @@ ProcArrayLockClearTransaction(TransactionId latestXid) */ if (mustwait) { - extraWaits += FlexLockWait(ProcArrayLock, 2); + int extraWaits; + + extraWaits = FlexLockWait(ProcArrayLock, 2); while (extraWaits-- 0) PGSemaphoreUnlock(proc-sem); } @@ -247,7 +253,7 @@ ProcArrayLockRelease(void) ending = lock-ending; vproc = ending; - while (vproc != NULL) + do { volatile PGXACT *pgxact = ProcGlobal-allPgXact[vproc-pgprocno]; @@ -258,7 +264,7 @@ ProcArrayLockRelease(void) pgxact-nxids = 0; pgxact-overflowed = false; vproc = vproc-flWaitLink; - } + } while (vproc != NULL); /* Also advance global latestCompletedXid */ if (TransactionIdPrecedes(ShmemVariableCache-latestCompletedXid, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python SQL error code pass-through
On 24.11.2011 23:56, Jan Urbański wrote: On 24/11/11 16:15, Heikki Linnakangas wrote: On 24.11.2011 10:07, Jan Urbański wrote: On 23/11/11 17:24, Mika Eloranta wrote: [PL/Python in 9.1 does not preserve SQLSTATE of errors] Oops, you're right, it's a regression from 9.0 behaviour. The fix looks good to me, I changed one place to indent with tabs instead of spaces and added a regression test. (Forgot to mention earlier: I committed the patch to master and REL9_1_STABLE) In case of SPI errors we're preserving the following from the original ErrorData: * sqlerrcode (as of Mika's patch) * detail * hint * query * internalpos that leaves us with the following which are not preserved: * message * context * detail_log The message is being constructed from the Python exception name and I think that's useful. The context is being taken by the traceback string. I'm not sure if detail_log is ever set in these types of errors, probably not? So I guess we're safe. Ok. The problem with storing the entire ErrorData struct is that this information has to be transformed to Python objects, because we attach it to the Python exception that gets raised and in case it bubbles all the way up to the topmost PL/Python function, we recover these Python objects and use them to construct the ereport call. While the exception is inside Python, user code can interact with it, so it'd be hard to have C pointers to non-Python stuff there. Hmm, can the user also change the fields in the exception within python code, or are they read-only? Is the spidata attribute in the exception object visible to user code? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch - Debug builds without optimization
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I have applied the attached patch to mention the debugger. OK? Server developers should consider using the configure options option--enable-cassert/ and option--enable-debug/ to enhance the ability to detect and debug server errors. They should also consider ! running configure with literalCFLAGS=-O0 -g/ if using a debugger. I still think this is basically useless. If we're going to mention the topic at all, we should provide enough information to be helpful, which this does not. Furthermore, it's concretely wrong in that it suggests you need to say -g when --enable-debug already does that, and that it fails to note that all this advice is gcc-specific. I suggest wording along these lines: When developing code inside the server, it's recommended to use the configure options --enable-cassert, which turns on many run-time error checks, and --enable-debug, which improves the usefulness of debugging tools. If you use gcc, it's best to build with an optimization level of at least -O1, because using level -O0 disables some important compiler warnings (such as use of an uninitialized variable). However, nonzero optimization levels can complicate debugging because stepping through the compiled code will usually not match up one-to-one with source code lines. If you get confused while trying to debug optimized code, recompile the specific file(s) of interest with -O0. An easy way to do this with the Unix makefiles is make PROFILE=-O0 file.o. OK, I make some slight modifications and applied the attached patch. Ideally we could tell everyone to read the developer's FAQ, but that is too large for people who are debugging problems in our shipped code --- that is why I was excited to get something into our main docs. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml new file mode 100644 index 1135961..75fb783 *** a/doc/src/sgml/installation.sgml --- b/doc/src/sgml/installation.sgml *** su - postgres *** 1415,1424 note para ! Server developers should consider using the configure options ! option--enable-cassert/ and option--enable-debug/ to enhance the ! ability to detect and debug server errors. Your debugger might ! also require specific compiler flags to produce useful output. /para /note /step --- 1415,1437 note para ! When developing code inside the server, it is recommended to ! use the configure options option--enable-cassert/ (which ! turns on many run-time error checks) and option--enable-debug/ ! (which improves the usefulness of debugging tools). ! /para ! ! para ! If using GCC, it is best to build with an optimization level of ! at least option-O1/, because using no optimization ! (option-O0/) disables some important compiler warnings (such ! as the use of uninitialized variables). However, non-zero ! optimization levels can complicate debugging because stepping ! through compiled code will usually not match up one-to-one with ! source code lines. If you get confused while trying to debug ! optimized code, recompile the specific files of interest with ! option-O0/. An easy way to do this is by passing an option ! to applicationmake/: commandgmake PROFILE=-O0 file.o/. /para /note /step -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
Hi all, There is also the point about how permission checks on the actual commands (in comparison of modifying command triggers) and such are handled: BEFORE and INSTEAD will currently be called independently of the fact whether the user is actually allowed to do said action (which is inconsistent with data triggers) and indepentent of whether the object they concern exists. I wonder if anybody considers that a problem? Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] clog buffers (and FlexLocks)
Heikki tipped me off to a possible problem with CLOG contention that he noticed while doing some benchmarking, and so I (you know what's coming, right?) tried to reproduce it on Nate Boley's 32-core AMD machine. It turns out that increasing NUM_CLOG_BUFFERS from 8 to 32 delivers a significant performance boost on this server at higher concurrency levels. Although there is some possible effect even at 8 clients, it's much more pronounced at 16+ clients. I believe the root of the problem is the case where SimpleLruReadPage_ReadOnly fails to find the necessary page in a buffer and therefore needs to drop the shared lock, acquire the exclusive lock, and pull the page in. More testing is probably necessary, but ISTM that if you have many more CPUs than CLOG buffers, you could end up in a situation where the number of people waiting for a CLOG buffer is greater than the number of buffers. On these higher velocity tests, where you have 10k tps, you burn through a CLOG page every few seconds, so it's easy to imagine that when you go to examine a row updated earlier in the test, the relevant CLOG page is no longer resident. Another point here is that the flexlock patch on latest sources seems to be *reducing* performance on permanent tables and increasing it on unlogged tables, which seems quite bizarre. I'll see if I can look into what's going on there. Obligatory benchmark details: m = master, c = master w/NUM_CLOG_BUFFERS = 32, f = FlexLocks, b = FlexLocks w/NUM_CLOG_BUFFERS = 32; number following the letter is the client count. scale factor 100, median of three five-minute pgbench rules, shared_buffers = 8GB, maintenance_work_mem = 1GB, synchronous_commit = off, checkpoint_segments = 300, checkpoint_timeout = 15min, checkpoint_completion_target = 0.9, wal_writer_delay = 20ms. Permanent Tables m01 tps = 629.407759 (including connections establishing) c01 tps = 624.163365 (including connections establishing) f01 tps = 588.819568 (including connections establishing) b01 tps = 622.116258 (including connections establishing) m08 tps = 4199.008538 (including connections establishing) c08 tps = 4325.508396 (including connections establishing) f08 tps = 4154.397798 (including connections establishing) b08 tps = 4442.209823 (including connections establishing) m16 tps = 8011.430159 (including connections establishing) c16 tps = 8424.109412 (including connections establishing) f16 tps = 7783.139603 (including connections establishing) b16 tps = 8524.645511 (including connections establishing) m32 tps = 10025.079556 (including connections establishing) c32 tps = 11247.358762 (including connections establishing) f32 tps = 10139.320355 (including connections establishing) b32 tps = 10942.857215 (including connections establishing) m80 tps = 11072.246423 (including connections establishing) c80 tps = 11845.972123 (including connections establishing) f80 tps = 10525.232951 (including connections establishing) b80 tps = 11757.289333 (including connections establishing) Unlogged Tables m01 tps = 669.939124 (including connections establishing) c01 tps = 664.763349 (including connections establishing) f01 tps = 671.539140 (including connections establishing) b01 tps = 665.448818 (including connections establishing) m08 tps = 4557.328825 (including connections establishing) c08 tps = 4590.319949 (including connections establishing) f08 tps = 4389.771668 (including connections establishing) b08 tps = 4618.345372 (including connections establishing) m16 tps = 8474.117765 (including connections establishing) c16 tps = 9059.450494 (including connections establishing) f16 tps = 8338.446241 (including connections establishing) b16 tps = 9114.274464 (including connections establishing) m32 tps = 16999.301828 (including connections establishing) c32 tps = 19653.023856 (including connections establishing) f32 tps = 19431.228726 (including connections establishing) b32 tps = 24696.075282 (including connections establishing) m80 tps = 14048.282651 (including connections establishing) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Join push-down for foreign tables
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 02.12.2011 18:55, Tom Lane wrote: Furthermore, you seem to be imagining that there is only one best path for any join, which isn't the case. No, I understand that the planner considers many alternatives, even at the same time, because of different output sort orders and startup vs. total cost. I'm imagining that the planner would ask the FDW to construct the two-way joins, and consider joining the results of those locally to the third table, and also ask the FDW to construct the three-way join as whole. And then choose the cheapest alternative. It probably makes sense to turn control over to the FDW just once to consider all possible foreign join types for a given join pair, ie we don't want to ask it separately about nestloop, hash, merge joins. But then we had better be able to let it generate multiple paths within the one call, and dump them all to add_path. You're still assuming that there is one unique best path for any join, and *that is not the case*, or at least we don't know which will be the best at the time we're generating join paths. We don't know whether fast-start is better than cheapest-total, nor which sort order might be the best, until we get up to the highest join level. So rather than returning a Path struct, it would have to just be given the joinrel struct and be expected to do add_path call(s) for itself. I don't understand why the FDW should care about the order the joins are constructed in in the planner. From the FDW's point of view, there's no difference between joining ((A B) C) and (A (B C)). Maybe there is, maybe there isn't. You're assuming too much about how the FDW does its join planning, I think --- in particular, FDWs that are backed by less than a Postgres-equivalent remote planner might well appreciate being walked through all the feasible join pairs. If we do it as I suggest above, the FDW could remember that it had already planned this joinrel and just drop out immediately if called again, if it's going to do it the way you're thinking. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Join push-down for foreign tables
On 03.12.2011 00:24, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: On 02.12.2011 18:55, Tom Lane wrote: Furthermore, you seem to be imagining that there is only one best path for any join, which isn't the case. No, I understand that the planner considers many alternatives, even at the same time, because of different output sort orders and startup vs. total cost. I'm imagining that the planner would ask the FDW to construct the two-way joins, and consider joining the results of those locally to the third table, and also ask the FDW to construct the three-way join as whole. And then choose the cheapest alternative. It probably makes sense to turn control over to the FDW just once to consider all possible foreign join types for a given join pair, ie we don't want to ask it separately about nestloop, hash, merge joins. But then we had better be able to let it generate multiple paths within the one call, and dump them all to add_path. You're still assuming that there is one unique best path for any join, and *that is not the case*, or at least we don't know which will be the best at the time we're generating join paths. We don't know whether fast-start is better than cheapest-total, nor which sort order might be the best, until we get up to the highest join level. Hmm, so you're saying that the FDW function needs to be able to return multiple paths for a single joinrel. Fair enough, and that's not specific to remote joins. Even a single-table foreign scan could be implemented differently depending on whether you prefer fast-start or cheapest total. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Join push-down for foreign tables
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Hmm, so you're saying that the FDW function needs to be able to return multiple paths for a single joinrel. Fair enough, and that's not specific to remote joins. Even a single-table foreign scan could be implemented differently depending on whether you prefer fast-start or cheapest total. ... or ordered vs unordered, etc. Yeah, good point, we already got this wrong with the PlanForeignScan API. Good thing we didn't promise that would be stable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
Hi, First thing first: thank you Andres for a great review, I do appreciate it. Please find attached a newer version of the patch. The github repository is also updated. Andres Freund and...@anarazel.de writes: I think at least a * command_tagtext Added. Why is there explicit documentation of not checking the arguments of said trigger function? That seems to be a bit strange to me. The code is searching for a function with the given name and 5 text arguments, whatever you say in the command. That could be changed later on if there's a need. schemaname currently is mightily unusable because whether its sent or not depends wether the table was created with a schemaname specified or not. That doesn't seem to be a good approach. I'm not sure about that, we're spiting out what the user entered. Imo the output should fully schema qualify anything including operators. Yes, thats rather ugly but anything else seems to be too likely to lead to problems. Will look into qualifying names. As an alternative it would be possible to pass the current search path but that doesn't seem to the best approach to me; The trigger runs with the same search_path settings as the command, right? Then there is nice stuff like CREATE TABLE (LIKE ...); What shall that return in future? I vote for showing it as the plain CREATE TABLE where all columns are specified. I don't think so. Think about the main use cases for this patch, auditing and replication. Both cases you want to deal with what the user said, not what the system understood. I think it would sensible to allow multiple actions on which to trigger to be specified just as you can for normal triggers. I also would like an ALL option Both are now implemented. When dealing with an ANY command trigger, your procedure can get fired on a command for which we don't have internal support for back parsing etc, so you only get the command tag not null. I think that's the way to go here, as I don't want to think we need to implement full support for command triggers on ALTER OPERATOR FAMILY from the get go. Currently the grammar allows options to be passed to command triggers. Do we want to keep that? If yes, with the same arcane set of datatypes as in data triggers? If yes it needs to be wired up. In fact it's not the case, you always get called with the same 5 arguments, all nullable now that we have ANY COMMAND support. * schema qualification: An option to add triggers only to a specific schema would be rather neat for many scenarios. I am not totally sure if its worth the hassle of defining what happens in the edge cases of e.g. moving from one into another schema though. I had written down support for a WHEN clause in command triggers, but the exact design has yet to be worked out. I'd like to have this facility but I'd like it even more if that could happen in a later patch. Currently there is no locking strategy at all. Which e.g. means that there is no protection against two sessions changing stuff concurrently or such. Was that just left out - which is ok - or did you miss that? I left out the locking as of now. I tried to fix some of it in the new patch though, but I will need to spend more time on that down the road. Command triggers should only be allowed for the database owner. Yes, that needs to happen soon, along with pg_dump and psql support. I contrast to data triggers command triggers cannot change what is done. They can either prevent it from running or not. Which imo is fine. The question I have is whether it would be a good idea to make it easy to prevent recursion Especially if the triggers wont be per schema. You already have ATER TRIGGER foo ON COMMAND create table DISABLE; that you can use from the trigger procedure itself. Of course, locking is an issue if you want to go parallel on running commands with recursive triggers attached. I think we might be able to skip solving that problem in the first run. Imo the current callsite in ProcessUtility isn't that good. I think it would make more sense moving it into standard_ProcessUtility. It should be *after* the check_xact_readonly check in there in my opinion. Makes sense, will fix in the next revision. I am also don't think its a good idea to run the ExecBeforeOrInsteadOfCommandTriggers stuff there because thats allso the path that COMMIT/BEGIN and other stuff take. Those are pretty fast path. I suggest making two switches in standard_ProcessUtility, one for the non- command trigger stuff which returns on success and one after that. Only after the first triggers are checked. Agreed. Also youre very repeatedly transforming the parstree into a string. It would be better doing that only once instead of every trigger... It was only done once per before and instead of triggers (you can't have both on the same command), and another time for any and all after triggers, meaning at most twice.
Re: [HACKERS] backup_label during crash recovery: do we know how to solve it?
On Thu, Dec 1, 2011 at 3:47 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 29, 2011 at 9:10 PM, Daniel Farina dan...@heroku.com wrote: Reviving a thread that has hit its second birthday: http://archives.postgresql.org/pgsql-hackers/2009-11/msg00024.php In our case not being able to restart Postgres when it has been taken down in the middle of a base backup is starting to manifest as a serious source of downtime: basically, any backend crash or machine restart will cause postgres not to start without human intervention. The message delivered is sufficiently scary and indirect enough (because of the plausible scenarios that could cause corruption if postgres were to make a decision automatically in the most general case) that it's not all that attractive to train a general operator rotation to assess what to do, as it involves reading and then, effectively, ignoring some awfully scary error messages and removing the backup label file. Even if the error messages weren't scary (itself a problem if one comes to the wrong conclusion as a result), the time spent digging around under short notice to confirm what's going on is a high pole in the tent for improving uptime for us, taking an extra five to ten minutes per common encounter. Our problem is compounded by having a lot of databases that take base backups at attenuated rates in an unattended way, and therefore a human who may have been woken up from a sound sleep will have to figure out what was going on before they've reached consciousness, rather than a person with prior knowledge of having started a backup. Also, fairly unremarkable databases can take so long to back up that they may well have a greater than 20% chance of encountering this problem at any particular time: 20% of a day is less than 5 hours per day taken to do on-line backups. Basically, we -- and anyone else with unattended physical backup schemes -- are punished rather severely by the current design. This issue has some more recent related incarnations, even if for different reasons: http://archives.postgresql.org/pgsql-hackers/2011-01/msg00764.php Because backup_label coming or going? confusion in Postgres can have serious consequences, I wanted to post to the list first to solicit a minimal design to solve this problem. If it's fairly small in its mechanics then it may yet be feasible for the January CF. In some ways, I feel like this problem is unsolvable by definition. If a backup is designed to be an exact copy of the data directory taken between pg_start_backup() and pg_stop_backup(), then by definition you can't distinguish between the original and the copy. That's what a copy *is*. Now, we could fix this by requiring an additional step when creating a backup. For example, we could create backup_label.not_really on the master and require the person taking the backup to rename it to backup_label on the slave before starting the postmaster. This could be an optional behavior, to preserve backward compatibility. Now the slave *isn't* an exact copy of the master, so PostgreSQL can distinguish. I actually think such a protocol should be chosen. As is I cannot say yeah, restarting postgres is always designed to work in the presence of backups. Prior suggestions -- I think rejected -- were to use the recovery.conf as such a sentinel file suggesting I am restoring, not being backed up. But it seems that this could also be worked around outside the database. We don't have built-in clusterware, so there must be something in the external environment that knows which server is supposed to be the master and which is supposed to be the standby. So, if you're on the master, remove the backup label file before starting the postmaster. If you're on the standby, don't. Fundamentally this is true, but taking a backup should not make database restart a non-automatic process. By some definition one could adapt their processes to remove backup_label at all these times, but I think this should be codified; I cannot think of any convincing reason have that much freedom (or homework, depending how you look at it) to write one's own protocol for this from scratch. From an arm's length view, a database that cannot do a clean or non-clean restart at any time regardless of the existence of a concurrent on-line backup has a clear defect. Here's a protocol: have pg_start_backup() write a file that just means backing up. Restarts are OK, because that's all it means, it has no meaning to a recovery/restoration process. When one wishes to restore, one must touch a file -- not unlike the trigger file in recovery.conf (some have argued in the past this *should* be recovery.conf, except perhaps for its tendency to be moved to recovery.done) to have that behavior occur. How does that sound? All fundamentally possible right now, but the cause of slivers in my and other people's sides for years. -- fdr -- Sent via
Re: [HACKERS] backup_label during crash recovery: do we know how to solve it?
On Fri, Dec 2, 2011 at 6:25 PM, Daniel Farina dan...@heroku.com wrote: Here's a protocol: have pg_start_backup() write a file that just means backing up. Restarts are OK, because that's all it means, it has no meaning to a recovery/restoration process. When one wishes to restore, one must touch a file -- not unlike the trigger file in recovery.conf (some have argued in the past this *should* be recovery.conf, except perhaps for its tendency to be moved to recovery.done) to have that behavior occur. It certainly doesn't seem to me that you need TWO files. If you create a file on the master, then you just need to remove it from the backup. But I think the use of such a new protocol should be optional; it's easy to provide backward-compatibility here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
Excerpts from Andres Freund's message of vie dic 02 19:09:47 -0300 2011: Hi all, There is also the point about how permission checks on the actual commands (in comparison of modifying command triggers) and such are handled: BEFORE and INSTEAD will currently be called independently of the fact whether the user is actually allowed to do said action (which is inconsistent with data triggers) and indepentent of whether the object they concern exists. I wonder if anybody considers that a problem? Hmm, we currently even have a patch (or is it already committed?) to avoid locking objects before we know the user has permission on the object. Getting to the point of calling the trigger would surely be even worse. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
On Fri, Dec 2, 2011 at 7:09 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Hmm, we currently even have a patch (or is it already committed?) to avoid locking objects before we know the user has permission on the object. Getting to the point of calling the trigger would surely be even worse. I committed a first round of cleanup in that area, but unfortunately there is a lot more to be done. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Command Triggers
On Saturday, December 03, 2011 01:09:48 AM Alvaro Herrera wrote: Excerpts from Andres Freund's message of vie dic 02 19:09:47 -0300 2011: Hi all, There is also the point about how permission checks on the actual commands (in comparison of modifying command triggers) and such are handled: BEFORE and INSTEAD will currently be called independently of the fact whether the user is actually allowed to do said action (which is inconsistent with data triggers) and indepentent of whether the object they concern exists. I wonder if anybody considers that a problem? Hmm, we currently even have a patch (or is it already committed?) to avoid locking objects before we know the user has permission on the object. Getting to the point of calling the trigger would surely be even worse. Well, calling the trigger won't allow them to lock the object. It doesn't even confirm the existance of the table. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prep object creation hooks, and related sepgsql updates
On Fri, Dec 2, 2011 at 6:52 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: At least, it is working. However, it is not a perfect solution to the future updates of code paths in the core. Hmm. So, do you want this committed? If so, I think the major thing it lacks is documentation. I can't help noticing that this amounts, altogether, to less than 600 lines of code. I am not sure what your hesitation in taking this approach is. Certainly, there are things not to like in here, but I've seen a lot worse, and you can always refine it later. For a first cut, why not?Even if you had the absolutely perfect hooks in core, how much would it save compared to what's here now? How different would your ideal implementation be from what you've done here? As regards future updates of code paths in core, nothing in here looks terribly likely to get broken; or at least if it does then I think quite a lot of other things will get broken, too. Anything we do has some maintenance burden, and this doesn't look particularly bad to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers