Re: [HACKERS] lateral function as a subquery - WIP patch
Robert Haas robertmh...@gmail.com writes: On Fri, Mar 9, 2012 at 8:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: Um ... how do you get the subquery result rows to join to only the correct rows of the other tables? This looks like an unconstrained join to me, which is not what I believe the SQL spec for LATERAL to be, and it doesn't seem especially useful either. (If a subquery could do what people wanted, we'd not be hearing all the requests for LATERAL.) I think LATERAL is intended as more or less an unconstrained nested loop with the lateral expression on the inner side, parameterized by value from the outer side. Typically it's a SRF. Um ... if it's parameterized by values from a current row of the outer side, then it's not an unconstrained join. That would be like doing an inner indexscan join and producing a cross-join result. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Add GUC sepgsql.client_label
On 2012-03-09 21:49, Robert Haas wrote: On Tue, Mar 6, 2012 at 9:14 AM, Kohei KaiGaikai...@kaigai.gr.jp wrote: [ new patch ] Are we absolutely certain that we want the semantics of sepgsql_setcon() to be transactional? Because if we made them non-transactional, this would be a whole lot simpler, and it would still meet the originally proposed use case, which is to allow the security context of a connection to be changed on a one-time basis before handing it off to a client application. It would meet the original use case, but outside of that use case it would be very easy to get POLA violations. Imagine a transaction like 1- do stuff under label A 2- setcon to B 3- do stuff under label B When that transaction fails due to a serialization error, one would expect that when the transaction is replayed, the initial actions are executed under label A. If it was B, or any other further label in the original transaction, it would be very hard to develop software in user space that could cope with this behaviour. As a separate but related note, the label management here seems to be excessively complicated. In particular, it seems to me that the semantics of sepgsql_get_client_label() become quite muddled under this patch. An explicitly-set label overrides the default label, but a trusted procedure's temporary label overrides that. So if you enter a trusted procedure, and it calls sepgsql_setcon(), it'll change the label, but no actual security transition will occur; then, when you exit the trusted procedure, you'll get the new label in place of whatever the caller had before. That seems like one heck of a surprising set of semantics. I agree that sepgsql_get_*client*_label does not really match with a trusted procedure temporarily changing it. It seems to me that it would make more sense to view the set of security labels in effect as a stack. When we enter a trusted procedure, it pushes a new label on the stack; when we exit a trusted procedure, it pops the top label off the stack. sepgsql_setcon() changes the top label on the stack. If we want to retain transactional semantics, then you can also have a toplevel label that doesn't participate in the stack; a commit copies the sole item on the stack into the toplevel label, and transaction start copies the toplevel label into an empty stack. Yes the additions be sepgsql_setcon look like a stack for pushing. However, the current code that rollbacks the pending list to a certain savepoint matches code from e.g. StandbyReleaseLocks(), so having a concept like this as a list is not unknown to the current codebase. In the current coding, I think client_label_peer is redundant with client_label_committed - once the latter is set, the former is unused, IIUC client_label_peer is used on reset of the client label, i.e. calling sepgsql_setcon with NULL. - and what I'm proposing is that client_label_func shouldn't be separate, but rather should mutate the stack of labels maintained by client_label_pending. The drawback of having trusted procedures push things on this stack is that it would add a memory alloc and size overhead when calling trusted functions, especially for recursive functions. The docs need updating, both to reflect the version bump in sepgsql-regtest.te and to describe the actual feature. I can probably write some docs tomorrow. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
On Sat, Feb 18, 2012 at 21:16, Vik Reykja vikrey...@gmail.com wrote: I would be willing to have a go at translating test cases. I do not (yet) have the C knowledge to maintain the regex code, though. I got suddenly swamped and forgot I had signed up for this. I'm still pretty swamped and I would like these regression tests to be in for 9.2 so if someone else would like to pick them up, I would be grateful. If they're still not done by the time I resurface, I will attack them again.
Re: [HACKERS] elegant and effective way for running jobs inside a database
On Fri, Mar 9, 2012 at 6:51 PM, Andrew Dunstan and...@dunslane.net wrote: On 03/09/2012 01:40 PM, Robert Haas wrote: On Fri, Mar 9, 2012 at 12:02 PM, David E. Wheelerda...@justatheory.com wrote: On Mar 9, 2012, at 7:55 AM, Merlin Moncure wrote: 100% agree (having re-read the thread and Alvaro's idea having sunk in). Being able to set up daemon processes side by side with the postmaster would fit the bill nicely. It's pretty interesting to think of all the places you could go with it. pgAgent could use it *right now*. I keep forgetting to restart it after restarting PostgreSQL and finding after a day or so that no jobs have run. That can and should be fixed by teaching pgAgent that failing to connect to the server, or getting disconnected, is not a fatal error, but a reason to sleep and retry. Yeah. It's still not entirely clear to me what a postmaster-controlled daemon is going to be able to do that an external daemon can't. Start and stop at the same time as postmaster, without any pain. It's a considerable convenience to be able to design this aspect once and then have all things linked to the postmaster follow that. It means people will be able to write code that runs on all OS easily, without everybody having similar but slightly different code about starting up, reading parameters, following security rules etc.. Tight integration, with good usability. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] poll: CHECK TRIGGER?
Hello 2012/3/10 Tom Lane t...@sss.pgh.pa.us: Peter Eisentraut pete...@gmx.net writes: But then I would have to map all language-specific error reports to some SQL error scheme, which is not only cumbersome but pretty useless. For example, a Python programmer will be familiar with the typical output that pylint produces and how to fix it. If we hide that output behind the layer of SQL-ness, that won't make things easier to anyone. Yeah, this is a good point. I'm willing to concede that we are not close to having a uniform API that could be used for checker functions, so maybe what we should do for now is just invent plpgsql_check_function(regprocedure). I'd still like to see the question revisited sometime in the future, but it would be appropriate to have a few working examples of popular checker functions for different languages before we try to invent a common API. here is draft I removed all generic structures. Regards Pavel regards, tom lane plpgsql_check_function.diff.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lateral function as a subquery - WIP patch
On Sat, Mar 10, 2012 at 4:16 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Mar 9, 2012 at 8:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: Um ... how do you get the subquery result rows to join to only the correct rows of the other tables? This looks like an unconstrained join to me, which is not what I believe the SQL spec for LATERAL to be, and it doesn't seem especially useful either. (If a subquery could do what people wanted, we'd not be hearing all the requests for LATERAL.) I think LATERAL is intended as more or less an unconstrained nested loop with the lateral expression on the inner side, parameterized by value from the outer side. Typically it's a SRF. Um ... if it's parameterized by values from a current row of the outer side, then it's not an unconstrained join. That would be like doing an inner indexscan join and producing a cross-join result. True. I just meant that no join filter was implied. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Add GUC sepgsql.client_label
On Sat, Mar 10, 2012 at 4:39 AM, Yeb Havinga yebhavi...@gmail.com wrote: As a separate but related note, the label management here seems to be excessively complicated. In particular, it seems to me that the semantics of sepgsql_get_client_label() become quite muddled under this patch. An explicitly-set label overrides the default label, but a trusted procedure's temporary label overrides that. So if you enter a trusted procedure, and it calls sepgsql_setcon(), it'll change the label, but no actual security transition will occur; then, when you exit the trusted procedure, you'll get the new label in place of whatever the caller had before. That seems like one heck of a surprising set of semantics. I agree that sepgsql_get_*client*_label does not really match with a trusted procedure temporarily changing it. I'm not complaining about the name of the function; I'm complaining about the semantics. In the current coding, I think client_label_peer is redundant with client_label_committed - once the latter is set, the former is unused, IIUC client_label_peer is used on reset of the client label, i.e. calling sepgsql_setcon with NULL. Ugh. What's the point of supporting that? The drawback of having trusted procedures push things on this stack is that it would add a memory alloc and size overhead when calling trusted functions, especially for recursive functions. Compared to all the other overhead of using sepgsql, that is miniscule and not worth worrying about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it time for triage on the open patches?
On Fri, Mar 9, 2012 at 7:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: I think a reasonable way to proceed might be to get some consensus on a short list of patches we're willing to try to push to completion, then set a schedule accordingly, and then anything that doesn't get done by the deadline gets kicked to 9.3. Or we can just keep drifting. But the number of open patches that are *not* Ready for Committer, nigh two months after the CF started, is depressingly large. * FOR KEY SHARE locks looks in very good shape and so I'm spending time on that with a view to committing it next week if all goes well * pg_stat_statements looks good also, I hope someone is looking at that It's a good thing we have so many patches. We just need to organise ourselves to ensure that we're working on priority items and spot important things that are receiving no attention. At this stage the CF app isn't helping us much. We need some way to indicate who is actively working on review, not just a list of people who have at some point reviewed it. Patches without committers will likely suffer, so we need to be able to see the Committers column on the display, so we know whether a patch needs one assigned. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] elegant and effective way for running jobs inside a database
W dniu 2012-03-09 16:55, Merlin Moncure pisze: On Fri, Mar 9, 2012 at 9:36 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: 2012/3/6 Alvaro Herrera alvhe...@commandprompt.com: It seems to me that the only thing that needs core support is the ability to start up the daemon when postmaster is ready to accept queries, and shut the daemon down when postmaster kills backends (either because one crashed, or because it's shutting down). So, although my motivation is not something like Cron in core, it seems to me Alvaro's idea is quite desirable and reasonable, to be discussed in v9.3. 100% agree (having re-read the thread and Alvaro's idea having sunk in). Being able to set up daemon processes side by side with the postmaster would fit the bill nicely. It's pretty interesting to think of all the places you could go with it. merlin Good to hear that (I hope that even though English is not my native language I understand properly posts in this thread). I am convinced, that all of You will be pround of the new solution like a heart bit for PostgreSQL. May be it is too poetic but considering cron or pgAgent instead real job manager is like considering defibrillator instead a real heart. Currently, especially in web applications, the idea is not where to store a data but how a data can flow and how *fast*. It's pretty interesting to think of all the places you could go with it. - in fact it is :) Best regards, Artur -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for parallel pg_dump
On Thu, Feb 23, 2012 at 11:37 PM, Joachim Wieland j...@mcknight.de wrote: On Thu, Feb 16, 2012 at 1:29 PM, Robert Haas robertmh...@gmail.com wrote: Can you provide an updated patch? Robert, updated patch is attached. Well, I was hoping someone else would do some work on this, but here we are. Some more comments from me: + /* +* If the version is lower and we don't have synchronized snapshots +* yet, we will error out earlier already. So either we have the +* feature or the user has given the explicit command not to use it. +* Note: If we have it, we always use it, you cannot switch it off +* then. +*/ I don't agree with this design decision. I think that --no-synchronized-snapshots should be available even on server versions that support it. + if (archiveFormat != archDirectory numWorkers 1) { Style. -const char *owner, bool withOids, +const char *owner, +unsigned long int relpages, bool withOids, The new argument to ArchiveEntry() is unused. Removing it would declutter things a good bit. +#include ../../backend/port/pipe.c This seems really ugly. I suggest that we commit a separate patch to move src/backend/port/pipe.c to src/port and refactor the interface so that it has a signature something like this: int win32_setup_pipe(int handles[2], char **error_string, int *error_code); The backend can have a wrapper function around this that calls ereport using the error_string and error_code, and any front-end code that wants to use this can do so directly. +/* + * The parallel error handler is called for any die_horribly() in a child or master process. + * It then takes control over shutting down the rest of the gang. + */ I think this needs to be revised to take control in exit_nicely(), maybe by using on_exit_nicely(). Trapping die_horribly() won't catch everything. + if (msgsize == 0 !do_wait) { + setnonblocking(fd); + } Style. + if (msg[msgsize] == '\0') { + return msg; + } Style. I find myself somewhat uncomfortable with the extent to which this is relying on being able to set blocking and nonblocking flags on and off repeatedly. Maybe there's no real problem with it except for being inefficient, but the way I'd expect this to readMessageFromPipe() to be written is: Always leave the pipe in blocking mode. If you want a non-blocking read, then use select() first to check whether it's ready for read; if not, just do the read directly. Actually, looking further, it seems that you already have such logic in getMessageFromWorker(), so I'm unclear why do_wait needs to be passed down to readMessageFromPipe() at all. + * Hence this function returns an (unsigned) int. + */ +static int It doesn't look unsigned? +extern const char *fmtQualifiedId(struct Archive *fout, + const char *schema, const char *id); I don't think we want to expose struct Archive to dumputils.h. Can we find a different place to put this? +enum escrow_action { GET, SET }; +static void +parallel_error_handler_escrow_data(enum escrow_action act, ParallelState *pstate) +{ + static ParallelState *s_pstate = NULL; + + if (act == SET) + s_pstate = pstate; + else + *pstate = *s_pstate; +} This seems like a mighty complicated way to implement a global variable. +#ifdef HAVE_SETSID + /* +* If we can, we try to make each process the leader of its own +* process group. The reason is that if you hit Ctrl-C and they are +* all in the same process group, any termination sequence is +* possible, because every process will receive the signal. What +* often happens is that a worker receives the signal, terminates +* and the master detects that one of the workers had a problem, +* even before acting on its own signal. That's still okay because +* everyone still terminates but it looks a bit weird. +* +* With setsid() however, a Ctrl-C is only sent to the master and +* he can then cascade it to the worker processes. +*/ + setsid(); +#endif This doesn't seem like a very good idea, because if the master fails to propagate the ^C to all the workers for any reason, then the user will have to hunt them down manually and kill them. Or imagine that someone hits ^Z, for example. I think the right thing to do is to have the workers catch the signal and handle it by terminating, passing along a notification to the master
Re: [HACKERS] Is it time for triage on the open patches?
On Sat, Mar 10, 2012 at 9:04 AM, Simon Riggs si...@2ndquadrant.com wrote: * FOR KEY SHARE locks looks in very good shape and so I'm spending time on that with a view to committing it next week if all goes well Álvaro is a committer and is perfectly capable of committing that patch for himself, had we consensus on it. So far we don't, and I don't think you can make a unilateral decision to the contrary. Maybe that's not what you're proposing, but then what exactly are you proposing? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] elegant and effective way for running jobs inside a database
On 03/10/2012 07:11 AM, Simon Riggs wrote: On Fri, Mar 9, 2012 at 6:51 PM, Andrew Dunstanand...@dunslane.net wrote: On 03/09/2012 01:40 PM, Robert Haas wrote: On Fri, Mar 9, 2012 at 12:02 PM, David E. Wheelerda...@justatheory.com wrote: On Mar 9, 2012, at 7:55 AM, Merlin Moncure wrote: 100% agree (having re-read the thread and Alvaro's idea having sunk in). Being able to set up daemon processes side by side with the postmaster would fit the bill nicely. It's pretty interesting to think of all the places you could go with it. pgAgent could use it *right now*. I keep forgetting to restart it after restarting PostgreSQL and finding after a day or so that no jobs have run. That can and should be fixed by teaching pgAgent that failing to connect to the server, or getting disconnected, is not a fatal error, but a reason to sleep and retry. Yeah. It's still not entirely clear to me what a postmaster-controlled daemon is going to be able to do that an external daemon can't. Start and stop at the same time as postmaster, without any pain. It's a considerable convenience to be able to design this aspect once and then have all things linked to the postmaster follow that. It means people will be able to write code that runs on all OS easily, without everybody having similar but slightly different code about starting up, reading parameters, following security rules etc.. Tight integration, with good usability. The devil is in the details, though, pace Mies van der Rohe. In particular, it's the tight integration piece I'm worried about. What is the postmaster supposed to do if the daemon start fails? What if it gets a flood of failures? What access will the daemon have to Postgres internals? What OS privileges will it have, since this would have to run as the OS postgres user? In general I think we don't want arbitrary processes running as the OS postgres user. I accept that cron might not be the best tool for the jobs, since a) its finest granularity is 1 minute and b) it would need a new connection for each job. But a well written external daemon that runs as a different user and is responsible for making its own connection to the database and re-establishing it if necessary, seems to me at least as clean a design for a job scheduler as one that is stopped and started by the postmaster. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] poll: CHECK TRIGGER?
here is draft and there some cleaned version Regards Pavel Stehule regards, tom lane plpgsql_check_function.diff.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it time for triage on the open patches?
Simon Riggs si...@2ndquadrant.com writes: * pg_stat_statements looks good also, I hope someone is looking at that I will take that one, if it ever gets marked RFC, but in the meantime I plan to spend my time elsewhere. At this stage the CF app isn't helping us much. We need some way to indicate who is actively working on review, not just a list of people who have at some point reviewed it. Patches without committers will likely suffer, so we need to be able to see the Committers column on the display, so we know whether a patch needs one assigned. I've kind of wished the Committer field could be seen in the list view too, although I'm not sure what I'd give up for it (narrowing the other fields doesn't sound good). Mainly what I wanted it for is to check is there anything I claimed and have not dealt with; which is not something I need to check every time I look, so perhaps some different view could cover this need. Another thing that would be really nice is the ability to sort open items by last-activity time (disregarding the category labels), to make it easier to spot which items are not receiving any attention. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future of our regular expression code
Vik Reykja vikrey...@gmail.com writes: On Sat, Feb 18, 2012 at 21:16, Vik Reykja vikrey...@gmail.com wrote: I would be willing to have a go at translating test cases. I do not (yet) have the C knowledge to maintain the regex code, though. I got suddenly swamped and forgot I had signed up for this. I'm still pretty swamped and I would like these regression tests to be in for 9.2 so if someone else would like to pick them up, I would be grateful. FWIW, I spent a few minutes looking at the Tcl regression tests and realized that they are not in a form that's tremendously useful to us. What they are, unsurprisingly, are Tcl scripts, and a lot of the specific test cases are couched as calls to special-purpose Tcl functions. I tried inserting some hooks that would print out the arguments/results of the underlying regexp and regsub calls, but didn't get far (my Tcl is way too rusty :-(). I also found that quite a few of the test cases are concerned with features that are not accessible, or at least not accessible in the same way, from our SQL functions. Those test cases would still be worthwhile for a standalone library package, but they won't be much use in a Postgres regression test. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
Shigeru Hanada shigeru.han...@gmail.com writes: I've not read whole of the patch yet, but I have basic questions. 1) IIUC, GetForeignRelSize should set baserel-rows to the number of rows the ForeignScan node returns to upper node, but not the number of rows FDW returns to core executor, right? It should be the number of rows estimated to pass the baserestrictinfo restriction clauses, so yeah, not the same as what the FDW would return, except in cases where all the restriction clauses are handled internally by the FDW. BTW, once Fujita-san's ANALYZE support patch is merged, we will be able to get rows estimatation easily by calling clauselist_selectivity with baserel-tuples and baserestrictinfo. Otherwise, pgsql_fdw would still need to execute EXPLAIN on remote side to get meaningful rows estimation. Yeah, one of the issues for that patch is how we see it coexisting with the option of doing a remote-side EXPLAIN. 2) ISTM that pgsql_fdw needs to execute EXPLAIN on remote side for each possible remote query to get meaningful costs estimation, and it requires pgsql_fdw to generate SQL statements in GetForeignPaths. I worry that I've misunderstood intention of your design because you've mentioned postponing SQL deparsing to createplan time. If you want to get the cost estimates that way, then yes, you'd be needing to do some SQL-statement-construction earlier than final plan generation. But it's not apparent to me that those statements would necessarily be the same as, or even very similar to, what the final queries would be. For instance, you'd probably try to reduce parameters to constants for estimation purposes. GetForeignPaths 1) Repeat for each possible remote query: 1-1) Generate remote query, such as with-WHERE and with-ORDER BY. 1-2) Execute EXPLAIN of generated query, and get costs estimation (rows estimation is ignored because it's useless in planning). Why do you say that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
Shigeru Hanada shigeru.han...@gmail.com writes: Thanks for the review. Agreed to write own depraser for pgsql_fdw which handles nodes which can be pushed down. Every SQL-based FDW which constructs SQL statement for each local query would need such module inside. Yeah. That's kind of annoying, and the first thing you think of is that we ought to find a way to share that code somehow. But I think it's folly to try to design a shared implementation until we have some concrete implementations to compare. An Oracle FDW, for instance, would need to emit SQL code with many differences in detail from pgsql_fdw. It's not clear to me whether a shared implementation is even practical, but for sure I don't want to try to build it before we've done some prototype single-purpose implementations. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lateral function as a subquery - WIP patch
On 03/10/2012 02:15 AM, Tom Lane wrote: Um ... how do you get the subquery result rows to join to only the correct rows of the other tables? The subquery just restricts the set of rows that the function has to evaluate. The main query is supposed to perform the join. I understand, such a join causes repeated scan of the function if the function is on the inner side. This looks like an unconstrained join to me, which is not what I believe the SQL spec for LATERAL to be, o.k., then just forget about my proposal. Thanks for your comments anyway, Tony H. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Caching for stable expressions with constant arguments v6
On Sat, Mar 10, 2012 at 02:05, Tom Lane t...@sss.pgh.pa.us wrote: Marti Raudsepp ma...@juffo.org writes: [ cacheexpr-v8.patch ] A few comments Whoa, that's quite a few. Thanks for the review. * There's a lot of stuff that seems wrong in detail in eval_const_expressions_mutator, eg it'll try to wrap a plain vanilla Const with a CacheExpr. I see you've hacked that case inside insert_cache itself, but that seems like evidence of a poorly thought out recursion invariant. The function should have a better notion than that of whether caching is useful for a given subtree. Well my logic was this: accessing params and consts is just as fast as accessing the cache -- there's no evaluation involved. So there's no point in inserting CacheExpr in front of those. All other kinds of expressions require some sort of evaluation, so they are cached, if they weren't already constant-folded. In general I'd not expect it to insert a CacheExpr unless there is a Param or stable function call somewhere below the current point, but it seems not to be tracking that. I think if you continue on your Param or stable function train of thought, then it boils down to expressions that can't be constant-folded. And that's how it already works now -- constant-foldable expressions get reduced to a Const, which isn't cached. There's no need to track it specially since we can check whether we've got a Const node. You probably need at least a three-way indicator returned from each recursion level: subexpression is not cacheable (eg it contains a Var or volatile function); subexpression is cacheable but there is no reason to do so (default situation); or subexpression is cacheable and should be cached (it contains a Param or stable function). That could be done -- give a special cachability result in code paths that return a constant -- but that seems like complicating the logic and doesn't tell us anything we don't already know. * I'm having a hard time understanding what you did to simplify_function and friends; that's been whacked around quite a bit, in ways that are neither clearly correct nor clearly related to the goal of the patch. Sure, I'll split this out as a separate patch and send it up. * I believe the unconditional datumCopy call in ExecEvalCacheExpr will dump core if the value is null and the type is pass-by-reference. datumCopy already has a check for NULL pointer. But good point about skipping type properties lookup -- will change. * I think you should consider getting rid of the extra argument to ExecInitExpr, as well as the enable flag in CacheExprState, and instead driving cache-enable off a new flag added to EState Will do. * In the same vein, I think it would be better to avoid adding the extra argument to eval_const_expressions_mutator and instead pass that state around in the existing context struct argument. I'll have to see how ugly it gets to save restore the cachability field in recursive calls. I am entirely unimpressed with the fact that you can't pass the extra argument through expression_tree_mutator and thus have to dumb the code down to fail to cache underneath any node type for which there's not explicit coding in eval_const_expressions_mutator. Well I deliberately chose a whitelisting approach. Expression types that aren't important enough to be constant-folded probably aren't that important for caching either. Do you think it's safe to assume that expression types we don't know about are inherently cachable? + * NOTE: This function is not indempotent. Calling it twice over an expression * It seems like some of the ugliness here is due to thinking that selectivity functions can't cope with CacheExprs. Wouldn't it be a lot better to make them cope? I thought centralizing this CacheExpr-stripping to one place was a better idea than spraying it all around the code base. We can skip the copy *and* the check this way. Hmm, if I passed the context to insert_cache then we could avoid this problem. We could do the same for RelabelType in estimation mode, but I don't know how safe that is. * I don't think it's a good idea for simplify_and_arguments/simplify_or_arguments to arbitrarily reorder the arguments based on cacheability. I know that we tell people not to depend on order of evaluation in AND/OR lists, but that doesn't mean they listen Fair enough, will change. * I don't believe CacheExprs can be treated as always simple by ruleutils' isSimpleNode Ok. * I don't think I believe either of the changes you made to plpgsql's exec_simple_check_node. Will change. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
On Sat, Mar 10, 2012 at 11:38:51AM -0500, Tom Lane wrote: Shigeru Hanada shigeru.han...@gmail.com writes: Thanks for the review. Agreed to write own depraser for pgsql_fdw which handles nodes which can be pushed down. Every SQL-based FDW which constructs SQL statement for each local query would need such module inside. Yeah. That's kind of annoying, and the first thing you think of is that we ought to find a way to share that code somehow. But I think it's folly to try to design a shared implementation until we have some concrete implementations to compare. An Oracle FDW, for instance, would need to emit SQL code with many differences in detail from pgsql_fdw. It's not clear to me whether a shared implementation is even practical, but for sure I don't want to try to build it before we've done some prototype single-purpose implementations. FWIW, this sounds like the compiler mechanism in SQLalchemy for turning SQL node trees into strings. The basic idea is you define functions for converting nodes to strings. Stuff like And and Or works for every database, but then dialects can override different things. So you have for Postgres: Node(foo) = $1, but to other databases perhaps :field1. But most of the other code can be shared.. http://docs.sqlalchemy.org/en/latest/core/compiler.html In my experience it works well for generating custom constructs. They have compilers for 11 different database engines, so it seems flexible enough. Mind you, they also handle DDL mapping (where most of the variation is) and datatype translations, which seems a lot further than we need here. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] Caching for stable expressions with constant arguments v6
Marti Raudsepp ma...@juffo.org writes: On Sat, Mar 10, 2012 at 02:05, Tom Lane t...@sss.pgh.pa.us wrote: * There's a lot of stuff that seems wrong in detail in eval_const_expressions_mutator, eg it'll try to wrap a plain vanilla Const with a CacheExpr. I see you've hacked that case inside insert_cache itself, but that seems like evidence of a poorly thought out recursion invariant. The function should have a better notion than that of whether caching is useful for a given subtree. Well my logic was this: accessing params and consts is just as fast as accessing the cache -- there's no evaluation involved. So there's no point in inserting CacheExpr in front of those. All other kinds of expressions require some sort of evaluation, so they are cached, if they weren't already constant-folded. I think this is overkill. Inserting a CacheExpr is far from free: it costs cycles to make that node, cycles to copy it around, cycles to recurse through it during subsequent processing. We should not be putting them in to save trivial amounts of execution time. So I don't believe your argument that there is no such thing as a non-Const, non-volatile expression that shouldn't be cached. Examples of things that clearly are not expensive enough to deserve a cache node are RelabelType and PlaceHolderVar (the latter won't even be there at runtime). More generally, though, I think that caching something like, say, Param int4pl 1 is probably a net loss once you consider all the added overhead. I'm not going to argue for putting explicit cost considerations into the first version, but I don't want the infrastructure to be such that it's impossible to bolt that on later. * I believe the unconditional datumCopy call in ExecEvalCacheExpr will dump core if the value is null and the type is pass-by-reference. datumCopy already has a check for NULL pointer. You're assuming that a null Datum necessarily has an all-zero value, which is not a safe assumption; in many situations the datum word will be uninitialized. The null-pointer check in datumCopy is not particularly useful IMO. It's probably a hangover from fifteen years ago when the system's null-handling was a lot shakier than it is now. Do you think it's safe to assume that expression types we don't know about are inherently cachable? Presumably, anything that doesn't contain a Var nor set off contain_volatile_functions() should be safe to cache. I do not care for the assumption that this set of expression types is identical to the set that eval_const_expressions bothers to deal with. * It seems like some of the ugliness here is due to thinking that selectivity functions can't cope with CacheExprs. Wouldn't it be a lot better to make them cope? I thought centralizing this CacheExpr-stripping to one place was a better idea than spraying it all around the code base. How exactly do you figure that this avoids needing to know about them elsewhere? Not everything in the selectivity code starts out by doing estimate_expression_value. Perhaps more to the point, the stuff that *does* do that is generally going to punt if it doesn't get a plain Const back, so eliminating CacheExprs from non-Const cases isn't helping it anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it time for triage on the open patches?
On Sat, Mar 10, 2012 at 2:55 PM, Robert Haas robertmh...@gmail.com wrote: On Sat, Mar 10, 2012 at 9:04 AM, Simon Riggs si...@2ndquadrant.com wrote: * FOR KEY SHARE locks looks in very good shape and so I'm spending time on that with a view to committing it next week if all goes well Álvaro is a committer and is perfectly capable of committing that patch for himself, had we consensus on it. So far we don't, and I don't think you can make a unilateral decision to the contrary. Maybe that's not what you're proposing, but then what exactly are you proposing? I don't think I can express myself more clearly than the above quote. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it time for triage on the open patches?
On Sat, Mar 10, 2012 at 3:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: * pg_stat_statements looks good also, I hope someone is looking at that I will take that one, if it ever gets marked RFC, but in the meantime I plan to spend my time elsewhere. At this stage the CF app isn't helping us much. We need some way to indicate who is actively working on review, not just a list of people who have at some point reviewed it. Patches without committers will likely suffer, so we need to be able to see the Committers column on the display, so we know whether a patch needs one assigned. I've kind of wished the Committer field could be seen in the list view too, although I'm not sure what I'd give up for it (narrowing the other fields doesn't sound good). Mainly what I wanted it for is to check is there anything I claimed and have not dealt with; which is not something I need to check every time I look, so perhaps some different view could cover this need. Yes, that's a good idea. Another thing that would be really nice is the ability to sort open items by last-activity time (disregarding the category labels), to make it easier to spot which items are not receiving any attention. And another. My worry is things I care about, but may not be receiving the attention they deserve (IMHO) so I can do something about them before someone calls time. I guess that means I'd like to assign Importance values to them as well for my own use. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of pg_archivecleanup -x option patch
On Fri, Mar 9, 2012 at 9:07 AM, Robert Haas robertmh...@gmail.com wrote: On Fri, Mar 9, 2012 at 12:47 AM, Jaime Casanova ja...@2ndquadrant.com wrote: Sorry, here's the patch rebased and with the suggestion from Alex. Which reminds me, I never thank him for the review (shame on me) :D with the patch this time This may be a stupid idea, but it seems to me that it might be better to dispense with all of the logic in here to detect whether the file name is still going to be long enough after chomping the extension. I feel like that's just making things complicated. while i like the idea of separating the logic, i don't like the results: for example i tried this (notice that i forgot the point), and it just says nothing (not even that the file with the extension but without the point doesn't exist). that's why we were checking that the length matches $ ./pg_archivecleanup -x bz2 /tmp 000100010058 $ -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of pg_archivecleanup -x option patch
On Sat, Mar 10, 2012 at 2:38 PM, Jaime Casanova ja...@2ndquadrant.com wrote: On Fri, Mar 9, 2012 at 9:07 AM, Robert Haas robertmh...@gmail.com wrote: On Fri, Mar 9, 2012 at 12:47 AM, Jaime Casanova ja...@2ndquadrant.com wrote: Sorry, here's the patch rebased and with the suggestion from Alex. Which reminds me, I never thank him for the review (shame on me) :D with the patch this time This may be a stupid idea, but it seems to me that it might be better to dispense with all of the logic in here to detect whether the file name is still going to be long enough after chomping the extension. I feel like that's just making things complicated. while i like the idea of separating the logic, i don't like the results: for example i tried this (notice that i forgot the point), and it just says nothing (not even that the file with the extension but without the point doesn't exist). that's why we were checking that the length matches $ ./pg_archivecleanup -x bz2 /tmp 000100010058 $ i would add this in the middle of the TrimExtension() function: if ((flen - (elen+1) != XLOG_DATA_FNAME_LEN flen - (elen+1) != XLOG_BACKUP_FNAME_LEN) (flen != XLOG_DATA_FNAME_LEN flen != XLOG_BACKUP_FNAME_LEN)) { fprintf(stderr, %s: invalid filename input\n, progname); fprintf(stderr, Try \%s --help\ for more information.\n, progname); exit(2); } -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Refactoring simplify_function (was: Caching constant stable expressions)
Hi list, Per Tom's request, I split out this refactoring from my CacheExpr patch. Basically I'm just centralizing the eval_const_expressions_mutator() call on function arguments, from multiple different places to a single location. Without this, it would be a lot harder to implement argument caching logic in the CacheExpr patch. The old call tree goes like: case T_FuncExpr: - eval_const_expressions_mutator(args) - simplify_function - reorder_function_arguments - eval_const_expressions_mutator(args) - add_function_defaults - eval_const_expressions_mutator(args) New call tree: case T_FuncExpr: - simplify_function - simplify_copy_function_arguments - reorder_function_arguments - add_function_defaults - eval_const_expressions_mutator(args) Assumptions being made: * The code didn't depend on expanding existing arguments before adding defaults * operator calls don't need to expand default arguments -- it's currently impossible to CREATE OPERATOR with a function that has unspecified arguments Other changes: * simplify_function no longer needs a 'bool has_named_args' argument (it finds out itself). * I added 'bool mutate_args' in its place, since some callers need to mutate args beforehand. * reorder_function_arguments no longer needs to keep track of which default args were added. Regards, Marti diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c new file mode 100644 index cd3da46..9485e95 *** a/src/backend/optimizer/util/clauses.c --- b/src/backend/optimizer/util/clauses.c *** static List *simplify_and_arguments(List *** 109,124 static Node *simplify_boolean_equality(Oid opno, List *args); static Expr *simplify_function(Expr *oldexpr, Oid funcid, Oid result_type, int32 result_typmod, Oid result_collid, ! Oid input_collid, List **args, ! bool has_named_args, ! bool allow_inline, eval_const_expressions_context *context); static List *reorder_function_arguments(List *args, Oid result_type, ! HeapTuple func_tuple, ! eval_const_expressions_context *context); static List *add_function_defaults(List *args, Oid result_type, ! HeapTuple func_tuple, ! eval_const_expressions_context *context); static List *fetch_function_defaults(HeapTuple func_tuple); static void recheck_cast_function_args(List *args, Oid result_type, HeapTuple func_tuple); --- 109,123 static Node *simplify_boolean_equality(Oid opno, List *args); static Expr *simplify_function(Expr *oldexpr, Oid funcid, Oid result_type, int32 result_typmod, Oid result_collid, ! Oid input_collid, List **args_p, ! bool mutate_args, bool allow_inline, eval_const_expressions_context *context); + static List *simplify_copy_function_arguments(List *old_args, Oid result_type, + HeapTuple func_tuple); static List *reorder_function_arguments(List *args, Oid result_type, ! HeapTuple func_tuple); static List *add_function_defaults(List *args, Oid result_type, ! HeapTuple func_tuple); static List *fetch_function_defaults(HeapTuple func_tuple); static void recheck_cast_function_args(List *args, Oid result_type, HeapTuple func_tuple); *** eval_const_expressions_mutator(Node *nod *** 2303,2336 case T_FuncExpr: { FuncExpr *expr = (FuncExpr *) node; ! List *args; ! bool has_named_args; Expr *simple; FuncExpr *newexpr; - ListCell *lc; - - /* - * Reduce constants in the FuncExpr's arguments, and check to - * see if there are any named args. - */ - args = NIL; - has_named_args = false; - foreach(lc, expr-args) - { - Node *arg = (Node *) lfirst(lc); - - arg = eval_const_expressions_mutator(arg, context); - if (IsA(arg, NamedArgExpr)) - has_named_args = true; - args = lappend(args, arg); - } /* * Code for op/func reduction is pretty bulky, so split it out * as a separate function. Note: exprTypmod normally returns * -1 for a FuncExpr, but not when the node is recognizably a * length coercion; we want to preserve the typmod in the ! * eventual Const if so. */ simple = simplify_function((Expr *) expr, expr-funcid, --- 2302,2318 case T_FuncExpr: { FuncExpr *expr = (FuncExpr *) node; ! List *args = expr-args; Expr *simple; FuncExpr *newexpr; /* * Code for op/func reduction is pretty bulky, so split it out * as a separate function. Note: exprTypmod normally returns * -1 for a FuncExpr, but not when the node is recognizably a * length coercion; we want to preserve the typmod in the ! * eventual Const if so. This function also mutates the ! * argument list. */ simple = simplify_function((Expr *) expr, expr-funcid,
Re: [HACKERS] pg_prewarm
On Mar 9, 2012, at 2:34 PM, Robert Haas wrote: On Fri, Mar 9, 2012 at 5:42 AM, Hans-Jürgen Schönig postg...@cybertec.at wrote: we had some different idea here in the past: what if we had a procedure / method to allow people to save the list of current buffers / cached blocks to be written to disk (sorted). we could then reload this cache profile on startup in the background or people could load a certain cache content at runtime (maybe to test or whatever). writing those block ids in sorted order would help us to avoid some random I/O on reload. I don't think that's a bad idea at all, and someone actually did write a patch for it at one point, though it didn't get committed, partly I believe because of technical issues and partly because Greg Smith was uncertain how much good it did to restore shared_buffers without thinking about the OS cache. Personally, I don't buy into the latter objection: a lot of people are running with data sets that fit inside shared_buffers, and those people would benefit tremendously. However, this just provides mechanism, not policy, and is therefore more general. You could use pg_buffercache to save the cache contents at shutdown and pg_prewarm to load those blocks back in at startup, if you were so inclined. Or if you just want to load up your main relation, and its indexes, you can do that, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company i also think that it can be beneficial. once in a while people ask how to bring a database up to speed after a restart. i have seen more than one case when a DB was close to death after a restart because random I/O was simply killing it during cache warmup. it seems the problem is getting worse as we see machines with more and more RAM in the field. technically i would see a rather brute force approach: if we just spill out of the list of blocks we got in shared buffer atm (not content of course, just physical location sorted by file / position in file) it would be good enough. if a block physically does not exist on reload any more it would not even be an issue and allow people basically to snapshot their cache status. we could allow named cache profiles or so and make a GUC to indicate of one of them should be preloaded on startup (background or beforehand - i see usecases for both approaches). yes, somehow linking to pg_buffercache makes a lot of sense. maybe just extending it with some extra functions is already enough for most cases. hans -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_prewarm
Hi Robert, Just recently I asked on postgres-performance PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory? Somehow open was, what's the best practice of configuration and relationship between disk/OS cache vs. Portgres cache The main conclusion was: * Do a tar cf /dev/zero $PG_DATA/base either shortly before or shortly after the database is created * Do a seq scan SELECT * FROM osm_point. Is your tool a replacement of those above? -Stefan 2012/3/9 Robert Haas robertmh...@gmail.com: On Fri, Mar 9, 2012 at 10:53 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Fri, Mar 9, 2012 at 5:21 AM, Robert Haas robertmh...@gmail.com wrote: On Fri, Mar 9, 2012 at 5:24 AM, Fujii Masao masao.fu...@gmail.com wrote: When a relation is loaded into cache, are corresponding indexes also loaded at the same time? No, although if you wanted to do that you could easily do so, using a query like this: select pg_prewarm(indexrelid, 'main', 'read', NULL, NULL) from pg_index where indrelid = 'your_table_name'::regclass; Could that be included in an example? Maybe admins are expected to know how to construct such queries of the cuff, but I always need to look it up each time which is rather tedious. Not a bad idea. I thought of including an Examples section, but it didn't seem quite worth it for the simple case of prewarming a heap. Might be worth it to also include this. In the patch: s/no special projection/no special protection/ OK, will fix. Thanks for putting this together. I will confess that it was 0% altruistic. Not having it was ruining my day. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Add GUC sepgsql.client_label
On 2012-03-10 14:06, Robert Haas wrote: On Sat, Mar 10, 2012 at 4:39 AM, Yeb Havingayebhavi...@gmail.com wrote: As a separate but related note, the label management here seems to be excessively complicated. In particular, it seems to me that the semantics of sepgsql_get_client_label() become quite muddled under this patch. An explicitly-set label overrides the default label, but a trusted procedure's temporary label overrides that. So if you enter a trusted procedure, and it calls sepgsql_setcon(), it'll change the label, but no actual security transition will occur; then, when you exit the trusted procedure, you'll get the new label in place of whatever the caller had before. That seems like one heck of a surprising set of semantics. I agree that sepgsql_get_*client*_label does not really match with a trusted procedure temporarily changing it. I'm not complaining about the name of the function; I'm complaining about the semantics. The semantics are muddled because the client labels are mixed with labels from trusted procedures. The stack you described upthread may also exhibit surprising behaviour. If a trusted procedure is called, it's label is pushed onto the stack. Suppose it pushes another label by calling sepgsql_setcon(). In the stack case, this is now the top of the stack and the result of sepgsql_get_client_label(). The procedure exists. What should now be the result of sepgsql_get_client_label()? Since labels are managed by a stack, we can only pop what's on top and need to pop twice, so we've lost the label pushed onto the stack by the trusted function, which means that trusted procedures cannot be used to change client labels beyond their own scope, but other functions would. Maybe this semantics muddling of trusted procedures and setting the client label can be avoided by denying changing the client label with sepgsql_setcon() from a trusted procedure, which would also make some sense: ERROR: cannot override the client label of a trusted procedure during execution. -- Yeb Havinga http://www.mgrid.net/ Mastering Medical Data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] about EncodeDateTime() arguments
We currently have void EncodeDateTime(struct pg_tm * tm, fsec_t fsec, int *tzp, char **tzn, int style, char *str) but tzn isn't used anywhere, only *tzn is used everywhere. Wouldn't it be clearer to remove that one level of indirection and instead have the signature be void EncodeDateTime(struct pg_tm * tm, fsec_t fsec, int *tzp, char *tzn, int style, char *str) or better yet void EncodeDateTime(struct pg_tm * tm, fsec_t fsec, const int *tzp, const char *tzn, int style, char *str) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it time for triage on the open patches?
On 10 March 2012 15:47, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: * pg_stat_statements looks good also, I hope someone is looking at that I will take that one, if it ever gets marked RFC, but in the meantime I plan to spend my time elsewhere. It has been marked RFC now. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_crypto failures with llvm on OSX
On 03/09/2012 07:50 PM, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: Buildfarm member mussel (OS X 10.7.3, llvm-gcc 4.2.1, x86_64)seems to be getting consistent warnings when running the pgcrypto regression tests, that look like this: WARNING: detected write past chunk end in ExprContext 0x7fec2b11eb58 Does anyone have an idea why that might be? Worksforme, on an up-to-date Lion system using exactly the same compiler version. I do see the deprecation warnings (which Apple seems to have plastered on every single OpenSSL function ... doesn't leave me with a warm feeling about their future plans). I suspect that mussel has an ABI-incompatible openssl library hanging around someplace. On my machine otool -L pgcrypto.so shows /usr/lib/libcrypto.0.9.8.dylib (compatibility version 0.9.8, current version 44.0.0) It'd be interesting to know what it says on mussel. Robert, please investigate. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_crypto failures with llvm on OSX
On Mar 10, 2012, at 4:19 PM, Andrew Dunstan and...@dunslane.net wrote: On 03/09/2012 07:50 PM, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: Buildfarm member mussel (OS X 10.7.3, llvm-gcc 4.2.1, x86_64)seems to be getting consistent warnings when running the pgcrypto regression tests, that look like this: WARNING: detected write past chunk end in ExprContext 0x7fec2b11eb58 Does anyone have an idea why that might be? Worksforme, on an up-to-date Lion system using exactly the same compiler version. I do see the deprecation warnings (which Apple seems to have plastered on every single OpenSSL function ... doesn't leave me with a warm feeling about their future plans). I suspect that mussel has an ABI-incompatible openssl library hanging around someplace. On my machine otool -L pgcrypto.so shows /usr/lib/libcrypto.0.9.8.dylib (compatibility version 0.9.8, current version 44.0.0) It'd be interesting to know what it says on mussel. Robert, please investigate. creagers-imac:pgcrypto Robert$ otool -L pgcrypto.so pgcrypto.so: /opt/local/lib/libcrypto.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0) /opt/local/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.5) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0) via Mac Ports cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about EncodeDateTime() arguments
Peter Eisentraut pete...@gmx.net writes: We currently have void EncodeDateTime(struct pg_tm * tm, fsec_t fsec, int *tzp, char **tzn, int style, char *str) but tzn isn't used anywhere, only *tzn is used everywhere. Wouldn't it be clearer to remove that one level of indirection and instead have the signature be void EncodeDateTime(struct pg_tm * tm, fsec_t fsec, int *tzp, char *tzn, int style, char *str) or better yet void EncodeDateTime(struct pg_tm * tm, fsec_t fsec, const int *tzp, const char *tzn, int style, char *str) It appears to me that null-ness of tzp and tzn are used as a 3-way flag to identify the style of timezone output wanted (none, numeric, or alpha). It would probably be better yet if it went like enum tzstyle, int tzp, const char *tzn where tzp or tzn would be examined only if tzstyle said so. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_crypto failures with llvm on OSX
Robert Creager rob...@logicalchaos.org writes: On Mar 10, 2012, at 4:19 PM, Andrew Dunstan and...@dunslane.net wrote: On 03/09/2012 07:50 PM, Tom Lane wrote: I suspect that mussel has an ABI-incompatible openssl library hanging around someplace. On my machine otool -L pgcrypto.so shows /usr/lib/libcrypto.0.9.8.dylib (compatibility version 0.9.8, current version 44.0.0) It'd be interesting to know what it says on mussel. creagers-imac:pgcrypto Robert$ otool -L pgcrypto.so pgcrypto.so: /opt/local/lib/libcrypto.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0) /opt/local/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.5) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0) Uh-huh. I will bet a good deal that you are compiling against the Apple-supplied openssl header files, but then linking against the MacPorts openssl shlib, which evidently is configured a bit differently than Apple's copy. [ pokes around in the mussel build logs ] I see -L/opt/local/lib in the link commands, which explains why you're linking to that shlib, and I don't see anything like -I/opt/local/include, which is probably needed to find the matching header files. There is a -I/opt/local/include/libxml2, but that isn't going to help unless your /opt/local layout is really weird. What's really odd though is that there is nothing in the configuration script that injects any of those switches. I think you've got some screwy global configuration on that machine, which you'd be well advised to try to get rid of --- it's tough for people to do remote diagnosis of buildfarm critters when there's relevant configuration that's not exposed in the config script. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_crypto failures with llvm on OSX
On Mar 10, 2012, at 5:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: What's really odd though is that there is nothing in the configuration script that injects any of those switches. I think you've got some screwy global configuration on that machine, which you'd be well advised to try to get rid of --- it's tough for people to do remote diagnosis of buildfarm critters when there's relevant configuration that's not exposed in the config script. No global config. Changed the path around to have /opt/local after the standard Apple ones, and it appears to be working fine. autoconf must be throwing in that path based on executables found? Later, Rob -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_crypto failures with llvm on OSX
Robert Creager rob...@logicalchaos.org writes: On Mar 10, 2012, at 5:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: What's really odd though is that there is nothing in the configuration script that injects any of those switches. I think you've got some screwy global configuration on that machine, which you'd be well advised to try to get rid of --- it's tough for people to do remote diagnosis of buildfarm critters when there's relevant configuration that's not exposed in the config script. No global config. Changed the path around to have /opt/local after the standard Apple ones, and it appears to be working fine. autoconf must be throwing in that path based on executables found? I don't believe autoconf would insert such stuff on its own authority. I'm wondering about CPPFLAGS, CFLAGS, LDFLAGS or similar variables being set in the environment that the buildfarm script is running in. Take a look at ~/.bash_profile and suchlike files. (I wonder whether it'd be a good idea for the buildfarm script to explicitly clear anything that autoconf pays attention to from its startup environment, so that you have to set these variables in the buildfarm config to make them have effect. If not that, maybe print env output to document what the situation is?) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_crypto failures with llvm on OSX
On Mar 10, 2012, at 7:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Creager rob...@logicalchaos.org writes: On Mar 10, 2012, at 5:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: What's really odd though is that there is nothing in the configuration script that injects any of those switches. I think you've got some screwy global configuration on that machine, which you'd be well advised to try to get rid of --- it's tough for people to do remote diagnosis of buildfarm critters when there's relevant configuration that's not exposed in the config script. No global config. Changed the path around to have /opt/local after the standard Apple ones, and it appears to be working fine. autoconf must be throwing in that path based on executables found? I don't believe autoconf would insert such stuff on its own authority. I'm wondering about CPPFLAGS, CFLAGS, LDFLAGS or similar variables being set in the environment that the buildfarm script is running in. Take a look at ~/.bash_profile and suchlike files. Nope. Only PATH is set. Later, Rob -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_crypto failures with llvm on OSX
On 03/10/2012 09:15 PM, Tom Lane wrote: (I wonder whether it'd be a good idea for the buildfarm script to explicitly clear anything that autoconf pays attention to from its startup environment, so that you have to set these variables in the buildfarm config to make them have effect. If not that, maybe print env output to document what the situation is?) I can put the latter in the next client release, unless people think it's a bit insecure to report arbitrary environment values. If we were to clear them, which would we clear? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_crypto failures with llvm on OSX
On Mar 10, 2012, at 7:54 PM, Andrew Dunstan and...@dunslane.net wrote: On 03/10/2012 09:15 PM, Tom Lane wrote: (I wonder whether it'd be a good idea for the buildfarm script to explicitly clear anything that autoconf pays attention to from its startup environment, so that you have to set these variables in the buildfarm config to make them have effect. If not that, maybe print env output to document what the situation is?) I can put the latter in the next client release, unless people think it's a bit insecure to report arbitrary environment values. If we were to clear them, which would we clear? Why it just report the pertinent ones? Rob -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_prewarm
On Sat, Mar 10, 2012 at 4:35 PM, Stefan Keller sfkel...@gmail.com wrote: The main conclusion was: * Do a tar cf /dev/zero $PG_DATA/base either shortly before or shortly after the database is created * Do a seq scan SELECT * FROM osm_point. Is your tool a replacement of those above? It can be used that way, although it is more general. (The patch does include documentation...) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] wal_buffers, redux
I've finally been able to run some more tests of the effect of adjusting wal_buffers to values higher than 16MB. I ran the test on the 16 core (x 4 hw threads/core) IBM POWER7 machine, with my usual configuration settings: shared_buffers = 8GB maintenance_work_mem = 1GB synchronous_commit = off checkpoint_segments = 300 checkpoint_timeout = 15min checkpoint_completion_target = 0.9 wal_writer_delay = 20ms I ran three 30-minute tests at scale factor 300 with wal_buffers set at various values from 16MB up to 160MB, in multiples of 16MB, using pgbench with 32 clients and 32 threads in each case. The short version is that 32MB seems to be significantly better than 16MB, by about 1000 tps, and after that it gets murky; full results are below. 16MB tps = 13069.420658 (including connections establishing) 16MB tps = 14370.293228 (including connections establishing) 16MB tps = 13787.219680 (including connections establishing) 32MB tps = 14916.068968 (including connections establishing) 32MB tps = 14746.448728 (including connections establishing) 32MB tps = 15110.676467 (including connections establishing) 48MB tps = 15111.981870 (including connections establishing) 48MB tps = 12824.628192 (including connections establishing) 48MB tps = 15090.280081 (including connections establishing) 64MB tps = 15382.740815 (including connections establishing) 64MB tps = 12367.942312 (including connections establishing) 64MB tps = 15195.405382 (including connections establishing) 80MB tps = 15346.080326 (including connections establishing) 80MB tps = 12791.192216 (including connections establishing) 80MB tps = 14780.804054 (including connections establishing) 96MB tps = 15476.229392 (including connections establishing) 96MB tps = 15426.012162 (including connections establishing) 96MB tps = 15548.671849 (including connections establishing) 112MB tps = 15400.669675 (including connections establishing) 112MB tps = 15676.416378 (including connections establishing) 112MB tps = 15016.651083 (including connections establishing) 128MB tps = 15811.463522 (including connections establishing) 128MB tps = 15590.343669 (including connections establishing) 128MB tps = 15256.867665 (including connections establishing) 144MB tps = 15842.131696 (including connections establishing) 144MB tps = 15669.880640 (including connections establishing) 144MB tps = 15753.460908 (including connections establishing) 160MB tps = 15658.726637 (including connections establishing) 160MB tps = 15599.600752 (including connections establishing) 160MB tps = 15311.198480 (including connections establishing) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_crypto failures with llvm on OSX
Robert Creager rob...@logicalchaos.org writes: On Mar 10, 2012, at 7:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: I don't believe autoconf would insert such stuff on its own authority. I'm wondering about CPPFLAGS, CFLAGS, LDFLAGS or similar variables being set in the environment that the buildfarm script is running in. Take a look at ~/.bash_profile and suchlike files. Nope. Only PATH is set. Hmm ... [ eyeballs the mussel build reports some more ] ... ah-hah, look at this (at the bottom of the configure step for the last failed build): configure: using CPPFLAGS= -I/opt/local/include/libxml2 configure: using LDFLAGS= -L/opt/local/lib -Wl,-dead_strip_dylibs versus this in the first successful build: configure: using CPPFLAGS= -I/usr/include/libxml2 configure: using LDFLAGS= -Wl,-dead_strip_dylibs I will bet that those -I and -L switches are coming from this part of configure.in: if test $with_libxml = yes ; then AC_CHECK_PROGS(XML2_CONFIG, xml2-config) if test -n $XML2_CONFIG; then for pgac_option in `$XML2_CONFIG --cflags`; do case $pgac_option in -I*|-D*) CPPFLAGS=$CPPFLAGS $pgac_option;; esac done for pgac_option in `$XML2_CONFIG --libs`; do case $pgac_option in -L*) LDFLAGS=$LDFLAGS $pgac_option;; esac done fi fi So the answer is that you've got a MacPorts libxml2 installation whose xml2-config program inserts these not terribly self-consistent switches, resulting in drive-by failure of the openssl configuration. When you switched your PATH around, instead of finding the MacPorts copy of xml2-config, configure found the one in /usr/bin, which provides -I/usr/include/libxml2 and no particular -L switch; hence no openssl problem. You're building with a different libxml2 than you were before, though. Seems to me this is a MacPorts bug: their libxml2 and openssl packagings don't play nice together. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_crypto failures with llvm on OSX
Andrew Dunstan and...@dunslane.net writes: On 03/10/2012 09:15 PM, Tom Lane wrote: (I wonder whether it'd be a good idea for the buildfarm script to explicitly clear anything that autoconf pays attention to from its startup environment, so that you have to set these variables in the buildfarm config to make them have effect. If not that, maybe print env output to document what the situation is?) I can put the latter in the next client release, unless people think it's a bit insecure to report arbitrary environment values. If we were to clear them, which would we clear? I think it'd be useful to print CPPFLAGS, CFLAGS, and LDFLAGS if the environment is supplying values for them (or maybe print their values after absorbing whatever is in the buildfarm animal's config). Peter might know whether there's anything else of great interest. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers